Protein Prospector Automation Guidance

Most of the programs in the Protein Prospector package are implemented according to the Common Gateway Interface (CGI) standard. The parameters for the programs are thus a set of name-value pairs. These can either be provided from a HTML form, a script that mimics the CGI format or from the command line.

A detailed description of the CGI format is beyond the scope of this document. Protein Prospector should be able to recognize forms where the method is either GET or POST. If the method is POST then the enctype attribute can either be multipart/form-data or text/plain. A useful book on CGI Programming is:

CGI Programming on the World Wide Web, Shishir Gundavaram, O'Reilly, 1996

The programs can be run from the command line by two different methods either as a long list of name value pairs:

$ programName - name1=value1 name2=value2 ............... nameN=valueN

or with the parameters are stored in an XML file:

$ programName -f params.xml name1=value1 name2=value2 ..... nameN=valueN

For example to index a database using FA-Index on a Windows system:

faindex.cgi - create_database_indicies=1 database=SwissProt.11.02

On a UNIX system you should use a command of the following form (run from the directory):

./faindex.cgi - create_database_indicies=1 database=SwissProt.11.02 

Note that the first argument has to be a dash character.

For the case where the parameters are stored in an XML file any parameters submitted on the command line will override those in the file. The format of the xml file should be as follows:

<?xml version="1.0" encoding="UTF-8"?>
<parameters>
<name1>value1</name1>
<name2>value2</name2>
.....
<nameN>valueN</nameN>
</parameters>

The second version may be preferred if there are a lot of parameters or the environment you submit the command from imposes a maximum length for a command. A second alternative in this case would be to use a Perl script:

$c = "programName";
$c .= " - ";
$c .= "name1=value1";
$c .= " ";
$c .= "name2=value2";
$c .= " ";
........
$c .= "nameN=valueN";
$c .= " ";
system "$c";

Certain characters need to be escaped when included in a value. These are uppercase hexadecimal ASCII values preceded by a percent sign. If the parameters are stored in an XML file they don't strictly need to be escaped like this but can be. Some commonly used escape codes are listed in the table below:

space%20
line end%0D%0A
comma%2C


Some parameters such as would appear as multiple choice menus or tick boxes on a HTML form may be specified multiple times with different values. Eg in an XML parameter file context:

.....
<mod_AA>Peptide%20N-terminal%20Gln%20to%20pyroGlu</mod_AA>
<mod_AA>Oxidation%20of%20M</mod_AA>
<mod_AA>Protein%20N-terminus%20Acetylated</mod_AA>
.....

or in a command line context:

$ command.cgi - it=a it=b it=y ...................

If the parameter is entered via a text box on a HTML form then its value can extend over several lines.

In an XML parameter file this could be specified thus:

<accession_nums>P40069
P15180
P11484</accession_nums>

Or thus:

<accession_nums>P40069%0D%0AP15180%0D%0AP11484</accession_nums>

In a list of command line arguments it would be specified as follows:

$ command.cgi - accession_nums=P40069%0D%0AP15180%0D%0AP11484 .....

Peak Spotter

Peak Spotter is used to extract data from an ABI 4700/4800 TOF-TOF Oracle database (version 3). It doesn't require the rest of the Protein Prospector installation to work. However the Oracle client software needs to be installed to allow the database to be accessed. The program can work on UNIX type platforms as well as Windows ones. The program parameters are as follows:

Parameter Default Value Valid Values
server_name "" text
username TSQUARED text
password TS text
spot_set_list 0 0 or 1
run_number 0 integer
all_runs 0 0 or 1
write_raw_ms 0 0 or 1
write_raw_msms 0 0 or 1
retain_isotopes 0 0 or 1
minimum_area 100.0 floating point number
intensity_type Height Height, Area,
Cluster Area,
Signal to Noise
spot_set_names None defined Spot sets names from
the database
centroid_dir "" valid directory name
centroid_filename output.txt valid filename
raw_dir "" valid directory name

If the parameter is set to the default value then it does not need to be specified.

server_name

This is the server name as defined in the Oracle client file tnsname.org. This parameter is always required.

username

A username to log in to the database. This parameter is always required.

password

A password for the given username. This parameter is always required.

spot_set_list

This parameter is only required if you want to get a list of spot sets in the database. If you set this parameter to 1 then all the parameters below should not be used.

run_number

The run number can be specified if you just want to extract the data for a given run.

all_runs

This parameter should be set unless you just want to extract the data for a single run.

write_raw_ms

If you set this parameter the raw MS data is also extracted from the database. A separate T2D file is extracted for each spectrum. You need to extract this if you want to do quantitation, such as SILAC, that uses the MS data. You should also extract it if you want to include columns in the Search Compare report, such as intensity, which are calculated from the raw data or to be able to see the parent ion raw data by clicking on the m/z column in the Search Compare peptide report. These files are not used by the database search program.

write_raw_msms

If you set this parameter the raw MSMS data is also extracted from the database. A separate T2D file is extracted for each spectrum. You need to extract this if you want to do quantitation, such as iTRAQ, that uses the MSMS data. You should also extract it if you want to be able to look at the raw data from the Search Compare report. These files are not used by the database search program.

retain_isotopes

If you set this parameter then the isotope peaks are retained in the centroid file. Note that the default setting is for Prospector to allow the ABI software to do the deisotoping. If you want Protein Prospector to do the deisotoping then the Prospector instrument file will need to be set up accordingly.

minimum_area

The minimum area for a peak before it is included in the centroid file. Sometimes the peak lists can be very large if this value isn't set appropriately.

intensity_type

The units for the intensities stored in the centroid file. The possible values for the intensity type are:

  • Intensity - The peak intensity of the monoisotopic peak.
  • Area - The peak area of the monoisotopic peak.
  • Cluster Area - The summed areas of all the peaks in the isotope cluster.
  • Signal to Noise - The signal to noise ratio of the monoisotopic peak.
spot_set_names

A list of the names of the spot sets (one per line) to extract from the database. Generally only a single spot set is extracted at one time.

centroid_dir

The directory where you want the centroid file to be created. This directory should exist before running the program. A full path should be given for the directory.

centroid_filename

The name for the centroid file created.

raw_dir

The directory where you want the raw files to be created. This directory should exist before running the program. A full path should be given for the directory.

Examples

To get a list of spot sets from the database:

$ peakSpotter.cgi - server_name=server username=n password=p spot_set_list=1

Alternatively the following could be stored in a file called p.xml:

<?xml version="1.0" encoding="UTF-8"?>
<parameters>
<server_name>server</server_name>
<username>n</username >
<password>p</password>
<spot_set_list>1</spot_set_list>
</parameters>

and the command run as follows:

$ peakSpotter.cgi -f p.xml

To extract a spot set from the database into a Protein Prospector data repository:

Create a file (say p.xml) containing the following:

<?xml version="1.0" encoding="UTF-8"?>
<parameters>
<server_name>server</server_name>
<username>n</username>
<password>p</password>
<all_runs>1</all_runs>
<write_raw_ms>1</write_raw_ms>
<write_raw_msms>1</write_raw_msms>
<retain_isotopes>0</retain_isotopes>
<minimum_area>100.0</minimum_area>
<intensity_type>Height</intensity_type>
<spot_set_names>User Project 1\test</spot_set_names>
<centroid_dir>R:\peaklists\TOFTOF1\2005\11</centroid_dir>
<centroid_filename>User Project 1$test.txt</centroid_filename>
<raw_dir>R:\raw\TOFTOF1\2005\11\User Project 1$test</raw_dir>
</parameters>

and run the command as follows:

$ peakSpotter.cgi -f p.xml

Note that

  1. The raw directory needs to exist before running peakSpotter.
  2. Once several spot sets have been extracted you can combine them together to make a Protein Prospector project using the Make Project program. However for this to work the directory where the raw files are going to be stored needs to have the same name as the centroid file (without the .txt suffix). The name doesn't have to correspond to the name of the spot set in the database. The .txt suffix for the centroid file is also currently mandatory.
  3. The \ character in the original spot set name has been replaced by a $ character so it isn't mistaken for part of the raw directory path.
  4. Peaklists for both the MS and MSMS spectra are stored in the one centroid file.
  5. If you extract the raw data then the program will typically take a lot longer to run.
Wiff To Centroid

The program wiffToCentroid.exe is a simple command line program to extract mgf (Mascot Generic Format) peak lists from ABI Sciex wiff files. It is compatible with Analyst QS 2.0 and Mascot.dll version 1.1446.0.20. An example use of the program is:

$ wiffToCentroid.cgi C:\wiffs\example.wiff C:\mgfs\example.mgf

The full paths to the wiff file and the mgf file must be specified.

The Analyst service must be running on the computer on which wiffToCentroid is running. The Analyst centroiding parameters stored in the registry are used to do the centroiding.

This program can only run on a Windows computer and does not support a CGI or XML parameter file interface.

Parameters for Other Protein Prospector Programs
MS-Fit
name Default Value Valid Values
search_name "" needs to be set to msfit
report_title "" text
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
dna_frame_translation 3 6, 3, -3, 1, -1
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
ms_prot_low_mass 1000 integer
ms_prot_high_mass 100000 integer
ms_full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
sort_type Score Sort Score Sort
MW Sort
pI Sort
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
missed_cleavages 1 integer
comment "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msfit
text
output_filename "" text
max_reported_hits 50 integer
detailed_report 0 0,1
display_graph 0 0,1
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
parent_mass_systematic_error0.0 double
parent_contaminant_massesNULL list of singly charged masses
average_to_mono_convertNULL list of 0's and 1's
0 = monoisotopic
1 = average
mod_AA None Defined Peptide N-terminal Gln to pyroGlu
Oxidation of M
Protein N-terminus Acetylated
User Defined 1
Acrylamide Modified Cys
user1_name "" valid text strings defined in usermod.txt
min_parent_ion_matches 1 integer
min_matches 5 integer
mowse_on 0 0, 1
mowse_pfactor 0.4 double
chem_score 0 0, 1
met_ox_factor 1.0 0.2-5.0
data_source Data Paste Area List of Files, Upload Data From File, Data Paste Area
upload_data "" file name
data_directory "" directory name
data_files NULL list of file names
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter.
ms_search_type None Defined valid text strings defined in params/homology.txt


MS-Tag
name Default Value Valid Values
search_name "" needs to be set to mstag
report_title "" text
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
dna_frame_translation 3 6, 3, -3, 1, -1
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
msms_prot_low_mass 1000 integer
msms_prot_high_mass 100000 integer
msms_full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
missed_cleavages 1 integer
comment "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_mstag
text
output_filename "" text
max_reported_hits 50 integer
max_hits 200 integer
protein_report 0 0,1
detailed_report 0 0,1
display_graph 0 0,1
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
data_source Data Paste Area List of Files, Upload Data From File, Data Paste Area
upload_data "" file name
data_directory "" directory name
data_files NULL list of file names
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter. For MS-Tag the
first line contains the m/z and optionally the charge
of the parent ion.
msms_search_type None Defined valid text strings defined in params/homology.txt
it None Defined a,a-H2O,a-NH3,a-H3PO4,
b,b-H2O,b-NH3,b+H2O,
b-H3PO4,b-SOCH4,
y,y-H2O,y-NH3,y-H3PO4,y-SOCH4,
MH+,c,B,n,h,P,S,I,N,C
ion_unmatched_type Max. % Unmatched Ions Max. % Unmatched Ions
Min. % Matched Ions
Max. Num Unmatched Ions
Min. Num Matched Ions
unused_ions 10 double or integer
fragment_mass_tolerance1.0 double
comp_ion None Defined regular expression
composition_search 0 0, 1
exclude_flag 0 0, 1
aa_exclude "" text string containing the following characters
A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y
aa_add "" m,q,h,s,t,y,u
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
regular_expression . regular expression
use_instrument_ion_types0 0,1


MS-Seq
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msseq
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
dna_frame_translation 3 6, 3, -3, 1, -1
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
msms_prot_low_mass 1000 integer
msms_prot_high_mass 100000 integer
msms_full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
missed_cleavages 1 integer
comment "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msseq
text
output_filename "" text
max_reported_hits 50 integer
max_hits 200 integer
protein_report 0 0,1
detailed_report 0 0,1
display_graph 0 0,1
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter. For MS-Tag the
first line contains the m/z and optionally the charge
of the parent ion.
msms_search_type None Defined no errors, high mass error, low mass error,
middle masses error, parent mass
ion_type b a,b,c,y
fragment_mass_tolerance1.0 double
comp_ion None Defined regular expression
composition_search 0 0, 1
composition_exclude "" ACDEFGHIKLMNPQRSTVWY
regular_expression . regular expression


MS-Pattern
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to mspattern
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
dna_frame_translation 3 6, 3, -3, 1, -1
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
prot_low_mass 1000 integer
prot_high_mass 100000 integer
full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
comment "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_mspattern
text
output_filename "" text
max_reported_hits 50 integer
pre_search_only 0 0,1
regular_expression . regular expression
possible_sequences NULL list of peptides
max_aa_substitutions 0 integer


MS-Homology
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to mshomology
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
dna_frame_translation 3 6, 3, -3, 1, -1
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
prot_low_mass 1000 integer
prot_high_mass 100000 integer
full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
comment "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_mshomology
text
output_filename "" text
max_reported_hits 50 integer
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
fragment_masses_tolerance0.5 double
tolerance_units Da Da, %, ppm, mmu
min_matches 5 integer
possible_sequences NULL list of peptides
score_matrix 0 integer


MS-Digest
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msdigest
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
missed_cleavages 1 integer
end_terminus 0 0, 1
stripping_terminus N N, C
start_strip 2 integer
end_strip 4 integer
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msdigest
text
output_filename "" text
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
mod_AA None Defined Peptide N-terminal Gln to pyroGlu
Oxidation of M
Protein N-terminus Acetylated
User Defined 1
Acrylamide Modified Cys
chem_score 0 0, 1
met_ox_factor 1.0 0.2-5.0
bull_breese 0 0,1
hplc_index 0 0,1
comp_ion None Defined regular expression
comp_mask_type AND AND or OR
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
dna_reading_frame 1 1, 2, 3, 4, 5, 6
open_reading_frame 1 non-zero positive integer
user_protein_sequence "" protein as a text string
report_mult_charge 0 0, 1
hide_html_links 0 0, 1
separate_proteins 0 0, 1
hide_protein_sequence 0 0, 1
access_method Index Number Index Number
Accession Number
entry_data "" See manual
index_num 80707 integer
accession_num L39370 text
min_digest_fragment_mass500.0 double
max_digest_fragment_mass4000.0 double
min_digest_fragment_length5 integer


MS-Bridge
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msbridge
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
missed_cleavages 1 integer
end_terminus 0 0, 1
stripping_terminus N N, C
start_strip 2 integer
end_strip 4 integer
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msbridge
text
output_filename "" text
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
parent_contaminant_massesNULL list of singly charged masses
mod_AA None Defined Peptide N-terminal Gln to pyroGlu
Oxidation of M
Protein N-terminus Acetylated
User Defined 1
Acrylamide Modified Cys
link_search_type Xlink:Dehydro (C) valid text strings defined in links.txt
max_link_molecules 5 integer
chem_score 0 0, 1
met_ox_factor 1.0 0.2-5.0
data_source Data Paste Area List of Files, Upload Data From File, Data Paste Area
upload_data "" file name
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter.
comp_ion None Defined regular expression
comp_mask_type AND AND or OR
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
dna_reading_frame 1 1, 2, 3, 4, 5, 6
open_reading_frame 1 non-zero positive integer
user_protein_sequence "" protein as a text string
separate_proteins 0 0, 1
hide_protein_sequence 0 0, 1
access_method Index Number Index Number
Accession Number
entry_data "" See manual
index_num 80707 integer
accession_num L39370 text
min_digest_fragment_mass500.0 double
max_digest_fragment_mass4000.0 double
min_digest_fragment_length5 integer


MS-NonSpecific
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msnonspecific
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
instrument_name "" valid text strings from params/instrument.txt
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msnonspecific
text
output_filename "" text
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
parent_contaminant_massesNULL list of singly charged masses
data_source Data Paste Area List of Files, Upload Data From File, Data Paste Area
upload_data "" file name
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter.
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
dna_reading_frame 1 1, 2, 3, 4, 5, 6
open_reading_frame 1 non-zero positive integer
user_protein_sequence "" protein as a text string
hide_protein_sequence 0 0, 1
access_method Index Number Index Number
Accession Number
entry_data "" See manual
index_num 80707 integer
accession_num L39370 text


MS-Product
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msproduct
output_type HTML HTML or XML
script "" text
script_type "" text
instrument_name "" valid text strings from params/instrument.txt
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_msproduct
text
output_filename "" text
nterm "" N-termini defined in params/usermod.txt or ""
cterm "" C-termini defined in params/usermod.txt or ""
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
tolerance_units Da Da, %, ppm, mmu
data_source Data Paste Area List of Files, Upload Data From File, Data Paste Area
upload_data "" file name
data_format "" M/Z Charge, M/Z Intensity Charge
data NULL See the data_format parameter. For MS-Tag the
first line contains the m/z and optionally the charge
of the parent ion.
it None Defined a,a-H2O,a-NH3,a-H3PO4,
b,b-H2O,b-NH3,b+H2O,
b-H3PO4,b-SOCH4,
y,y-H2O,y-NH3,y-H3PO4,y-SOCH4,
MH+,c,B,n,h,P,S,I,N,C
multiple_losses 0 0,1
fragment_masses_tolerance1.0 double
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
use_instrument_ion_types0 0,1
max_charge 1 positive integer or No Limit
sequence SAMPLER A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y,m,q,h,s,t,y,u


MS-Comp
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to mscomp
output_type HTML HTML or XML
script "" text
script_type "" text
instrument_name "" valid text strings from params/instrument.txt
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_mscomp
text
output_filename "" text
max_reported_hits 50 integer
parent_mass_convert monoisotopic monoisotopic
average
Par(mi)Frag(av)
Par(av)Frag(mi)
parent_mass_tolerance 0.5 double
tolerance_units Da Da, %, ppm, mmu
it None Defined a,a-H2O,a-NH3,a-H3PO4,
b,b-H2O,b-NH3,b+H2O,
b-H3PO4,b-SOCH4,
y,y-H2O,y-NH3,y-H3PO4,y-SOCH4,
MH+,c,B,n,h,P,S,I,N,C
comp_ion None Defined regular expression
parent_mass 1000.0 double
parent_charge 1 integer
composition_search 0 0, 1
aa_exclude "" text string containing the following characters
A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y
aa_add "" m,q,h,s,t,y,u
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
combination_type Amino Acid Amino Acid
Peptide Elemental
Elemental


DB-Stat
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to dbstat
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
prot_low_mass 1000 integer
prot_high_mass 100000 integer
full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
enzyme Trypsin valid text strings from params/enzyme.txt
or params/enzyme_comb.txt
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter ie:
results_dbstat
text
output_filename "" text
dna_reading_frame 1 1, 2, 3, 4, 5, 6
show_aa_statistics 0 0, 1


MS-Isotope
name Default Value Valid Values
search_name "" text which can be part of a filename.
report_title "" needs to be set to msisotope
output_type HTML HTML or XML
script "" text
script_type "" text
results_to_file 0 0, 1
output_dir "results_" + value of
search_name parameter eg:
results_msisotope
text
output_filename "" text
nterm "" N-termini defined in params/usermod.txt or ""
cterm "" C-termini defined in params/usermod.txt or ""
parent_charge 1 integer
user_aa_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_2_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_3_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
user_aa_4_composition C2 H3 N1 O1 Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
sequence SAMPLER A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y,m,q,h,s,t,y,u
distribution_type Peptide Sequence Peptide Sequence, Elemental Composition
profile_type Peptide Sequence Stick, Gaussian, Lorentzian
resolution 10000.0 double
averagine_mass 1000.0 double
percent_C13 100.0 0.0-100.0
percent_N15 100.0 0.0-100.0
percent_O18 100.0 0.0-100.0
elemental_composition "" Elemental formula of the form Cx Hy Oz etc where
x, y and z are integers. Elements defined in
params/elements.txt can be used.
detailed_report 0 0, 1


FA-Index
name Default Value Valid Values
report_title "" text
output_type HTML HTML or XML
script "" text
script_type "" text
database "" valid prefixes: Genpept, gen, SwissProt, swp, Owl, owl,
Ludwignr, NCBInr, nr, dbEST, dbest, pdbEST, pdbest,
DA, DN, PA, PN, pDA, pDN, Pdefault, Ddefault, pDdefault.
results_from_file 0 0, 1
results_input_dir results_" + value of
input_program_name parameter
eg: results_msfit
directory name
input_program_name msfit msfit, mstag, mspattern, mshomology, msseq
input_filename "" text
indicies "" list of database indicies (integers)
accession_numbers "" list of database accession numbers
species All valid text strings from params/species.txt or All
names None Defined list of text strings
accession_nums None Defined list of text strings
species_names None Defined list of text strings
species_remove 0 0, 1
add_accession_numbers None Defined list of text strings
prot_low_mass 1000 integer
prot_high_mass 100000 integer
full_mw_range 0 0, 1
low_pi 3.0 double
high_pi 10.0 double
full_pi_range 0 0, 1
dna_reading_frame 1 1, 2, 3, 4, 5, 6
user_protein_sequence "" protein as a text string
hide_protein_sequence 0 0, 1
accession_num L39370 text
create_database_indicies0 0, 1
create_sub_database 0 0, 1
sub_database_id .sub text
name_field "" text
create_user_database 0 0, 1
start_index_number 0 integer
end_index_number 1 integer
all_indicies 0 0, 1
dna_to_protein 0 0, 1
delete_dna_database 0 0, 1
random_database 0 0, 1
reverse_database 0 0, 1
concat_database 0 0, 1