ProteinProspector Revision History
V 3.4.1 10/2000 (NT only)
- MS-Fit can now accept a list of contaminant masses as a parameter. Data peaks which are
within the tolerance of the contaminant peaks will be deleted from the data set before the
search takes place. All charge states are considered.
- The facility to list search parameters at the end of the report, rather than the
beginning, has been removed.
- Phosphorylation of tyrosine is now considered in MS-Product and MS-Tag.
- Support added for Ludwig non-redundant database.
- In MS-Tag there is now a choice between maximum % unmatched ions, maximum number
of unmatched ions or minimum number of matched ions.
- Fixed the following bugs:
- MS-Comp doesn't work if the full amino acid composition of the ion is specified.
- DB-Stat crashes if the pre-searches select no entries.
- If a selected accession number is not present in the database the program either
prints an incorrect error message or crashes.
- If the instrument specific default ion types are selected in MS-Product or MS-Tag
then the default lists of amino acids which are defined in instrument.txt for MSMS
fragmentation are not initialised properly.
- If the pre-searches select no entries it is still possible for the search programs
to get a hit.
- The name and species pre-searches don't work properly it one of the names/species
contains spaces.
- When a subsequent MS-Fit search is invoked from the MS-Fit results page to
find mixture components an obselete name is used to describe the data format. This is
not known to cause any problems at present.
- The coverage count in MS-Fit is out by 1 if the last amino acid in the
protein is included in one of the hits.
V 3.4.0 8/2000 (NT only)
- The search programs can now search databases containing the sequence of an entire
genome.
- There is a new maximum hits parameter on MS-Tag which controls when a search is
aborted.
- The DB-Stat program can now do a pre-search based on a list of species names. The
species names used are the ones contained in the database comment lines. These vary from
database to database. The species name list can also be used as a pre-search parameter
in FA-Index. Licensees could add this pre-filter to the other search programs if required.
- The programs can now be operated directly from the command line with the search
parameters specified via command line arguments. HTML style escape characters can be
used to deal with newline characters and spaces, etc.
- MS-Digest now has an open reading frame parameter when used with DNA databases
and only gives results for that reading frame. This allows it to be used with single
entry genome databases.
- In MS-Tag/MS-Seq the parent error is reported in the tolerance units.
- The elements file has been updated. The masses were taken from Audi, G.,
Wapstra, A. H., Nucl. Phys. A, Vol. 595, pp. 409-480 (1995).
- Element definitions have been added for Deuterium, Chlorine, Fluorine,
Sodium, Potassium, Zinc, Selenium, Bromine and Iodine.
- MS-Fit can be now used as a pre-filter for MS-Seq or MS-Tag.
- The MS-Tag matching ion table now has 12 ions per row.
- The size of the memory map used by the programs can now be controlled. This leads
to faster operation in some circumstances.
- MS-Tag now uses percent unmatched ions rather than number of unmatched ions.
- There is an option on the MS-Fit results page to submit the unmatched masses to
another MS-Fit search and thus search for subsequent components.
- There is an option on the MS-Fit results page to submit a single unmatched mass
to a No enzyme MS-Tag search on the hit protein. This will find peptides with non-specific
cleavages.
- Some additional tags in the form of html comments have been added to the MS-Fit results
to facilitate cooperation with automation programs. Also the text version of the MS-Fit
report has been tidied up.
- ICAT-light, ICAT-heavy and IDEnT have been added to cysteine modification file. However
the results are not currently filtered so that they only contain cysteine containing peptides.
- The species.js file can now be updated automatically from the FA-Index page. This means that
only the msparams/species.txt file needs to be edited when adding species definitions.
- There has been a comprehensive rewrite of the software which extracts species, accession
numbers and names from database comment lines.
- There is now a new Protein Propsector email address.
- The percent TIC, mean error, data tolerance and mean number of missed cleavages are
printed after an MS-Fit hit. If intensities aren't specified then the percent TIC value will be
the same as the percent masses matched. The mean error is useful for diagnosing systematic errors
in the results - indicating a calibration problem. The data tolerance is twice the standard
deviation of the results and is the number that should be used as a tolerance parameter in the
absence of systematic errors. This number is more reliable if there are a reasonable number of
matching peaks (say 10). Also the number is only valid if all the matched peptides are real hits.
- The species, accession number and name searches have been removed from MS-Pattern.
- The molecular weight and pI pre-searches are no longer available for DNA databases.
- Species collections can now contain other species collections (eg MAMMALS could contain
RODENTS).
- Fixed the following bugs:
- There is a limit to the size of the database that can be used of around 2 GBytes.
- The hit information line in MS-Fit containing the percentage of hits,
size, PI and protein name now makes up the first line of the
detailed result table. Netscape and IE display the results fine but when
they are copied from IE into Word the first column of the table is formatted
to the width of the information line. This pushes everything else together
resulting in a messy table that has to be reformatted by hand.
- The column header is MH+ in MS-Digest when the data in multiply charged.
- Every 50th peak in MS-Digest doesn't get reported.
- If there is only a single base on the last line of an entry in a DNA database
then frames 5 and 6 come out the same. This only affects a small proportion of the
entries.
- The protein sequence is erroneously appended to the name for a few NCBI entries
(eg. Accession no:464430).
- MS-Tag crashes if there are more than 400 internal ions in a potential hit. This
only affects very high mass peptides.
- The Genpept accession number link doesn't work.
- The Owl accession number link doesn't work.
- MS-Tag in homology mode sometimes fails to find hits that it should find if the
number (or %) of unmatched ions is too low.
- The lowest mass peptide is not reported in MS-Digest. This isn't normally a problem
as masses below 500 Da aren't displayed by default.
V 3.3.0 UCSF Internal Version
V 3.2.2 12/1999 (NT only - not released)
- Name of MS-Edman program changed to MS-Pattern.
- Pre-searches based on the name field and a list of accession numbers are now possible in MS-Fit, MS-Tag, MS-Pattern and MS-Seq.
- The name field can be used as an additional parameter in FA-Index and DB-Stat.
- A list of accession numbers can be used as an additional parameter in FA-Index.
- End terminus digesting has been added to MS-Digest.
- MS-Fit can now have zero peaks entered. This allows searches of just pI, MW, etc.
- MS-Fit results can be sorted based on pI or intact protein molecular weight.
- A default database type has been added. This allows Prospector to be used with FASTA format
databases with any comment line format. Prospector assigns the accession number to the numerical
position of the entry in the database the species as UNREADABLE and the name as the entire
comment line.
- A more meaningful error message is printed if the database is not present on the server.
- The procedures for adding Cysteine modifications have been simplified.
- Web server users can now control the minimum peptide mass and the minimum peptide length in MS-Digest.
- Multipart form data is now supported in addition to get and post.
- The absence of an immonium ion can be used to denote absence of an amino acid in MS-Tag.
- Accession numbers are now saved when hits are saved to a file.
- DB-Stat prints out the total number of amino acids in the entries
matching the search conditions.
- The mass and ion type columns have been swapped in the final MS-Product table.
- A column listing the number of missed cleavages has been added to the MS-Digest report.
- More than one of a given amino acid can be specified as a partial composition in MS-Comp.
- In MS-Tag/MS-Seq the fragment error is reported in the tolerance units.
- Fixed the following bugs:
- IUPAC ambiguity codes not supported in DNA to protein conversions.
- mmu tolerance calculations incorrect.
- The programs crash if the msparams/colors.txt file is missing.
- MS-Fit MOWSE scores can be incorrectly ranked if the score overflows
what can be stored in a 4 byte integer number.
- The format of the OWL database has changed. This means that most of the
time the link from accession number in the reports doesn't work.
- After the MS-Pattern list of sequences option has found a hit in a
protein it only starts looking for new hits at the end of the peptide it has
just found whereas it should start looking at the next amino acid.
- Pepsin enzyme rules incorrect.
- Some SwissProt entries don't have accession numbers. This causes problems with the UNIX
versions of Prospector.
- The delta mass column in the MS-Pattern and MS-Tag reports can span 2 lines when viewed
on Internet Explorer.
- Next 20 entries button on the FA-Index Database Summary Report doesn't work
if seqdb directory or msparams directory has been relocated.
V 3.2.1 3/1999
- The mutation/modification feature in MS-Tag/MS-Fit now has more options. The options can also be programmed
by licencees.
- The form entries for the Max. # of mismatched AA's in MS-Edman, the Min. # matches with NO AA substitutions in MS-Fit
and the Max. # unmatched ions in MS-Tag have been changed to allow any number to be entered.
- Fixed the following bugs:
- There are sometimes problems with inputs of lists of text strings (eg. list of
sequences in MS-Edman) if the cursor does not end up at the start of the line underneath the last text string.
There are problems both when the cursor ends up directly at the end of the last text string (with no carriage
return) and when there are lines after the last text string containing only space or tab characters.
- Setting the ion types in MS-Comp to include anything except MH+, y or b+H2O
and the combination type to elemental caused the peptide mass and mass
error columns in the report to show incorrect results.
- There are sometimes problems with inputs of lists of text strings (eg. list of sequences in MS-Edman) if
the cursor does not end up at the start of the line underneath the last text string. There are problems
both when the cursor ends up directly at the end of the last text string (with no carriage return) and
when there are lines after the last text string containing only space or tab characters.
- Excluded species entries have to be placed after multiple species entries in the species.txt file for
multiple and excluded species filters to work correctly.
V 3.2.0 2/1999
- MS-Seq introduced. This is our implementation of the Mann and Wilm sequence tag algorithm.
- Database statistics program DB-Stat added.
- MS-Product now works with multiply charged ions.
- MS-Product now displays a list of the theoretical fragment ions in mass/charge order.
- One of the MS-Fit/MS-Digest Considered Modification is now user programmable. This replaces the
Phosphorylation of S, T and Y option.
- An instrument parameter has been added to some of the programs. This currently defines mass accuracies
for reports, ammonia loss aas, water loss aas, positive charge bearing aas, maximum internal ion masses
and fragmentation ion types.
- Now no support for non-JavaScript aware browsers.
- One release supports both the Internet Explorer 4 and Netscape Navigator 4 browsers.
- Licencees can now control the minimum peptide mass and the minimum peptide length in MS-Digest.
- Licencees can now enter a mixture of monoisotopic and average data into MS-Fit.
- Licencees can now control whether the search parameters in MS-Fit, MS-Edman, MS-Tag and MS-Seq appear at
the top or bottom of the output.
- Licencees can now define the number of hits which cause an MS-Tag or MS-Seq search to abort.
- FA-Index can now deal with the SwissProt database supplied by NCBI.
- MS-Comp now works with a range of fragment ion types.
- There is now only one species alias file which is used for all the databases.
- SwissProt URL changed.
- FA-Index now has an option to convert DNA databases to a protein format. This makes the searches somewhat
faster but doubles the database size.
- Licencees may now change the page colors.
- MS-Tag users can now specify a-H2O, y-H2O and c ions.
- The JavaScript files have been simplified.
- The msparams files now have .txt on the end of the file names.
- The dbEST.spl.txt file has been updated.
- The links to similar programs by other groups have been checked.
- The database summary report on FA-Index can now display protein sequences. The sequences can be pasted
into the MS-Digest user supplied protein sequence box.
- s, t and y can now be used in the MS-Digest User Protein.
- MS-Isotope now works with peptide or elemental formulae containing phosphorus.
- MS-Comp now works if Combination Type is set to Elemental.
- MS-Fit now has a column reporting the m/z submitted if any of the data is multiply charged.
- Fixed bug which caused MS-Tag to sometimes give false positives with the CNBr digest in which the
parent ion didn't match.
V 3.1.1ie 8/1998
- A version of 3.1.1 intended for use with Internet Explorer 4. It only contains Javascript which is
compatible with Internet Explorer 4.
- The next and previous amino acids are now always displayed on the same line as the peptide sequence.
V 3.1.1njs 8/1998
- A version of 3.1.1 intended for use with older browsers. It contains no Javascript.
- Fixed bug in MS-Fit which causes first column of the detailed report to be too wide when using the
Internet Explorer browser.
V 3.1.1 7/1998
- In MS-Edman sequence and mass mode the results have to match the sequence when the .* wildcard used.
- MS-Fit, MS-Tag and MS-Edman search conditions now at the end of the report. They can be
accessed via a link from the top of the report.
- MS-Digest report gives protein amino acid composition.
- MS-Isotope prints an error message rather than crashing if the monoisotopic peak abundance is too low.
- MS-Isotope limit on the length of the peptide sequence removed.
- MS-Tag reports fragment ion m/z's to two decimal places.
- Fixed bug in the MS-Tag output which caused the DNA and open reading frame results not to line up with the headers.
- Fixed bug which caused an error in the pI calculation under certain conditions.
- Copyright notices updated.
- msparams, seqdb and results directories can now have their locations defined in ucsfhtml/js/info.js.
- In the msparams3.1/*.sp files the species specified in a species collection don't have to be in the
same order as the single species specifications.
V 3.1 4/1998
- MS-Fit has an optional MOWSE style scoring system (Pappin et al, Current Biology, 1993,
Vol 3, No 6, pp. 327-332).
- MS-Fit results contain a link to a coverage map displayed in MS-Digest.
- MS-Fit searches can be constrained by pI range; MS-Edman, MS-Digest, MS-Fit and MS-Tag report
the pI of the intact protein in the search results.
- MS-Isotope introduced, as a tool for calculating and visualizing isotope patterns of
peptides and organic molecules.
- MS-Fit can search the hits saved by MS-Fit or MS-Edman.
- MS-Tag can search the hits saved by MS-Fit, MS-Edman, or MS-Tag.
- MS-Edman can search the hits saved by MS-Fit, MS-Tag, MS-Tag Unknome, or MS-Edman.
- MS-Tag has an option of specifying an average parent mass along with monoisotopic
fragment masses.
- MS-Fit, MS-Tag, and MS-Edman report hits up to the specified maximum reported hits when this
value is exceeded instead of generating an error.
- MS-Digest calculates masses of multiply-charged ions.
- MS-Comp has an elemental composition search option which includes nitrogen rule
and valency checks.
- MS-Edman has options for searching lists of names, species and accession numbers. This option
would typically be used in conjunction with FA-Index to create subset databases.
- Mixtures of 2 or more digests are possible; you can mix N and C terminal digests
and any digest with CNBr.
- MS-Fit and MS-Digest have the option of looking for fragments containing either
acrylamide modified cysteines or another form of cysteine.
- The species filtering mechanism has been expanded to allow searches limited to a
collections of species and searches which exclude certain species.
- PTC added as an N terminal group option.
- Additional options added to the DNA frame translation parameter.
- The digest cleavage rules are defined in a file. Licensees can thus add new digests
or modify existing ones.
- The list of available options for the database, terminal groups, cysteine modification,
digest and species parameters are defined in one place for all the programs, for easier updating by licensees.
- The programs can handle database files with carriage return and line feed at the
end of the line as well as just line feed (ie. both DOS and UNIX ASCII).
- Licensees can now run FA-Index from a browser page. It also has options for creating subset
databases based on species/molecular weight filters and from saved results files. Also new
entries can be added to existing databases and databases can be created from scratch. A
database summary report has also been added.
- A new file (suffix .sl) is created by the FA-Index program when it is makes the index files for
a database. The file contains an alphabetic listing of the species aliases which occur more than
ten times in a given database, for easier creation of species collections by licensees.
V 3.0.4 4/1998
- Changed the way in which databases with the same accession number links and/or
species alias files can be grouped.
- Fixed bug for printing error message with faulty species alias lists.
V 3.0.3 4/1998
- Limit removed on the number of allowed user databases.
- Fixed bug whereby alphanumeric accession numbers in DA.. databases didn't work.
- The software can now deal with changes to the Owl comment line introduced after revision 30.
- Minor bug in the MS-Tag abort facility fixed.
- Updated Owl.sp file implementing improved species searching.
V 3.0.2 3/1998
- Cosmetic changes to the MS-Fit output including addition of comment tags.
- Pressing stop on client browser aborts MS-Tag and MS-Edman searches on server.
- Fixed bug in MS-Tag where automated immonium assignement altered multiply charged ion assignments.
- Fixed complement strand DNA translation so frames 4,5,6 are correct.
V 3.0.1 9/1997
- MS-Fit searches can be terminated by pressing the browser STOP button.
- Fixed cosmetic bug in MS-Fit multiply charged output. Charges were displayed above table in Detailed Results,
instead of as subscripts on input masses.
V 3.0 7/1997
- Extensive overhaul of instructions for most programs,installation and
administration.
- Multiply charged ions for electrospray allowed.
- Dynamic memory allocation eliminates the restriction on maximum number of
masses that can be input to MS-Fit and MS-Tag.
- Intact Protein MW pre-filter can be bypassed in MS-Fit, MS-Tag, and MS-Edman.
- MS-Fit Homology mode implemented by adding mutation matrix routines from MS-Tag.
- MS-Fit results can be saved to a file.
- MS-Tag can search the hits saved from MS-Fit.
- MS-Tag no errors and allow error modes renamed identical and homology modes.
- MS-Tag results split into summary and detail reports.
- MS-Tag output shows mass error on fragment ion masses.
- MS-Tag allows use of monoisotopic parent mass and average fragment masses.
- MS-Tag, MS-Edman, and MS-Comp allow for modified peptide termini to be designated.
- MS-Tag allows for immonium and related ion masses to be entered directly.
- MS-Tag b+H2O ion type modified to allow occurence when ion contains K.
- Maximum internal ion mass in MS-Tag increased from 500 to 800 Da.
- Improved implementation of de novo sequencing in MS-Tag (searching
Unknome), particularly with display of results in degenerate peptide form.
- MS-Edman modified to accept a list of sequences (expected to come from MS-Tag Unknome mode results).
- MS-Edman modified for better homology searching by allowing mismatched amino acids.
- Results now display the version number of the program in the footer.
- All input variables internally assigned default values so that automated bypassing
of the HTML input form doesn't crash the program if some input variables are not supplied.
- The URL's linked to from the peptide sequence and the MS-Digest index number are now externally modifiable by a server administrator.
- Prospector logo first used.
- DN1, DN2, DA1, DA2, PN1, PN2, PA1, and PA2 prefixes for generic FASTA formatted databases introduced.
- Fixed bug in MS-Tag homology mode so star ions are matched when only
N-terminal ion types OR only C-terminal ion types are designated.
V 2.0 2/1997
- Whole package of programs now called ProteinProspector.
- 6-frame translation added to handle dbEST.
- MS-Fit, MS-Tag output contains html comment tags to facilitate possible
automated parsing of redirected output.
- Input variable names unified across htm files for all programs to
facilitate possible automated input bypassing of HTML forms.
- MS-Comp introduced.
- Preliminary implementation of de novo sequencing in MS-Tag (searching
Unknome).
- MS-Tag allow errors mode control of mutation matrix added.
- MS-Tag allow errors mode searches speed increased.
- Fixed bug in global error handling display.
- MS-Fit, MS-Tag, MS-Edman have variable maximum number of hits displayed.
- MS-Fit calculates % of protein covered by matched peptides.
- MS-Fit, MS-Tag, MS-Edman allows for a sample ID (comment) input field.
- MS-Digest allows user specified protein sequence.
- MS-Digest, MS-Product allows user specified amino acid.
- Fixed bug pertaining to unreadable species in NCBInr database.
- faindex creates database_name.usp file for unreadable species entries.
- MS-Product; added b+H2O; a,b,y-H3PO4; b,y-SOCH4; Y; and N,C-term ladder
ion types.
- MS-Product modified to display both monoisotopic and average parent
masses.
- MS-Product prints fragment masses to 2 decimal places.
- MS-Product fixed so c ions are correctly calculated.
- Changed suffix on forms from html to htm.
- Made source code portable to Windows NT.
- Shifted software development environment to PC.
V 1.1 8/1996
- Added MS-Edman program.
- Added NCBInr database usage.
- Implemented global error handling.
- Fixed bug in MS-Tag associated with incorrect labeling of star
ion types in allow errors mode.
- Fixed bug in MS-Digest retrieval by accession number.
- Added elemental composition and amino acid composition calculation to
MS-Product.
- Cosmetic revisions to output display in MS-Fit and MS-Tag.
- Allowed comments in msparams/database.sp files.
- Separated source code libraries.
V 1.0 6/1996
- Programs available for use only on the world wide web prior to 6/1996 (starting 10/1995),
and not distributed to licensees for use locally. Work on ProteinProspector started in 04/1995.