Abstract MAss SPECTRometry Analysis System (MASPECTRAS) is a web-based platform for the management and analysis of proteomic liquid chromatography tandem mass spectrometry (LC-MS/MS) data supporting Minimum Information About a Proteomics Experiment (MIAPE).

MASPECTRAS was developed using state-of-the-art software technology and enables data to be imported from five (5) common search engines.

Analytical modules are provided along with visualization tools and PRoteomics IDEntifications (PRIDE) export as well as a module for distributing intensive calculations to a computing cluster.

MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database scheme and follows the guidelines of the Proteomic Standards Initiative (PSI).

Analytical modules include:

1) Import and parsing of the results form the ‘search engines’ SEQUEST, Mascot (see G6G Abstract Number 20087), Spectrum Mill, X! Tandem, and OMSSA.

The manufacturer developed parsers for the five (5) widely used 'search engines' (stated above). MASPECTRAS manages the sequence databases used while searching with different modules internally.

When results of a search engine are imported into MASPECTRAS, the system first tries to determine whether the same accession string for the same database version exists.

If that is Not the case, the original sequence information is retrieved from the corresponding sequence database.

Subsequently the system tries to match the sequence against the sequences already stored in the database.

If an entry with the same sequence information but a different accession string is found, the new accession string is associated with the unique identifier of the already stored sequence.

Otherwise a new unique identifier is created and the sequence is stored with the appropriate accession strings.

2) Peptide validation -- MASPECTRAS calculates a probability score for SEQUEST and Mascot which is based on the algorithm of PeptideProphet.

Data re-scoring adds a further layer, which improves the specificity of the highly sensitive SEQUEST and Mascot database searches.

The statistical model incorporates a linear discriminant score based on the database search scores (for SEQUEST: XCorr, dCn, Sp rank, and mass difference) as well as the tryptic termini and missed cleavages.

After scoring the data has to pass a user definable filter, which uses the search programs specific score to discard the most unlikely data.

3) Clustering of proteins based on Markov Clustering and multiple alignments -- In peptide fragmentation fingerprinting (PFF) peptides are identified by search engines, which have to be mapped to proteins.

A single peptide often corresponds to a group of proteins.

Therefore, PFF identifies protein groups, each protein sharing similar peptides.

A grouped protein view represents the result more concisely and proteins with a small number of identified peptides can be recognized easier in complex samples.

The protein grouping implemented in MASPECTRAS is based on Markov clustering using Basic Local Alignment Search Tool (BLAST) and multiple alignments.

4) Quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio) -- For quantification of peptides the ASAPRatio algorithm has been integrated and applied.

To determine peak area a single ion chromatogram is reconstructed for a given m/z range by summation of ion intensities.

This chromatogram is then smoothed ten-fold by repeated application of the Savitzky – Golay smooth filtering method.

Visualization Tools -- MASPECTRAS allows the storage and comparison of search results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA matched to different 'sequence databases' merged in a single user-definable view.

MASPECTRAS provides customizable (clustered) protein, peptide, spectrum, and chromatogram views, as well as a view for quantitative comparison.

The clustered protein view displays one representative for each protein cluster.

In the peptide centric view the peptides with the same modifications are combined together and only the representative with the highest score is displayed.

The spectrum viewer of MASPECTRAS enables manual inspection of the data by providing customizable zooming and printing features.

The chromatogram viewer allows manual definition of the peak areas. The chromatograms of all charge states of the identified peptide are displayed.

The quantitative comparison view offers the possibility to compare peptides with two (2) different post translational modifications (PTMs) or with one PTM and an unmodified version.

The calculated peaks are displayed graphically together with a regression line.

PRIDE export -- MASPECTRAS has been designed to comply with the MIAPE requirements and provide researchers export possibilities to other file formats (Excel, Word, and plain text).

Additionally, the export to the PRIDE XML format is possible directly from the protein and peptide views and the resulting file can be submitted to the PRIDE repository.

System Requirements

Operating system: Solaris, Linux, Windows; the JClusterService requires a UNIX/Linux system.

Other requirements: Java JDK 1.5.x, Oracle™ 9i or PostgreSQL™ 8.0.x or MySQL™ 4.1.xx/5.0.xx, server with at least 1 GB of main memory (2 GB are recommended).


