MZmine 2

Abstract MZmine 2 is a modular framework for processing, visualizing, and analyzing mass spectrometry(MS)-based molecular profile data. The current version of MZmine 2 is suitable for processing large batches of data and has been applied to both targeted and non-targeted ‘metabolomic analyses’.

A key concept of the MZmine 2 software design is the strict separation of core functionality and data processing modules, with emphasis on easy usability and support for high-resolution spectra processing. Data processing modules take advantage of embedded visualization tools, allowing for immediate previews of parameter settings.

MZmine 2 Data processing workflow -- A typical workflow for processing mass spectrometry data using MZmine 2 consists of the following steps:

(Note: that some of these steps are optional and may be skipped):

1) Raw data import;

2) Detection of chromatograms using the Chromatogram builder;

3) Deconvolution of chromatograms into individual peaks;

4) Deisotoping;

5) Identification of fragments, adducts, and peak complexes;

6) Normalization of retention time using the Retention time normalizer;

7) Alignment using the Join aligner or RANSAC aligner;

8) Gap filling using the Peak finder or Same range gap filler;

9) Normalization using the Linear normalizer or Standard compound normalizer;

10) Identification, using a custom database or online databases; and

11) Data analysis, export, and visualization.

MZmine 2 Supported file formats -- MZmine 2 can read and process both unit mass resolution and exact mass resolution (e.g. FTMS) data in both continuous and centroided modes, including fragmentation scans. Supported data formats are:

1) mzML (version 1.0 and 1.1).

2) mzXML (versions 2.0, 2.1 and 3.0) - mzXML provides a standard container for MS and MS/MS proteomics data and is the foundation of its manufacturer’s proteomic pipelines.

3) mzData (versions 1.04 and 1.05) - mzData is a data format capturing peak list information. Its aim is to unite the large number of current formats (PKL’s, DTA’s, MGF’s, etc.) into one; mzData.

4) NetCDF - NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

5) Thermo RAW (only on Windows with Thermo Xcalibur installed).

MZmine 2 Dataset filters -- This module comprises various filter(s) that can be applied to the entire raw data file.

Scan filters - This module comprises various filters that can be applied to the raw data scan by scan.

Peak detection -- Chromatogram builder - The Chromatogram builder is the main module for peak detection in MZmine 2. Its purpose is to create a list of unique masses which form continuous chromatograms in raw data. Later, these chromatograms may be deconvoluted into individual peaks by the Deconvolution module.

The Chromatogram builder module works in three (3) steps: Mass detection, Filtering and Chromatogram construction.

1) Mass detection - In the Mass detection step, individual ions are detected in each mass spectrum.

2) Filtering - Filtering is an optional operation which may reduce the amount of false chromatograms by removing those M/Z signals which can be recognized as noise. The filtering is done on the detected M/Z peak lists; raw data is Not considered in this step.

3) Chromatogram construction - When mass detection and filtering is finished, M/Z data points from each scan must be connected together to form chromatograms.

Chromatogram deconvolution -- Following the detection of chromatograms by the Chromatogram builder, chromatograms have to be deconvoluted into individual peaks. The Deconvolution module provides several algorithms for this purpose.

Isotopic peaks grouper -- This module attempts to find those peaks in a peak list, which form an isotope pattern. When an isotope pattern is found, the information about the charge and isotope ratios is saved, and additional isotopic peaks are removed from the peak list. Only the highest isotope is kept.

Note that Deisotoping is performed after the Chromatogram builder and Deconvolution. Therefore, MZmine does Not search for isotopic peaks in individual scans, but instead tries to identify those peak list entries, which form an isotope pattern together.

Identification of fragments, adducts, and peak complexes --

Normalization of retention time using the Retention time normalizer -- a) The retention time normalizer attempts to reduce the deviation of retention times between peak lists, by searching for common peaks in these peak lists and using them as normalization standards.

Alignment using the Join aligner or RANSAC aligner --

Gap filling using the Peak finder or Same range gap filler --

Normalization using the Linear normalizer or Standard compound normalizer --

Identification using a custom database or online databases --

MZmine 2 Supported databases --

PubChem - PubChem database contains millions of chemical compound structures.

KEGG - KEGG database contains metabolites and other biomolecules present in natural metabolic pathways.

HMDB - The Human Metabolome Database (HMDB) - contains over 7,000 known metabolites found in the human body.

METLIN - The METLIN database contains over 20,000 metabolites.

MZmine 2 Data analysis, export, and visualization --

Data analysis -

Peak list export and import -

Visualization -

