Abstract PRIMe, the Platform for RIKEN Metabolomics, is a website that has been designed and implemented to support research and analysis workflows ranging from metabolome to transcriptome analysis.

The site provides access to a growing collection of standardized measurements of metabolites obtained by using NMR, GC/MS, LC/MS, and CE/MS, and metabolomics tools that support related analyses (SpinAssign for the identification of metabolites by means of NMR and KNApSAcK for searches within metabolite databases).

In addition, the transcriptomics tools provide Correlated Gene Search and Cluster Cutting for the analysis of mRNA expression. Use of these tools and database can contribute to the analysis of biological events at the level of metabolites and gene expression.

This resource provides access to publicly available metabolite databases for a range of standardized data (i.e., measured under physically and chemically constant conditions) and tools to support transcriptome analysis by allowing researchers to integrate multiple sources of data using a single and efficient interface.

PRIMe Web Applications --

PRIMe Standard Spectrum Search Database -

Chemical complexity, metabolic heterogeneity, dynamic range, and ease of extraction remain the main challenges to developing an effective metabolomics database platform.

PRIMe provides standardized metabolite measurements obtained using multi-dimensional Nuclear Magnetic Resonance (NMR) spectroscopy, Gas Chromatography/Mass Spectrometry (GC/MS), Liquid Chromatography/MS (LC/MS, and Capillary Electrophoresis/MS (CE/MS).

Note: Because the value of this data will grow as the size of the database(s) increase, the manufacturers encourage researchers to contact them for information about how to include their own data in the PRIMe resource(s).

Researchers access the standardized experimental data by searching the database(s) using the Compound Name, the PubChem Compound ID, the KEGG Compound ID, or a Molecular Formula. Thus users can search for compounds using a range of synonyms.

Since MS data has been deposited into a publicly accessible database (MassBank) - (MassBank is one of the first public repositories of mass spectra of small chemical compounds for life science), each PRIMe record is linked to corresponding MassBank record(s).

Basic information on chemical compounds has also been obtained from PubChem and KEGG.

PRIMe SpinAssign -

NMR-based metabolomics has emerged as an important discipline that offers the advantage of reproducible, non-invasive (i.e., in vitro experiments are Not required), atomic-level measurements to support research such as metabolic flux analysis.

Metabolic flux analysis - Metabolic flux analysis is the calculation and analysis of the flux distribution of the entire biochemical reaction network in the system of study.

The manufacturer’s strategy aims to annotate a number of metabolites, as many as possible from user 13C-HSQC (Heteronuclear Single Quantum Coherence) spectrum peaks in a batch manner. This process is Omics-oriented rather than targeted.

To support this research, the manufacturer’s developed a 1H and 13C standard chemical shift database of metabolites and SpinAssign, a batch-annotation system for second dimension (2D) user 13C-HSQC query peaks to annotate metabolites. This database is also implemented in the (Standard Spectrum Search - see above...).

SpinAssign provides batch-annotations of a large number of metabolites against user NMR peaks based on the manufacturers original 1H and 13C chemical-shift database. The database collected chemical shifts under accurately standardized measurement conditions, thus it is adequate for batch-annotations.

SpinAssign can search the PRIMe database and display any matches with user-defined 1H and 13C chemical shifts. An example of matching real 13C-labeled plant extracts with standardized 1H and 13C NMR data is provided on the website page for the SpinAssign system.

Mathematical indices called p-values have also been implemented to assess SpinAssign results.

PRIMe Correlated Gene Search -

Correlated Gene Search is a Web-based application that facilitates searches for correlated Arabidopsis genes that have been taken from a large quantity of ‘gene expression’ data and it permits easy visualization of the results.

The expression data has been produced using Affymetrix GeneChips and the correlation data has been released by the Arabidopsis Thaliana Trans-factor and cis-Element prediction Database (ATTED-II) -

(ATTED-II is a database of gene co-expression in Arabidopsis that can be used to design a wide variety of experiments, including the prioritization of genes for functional identification or for studies of regulatory relationships).

Visualization of co-expressed genes is useful for understanding gene-to-gene relationships. This tool provides search output in formats that can be applied to Pajek and BioLayout visualization software.

PRIMe Cluster Cutting -

Cluster Cutting is a Web-based tool designed to cut out part of a hierarchical clustering tree at a specified gene locus (the Arabidopsis Genome Initiative [AGI] code).

Because it is difficult to process and display the large amounts of data contained in large cluster trees on a personal computer, the manufacturers have provided this tool to narrow down the results.

The Cluster Cutting tool uses the Java TreeView software to show a part of a cluster tree specified by the locus ID, number of genes, and depth of tree for a prepared dataset.

The dataset comes from AtGenExpress, a gene expression analysis project based on the Affymetrix GeneChip Arabidopsis ATH1 and run by RIKEN and various other groups.

PRIMe Additional Tools --

DROP Met: Data Resources Of Plant Metabolomics -

DROP Met, Data Resources Of Plant Metabolomics, is a repository and distribution site of the dataset obtained from the manufacturer’s multiple MS-based metabolome analyses.

Various kinds of datasets, ranging from raw fundamental data on analytical conditions to metabolic profiles of plant samples, are available to the public.

Note: The manufacturers release some datasets in the raw, hoping that users in the bioinformatics and metabolomics fields will develop novel algorithms and methodologies for metabolomics using these datasets.

KNApSAcK - A comprehensive species-metabolite relationships database -

KNApSAcK is a tool for the analysis of metabolites. Information on natural products has been amassed, with special emphasis on their biological origins.

You can retrieve information on metabolites by entering the organism name, the name of a metabolite, molecular weight, or molecular formula.

KNApSAcK also provides a tool for analyzing mass spectrum data.

Simple BL-SOM, an integrated analytical tool for a range of “omics” data -

In order to understand the whole mechanism of a living organism by linking a range of information from its genome, transcriptome, proteome, and metabolome, it is essential to have a platform capable of integrating a range of “omics” data so that they can be analyzed together.

Batch-learning self-organizing map (BL-SOM) is one tool that can support such integrated analyses.

The transcriptome and metabolome dataset of this tool consists of a matrix in which signal intensities are arranged in multiple columns (experimental series) and multiple rows (gene and metabolite IDs).

BL-SOM can analyze an integrated matrix of both transcriptome and metabolome data after appropriate normalization of the data and preliminary calculations, allowing researchers to visualize the correlations between elements.

Genes and metabolites are classified into clusters in a two-dimensional “feature map” based on their expression and accumulation patterns.

Owing to the high reproducibility and resolution provided by BL-SOM, gene-to-metabolite, gene-to-gene, and metabolite-to-metabolite correlations can be elucidated, facilitating the identification of a gene’s function.

