Babelomics

Category Genomics>Gene Expression Analysis/Profiling/Tools

Abstract Babelomics is a suite of interconnected tools oriented to the functional annotation of genome-scale experiments. One of the fields for which Babelomics is best suited is the analysis of microarray gene expression experiments.

However, Babelomics is Not restricted to this type of data and has been designed to facilitate the interpretation of large-scale experiments.

Functional enrichment Tools in Babelomics --

FatiGO (fast assignment and transference of information) - FatiGO takes two lists of genes and converts them into two lists of Gene Ontology (GO) terms using the corresponding gene-GO association table.

Then, a Fisher's exact test for 2×2 contingency tables is used to check for the significant over-representation of GO terms in one of the sets with respect to the other one.

Multiple test corrections to account for multiple hypothesis tested (one for each GO term) is also applied.

In addition to GO terms, FatiGO can simultaneously test for KEGG pathways, InterPro motifs, Swissprot keywords, microRNA, transcription factor binding sites (TFBSs), cis-Regulatory Element Database (cisRED) motifs and BioCarta pathways.

The distribution of any combination (or all) of the terms between the two groups of genes can be simultaneously tested by means of a Fisher exact test.

All the p-values are adjusted via the false discovery rate (FDR) process. Marmite - Marmite stands for 'My Accurate Resource for MIning TExt' and its aim is to provide the research community with an easy and accurate way to analyze the association between 'gene lists' and the terms that have been found to be anyway related to their genes in PubMed documents.

Data is provided by the manufacturer 'bioalma' who generated the associations using their product AKS2 (alma Knowledge Server).

Bioentities and gene co-occurrences - Starting with a set of documents (e.g. the documents where a certain gene appears or a disease) one can define keywords as those words which are significantly over- represented compared to a standard set or background.

These words that appear with much higher frequencies than one would expect from chance alone can be considered as the 'content words' that capture the main features in this set of documents.

Marmite evaluates the differences between the gene-bioentity co- occurrences values (scores) for the two lists of genes. The manufacturer applies a Kolmogorov-Smirnov Test to each pair of distributions (one per list) formed by the scores of the co-occurrences between a bioentity and the genes within the list.

Only genes with a score indicating co-occurrence with the bioentity are included.

Gene set enrichment Tools in Babelomics --

FatiScan - FatiScan detects blocks of genes that are functionally related - FatiScan implements a ‘segmentation test’ which checks for 'asymmetrical distributions' of 'biological labels' (GO, KEGG pathways, Interpro motifs, Swissprot keywords, microRNA, Transcription factors and cisRED cis-regulatory elements) associated to genes ranked in a list.

FatiScan can be applied to the study of the relationship of 'biological labels' to any type of experiment whose outcome is a sorted list of genes.

Since Babelomics is linked to GEPAS (see G6G Abstract Number 20273) genes sorted by differential expression between two experimental conditions can be studied, also genes correlated to a clinical variable (such as the level of a metabolite) or even to survival, can be studied.

Moreover, other lists of genes ranked by any other experimental or theoretical criteria can be studied (e.g. genes arranged by physico- chemical properties, mutability, structural parameters, etc.) in order to understand whether there is some biological feature (among the labels used) which is related to the experimental parameter studied.

The manufacturer proposes the use of this procedure to scan ordered lists of genes in order to understand the biological processes operating behind them.

This procedure can be useful in situations in which it is Not possible to obtain statistically significant differences based on experimental measurements (low prevalence diseases, etc.).

MarmiteScan - MarmiteScan is the application of a ‘threshold free method’ (via FatiScan) that extracts blocks of related genes from an ordered list of genes by an 'associated value' via the Marmite tool (see above), a tool that finds differential distributions of bioentities extracted from PubMed between two groups of genes.

MarmiteScan tests whether the distributions of the scores, that is, the importance of the association to a bioentity, differ in any of the groups.

MarmiteScan tells you whether there is an enrichment of any bioentity in your list of sorted genes.

Tissue phenotype-based profiling tool(s) in Babelomics --

Tissues Mining Tool (TMT) - TMT extracts differences between the distributions of the 'expression values' of two groups of genes in a set of tissues.

In order to improve the possibilities of your analysis and to cover most of the scope of the possible experiments users are interested in, the manufacturer provides data from two (2) types of platforms: SAGE Tags and Microarray (Affymetrix) expression data.

TMT also provides two (2) possible applications. Users may compare the distribution of two lists of genes of interest, lists taken from a microarray experiments after a differential expression analysis, genes from two clusters, etc.

Alternatively, users may also compare a unique list of genes versus the complete set of genes with expression values in the tissues selected (genes in the user's list are excluded from this second list).

This approach may be useful to see if your genes of interest are over/under expressed or randomly mapped in the overall distribution of the tissue.

Gene module related Tools --

GO-Graphviewer - an advanced Gene Ontology directed acyclic graph (DAG) viewer -

Interactive and highlighted Gene Ontology Graph Visualization - The DAG viewer tool generates joined gene ontology graphs (DAGs) to create overviews of the functional context of groups of sequences.

Interactive graph visualization allows the navigation of large and unwieldy graphs often generated when trying to biologically explore large sets of sequence annotations.

Zoom and graph navigation is provided through the DAG viewer Java Web Start tool.

ID converter -

Babelomics accepts all standard gene and protein identifiers, which also makes it suitable for the analysis of proteomic experiments.

Actually, an improved tool for protein and gene ID conversion including a large number of species and databases has been implemented.

More species and gene references have been added and now the ‘converter tool’ supports more than 10 species and more than 40 id references for human species (including SNP and orthologous information).

Babelomics is currently based on Ensembl version 44.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site Babelomics

Price Contact manufacturer.

G6G Abstract Number 20274

G6G Manufacturer Number 102153