GeneMANIA
Category Cross-Omics>Knowledge Bases/Databases/Tools and Genomics>Gene Expression Analysis/Profiling/Tools
Abstract GeneMANIA is a flexible, user-friendly web interface for generating hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assays.
GeneMANIA helps you predict the function of your favorite genes and gene sets.
GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity.
You can use GeneMANIA to find new members of a pathway or complex, find additional genes you may have missed in your screen or find new genes with a specific function, such as protein kinases. Your question is defined by the set of genes you input.
If members of your gene list make up a protein complex, GeneMANIA will return more potential members of the protein complex. If you enter a gene list, GeneMANIA will return connections between your genes, within the selected datasets.
Given a query list, GeneMANIA extends the list with functionally similar genes, that it identifies using available genomics and proteomics data. GeneMANIA also reports weights that indicate the predictive value of each selected data set for the query.
Seven (7) organisms are currently supported (Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Homo sapiens, Rattus norvegicus and Saccharomyces cerevisiae); and
Hundreds of data sets have been collected from Gene Expression Omnibus (GEO) - (see G6G Abstract Number 20013), the Biological General Repository for Interaction Datasets (BioGRID);
Pathway Commons - (see G6G Abstract Number 20366); and I2D (interologous interaction database); as well as organism-specific functional genomics data sets.
Users can select arbitrary subsets of the data sets associated with an organism to perform their analyses and can upload their own data sets to analyze.
The GeneMANIA algorithm performs as well or better than other gene function prediction methods on yeast and mouse benchmarks.
The high accuracy of the GeneMANIA prediction algorithm, an intuitive user interface and large database make GeneMANIA a useful tool for any biologist.
GeneMANIA Input/Output --
The input to GeneMANIA is simple -- the user enters a list of genes and, optionally, selects from a list of data sets that they wish to query.
GeneMANIA then extends the user’s list with genes that are functionally similar, or have shared properties with the initial query genes, and displays an interactive functional association network, illustrating the relationships among the genes and data sets.
For example, if the user enters protein complex members, such as yeast ARP2 and ARP3, GeneMANIA will output other complex components, and highly weight co-expression and protein interaction data sets.
If the query genes are involved in disease, such as a mouse leukemia model, Online Mendelian Inheritance in Man (OMIM) database and phenotype data sets may receive high weight and GeneMANIA will output genes that likely are involved in the same process.
Users interested in prioritizing genes for planning a functional screen can use GeneMANIA to return ranked lists of genes likely to share phenotypes with those in the query list based on GeneMANIA’s large and diverse data collection.
Another helpful feature of GeneMANIA --
Another helpful feature of GeneMANIA is that it assigns weights to data sets based on how useful they are for each query.
Individual data sets are represented as networks, and in the basic algorithm, each network is assigned a weight primarily based on how well connected genes in the query list are to each other compared with their connectivity to non-query genes.
However, GeneMANIA’s adaptive weighting methods also detect and down-weight redundant networks.
This network weighting feature is particularly useful for determining how genes in a gene list are connected to one another, or for determining which types of functional genomic data are most useful to collect for finding more genes like those in the query.
GeneMANIA Data sources --
Data sets are collected from publicly available databases, including co-expression data from GEO; physical and genetic interaction data from BioGRID; predicted protein interaction data based on orthology from I2D); and pathway and molecular interaction data from Pathway Commons, which contains data from BioGRID (as stated above…);
Memorial Sloan-Kettering Cancer Center, Human Protein Reference Database (HPRD); HumanCyc - (see G6G Abstract Number 20234);
Systems Biology Center New York, IntAct (an open source resource for molecular interaction data); Molecular INTeraction database (MINT);
NCI-Nature Pathway Interaction Database - (see G6G Abstract Number 20245); and Reactome - (see G6G Abstract Number 20267).
Individual data sets relevant to specific organisms are also collected, such as protein sub-cellular localization in yeast.
Networks are produced from the data either directly (as in the case of protein or genetic interactions) or using an in-house analysis pipeline (workflow) to convert profiles to functional association networks (e.g. mRNA expression data are converted to co-expression networks).
In summary, co-expression networks are filtered to remove weak correlations, which the manufacturers have shown decreases compute time without impacting prediction accuracy.
A default set of networks has been selected for each organism; however, users can choose which networks to include in their analysis in the advanced options panel.
Using this panel, users can select or deselect all networks from a given data source, or select or deselect individual networks, using a system of check boxes.
Uploading data sets --
Users can upload their own data sets in tab-delimited text format to include in their analyses. The required format for uploaded data sets is described in ‘Upload Help’.
These data sets are stored only during a user’s session, accessible only to the user who uploaded them and can be used seamlessly with the preloaded GeneMANIA data sets.
More on GeneMANIA Output --
Determining the network weights -
The constructed composite network is a weighted sum of individual data sources; each edge (link) in the composite network is weighted by the corresponding individual data source.
The network weights are non-negative, sum to 100% and reflect the relevance of each data source for predicting membership in the query list.
Given the composite network, the manufacturers use label propagation to score all non-query genes. These scores are used to rank the genes.
The score assigned to each gene reflects how often paths that start at a given gene node end up in one of the query nodes and how long and heavily weighted those paths are.
These scores are presented to the user in a table that allows interaction with the composite network display.
Saving GeneMANIA results -
GeneMANIA allows users to save the results of their analysis by clicking on ‘Save’ in the menu above the network.
By selecting options in this menu, users can generate a report that includes any or the entire network image, the network weights, recommended genes and/or the query parameters.
Users can also download the network weights and genes in the generated composite network as a tab-delimited text file.
System Requirements
Web-based and GeneMANIA is also accessible via a Cytoscape plug-in - (see G6G Abstract Number 20092), designed for power users.
Manufacturer
- Department of Computer Science,
- Donnelly Centre for Cellular and Biomolecular Research,
- Department of Molecular Genetics, and
- Banting and Best Department of Medical Research,
- University of Toronto
- Toronto, Ontario, Canada
Manufacturer Web Site GeneMANIA
Price Contact manufacturer.
G6G Abstract Number 20685
G6G Manufacturer Number 104262