Genes2Networks

Category Cross-Omics>Pathway Analysis/Tools and Cross-Omics>Knowledge Bases/Databases/Tools

Abstract Genes2Networks is an advanced web-based software system that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes.

This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in 'common pathways' or protein complexes.

The manufacturer's aim in developing the Genes2Network software is to provide cell- and molecular-experimental biologists as well as computational biologists with a user-friendly tool for creating subnetworks from lists of mammalian genes or proteins by connecting these genes or proteins using known protein-protein interactions.

To accomplish this task the manufacturer developed a large-scale high- quality mammalian protein-protein interaction database.

This database was created by consolidating databases containing mostly low-throughput literature-based protein interaction data extracted manually by expert biologists, but also data generated from high- throughput methods.

To develop Genes2Networks, the manufacturer consolidated ten (10) currently available mammalian protein interaction network datasets into one large dataset. To prune out interactions of low confidence, a simple filter was implemented.

Genes2Networks is delivered as a web interface application. This tool can be used to extract relevant subnetworks given lists of gene or protein names.

The input to the system is a list of 'Entrez gene' symbols. The system uses the merged datasets made of selected databases to find interactions between the nodes in the seed list.

The merged datasets can be filtered based on user preferences concerning the maximum number of interactions a reference can provide, and the minimum number of references required for interactions to be included.

The resultant filtered dataset serves as a reference network for exploring, by depth-first traversal, paths between the seed nodes.

Nodes that fall on paths shorter than a user defined path length between seed nodes are included as intermediates in the outputted subnetwork.

The system's output includes a statistical analysis report, and a three color network map, highlighting the seed nodes in one color, the significant intermediates in another color, and the non-significant intermediates in a third color.

The statistical analysis report provides a list of intermediate nodes used to connect the gene names, sorted by significance of specificity to interact with nodes from the seed list.

High-quality large-scale mammalian protein interaction network --

The manufacturer used only mammalian (mouse/rat/human) interactions recorded in the following datasets:

Biomolecular Interaction Network Database (BIND), Human protein reference database (HPRD), IntAct, Database of Interacting Proteins (DIP), Molecular INTeraction database (MINT), Rual et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437(7062):1173-1178),

Stelzl et al. [A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome. Cell 2005, 122(6):957-968), Ma'ayan et al. (Formation of Regulatory Patterns During Signal Propagation in a Mammalian Cellular Network. Science 2005, 309(5737):1078-1083), protein-protein interaction database for PDZ-domains (PDZBase), and Protein-Protein Interaction Database (PPID).

All interactions from these databases/datasets were determined experimentally and include a PubMed reference to the primary research article that describes the experiments used to identify the interactions.

Some of the databases contain interactions that were manually extracted from the literature (e.g. HPRD); some datasets are the result of high throughput experimental data (e.g. Rual et al. and Stelzl et al.); whereas some databases contain both low and high-throughput interactions (e.g. BIND, IntAct, and DIP).

Database Filtering --

Many of the interactions and components listed in the ten (10) databases that the manufacturer used are the result of high-throughput experiments such as yeast-2-hybrid screens, and mass-spectrometry.

These interactions are considered low-quality since these techniques often report many false positives.

Thus, the manufacturer applied a simple filtering approach allowing users to exclude interactions originating from articles that provide many interactions, and/or include only interactions reported by several different papers.

Significant intermediates --

The output subnetworks produced by Genes2Networks contain nodes, mostly proteins, which were Not originally provided by the user as input.

The manufacturer calls these nodes "intermediate nodes". Some of these intermediate nodes may be present in the output subnetwork because these intermediates are highly connected nodes (hubs) in the consolidated reference network used to connect the seed-list.

On the other hand, intermediate nodes may be specific to interact with components from the inputted seed list.

If these intermediates are specific, it may be beneficial for the user to identify them as potential specific regulators and specific participants in pathways, protein complexes and modules involving the input seed list.

System Requirements

Web-based.

Manufacturer

Manufacturer Web Site Genes2Networks

Price Contact manufacturer.

G6G Abstract Number 20425

G6G Manufacturer Number 104054