Gene eXpression Network Analysis (GXNA)

Category Genomics>Gene Expression Analysis/Profiling/Tools and Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools

Abstract GXNA is an innovative method for analyzing ‘gene expression’ data using ‘gene interaction networks’.

The manufacturer addresses the problem of using ‘expression data’ and prior biological knowledge to identify differentially expressed pathways or groups of genes.

Following an idea of Ideker et al. (Ideker T, et al. - Discovering regulatory and signaling circuits in molecular interaction networks; Bioinformatics (2002) 18:S233–S240), the manufacturer constructed a ‘gene interaction network’ and searched for high-scoring sub-networks.

The manufacturer makes several improvements in terms of scoring functions and algorithms, resulting in higher speed and accuracy and easier biological interpretation.

The manufacturer also assigns significance levels to their results, adjusted for multiple testing.

Their methods have been successfully applied to three (3) human microarray data sets, related to cancer and the immune system, retrieving several known and potential pathways.

This method, GXNA is implemented in software that is publicly available and can be used on virtually any ‘microarray data’ set.

A central problem in biology is the identification of genes or pathways involved in diseases and other biological processes. The development of microarray technology and other high-throughput techniques has enabled ‘massively parallel’ approaches to this problem.

In a typical experiment, two (2) or more phenotypes are compared, with several replicates used for each phenotype. Each replicate measures ‘expression data’ for a large number of genes.

Standard analysis starts with filtering and normalizing the data, followed by the computation of test statistics for each gene, comparing expression levels in different phenotypes. Various techniques can be used to account for the large number of genes being tested.

Finally, genes are sorted in increasing order of their adjusted p-values, and the most significant genes are used to generate ‘biological hypotheses’ and/or subjected to experimental validation.

The power of this strategy is limited by the fact that it analyzes genes one- at-a-time.

Real genes function in concert rather than alone, their products interact with each other and with DNA and there has been a lot of interest in methods that analyze groups of genes.

One of the earliest approaches has been ‘hierarchical clustering’.

This method produces useful visualizations, but it lacks a sound statistical basis. It is also entirely driven by experimental data, Not using any prior biological knowledge about the genes of interest.

Several other methods were developed, from visualization tools such as GenMAPP (see G6G Abstract Number 20219) to algorithms such as GO (Gene Ontology) analysis. A useful way to classify them is according to the way they represent prior knowledge:

1) The Gene Ontology (GO) is a database of biological terms structured as a directed acyclic graph (DAG). A typical analysis seeks GO terms that contain a large number of differentially expressed genes.

Pros for this method include speed, simplicity and the ability to assign p- values. However, this analysis ignores a lot of information (all the top genes are assigned equal weight, regardless of their scores) and often outputs are very general GO terms that are Not very useful.

2) Another method starts with a predefined list of groups of genes, such as known pathways. This is used in the GSEA algorithm (see G6G Abstract Number 20266).

Every such group is assigned a score that is essentially the average of the test statistics of its member genes; groups with high scores are more likely to be differentially expressed. P-values can be obtained by permutation methods.

3) Yet another method uses a simple but advanced idea of Ideker et al. (2002); Rather than using a list of known pathways, ‘prior knowledge’ is represented as an ‘interaction network’.

The nodes of the graph correspond to genes; there is an edge between two (2) nodes if their genes interact.

Various types of interactions may be considered, such as protein-to- protein, protein-to-DNA or co-expression. A group of related genes corresponds to a connected sub-graph of the interaction graph.

Each sub-graph is assigned a score (typically the sum of the scores of its ‘component genes’), and a ‘search algorithm’ is used to find sub-graphs with high scores.

The ‘interaction network’ is a more precise way to represent information than lists of genes or pathways, as it describes which genes are closely connected within a given pathway.

Hence, it has the potential to detect more subtle signals, such as local disturbances within ‘known pathways’, as well as within pathways that have Not yet been described.

This comes at the price of increased complexity, making it more difficult to design fast algorithms and compute significance levels.

GXNA's Method approach --

The manufacturer essentially follows approach (3) above, but also integrates some elements of method (2) above.

The manufacturer makes several improvements in terms of design, scoring and algorithm that address the problems mentioned above.

The manufacturer briefly emphasizes the most important ones here:

1) The manufacturer focuses on finding ‘small networks’, which are easier to interpret and validate.

2) The manufacturer uses a ‘fast algorithm’ that allows one to compute p- values, adjusted for multiple testing using the FWER (family wise error rate) method.

3) While most previous work on ‘interaction networks’ was done on simple organisms such as yeast and bacteria, the manufacturer has successfully applied their method on human data.

The construction of an ‘interaction network’ is a key step in the manufacturer's algorithm.

The current implementation incorporates a broad definition of a network, including both protein-protein and DNA-protein interactions. All interactions are treated equally, regardless of type and direction.

The manufacturer believes that it is remarkable that even such a relatively coarse design captures enough information to produce interesting and statistically significant results.

Refining network structure is a promising area of future research. The manufacturers are also working on increasing output quality by reducing the overlap among the top graphs.

GXNA Documentation --

The manufacturer provides a ‘GXNA Quick Start’ guide to help you get started with this software product.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site Stanford University GXNA

Price Contact manufacturer.

G6G Abstract Number 20576

G6G Manufacturer Number 104180