Abstract GeneMerge is a versatile genomics program that can be used to analyze a wide range of functional genomic data. In particular GeneMerge is useful for the analysis of 'microarray data' and other large biological datasets.

GeneMerge offers a web-based & standalone version written in PERL (a high-level general-purpose, interpreted, dynamic programming language) that returns a range of functional and genomic data for a given set of study genes and provides statistical rank scores for over- representation of particular functions or categories in the data set.

Functional or categorical data of all kinds can be analyzed with GeneMerge, facilitating regulatory and metabolic pathway analysis, tests of population genetic hypotheses, cross-experiment comparisons, and tests of chromosomal clustering, among others.

GeneMerge can perform analyses on a wide variety of genomic data quickly and easily and facilitates both data mining and hypothesis testing.

Given a set of genes, GeneMerge can tell you, for example, whether these genes are statistically over-represented in a particular functional or biochemical class, clustered in a region of the genome, or are associated with a particular RNAi or deletion phenotype.

The following and other questions can be answered with GeneMerge:

Are functional pathways up-regulated during bacterial infection also up- regulated during fungal infection? What are they? Are fast evolving genes preferentially located in areas of high recombination? Do co- regulated genes show non-random enrichment for a certain family of transcription factor binding sites?

GeneMerge returns descriptive information regarding genes under investigation and statistically-based 'rank scores' regarding over- representation of descriptors in a given set of genes.

Functional or categorical descriptive data is associated with genes in gene-association files. These text files link each gene in a genome with a particular datum of information.

For example, the name of a gene and its chromosomal location, molecular function, or its identity as over-expressed in a particular type of cancer.

GeneMerge has four input files:

(1) Study set gene file;

(2) Population set gene file;

(3) Gene-association file;

(4) Description file.

The study set is comprised of genes that are currently under investigation. The population set is comprised of those genes from which the study set was drawn, often a genome.

The gene-association file links gene names with a particular datum of information using a shorthand identifier (ID). Finally, the description file contains human-readable descriptions of gene-association IDs.

The output file is a tab-delimited text file that can be opened in most spreadsheet programs.

It contains functional or categorical data associated with each gene in the study set and rank scores for over-represented functions/categories, as well as other pertinent data.

Rank scores for functional or categorical overrepresentation within the study set of genes is obtained using the hypergeometric distribution.

The big advantage of GeneMerge over other similar programs is that you are Not limited to analyzing your data from the perspective of a pre- packaged set of gene-association data.

You can download or create gene-association files to analyze your data from an unlimited number of perspectives.

Gene-Association files -- The following gene-association files are currently available for use with GeneMerge.

GeneMerge lists the source of the data, who prepared it, and a reference, if any. Functional genomic info for the following organisms:

Arabidopsis thaliana; Bacillus anthracis AMES; Bacillus subtilis; Bos taurus; Caenorhabditis elegans; Coxiella burnetii RSA 493; Danio rerio; Drosophila melanogaster; Gallus gallus; Glossina morsitans (Tsetse Fly); Homo sapiens;

Leishmania major; Oryza sativa (Rice); Plasmodium falciparum; Rattus norvegicus; Saccharomyces cerevisiae; Schizosaccharomyces pombe; Shewanella oneidensis MR-1; Trypanosoma brucei; Vibrio cholerae; Virus Bioinformatics.

Gene Name Converter -- The Gene Name Converter generates an output file with standard gene names that can be used in GeneMerge based on 'gene names' that you input.

There are three (3) output files generated:

1) List of paired matches.

2) List of genes for which No standard gene name can be found.

3) List of standard gene names alone (for easy cut-and-paste into GeneMerge).

System Requirements

GeneMerge runs on any platform and has a cool interface for Mac OS X. You can also use it over the Web.


