Tissue-specific Gene Expression and Regulation (TiGER)

Category Genomics>Gene Expression Databases/Tools

Abstract TiGER is a database for generating comprehensive information about human tissue-specific gene regulation, including both expression and regulatory data, developed by the Bioinformatics Lab at Wilmer Eye Institute of Johns Hopkins University.

The database contains tissue-specific gene expression profiles or expressed sequence tag (EST) data, cis-regulatory module (CRM) data, and combinatorial gene regulation data.

Currently, the database contains tissue-specific expression profiles for ~20,000 UniGene genes, combinatorial regulation for 7,341 interacting transcription factor (TF) pairs, and 6,232 cis-regulatory modules (CRMs) for tissue-specific genes.

Data Organization --

The data is organized as a relational database, which provides three (3) different views: gene view, TF view, and tissue view. Each view has a query interface and associated underlying database tables/entities:

1) The gene view - contains three (3) major entities:

2) The TF view - contains one (1) major entity called “TF-Partner”. This entity stores all factors that interact with a given TF, the tissue in which the interaction occurs and the significance (-log (p)) of the interaction.

3) The tissue view - contains three (3) major entities:

Data Quality --

EST Enrichment - The manufacturer has included P-values for the enrichment scores in the EST download file, where the 1st column is tissue name, the 2nd column is the enrichment score, and the 3rd column is the [-log10(P)] value.

The manufacturer defines a gene as a tissue-specific gene if it satisfies two (2) criteria: the enrichment score is greater than 5 and the P-value is smaller than 10 to the -3.5.

TF Interaction - The manufacturer has included P-values for the TF interaction in a summary table and a download file. To evaluate TF interaction results, the manufacturer uses known interactions as a positive control due to the scarcity of a tissue-specific interaction.

More than 40% of the known interactions are recovered, with 84-fold enrichment compared to the expected.

CRM Identification - A sensitivity of 12% and enrichment of 10 using known regulatory regions as a positive control is provided.

Data Up-To-Datedness --

The manufacturer obtained DNA sequences and annotations (such as RefSeq) from the Human May 2004 (hg17) assembly via the UCSC Genome Browser (see G6G Abstract Number 20197). Conservation score data was also downloaded from the multiple alignments of eight (8) vertebrate genomes with Human (hg17) via the UCSC Genome Browser.

The EST database was downloaded from National Center for Biotechnology Information (NCBI) website in 2005.

As more experimental data accumulates related to the nature of TF-DNA interactions, the manufacturer plans to further develop their predictions on tissue-specific TF interactions.

The manufacturer also plans to extend their work on CRM detection by relating regulatory elements with temporal (e.g., development) and spatial (e.g., cell types) attributes.

As new predictions on tissue-specific gene regulation accumulate, the TiGER database will need to be further expanded and modified. The manufacturer will update the content of the database on a regular basis.

Distribution --

TiGER has been constructed for free access and use. The downloadable data formats include standard .txt text files and .png images.

