PhenoPred
Category Genomics>Genetic Data Analysis/Tools
Abstract PhenoPred is a web-based tool designed to detect novel gene-disease associations in humans.
It is based on known gene-disease associations, protein-protein interaction data, protein functional annotation at a molecular level and protein sequence data.
Machine learning principles are used to integrate heterogeneous data sources.
PhenoPred can be used to prioritize genes based on their likelihood to be associated with a given disease or to prioritize diseases for a given query gene.
PhenoPred is based on HUGO Gene Nomenclature for gene names and Disease Ontology (DO) for the names of diseases. DO is based on International Classification of Diseases (ICD-9) maintained by the World Health Organization (WHO).
PhenoPred method --
The PhenoPred method is supervised:
First, the manufacturer mapped each gene/protein onto the spaces of disease and functional terms based on the distance to all annotated proteins in the protein interaction network.
The manufacturer also encoded sequence, function, physicochemical, and predicted structural properties, such as secondary structure and flexibility.
The manufacturer then trained support vector machines (SVMs) to detect gene-disease associations for a number of terms in the Disease Ontology and provided evidence that --
Despite the noise/incompleteness of experimental data and unfinished ontology of diseases, identification of candidate genes can be successful even when a large number of candidate disease terms are predicted on simultaneously.
PhenoPred web site usage --
The PhenoPred web site provides two (2) basic services:
1) For a given Disease Ontology (DO) term, PhenoPred will rank genes that are most likely to be associated with this term; and
2) For a given gene it ranks DO terms that are most likely to be associated with a query gene.
PhenoPred works with 422 DO terms, which were selected using the following rules:
- a) Each DO term has to be associated with at least 10 genes;
- b) if two DO terms are associated with the same set of genes and one term is a descendent of the other, the node closer to the root of the DO is eliminated; and
- c) Nodes directly descending from the root node are eliminated on the basis that they are too general (e.g. DOID: 7 is "disease of anatomical entity").
PhenoPred Datasets --
The Diseases and genes of known genetic involvement were extracted from the Online Mendelian Inheritance in Man (OMIM) database, Swiss- Prot, and the Human Protein Reference Database (HPRD).
The protein-protein interaction (PPI) interaction map was assembled by combining the physical interaction data from HPRD, the Online Predicted Human Interaction Database (OPHID) and studies by Rual et al. (2005) Towards a proteome-scale map of the human protein-protein interaction network, Nature, 437, 1173-1178 and Stezl et al. (2005) A human protein-protein interaction network: a resource for annotating the proteome, Cell, 122, 957-968.
Collected disease names and associated genes were manually integrated into the DO.
System Requirements
Web-based.
Manufacturer
PhenoPred was developed as a collaborative effort between the Radivojac and the Mooney lab at Indiana University in order to facilitate identification of disease-associated genes and understanding molecular basis of disease.
- Sean D. Mooney
- Center for Computational Biology and Bioinformatics
- Indiana University School of Medicine
- 410 W. 10th Street., Suite 5000
- Indianapolis, IN 46202
- And
- Predrag Radivojac
- School of Informatics
- Indiana University
- 901 E. 10th Street
- Bloomington, IN 47408
Manufacturer Web Site PhenoPred
Price Contact manufacturer.
G6G Abstract Number 20417
G6G Manufacturer Number 104046