PathoLogic Pathway-Prediction
Category Cross-Omics>Pathway Knowledge Bases/Databases/Tools
Abstract The PathoLogic Pathway-Prediction program supports automated creation of a Pathway/Genome Database (PGDB) and prediction of the metabolic pathway complement of an organism.
Products features/capabilities include:
1) Most of the databases (DBs) in the BioCyc collection (see G6G Abstract Number 20230) were created by PathoLogic. PathoLogic can predict the 'metabolic pathways' of an organism from its genome.
In that sense, the 'metabolic pathways' in the PathoLogic-based databases were derived computationally, in contrast to the literature- derived pathways in the EcoCyc (see G6G Abstract Number 20231) and MetaCyc databases (see G6G Abstract Number 20232).
PathoLogic also includes an 'operon predictor' and a predictor of what genes fill reactions with missing enzymes, in predicted metabolic pathways.
The input required by PathoLogic is an 'annotated genome' for the organism, such as in the form of a Genbank entry. The output produced by PathoLogic is a new 'pathway/genome database' for the organism.
For example, the AgroCyc pathway/genome database for the bacterium Agrobacterium tumefaciens was created by SRI using the PathoLogic program.
In general, most of the metabolic pathways in a computationally-derived database were predicted computationally, but in some cases, pathways that have been observed experimentally in the organism were added manually to the database.
Additional unknown pathways are likely to be present in each organism. In some cases, there has been additional manual curation of the metabolic pathways in PathoLogic-derived databases.
2) Adoption of BioCyc Databases -- As genomic information becomes available for an ever growing number of organisms, it becomes essential to have up-to-date knowledge resources for each and every organism.
For example, how can an experimentalist hope to accurately interpret 'gene-expression microarray' results without having all available knowledge about the functions of each gene product in the organism at their fingertips?
Yet, providing a current and complete knowledge resource about an organism is a time consuming task that involves combing the ‘biomedical literature’ for all available information about the organism's genes and biochemical network, and entering that information into an organism-specific DB.
No one group can curate all the world's genomes in this fashion. Therefore, we see it as essential that experts on the biology of specific organisms begin curation of 'organism-specific databases' describing the genome and biochemical networks of those organisms.
SRI and European Bioinformatics Institute (EBI) have taken a step toward this vision by creating a large set of organism-specific databases, and making those databases available for adoption by biologists for ongoing curation and updating.
All of the Tier 3 BioCyc PGDBs, and some of the Tier 2 PGDBs, are available for adoption.
The DBs are available under an open license agreement, meaning that they are freely available to all, they may be modified, and they may be freely redistributed.
SRI's Pathway Tools software system (see G6G Abstract Number 20236) provides a variety of tools to assist in PGDB adoption and curation. An adopted BioCyc PGDB can be immediately opened and queried using the desktop version of Pathway Tools.
Pathway Tools provides a package of graphical editing tools that allow biologists to create and update database entities such as genes, proteins, metabolic pathways, metabolites, operons, and transcriptional regulatory interactions.
Scientists can also use Pathway Tools to publish their PGDBs on their own Web site, in a manner analogous to the BioCyc Web site. That is, the BioCyc Web site is powered by Pathway Tools.
3) Suggested steps for adoption of PGDBs are as follows:
- a) Initiate the adoption process by contacting biocyc-support@ai.sri.com.
- b) Begin to refine the adopted PGDB by comparing its contents against known experimental knowledge for the organism. Update the PGDB to reflect errors such as false-positive pathway predictions, and to add experimentally known pathways that are Not in the PGDB. Update gene names and protein functions to reflect recent discoveries.
- c) Curation of PGDBs is a very big job. Enlist colleagues who are specialists in different aspects of the organism’s biology to help. A group of collaborating curators can all run Pathway Tools on separate computers to update a PGDB stored in a commonly accessed relational database manager such as MySQL.
- d) Dedicated funding for a PGDB is important to provide significant resources over an extended period. Many funding government agencies are supporting organism-specific database curation projects.
- e) Publish the PGDB on your web site. Using Pathway Tools you can establish a web site that looks just like the BioCyc.org site. Or you can ask SRI to include your PGDB within BioCyc itself.
To facilitate this process, SRI will ask you to deposit your PGDB in the database registry.
System Requirements
See G6G Abstract Number 20230 or BioCyc
Manufacturer
- Bioinformatics Research Group
- Dr Peter D Karp
- Director, Bioinformatics Research Group
- Artificial Intelligence Center
- SRI International
- 333 Ravenswood Avenue
- Menlo Park, CA 94025-3493
- Phone: (650) 859-2000
- ptools-info@ai.sri.com
- biocyc-support@ai.sri.com
Manufacturer Web Site PathoLogic Pathway-Prediction
Price Free for academic institutions. Fee required for commercial use. Licence required.
G6G Abstract Number 20235
G6G Manufacturer Number 102506