Biosof Protein Structure Products & Tools

Category Proteomics>Protein Structure/Modeling Systems/Tools

Abstract A) PredictProtein is one of the most widely used internet servers for protein structure prediction.

PredictProtein helps you get the most out of available lab data by extracting the more reliable predictions from high-throughput methods and extrapolating from high-accuracy results.

The system only requires protein sequence to push the experiment into new territory and to analyze data aspects to a previously unattainable degree of accuracy. PredictProtein analyzes the protein’s properties, observing the way that it folds and operates its unique function in the cell.

PredictProtein returns predictions of secondary structure, solvent accessibility, transmembrane helices, multiple sequence alignments, sequence motifs, low-complexity regions, nuclear localization signals, regions lacking regular structure and globular regions, coiled-coil regions, structural switch regions, disulfide-bonds, sub-cellular localization, and functional annotations.

Additional analysis includes fold recognition by prediction-based threading, domain assignments, predictions of transmembrane strands and inter-residue.

With PredictProtein you can:

1) Predict the most and least useful gene product targets using computational tools;
2) Discover new leads through improved interpretation of existing data;
3) Decrease expense and improve efficiency over traditional research methods; and
4) Gauge meaningful annotations from high-throughput analysis.

PredictProtein Visualization Feature(s) -- PredictProtein features data visualizations using the “jBio” interactive visualization library. jBio is an bioinformatics library that provides visualization and user interactions in a web browser.

jBio Features/capabilities: Client side operation allowing interaction with the user and full control of the interface; specialized visualization (‘glyphs’) for common features such as secondary structure elements and BLAST results; default visualization for any biological sequence feature; detailed feature view glyphs; a sequence cursor, allowing you to focus on a region of interest;

Methods to facilitate binding user interaction to actions; sequence input-output via BioPerl; zooming of display area; visual effects thanks to jQuery; a completely modular, object oriented design with multiple inheritance and polymorphism; detailed documentation in multiple formats, and it can be automatically extracted from source.

PredictProtein Data Storage and Caching -- PredictProtein caches the prediction for each new query sequence it processes for quick and easy retrieval. Currently the PredictProtein cache contains over 3,584,270 annotated proteins.

PredictProtein Job Management -- PredictProtein users can store their jobs for later reference.

B) SNAP -- Screening for Non-Acceptable Polymorphisms (SNAP) -- SNAP is a neural-network (NN) based method that uses in silico derived protein information (e.g. secondary structure, conservation, solvent accessibility, etc.) in order to make predictions regarding functionality of mutated proteins.

The network takes protein sequences and lists of mutants as input, returning a score for each substitution. These scores can then be translated into binary predictions of effect (present/absent) and reliability indices (RI). SNAP returns its results to the user via E-mail.

C) LOCtree -- LOCtree is a prediction method for sub-cellular localization of proteins --

Prediction algorithm -- LOCtree is a novel system of Support Vector Machines (SVMs) that predict the subcellular localization of proteins, and DNA-binding propensity for nuclear proteins, by incorporating a ‘Hierarchical Ontology’ of localization classes modeled onto biological processing pathways.

Biological similarities are incorporated from the description of cellular components provided by the Gene Ontology consortium (GO). GO definitions have been simplified and tailored to the problem of protein sorting. Technically the ontology has been implemented using a decision tree with SVMs as the nodes.

LOCtree, was extremely successful at learning evolutionary similarities among subcellular localization classes and was significantly more accurate than other traditional networks at predicting subcellular localization.

Whenever available, LOCtree also reports predictions based on the following:

1) Nuclear localization signals found by PredictNLS (see below…);
2) Localization inferred using PROSITE motifs and Pfam domains found in the protein; and
3) SWISS-PROT keywords associated with a protein.

Note: Localization is inferred in the last two (2) cases using the entropy-based LOCkey algorithm.

Comprehensive prediction of localization -- LOCtree can predict the subcellular localization and DNA-binding propensity of non-membrane proteins in non-plant and plant eukaryotes as well as prokaryotes.

LOCtree classifies eukaryotic animal proteins into one of five (5) subcellular classes, while plant proteins are classified into one of six (6) classes and prokaryotic proteins are classified into one of three (3) classes.

The novel feature of using a hierarchical architecture is the ability to make intermediate localization class predictions at much higher accuracy. Another source of improvement is the use of ‘noisy’ training data.

‘Noisy’ predictions from LOCKey (SWISS-PROT keyword based annotations) and LOCHom (annotations using sequence homology) are used to train the hierarchical SVMs.

D) PROFphd -- PROFphd is a world renowned suite of protein structure and function analysis tools. The secondary structure and solvent accessibility predictor utilizes a specialized neural network (NN) and provides up to 75% accuracy for homology based profiles.

E) Profisis (ISIS) -- Prediction of residues involved in external protein-protein and protein-DNA interactions -- Profisis (ISIS) is a machine learning-based method that identifies interacting residues from sequence alone.

Although the method was developed using transient protein-protein interfaces from complexes of experimentally known 3D structures, it never explicitly uses 3D information.

Instead, the manufacturer’s combine predicted structural features with evolutionary information. The strongest predictions of the method reached over 90% accuracy in a cross-validation experiment.

The manufacturer’s results suggest that despite the significant diversity in the nature of protein-protein interactions, they all share common basic principles and that these principles are identifiable from sequence alone.

F) PredictNLS -- PredictNLS is an automated tool for the analysis and determination of Nuclear Localization Signals (NLS) -- You submit a protein sequence or a potential NLS.

PredictNLS predicts that your protein is nuclear or finds out whether your potential NLS is found in the manufacturer’s database.

The program also compiles statistics on the number of nuclear/non-nuclear proteins in which your potential NLS is found. Finally, proteins with similar NLS motifs are reported, and the experimental paper describing the particular NLS are given.

If No NLS is found, you can predict the subcellular localization of the protein using LOCtree (see above...).

G) METAdisorder (MD) -- METAdisorder is a novel META-Disorder prediction method that molds various sources of information predominantly obtained from orthogonal prediction methods, to significantly improve in performance over its constituents.

In sustained cross-validation, MD Not only outperforms its origins, but it also compares favorably to other state-of-the-art prediction methods in a variety of tests.

H) NLProt -- NLProt is a tool for finding protein-names in natural language-text. It is based on Support Vector Machines (SVMs), which are trained on contextual-features of named entities in scientific language.

Additionally, simple filtering rules and a protein-name dictionary are used to increase performance.

NLProt reached a precision (accuracy) of 70% at a recall (coverage) of 85% after running it on the 166 most recent abstracts of EMBL and Cell (Nov/Dec 2003). When run from the command line, NLProt takes about 1 second per abstract to finish.

I) LocDB -- LocDB is an expert curated database that collects experimental annotations for the subcellular localization of proteins in Human (Homo sapiens) and Weed (Arabidopsis thaliana).

This database also contains predictions of subcellular localization from a variety of state-of-the-art prediction methods for all proteins with experimental information.

J) NLSdb -- NLSdb is a database of nuclear localization signals (NLSs) and of nuclear proteins targeted to the nucleus by NLS motifs.

System Requirements

Contact manufacturer.

Manufacturer

Biosof LLC
138 W. 25th Street, 10th floor
New York, NY 10001 USA
Tel: +1 (917) 378-5026
E-mail: main@bio-sof.com

Manufacturer Web Site Biosof Protein Structure Products & Tools

Price Contact manufacturer.

G6G Abstract Number 20737

G6G Manufacturer Number 104323

The G6G Directory of Omics and Intelligent Software

Biosof Protein Structure Products & Tools