ONDEX Suite

Category Cross-Omics>Data/Text Mining Systems/Tools

Abstract ONDEX enables data from diverse biological data sets to be linked, integrated and visualized through graph analysis techniques.

ONDEX can be used in a number of important application areas such as transcription analysis, protein interaction analysis, data mining and text mining.

Data may be imported through a number of parsers of public-domain and other databases such as TRANSFAC (see G6G Abstract Number 20121), TRANSPATH (see G6G Abstract Number 20121), CHEBI, Gene Ontology (GO), KEGG, Drastic, Enzyme Nomenclature, Pathway Tools Pathway Genomes (PGDBs) (see G6G Abstract Number 20236), Plant Ontology, and Medical Subject Headings Vocabulary (MeSH) together with associated sequence data where available.

The imported data may be utilized with the data mining features and specific applications or extensions of the ONDEX framework.

The ONDEX Frontend is an example of an ONDEX application package and this Frontend displays the data as a set of linked graphs with the nodes representing a data object and the edges (or links between the nodes) representing a relationship between the two nodes.

Filters are provided through the Frontend to select specific classes of data and relations between them to a selectable depth of linkage.

In addition, an advanced filter is available to import 'microarray expression' level data to globally analyze the relations between the different genes being expressed.

ONDEX is able to handle very large graphs and is able to represent many hundreds of thousands of data items, examine them for potential relationships through identifying equivalent concepts (data type).

It can also filter out unattached nodes (ones with No edges or relations) for the problem under consideration and provides labeled, organized graphs that may also be manipulated through the user interface.

ONDEX Features/Capabilities include:

Potential applications for ONDEX arise in many areas of analysis, including the support of microarray analysis, metabolomics, gene annotation, database curation, metabolic engineering, as well as modeling and simulation of biological systems.

The ONDEX Backend --

1) Data Import - import and extraction of biological networks currently from the numerous databases and generic exchange formats.

2) Data Integration -

a) Integration of different combinations of imported databases;
b) Identify and map identical entities between different biological networks/databases;
c) Generate new relations between integrated data;
d) Always keep track of equivalent data; and
e) Evidence tracking where the source of integrated data as well as evidence for generated mappings between entries of different databases is tracked and always available.

3) Text mining -

a) Concept based information extraction from free text;
b) Relation mining;
c) Support for basic Natural Language Processing (NLP) techniques, word stemming, and POS tagging (QTag) etc; and
d) Import data from PubMed.

4) Sequence analysis -

a) Use sequence similarity to identify elements in biological networks. Currently BLAST and PatternHunter (a general- purpose homology search tool) are supported.
b) Cluster genes in orthologs and paraloguous groups, using an improved reimplementation of the INPARANOID method.

5) TAVERNA/GRID workflow support -

a) Run data import, integration text mining process from TAVERNA (a free software tool for designing and executing workflows), (this functionality is currently been added).
b) Load integrated data from ONDEX via SOAP into TAVERNA (this functionality will be added soon).

The ONDEX Frontend --

1) Data import/export -

a) Import integrated networks from the ONDEX backend;
b) Filter data from the ONDEX backend by species, etc;
c) Import/export of networks in ONDEX Extensible Markup Language (XML);
d) Import/export of networks in XGMML (the eXtensible Graph Markup and Modeling Language);
e) Import and overlay tab delimited experimental data onto biological networks (such as, microarray data);
f) Export integrated networks as Petri Nets (Cell Illustrator format - see G6G Abstract Number 20155R); and
g) Export visible graphs as GML [a widely used graph exchange format (non-XML)] and GraphML [is an XML-based file format for graphs and is compatible with CytoScape (see G6G Abstract Number 20092) and yEd (yEd is a very advanced graph editor that can be used to quickly and effectively generate drawings and to apply automatic layouts to a range of different diagrams and networks)].

2) Graph visualization -

a) Visualize large networks consisting of more than 100k elements.
b) Support for a wide range of network layouts. Currently, the manufacturer’s layouts that exploit the semantically rich data structure of ONDEX, as well as layouts from yFiles and JUNG are supported.

3) Graph analysis -

a) Filters make irrelevant information invisible by blending out irrelevant nodes and edges.
As a consequence of applying various filters, the size of graphs decreases significantly, allowing users to effectively separate important from unimportant information.;
b) Perform virtual knockout experiments to identify key elements in biological networks (knockout filter);
c) Summarize and export graph analysis results in tabular format;
d) Export statistics of certain graph properties in tabular format;
e) Annotate sequence analysis results with information derived from biological networks (i.e. metabolic pathways, signal transduction, Enzyme nomenclature etc.); and
f) Automated analysis: you can to define, save and restore arbitrary combinations of the filter, layout and export steps. This functionality allows the analysis of different datasets in one standardized pipeline.

System Requirements

ONDEX requires Java SE Runtime Environment (JRE) 6 Update 10 or later to be installed (available on Windows, Linux and 64-bit Intel Macs). The minimum memory requirement is 1GB (depending on the size of datasets to be analyzed).

Manufacturer

The ONDEX Suite was developed by a large number of contributors. Most are associated with Rothamsted Research. See the complete list of contributors at the link below.

Rothamsted Research
Harpenden
Hertfordshire
AL5 2JQ
UK
Tel: + 44 (0) 1582 763 133
Fax: + 44 (0) 1582 760 981
Contributors

Manufacturer Web Site ONDEX Suite

Price Contact manufacturer.

G6G Abstract Number 20300

G6G Manufacturer Number 102307

The G6G Directory of Omics and Intelligent Software

ONDEX Suite