G6G Directory of Omics and Intelligent Software

PipeWorks

Category Cross-Omics>Workflow Knowledge Bases/Systems/Tools

Abstract PipeWorks� provides biologists and genomicists with a visual environment for developing �multi-step workflows� that are rapidly processed on the TimeLogic biocomputing systems.

With drag-and-drop simplicity, PipeWorks lets you stitch together sequence comparisons, data filters, and output results -- without Perl scripting.

The workflows you create can easily be saved, shared and modified by other PipeWorks users. By incorporating TimeLogic�s accelerated sequence comparison methods (Tera-BLAST�, Tera-Probe�, Smith- Waterman, Framesearch, HMM analysis, Profilesearch and GeneDetective�) into your workflows, you can process huge data sets without a computer cluster.

In order to minimize errors from data transfer, manual analysis and scripting, PipeWorks features �automatic format translators�, �reusable input objects� (e.g., a gene family being studied by your lab), and easy creation of �custom databases�.

The PipeWorks Client interface can be installed on any Windows computer in your lab (or at home), and provides easy access to search and analysis tools, stored input objects and databases installed on the CodeQuest or DeCypher system running the PipeWorks Server (see below...).

Just drag the appropriate search tools to the workflow space, connect them with filters and analysis tools, and then launch the workflow. �Visual indicators� display workflow status and projected completion time for each step.

PipeWorks� data and workflow sharing enable more effective research collaboration, and the visual workflow streamlines the design of analyses for your entire team.

Build once, deploy to many, maximize productivity --

1) Design and deploy workflows that can be updated and executed without your involvement, freeing you from repetitive tasks.

2) Researchers can add new input sequences, experiment with search parameters and databases, schedule their own workflows, and retrieve & analyze their own results without bioinformatics support.

Extend workflows with custom scripts or 3rd party applications --

1) Workflows may incorporate EMBOSS (The European Molecular Biology Open Software Suite) tools like Pepinfo for hydrophobicity plots and residue histograms.

2) Additional EMBOSS tools include CpGplot, DotMatcher, CompSeq, Prophecy, ShowSeq and OddComp.

3) To extend workflows with additional tools -- such as open source programs for next-generation sequence assembly � PipeWorks Script Executor Tool enables you to incorporate Perl or Python scripts into any workflow.

4) To share custom tools with all PipeWorks users, use the �Generic Wrapper� to register new tools within the PipeWorks environment.

5) The PipeWorks Application Programming Interface (API) helps you define XML inputs, outputs and data translators that are required to extend workflows with your own code.

PipeWorks Example Workflows --

The following sample workflows are included in PipeWorks.

1) Connect Sequential Searches for More Advanced Annotation -

In this workflow, adding a filter step enables you to route results data based upon percent identity, e-value, bit-score, gap percentage or score count. At each search step, you can view multiple results to help optimize your workflows.

a) Input sequences are processed with Tera-BLAST.
b) Sequences that generate No hits -- or score below your filter threshold -- are compared to Swiss-Prot using Smith-Waterman (S-W).
c) Sequences that are Not confidently identified by BLAST or S-W are routed to an �HMM search� that attempts to link them to particular super- families or protein families.

2) Automate and Schedule Repetitive Tasks -

a) PipeWorks Database Updater workflow downloads database sources from one or more FTP sites, and then formats them for local searching.
b) Workflows can also employ the �File Grabber� to retrieve a remote set of query sequences.
c) All workflows can be scheduled by date or frequency.

3) Merge Results, Embed Perl Scripts - You can also merge multiple sets of extracted sequences to serve as the input for additional operations.

You can also embed custom Perl and Python scripts into workflows.

4) Search, Filter and Analyze -

a) Use Tera-BLASTP to compare new �protein sequences� to a custom protein database.
b) Filter the results by e-value.
c) High-confidence hits are processed by the EMBOSS tool Pepinfo. Pepinfo�s hydrophobicity plots and residue histograms will be displayed for each hit.
d) Results that score below the filter threshold are routed to a more sensitive Smith-Waterman search.

5) Build Hidden Markov Models (HMMs) to Expand Protein Families

a) Generate a multiple sequence alignment from a protein family using ClustalW.
b) Build a Hidden Markov Model (HMM) for searching UniProt.

CodeQuest system--

CodeQuest� is a �biocomputing workstation� that processes large genomics searches and sophisticated informatics workflows. Using its FPGA-based SeqCruncher�, the quad-core CodeQuest workstation speeds Tera-BLAST�, Smith-Waterman, HMM and �gene modeling� searches at the speed of a mid-sized cluster.

The CodeQuest workstation combines optimized algorithms, advanced hardware acceleration, and PipeWorks visual workflow software to drive your genome exploration and drug discovery research.

With the new SeqCruncher accelerator, CodeQuest is 3-30 times faster than the model it replaces. CodeQuest is an effective solution for bioinformatics experts and novices alike.

You can build and process genomic analyses without tedious scripting and achieve compute-cluster performance, all in a desktop solution that can be shared by the entire lab.

High performance sequence analysis --

TimeLogic's accelerated �sequence analysis� applications; 1 or 2 SeqCruncher accelerators; and is available with Windows and PipeWorks or on RedHat Linux (without PipeWorks).

Flexible access for all levels of bioinformatics expertise --

PipeWorks-With drag-and-drop simplicity, you can connect multiple searches, filters and analysis tools. Databases, input queries and pipelines can easily be saved, modified and shared with other CodeQuest users;

An intuitive web Interface for searching and target database building; and Command line interface for scripted batch searches, and application integration.

Plug and play workstation saves power and is easy to maintain --

Hewlett Packard xw8600 workstation with 2 dual-core Xeon CPUs, 4 GB of RAM, 1 Terabyte of disk space, and an HP 19" flat panel monitor.

PipeWorks and CodeQuest --

The unique combination of PipeWorks and CodeQuest enables you to process high-throughput Expressed Sequence Tag (EST) and �protein annotation� workflows, as well as HMM-heavy metagenomics projects.

Your entire lab can design and process workflows on a resource- friendly desktop system to advance their genomics projects or drug discovery efforts.

System Requirements

1) PipeWorks 1.0 including the server and five (5) supported clients requires a CodeQuest workstation or DeCypher server running Windows. CodeQuest Z1 and Z2 include 1 or 2 SeqCruncher PCIe accelerators, respectively.

2) The PipeWorks Client application may be installed on any laboratory computer, enabling you to conveniently build, launch and monitor your workflows remotely. Additional 5-named-user license contracts may be purchased separately.

3) The PipeWorks Client requires a computer running Windows XP or Server2003, a minimum of 512 MB RAM, 10 GB storage and a network connection to the TimeLogic system running PipeWorks Server.

Manufacturer

Active Motif, Inc.
1914 Palomar Oaks Way
Suite 150
Carlsbad CA 92008
Toll Free: 877-222-9543
Tel: 760-431-1263 ext. 4
Fax: 760-431-1351

Manufacturer Web Site Active Motif PipeWorks

Price Contact manufacturer.

G6G Abstract Number 20543

G6G Manufacturer Number 104158

The G6G Directory of Omics and Intelligent Software

PipeWorks