Abstract NCBI Genome Workbench is an integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publicly available sequence databases at National Center for Biotechnology Information (NCBI), and mix this data with your own private data.

Genome Workbench can display sequence data in many ways, including graphical sequence views, various alignment views, phylogenetic tree views, and tabular views of data. It can also align your private data to data in public databases, display your data in the context of public data, and retrieve Basic Local Alignment Search Tool (BLAST) results.

Products features/capabilities include:

Projects/Workspaces --

Genome Workbench organizes your data into Projects. A Genome Workbench Project is a combination of data that semantically go together. Data inside a project can communicate more easily with other data inside the same project.

For example, if you are reviewing a gene, it would be best to organize the data for the gene in one project, combining references to genomic sequences, transcripts, proteins, and any local analyses done on the sequence.

Workspaces are collections of projects. Workspaces provide convenient ways to organize sets of related data, such as a set of projects relating to all the genes you are studying in mouse, or presenting different chunks of data in different organisms.

Project Tree(s) --

The Project Tree is a view on your workspace and projects. The project tree shows you a hierarchical expansion of your data, and allows you to group data items into folders.

Selection Inspector --

The selection inspector provides a means for evaluating all the selected objects in Genome Workbench. The selection inspector has three (3) modes of operation (Table, Brief Text, and Full Text), selectable by using the icons in the right-hand corner of the view.

The modes are:

1) Table - provides a tabular list of the most common attributes of selected objects.

2) Brief Text - provides a short textual description of the selected objects.

3) Full Text - provides a verbose textual printing of selected objects; this mode implies the GenBank Flat File format for any selected features.

The strength of the selection inspector is in 'aggregating selections across views'. The selection inspector features a drop down menu that indicates which view's selections are being shown. One of the options is to 'show selections from all views'.

This mode allows you to combine selections from the project tree with selections from any other view.

Data Mining View --

The data mining view is a view that combines many modes of searching into one interface. From the data mining view, you can search for items in the public sequence repository; you can search for gene records from Entrez Gene [a searchable database of genes, from RefSeq genomes - The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.]; you can search for annotations in a given view; and you can search for 'patterns of sequences'.

Genome Workbench data loading --

1) You can Import data into the Genome Workbench. Importing data allows you to load accessions from GenBank (sequence database) directly. Importing also allows you to read data from files. You must decide whether you want the imported data to be loaded into a new project, or added to the current project.

2) You can load data through the Data Mining View. In the data mining view you can search several public databases in Entrez (Entrez Protein, Entrez Nucleotide, and Entrez Gene) and load data directly from your query.

Genome Workbench supported Data Formats --

1) FASTA sequence files;

2) GFF2/GTF format (Note: GFF3 support will be added soon);

3) RepeatMasker .out format;

4) Sequin-style 5-Column Feature Table format;

5) Newick-format phylogenetic trees;

6) Phrap/ACE assembly files;

7) AGP sequence assembly files; and

8) NCBI ASN.1 objects [in ASN.1 text or binary or in Extensible Markup Language (XML) format].

Genome Workbench Web Services --

Genome Workbench provides several facilities for interacting with the web. From inside of the Genome Workbench, you can link out to data hosted at NCBI - for example, to view the gene report of a given gene. In addition, Genome Workbench provides some facilities for enriching web pages themselves.

General Linkout Service - The General Linkout Service provides a quick way to formulate a link to load a specific piece of data into Genome Workbench. Using these links, a web page can be set up to interact with an instance of Genome Workbench on a client computer.

Gene Linkout Service - The Gene Linkout Service provides a service specific to Entrez Gene, preparing a gene record for loading into Genome Workbench. This gene record includes a wealth of information about the sequence dependencies of a given gene, and includes references to publications available at the time of retrieval.

Contig Overlap Service - The Contig Overlap Service provides a specialized service for evaluating overlaps of abutting genomic sequences in sequence assemblies. As a part of the curation of the reference assemblies of several genomes, NCBI maintains a database of curated alignments between adjacent components in scaffold assemblies. This service provides a window into this data.

System Requirements

Genome Workbench is built on the NCBI C++ Toolkit and uses cross- platform Application Programming Interfaces (APIs) for graphics. It runs on your local machine, and is available for Windows 2000/XP, Linux, Mac OS X, and various flavors of UNIX.


