Biskit

Category Proteomics>Protein Structure/Modeling Systems/Tools

Abstract Biskit is a modular, object-oriented Python library for structural bioinformatics research. It facilitates the manipulation and analysis of macromolecular structures, protein complexes, and molecular dynamics trajectories.

For efficient number crunching, Biskit objects tightly integrate with NumPy (Numeric Python) - (see below...). Biskit also offers a platform for the rapid integration of external programs and new algorithms into complex workflows.

Calculations are thus often delegated to established programs - (see below...), such as, Xplor, AMBER, Hex, ProSa, FoldX, T-Coffee, Hmmer and Modeller; and interfaces to further software can be easily added.

Moreover, Biskit simplifies the parallelization of calculations via PVM (Parallel Virtual Machine) - (see below...).

Biskit Application examples --

The prime purpose of Biskit is to offer you a tool box for creating your own structural bioinformatics programs. In order to make the most of this package, you should program your own Python scripts around the Biskit library.

However, Biskit already contains ready-made programs for several applications -- ranging from a simple script to extract a sequence from a Protein Data Base (PDB) file to a complex workflow for flexible protein-protein docking, and a pipeline for automated homology modeling.

The manufacturer provides tutorials for some of the following main applications:

1) Automatted homology modeling - Biskit.Mod homology modeling pipeline (workflow) - Biskit/Mod is a Python module for the fully automatic or semi/automatic homology modeling.

2) Protein-protein docking - Regular docking: one receptor against one ligand - The manufacturer provides a text file that describes a docking run starting from a free receptor and a free ligand PDB-file and ends up in a complex list. During the docking the known structure of the bound complex is used for scoring. The use of a bound reference complex is optional.

3) Flexible protein-protein docking - How to use the Biskit.Dock flexible docking workflow - The manufacturer provides a text file that describes a flexible docking run starting from a free receptor and a free ligand PDB-file, the manufacturer then generates an ensemble of receptor and ligand conformers.

The manufacturer’s then dock the receptor and ligand conformers against each other, later referred to as cross-docking. The next step is to collect the docking result into a complex list and then cluster the solutions based on intermolecular atom contacts.

During the docking it is possible to use the known structure of the bound complex for benchmarking purposes. The structure ensembles are generated using principal component restrained molecular dynamics (PCR-MD).

4) Structure sampling with Xplor - Structure sampling with Xplor and runPcr.py -

The following describes how to sample conformal space for a protein using Biskit and Xplor (see below…). The default setting of the Xplor structure sampling script is used to run ten (10) independent trajectories of 50 ps length each with the structure embedded in a 9 A layer of explicit water.

Large-scale correlated motions usually escape the sampling of Molecular dynamics (MD) simulations. A second alternate approach is to run the simulation with an identical protocol, except a weak restraint is used to alleviate this problem. Large-scale, correlated motions typically occur along small gradients in the energy landscape. They are hence slow, but, on the other hand, they can be boosted by small interventions.

The restraint acts on the ensemble of the ten (10) concurrent trajectories as a whole and increases the variability along the major principal components of motion. The computational cost of this principal component restrained simulation (PCR-MD) is similar to the classic approach described above, but the ensemble is considerably more diverse.

5) Structure sampling with AMBER - Structure sampling with AMBER and amber_ensembleMD.py - The manufacturers provide a document that describes how to setup a parallel molecular dynamics run using AMBER (see below…) and amber_ensembleMD.py.

6) Modeling validation - Parallelized validation for a homology modeling project - Modeling validation test - This tutorial provided by the manufacturer takes all your template structures from an existing modeling project and then removes them one by one and tries to re-model them using the remaining template structures.

NumPy -- NumPy is the fundamental package needed for scientific computing with Python. It contains among other things:

1) An advanced N-dimensional array object;

2) Sophisticated (broadcasting) functions;

3) Tools for integrating C/C++ and Fortran code; and

4) Useful linear algebra, Fourier transform, and random number capabilities.

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

X-PLOR or XPLOR-NIH -- X-PLOR or XPLOR-NIH is a software package for computational structural biology originally developed by Axel Brunger at Yale University.

It was first published in 1987 as an off-shoot of Chemistry at HARvard Macromolecular Mechanics (CHARMM) - a general and flexible software application for modeling the structure and behavior of molecular systems.

AMBER -- AMBER (an acronym for Assisted Model Building with Energy Refinement) is a family of force fields for molecular dynamics of biomolecules. AMBER is also the name for the molecular dynamics software package that simulates these force fields.

The AMBER software suite provides a set of programs for applying the AMBER force fields to simulations of biomolecules.

Hex -- Hex is an interactive protein docking and molecular superposition program, written by Dave Ritchie. Hex understands protein and DNA structures in the PDB format, and it can also read small-molecule Standard Data Format (SDF) files.

ProSa -- ProSa is an advanced tool for protein structure research. ProSa supports and guides your studies aimed at the determination of a protein’s native fold. It is helpful for experimental structure determinations and modeling studies.

Suppose you are confronted with the three-dimensional structure of a protein. The structure may have been determined by X-ray analysis or NMR-spectroscopy, it may be the result of modeling studies or it could be a nice looking structure deliberately misfolded to fool specialists in protein structure theory.

Obvious questions are:

ProSa helps you to answer these questions.

FoldX -- A force field for energy calculations and protein design -

FoldX is a computer algorithm that was developed to provide a fast and quantitative estimation of the importance of the interactions contributing to the stability of proteins and protein complexes.

The predictive power of FOLDEF (which contains FoldX) has been tested on a very large set of point mutants (1,088 mutants) spanning most of the structural environments found in proteins.

FoldX uses a full atomic description of the structure of the proteins. The different energy terms taken into account in FoldX have been weighted using empirical data obtained from protein engineering experiments.

T-Coffee -- T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine the output of your favorite alignment methods (Clustal, Mafft, Probcons, Muscle, etc.) into one unique alignment (M-Coffee).

T-Coffee can align Protein, DNA and RNA sequences. It is also able to combine sequence information with protein structural information (3D-Coffee/Expresso), profile information (PSI-Coffee) or RNA secondary structures (R-Coffee).

HMMER - HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called “profile hidden Markov models” (profile HMMs).

Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models.

In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST. The current version is HMMER 3.0 (28 March 2010).

MODELLER -- MODELLER is used for homology or comparative modeling of protein three-dimensional structures.

PVM -- PVM (Parallel Virtual Machine) is a software package that permits a heterogeneous collection of UNIX and/or Windows computers hooked together by a network to be used as a single large parallel computer.

Thus large computational problems can be solved more cost effectively by using the aggregate power and memory of many computers. The software is very portable. The source, which is available free through netlib, has been compiled on everything from laptops to CRAYs.

PVM enables users to exploit their existing computer hardware to solve much larger problems at minimal additional cost. Hundreds of sites around the world are using PVM to solve important scientific, industrial, and medical problems in addition to PVM’s use as an educational tool to teach parallel programming.

With tens of thousands of users, PVM has become the de facto standard for distributed computing world-wide.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site Biskit

Price Contact manufacturer.

G6G Abstract Number 20721

G6G Manufacturer Number 104291