PPI Spider

Category Cross-Omics>Pathway Analysis/Tools

Abstract PPI Spider is a webserver based tool that can be used for subnetwork enrichment analysis and the interpretation of proteomics data in the context of protein-protein interaction networks (PPIs).

PPI Spider implements a robust statistical framework for the interpretation of protein lists in the context of a global PPI network.

The input list is translated into a 'network model' according to the topology of the PPI network.

The manufacturers approach takes into account that the input list may have missing proteins and automatically includes in the network model the most relevant proteins that seem to be missing.

As output, the manufacturer’s procedure provides a 'model of protein interactions' that represents the most probable scenario of how proteins within the list are connected.

Statistical significance of the model is computed by a Monte Carlo simulation procedure.

The manufacturers reasoning for creating PPI Spider --

Recent advances in experimental technologies allow for the detection of a complete cell proteome. Proteins that are expressed at a particular cell state or in a particular compartment as well as proteins with 'differential expression' between various cells states are commonly delivered by many proteomics studies.

Once a list of proteins is derived, a major challenge is to interpret the identified set of proteins in the biological context.

Protein-protein interaction (PPI) data represents abundant information that can be employed for this purpose. However, these data have Not yet been fully exploited due to the absence of a methodological framework that can integrate this type of information.

Here, the manufacturer proposes to infer a network model from an experimentally identified protein list based on the available information about the topology of the global PPI network.

The manufacturer also proposes to use a Monte Carlo simulation procedure to compute the statistical significance of the inferred models.

This method has been implemented as a freely available web-based tool, PPI Spider.

To support the practical significance of PPI Spider, the manufacturer collected several hundred recently published experimental proteomics studies that reported lists of proteins in various biological contexts.

The manufacturer reanalyzed them using the PPI Spider and demonstrated that in most cases the PPI Spider could provide statistically significant hypotheses that are helpful for the understanding of the protein list.

PPI Spider Optional parameters --

Attaching a file with a list of gene/protein identifiers -

As input, PPI Spider accepts several types of gene or protein identifiers.

For example, for the human genome PPI Spider supports identifiers from Entrez Gene, UniProt/Swiss-Prot, Gene Symbol, UniGene, Ensembl, and RefSeq Protein ID, RefSeq Transcript ID, and Affymetrix probe codes.

Number of random networks --

This parameter specifies the number of random simulations needed to generate the background distribution.

The statistical significance of the inferred model is estimated based on the distribution of the model size for a random protein list.

The distribution is computed by a random simulation procedure.

The p-value is estimated as a share of random simulations where the size of the inferred models for random protein list was equal to or greater than the size of the inferred models for input protein list.

Therefore, the p-value estimate represents a probability to infer the same or bigger size models from a randomly generated protein list of a size, equal to the size of the input list.

PPI Spider Results --

PPI Spider provides three (3) tables as standard output -

1) Summary - The first table "Summary" provides info on the mapping of the supplied gene IDs:

The supplied gene/protein IDs are mapped to the NCBI gene IDs (Entrez).

The first row of the table specifies the number of IDs submitted by User (space delimited).

The second row reports the number of recognized gene/protein IDs.

The third row reports the number of NCBI IDs that mapped to the user supplied IDs.

The last row summarizes the number of NCBI IDs mapped to currently known PPI network.

2) Enriched subnetworks - The second table "Enriched subnetworks" reports inferred models:

Model 1 corresponds to a maximal subnetwork of the directly connected proteins according to the reference PPI network.

Model 2 corresponds to a maximal subnetwork: genes connected via one intermediate gene according to reference network.

The 'Number of nodes' column reports the size (the number of input proteins in the inferred network models) of Model 1 and Model 2.

The 'p-value' column reports the probability of getting the same (or greater) size model for a random protein list of a size, equal to the size of the input list.

By clicking on Model 1 or Model 2 the user will be redirected to the visualization page.

Note: To visualize the models, you must have 'Sun's Java engine' installed on your machine and available to your browser (Version 1.5 or better).

3) Enriched GO categories -

The third table "Enriched GO categories" reports the list of enriched Gene Ontology (GO) terms with a p-value (corrected for multiple testing by the Monte Carlo simulation procedure) of at least 0.05.

PPI Spider ‘Extensive Examples’ that can be viewed on the manufacturers’ web-site --

In recent years hundreds of genomic and proteomic studies have reported lists of genes/proteins identified to be present (or differentially present) at different cell physiological conditions.

The manufacturer automatically collected some of these studies that were published in different journals.

In each case the manufacturer collected a set of genes/proteins (or several sets) from a table originally identified and reported in the paper.

Because the data was collected automatically, the manufacturer would like to alert you to the fact that the extracted lists might have errors.

The manufacturer analyzed the collected protein lists via the PPI Spider software.

This extensive output table of info (located at the manufacturers’ web- site) provides links to the results.

Each row of this displayed table corresponds to 'one gene/protein list' extracted from a table reported in the paper that was published recently.

The name of the paper is presented in the second column and the next column provides a link to the papers abstract.

The next column specifies the table ID in the paper which reported the gene/protein list.

The next column specifies the type and the number of recognized protein/gene IDs from the table.

The final column includes a Link to the 'PPI Spider Results' page (see above...) which also includes the sub-network model.

System Requirements

Web-based.

Manufacturer

The Institute of Bioinformatics and Systems Biology (IBI) is part of the Helmholtz Zentrum München - German Research Center for Environmental Health and hosts the Munich Information Center for Protein Sequences (MIPS).

Manufacturer Web Site PPI Spider

Price Contact manufacturer.

G6G Abstract Number 20484

G6G Manufacturer Number 101822