SNP Prioritization Online Tool (SPOT)

Category Genomics>Genetic Data Analysis/Tools

Abstract SPOT (SNP Prioritization Online Tool) is a web site for integrating biological databases into the prioritization of Single Nucleotide Polymorphisms (SNPs) for further study after a genome-wide association study (GWAS).

Typically, the next step after a GWAS is to genotype the top signals in an independent replication sample.

Investigators will often incorporate information from biological databases so that biologically relevant SNPs, such as those in genes related to the phenotype or with potentially non-neutral effects on gene expression such as a splice sites, are given higher priority.

The manufacturer's recently introduced the genomic information network (GIN) method for systematically implementing this kind of strategy.

The SPOT web site allows users to upload a list of SNPs and GWAS P-values and it will return a prioritized list of SNPs using the GIN method.

Users can specify candidate genes or genomic regions with custom levels of prioritization. The results can be downloaded or viewed in a browser where users can interactively explore the details of each SNP, including graphical representations of the GIN method.

For investigators interested in incorporating biological databases into a post-GWAS SNP selection strategy, the SPOT web tool is an easily implemented and flexible solution.

Genomic Information Networks (GIN) --

SPOT implements the GIN prioritization method using a secure, interactive web-based approach (as stated above...). The user uploads a list of SNPs and may optionally include P-values from statistical tests of a genotype-phenotype correlation.

The user may also upload a list of genes, genomic regions or specific SNPs with custom prioritization scores that determine how SPOT uses the GIN method to prioritize the SNPs.

The GIN prioritization method is designed to be flexible and allow a variety of biological databases to be incorporated while remaining viable and interpretable.

The results may be viewed directly in a web browser or downloaded in various formats. The GIN prioritization method is Not designed to predict causal variants, and the prioritization results are Not intended to be used to interpret the statistical significance of the GWAS results.

Rather, it is intended to assist users in incorporating a broad range of biological hypotheses into the prioritization process using a transparent method of specifying their specific biological priorities and to provide results that are easily interpreted in terms of what went into the model and exactly how that information was used to prioritize the SNPs.

The final results of the GIN prioritization process may be viewed in an interactive table within a web browser (see the SPOT User’s Guide, for screen-shots and additional details…).

The table shows the original P-values and their ranks in the column ‘Rank: p-value’, and the SPOT rankings from the GIN prioritization method in ‘Rank: SPOT’ which are determined by the ‘GIN weighted p-value’ column.

The ‘Rank: SPOT’ column is the most important item in the table as it reflects the priority of the SNP from the GIN prioritization method.

A graphical column selection tool can be used to configure the output tables. The user may select which columns are displayed and their order.

This also allows the user to view additional information such as detailed gene/mapping properties, HapMap allele frequencies and commercial SNP microarray linkage disequilibrium (LD) tagging properties.

Columns labeled with an asterisk refer to the LD proxy when one is being used, and information about the original or ‘source’ SNP may be viewed using the column selection tool.

Using SPOT --

The main page of the web-site consists of a web form for primary user input.

Users must first either upload a list of numeric SNP identification numbers as a file or enter the list directly into the web form.

The user may optionally provide P-values from statistical tests for genotype-phenotype correlation.

In the section labeled ‘Prioritization of specific genes and other genomic regions’, the user can specify a list of genes and prioritization scores to be used in the ‘User Gene’ node of the GIN model.

A number of methods can be used to specify genes and genomic regions.

SPOT’s Programmatic architecture --

MySQL is used via the PERL DBI module to process and store the users input and implement the GIN prioritization method.

During this execution, biological information is integrated from a local MySQL relational database. All results are stored as tables in a MySQL relational database which is then accessed by the web interface.

SPOT’s web interface makes use of different technologies and programming languages.

Note: The manufacturers have created a Wiki for SPOT with detailed technical information and resources for downloading and installing a local copy of SPOT, including the underlying MySQL relational database of biological information - (see -"How to install SPOT").

SPOT’s Security --

The results of a GWAS are sensitive information and SPOT takes several steps to ensure this information is protected during a session and it is destroyed when the session is complete.

SPOT uses a DigiCert encryption certificate so that all communication between the user and the server are secure.

Any information the user uploads is destroyed after three (3) hours or immediately at the user’s request.

SPOT’s Future Development --

Future iterations of SPOT will include additional biological information such as expression quantitative trait loci, transcription factor binding sites, micro RNA target sites and other GWAS results.

A useful feature would also be to add predefined gene sets (databases) to SPOT’s query tool. For example, the NeuroSNP database which consists of genes related to addiction-related diseases.

