SNPscan

Category Genomics>Genetic Data Analysis/Tools

Abstract SNPscan is a web-accessible tool that can be used to analyze and visualize high density Single Nucleotide Polymorphism (SNP) data. SNPscan accepts data exported from the Affymetrix Copy Number Analysis Tool as its input.

It allows users to upload large-scale SNP datasets obtained from the Affymetrix 10K, 50K Xba, 50K Hind or 100K (combined 50K Xba/Hind) SNP array platforms.

It enables researchers

(1) to visually and quantitatively assess the quality of user-generated SNP data relative to a benchmark data set derived from a control population;

(2) to display SNP intensity and allelic call data in order to detect chromosomal copy number anomalies (duplications and deletions);

(3) to display uniparental isodisomy based on loss of heterozygosity (LOH) across genomic regions;

(4) to compare paired samples (e.g. tumor and normal); and (5) to generate a file type for viewing SNP data in the University of California, Santa Cruz (UCSC) Human Genome Browser (see G6G Abstract Number 20197).

Implementation of SNPscan --

SNPscan was written in the Perl programming language, permitting uploads of the user's data, setting of the user's selections, and execution of functions implemented in the R programming language.

The R codes execute SNPscan's algorithm and generate graphics in several formats, to be sent back to the user via the SNPscan website written in HTML and running under an Apache Web server.

Data input - The input data for SNPscan is a text file generated by Affymetrix CNAT software, via its export tool.

SNPscan can be used with data from any platform, given the appropriate input format.

The columns that are used by SNPscan consist of SNP identifiers, chromosomal assignment and physical map position, and five additional columns for each array sample.

A typical SNPscan analysis, consisting of data on ~58,000 SNPs on ten individuals, is approximately 18 megabyte in size for file upload. The manufacturer has chosen a default restriction of 40 megabytes to the file size for upload.

Note: The availability of source code at the SNPscan website enables users to install the program locally and increase the file upload size.

There are three (3) upload web pages, one for each of the SNPscan tools (described below).

Sample data files are provided at the SNPscan website to help users become accustomed to the tools. SNPscan currently supports analyses using NCBI data.

There are three (3) main outputs of SNPscan: a web page called SNPscanPlot, a wiggle track (WIG) file that is used to visualize SNP data in a human genome browser, and a series of plots and tables that summarize SNP genotype and copy number data.

1) Data output - SNPscanPlot --

SNPscanPlot visualizes data on SNP genotype and copy number. Its combination of allelic calls with copy number is a unique feature of SNPscan. The sequence of plotting (LOH, p value, homozygous calls, heterozygous calls, NoCalls) is also uniquely arranged.

Combining these two designs, SNPscan can be used to display and thus discover a variety of chromosomal anomalies.

SNPscanPlot can also be used to visualize micro-deletions and Uniparental Disomy (UPD).

Graphical output file formats - Each SNPscanPlot result can be provided in several formats. The server returns a portable network graphics (PNG) output at a 20% reduction in size, to better fit into the screen, for a quick overview.

The user can click on the PNG figure to obtain a tagged image file format (TIFF) file displayed at the full size as specified by the user. This allows a more detailed view, but is slower for download due to the larger file size.

Links are also provided to obtain a portable document format (PDF) file, as well as a PostScript (PS) file, to permit extremely detailed viewing and printing.

2) Data output: - genome browser - The SNPscan program allows input data files to be converted to the WIG file format. This allows the display of continuous-valued data in a track format that is compatible with the UCSC Genome Browser.

Many dozens of additional tracks may be added to the browser, and the data may be explored from a scale of several bases to the entire length of a chromosome.

3) Data output - plots and tables - A third feature of SNPscan is a group of graphical and tabular summary statistics to describe the genotype and copy number calls from a SNP data set.

Examples that can be displayed include a tabular listing of SNPs organized by the size of a group of homozygous calls, and a plot of the number of instances of blocks of homozygous SNPs as a function of the length of the group.

Validation of SNPscan --

The manufacturer validated SNPscan using data generated from patients with known deletions, duplications, and uniparental disomy.

The manufacturer also inspected previously generated SNP data from 90 apparently normal individuals from the Centre d'Étude du Polymorphisme Humain (CEPH) collection, and identified three cases of uniparental isodisomy, four females having an apparently mosaic X chromosome, two mislabeled SNP data sets, and one micro-deletion on chromosome 2 with mosaicism from an apparently normal female.

These previously unrecognized abnormalities were all detected using SNPscan.

The micro-deletion was independently confirmed by fluorescence in situ hybridization, and a region of homozygosity in a UPD case was confirmed by sequencing of genomic DNA.

System Requirements

Web based. Also uses UCSC Genome Browser.

Manufacturer

SNPscan was written in the laboratory of Jonathan Pevsner, Kennedy Krieger Institute, Baltimore, MD and was developed in collaboration with the Department of Biostatistics in the Johns Hopkins Bloomberg School of Public Health.

Manufacturer Web Site SNPscan

Price Contact manufacturer.

G6G Abstract Number 20318

G6G Manufacturer Number 101645