SAM (Significance Analysis of Microarrays)

Category Genomics>Gene Expression Analysis/Profiling/Tools

Abstract SAM (Significance Analysis of Microarrays) is supervised learning software product that can be used for genomic expression data mining/analysis. SAM is a statistical technique for finding significant genes in a set of microarray experiments. It was proposed by Tusher, Tibshirani and Chu. The software was written by Balasubramanian Narasimhan and Robert Tibshirani. The input to SAM is gene expression measurements from a set of microarray experiments, as well as a response variable from each experiment. The response variable may be a grouping like untreated, treated [either unpaired or paired], a multiclass grouping (like breast cancer, lymphoma, colon cancer, . . .), a quantitative variable (like blood pressure) or a possibly censored survival time. SAM computes a statistic di for each gene i, measuring the strength of the relationship between gene expression and the response variable. It uses repeated permutations of the data to determine if the expression of any genes is significantly related to the response. The cutoff for significance is determined by a tuning parameter 'delta', chosen by the user based on the false positive rate. One can also choose a 'fold change' parameter, to ensure that called genes change at least a pre-specified amount.

Product features/capabilities include:

1) SAM is a convenient Excel Add-in.

2) SAM can be applied to data from Oligo or cDNA arrays, SNP arrays, protein arrays, etc.

3) SAM correlates expression data to clinical parameters including treatment, diagnosis categories, survival time, paired (before and after), quantitative (e.g. tumor volume) and one-class. Both parametric and non-parametric tests are offered.

4) SAM correlates expression data with time, to study time trends. The experimental units can fall into one or two classes, or be paired.

5) SAM provides automatic imputation of missing data via nearest neighbor algorithm (Upgraded in SAM version 2.0).

6) Its adjustable threshold determines the number of genes called significant.

7) SAM uses data permutations to provide an estimate of the False Discovery Rate (FDR) for multiple testing.

8) SAM's version 2.0 reports local false discovery rates and missing rates (see Note 1).

9) SAM can deal with 'blocked designs', for example, when treatments are applied within different batches of arrays.

10) SAM provides 'pattern discovery' via Eigen genes (principal component analysis).

11) SAM provides 'gene lists' in Excel workbook form, exportable into TreeView, Cluster or other software systems (see Note 2).

12) SAM listed genes are web-linked to Stanford SOURCE database.

13) SAM was developed at Stanford University Statistics and Biochemistry Labs.

14) SAM provides a 41 page users guide and technical document in electronic form Portable Document Format (PDF).

Note 1: Major New release 3.0, Jan 23, 2007. SAM now offers gene set analysis.

Note 2: Charlie Kim of Stanford's Falkow lab has written a neat tool called "samster''. It takes the significant gene list from SAM and exports it into a file readable by either TreeView or Cluster (Mike Eisen's programs). Samster is available at Falkow lab software.

System Requirements

SAM 2.0 requires

Manufacturer

Manufacturer Web Site SAM

Price Contact manufacturer.

G6G Abstract Number 20066

G6G Manufacturer Number 102505