Abstract MethyCancer, the database of human DNA Methylation and Cancer was developed to study the interplay of DNA methylation, gene expression, and cancer.

It hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from the manufacturer’s large-scale sequencing process, where interconnections between different data types were analyzed and presented.

The Search tool Methy&Cancer (see below...) and the graphical MethyView tool (see below...) were developed to help users access all the data and data connections, and view DNA methylation in the context of genomics and genetics data.

MethyCancer can function as both an information resource and analysis platform for the study of CGI distribution in human genes, alteration of DNA methylation patterns in promoter CGIs, identification of novel cancer genes altered by DNA methylation alone or in combination with genetic events, and the discovery of novel epigenetic targets.

MethyCancer Data Content and Data Integration --

There are four (4) main types of data included in MethyCancer:

1) CGI clones and global CGI predictions;

2) DNA methylation data;

3) Cancer information, genes, and mutations; and

4) Correlation among DNA methylation, gene expression, and cancer.

The DNA methylation data were mainly integrated from the DNA Methylation Database (MethDB), the Human Epigenome Project (HEP) and the Methylation Landscape of Human Genome at Columbia University (Columbia).

The data from MethDB contains the human DNA methylation patterns, profiles and total methylation content data for different tissues and phenotypes coming from thousands of experiments.

The HEP data was produced by bisulfite DNA sequencing. The data types include analysis, trace, CpG variation and sample, where ‘analysis’ is a cluster of traces and ‘CpG variation’ refers to the position of the C of each CpG.

The data from Columbia (University) includes the genome-scale dataset of methylated domains and unmethylated domains in human brain tissues.

MethyCancer Database Usage and Access --

MethyCancer provides users with an advanced search engine to query different data types and data interactions housed in the database.

Besides the simple keyword search, MethyCancer offers advanced searches, namely Methylation Search, Gene Search, Cancer Search, Clone Search, and Repeat Search.

For example, with the Methylation Search, users can specify and combine the query options such as:

Methylation type (pattern, profile, content, domain); data source [Biomedical Informatics Grid (BIG), University Health Network (UHN), MethDB, HEP, Columbia]; experimental method; sample information (tissue, sex, age, phenotype) and chromosomal positions.

Note: Gene Search and Cancer Search are the recommended starting points to surf MethyCancer, as users will be more familiar with the corresponding terms.

Methy&Cancer -

Methy&Cancer is a special module dedicated to study the relationship between DNA methylation, gene, and cancer.

Users can query the interconnections between the three (3) data types and study the methylation status of certain cancer genes in a specific tissue.

For example, if a user is interested in breast cancer and wants to know how many breast cancer genes there are with supportive experimental methylation data from BIG/UHN and HEP, all he or she needs to do is:

Select two (2) gene categories of ‘Annotated/Candidate cancer genes with experimental methylation data’ from the pull-down menu, then type ‘breast tumor’ in the search field of Cancer Name, and select BIG/UHN and HEP as the Methylation Data Source.

Users may also define specific expression tissues and genes with methylation data in promoter regions. A list of genes and methylation data will be returned and a factual detailed report can be produced upon demand.

MethyView -

As an important and efficient visualization tool, MethyView was developed to facilitate the users to have the ability to browse methylation data in the context of existing genome annotations.

MethyView is composed of three (3) layers of sub-viewers in the hierarchical architecture, namely ChroView, MethyLoci Survey, and MethyLoci Detail View.

ChroView contains an outline of a chromosome including statistics of cancer genes, methylation-related genes, and methylation data. ChroView allows users to center the map onto a specific chromosome band in order to expand for more detailed views.

MethyLoci Survey shows you where the MethyLoci are located in chromosomal regions and the distribution of its sourcing data (BIG/UHN, MethDB, HEP, Columbia, CGI predictions).

By clicking on MethyLoci or the original methylation data from different resources, users can switch to the MethyLoci Detail View, which displays the methylation levels in the form of a Matrix.

Each color-coded square within this Matrix represents one CpG site and the color coding represents the different methylation level (from yellow to blue, represents from 0% to 100% methylation and gray indicates CpG sites for which methylation levels could Not be determined).

Clicking on a square reveals the level of methylation observed at that particular CpG site. Each row represents all the CpG sites of one methylation record shown at the bottom of the MethyLoci Detail View.

Rows of squares are grouped by different data sources and tissues. If one continues to zoom in on the MethyLoci, the absolute position of each CpG site will be displayed.

At each layer of MethyView, gene distribution is plotted and chromosome coordinates and transcripts are also shown.

By enlarging a specific gene, the methylation data will be shown along the precise gene structure of promoters, exons and introns, as well as distribution of repetitive sequences.

A factual report for each element contained in the visualization system is displayed automatically, on demand.

Links to MethyView are also available in various Search result pages.

