miRBase

Category Cross-Omics>Knowledge Bases/Databases/Tools

Abstract The miRBase database is a searchable database of published miRNA sequences and annotation.

Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR).

Both hairpin and mature sequences are available for searching and browsing, and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for Download.

miRBase is one of the primary online repositories for all microRNA sequences and annotation.

The current release (Release 19 - as of August 2012) of the database contains 21,264 entries representing hairpin precursor miRNAs, expressing 25,141 mature miRNA products, in 193 species.

Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery.

The manufacturers have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings.

The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNA’s by tissue- and stage-specific expression.

This data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow one to revisit previous annotations.

Main Aims of miRBase --

The main aims of miRBase are:

1) To curate a consistent nomenclature scheme by which novel microRNAs are named;

2) To act as the central repository for all published microRNA sequences, and to facilitate online searching and bulk download of all microRNA data;

3) To provide human-readable and computer-parsable annotation of microRNA sequences (for example, functional data, references, genome mappings...);

4) To provide access to the primary evidence that supports microRNA annotations; and

5) To link to and aggregate microRNA target predictions and validations.

miRBase was designed to be a focused resource --

From its inception, miRBase was designed to be a focused resource that could make and facilitate significant contributions in a rapidly growing field.

The manufacturers currently capture data types from user submissions and from publications that describe novel microRNAs, for example, the experimental method used to identify the sequence.

Alongside the microRNA name, miRBase assigns a stable accession number to each stem-loop and mature sequence to allow tracking of improved annotations between releases.

The primary transcripts of almost all microRNAs remain unannotated, but the manufacturers aim to develop a mechanism to include annotations as they are determined.

Where genome assemblies are available, microRNAs are mapped to their locations, clusters of microRNAs are highlighted, and overlaps with annotated protein-coding genes are described.

Families of microRNAs are constructed, and links are provided to entries in other databases, to predicted targets, and to the primary literature.

miRBase provides several methods to access the sequence data: by browsing, sequence similarity, genomic coordinate intervals, keyword search and bulk download.

Incorporating Data from Deep-Sequencing Experiments --

The majority of the evidence supporting microRNA annotations now comes from deep-sequencing experiments.

The manufacturers have developed an interface to view reads from RNA deep-sequencing data mapped to microRNA loci on the miRBase web-site.

Briefly, the manufacturers identify entries in the Gene Expression Omnibus (GEO), with short-read data that map to references discovering novel microRNAs.

The manufacturers extract the read sequences and counts from the GEO entry and map the reads to the set of miRBase hairpin sequences for a given organism using Bowtie;

[Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes] allowing at most two (2) mismatches between the read and the hairpin sequence.

The first deep-sequencing data sets incorporated into miRBase are from human, Drosophila melanogaster, Arabidopsis thaliana, rice and three (3) nematode genomes.

The view of deep-sequencing data can be accessed from miRBase stem-loop and mature pages. Reads can be filtered by the number of mismatches to the hairpin sequence, the read count and by experiment.

Each experiment is annotated with species, tissue, stage, and methodology information.

These tags enable the user to search for experiments by tissue expression on the miRBase search page (located on the manufacturer’s web-site).

Read counts for mature microRNAs are commonly used as a proxy for relative expression levels. As the number of deep-sequencing experiments increases, this data provides extensive information about the expression profiles of microRNAs across tissues, stages, and organisms.

miRBase Future developments --

Improvement of community contribution of and access to microRNA data -

miRBase is a community resource. The majority of the primary sequence data is submitted by users. The manufacturers plan to improve methods and interfaces for both data access and data submission.

For example, web-services will allow programmatic access to all miRBase data, and batch search and download tools will be made available. The manufacturers plan to allow users to add and update textual annotation in a Wiki interface.

Expansion to include microRNA candidates and predictions -

As stated above, deep-sequencing experiments provide the majority of novel microRNA annotations.

Next-generation experimental techniques also provide low-level evidence for comparatively large numbers of lower confidence sequences, often published as candidate microRNAs, which fall below the current standards and are therefore Not present in miRBase.

The manufacturers plan to extend the database to include different classes of data as supplements, with associated confidence levels for each annotation.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site miRBase

Price Contact manufacturer.

G6G Abstract Number 20742

G6G Manufacturer Number 104328