miRExpress
Category Genomics>Gene Expression Analysis/Profiling/Tools and Cross-Omics>Next Generation Sequence Analysis/Tools
Abstract miRExpress is a systematic method for extracting miRNA expression profiles from sequencing reads generated by second (next)-generation sequencing.
miRExpress is one of the first stand-alone packages that contains miRNA information from miRBase and efficiently reveals miRNA expression profiles by aligning sequencing reads against the sequences of known miRNAs.
This approach can be used to determine miRNA expression profiles when genomic sequences are unavailable and can greatly reduce the time spent aligning sequencing reads and genomic sequences.
Furthermore, miRExpress can be used to find novel miRNA candidates by aligning reads with sequences of known miRNAs of various species.
The manufacturers have used miRExpress to extract the miRNA expression profiles from two (2) Illumina data sets constructed for the human and a plant species to demonstrate the utility of miRExpress.
miRExpress Implementation --
The process by which miRExpress constructs the miRNA expression profiles consist of three (3) main steps:
1) The first step is the preprocessing of raw data obtained by second-generation sequencing (see below...).
2) The second step is the alignment of all sequencing reads against those of known mature miRNAs (see below...).
3) The third step is the construction of miRNA expression profiles from the results of the alignment (see below...).
Step 1) In the first step, miRExpress merges the identical reads into a unique read and counts each unique read.
Then, each unique read is checked to determine whether it contains a full or a partial adaptor sequence.
In checking the full adaptor sequence, if the adaptor sequence is in the middle and at the beginning of the read, then the read is removed.
If the adaptor sequence is at the end of the read, then the adaptor sequence is trimmed from the read sequence.
In checking the partial adaptor sequence, the last bases of the 5' adaptor are used as a probe to match the first bases of the reads.
The first bases of the 3' adaptor are used as probes to match the last bases of the reads.
If the sequence identities of the matched regions are greater than 70%, these regions are eliminated.
Step 2) In the second step, each read is aligned with the sequences of known mature miRNAs.
The information for known miRNAs was obtained from the miRBase database (Release 12.0).
In miRExpress, the same sequences from different miRNAs are analyzed as a single sequence.
For example, ath-miR157a, ath-miR157b and ath-miR157c have the same sequence.
The proposed alignment algorithm is based on the Smith-Waterman algorithm and implemented by following Single Instruction Multiple Data (SIMD) instructions.
When miRExpress is executed on a PC machine with an SSE3 instruction set, it can compare one miRNA with eight (8) sequencing reads simultaneously.
However, the proposed software is also multiple-processor ready.
For example, miRExpress can compare one miRNA with 64 sequencing reads simultaneously on a computer with eight (8) processing cores.
Another advantage is the use of a lookup table for scoring:
The proposed algorithm can change the score or penalty for every pair of nucleotides as easily as changing the match score and mismatch of only one nucleotide pair.
Step 3) In the third step, miRNA expression profiles are constructed by computing the sum of read counts for each miRNA according to the alignment criteria (e.g., the length of the read equals the length of the miRNA sequence and the identity of the alignment is 100%).
Users can set the cutoff of alignment identity based on their requirements when using miRExpress.
Previous studies suggest that RNA editing that has occurred in miRNAs can affect their interactions to targets and regulate gene expression.
The sequence reads with high similarity to known miRNAs are valuable for further analysis of RNA editing and mutations that have occurred in miRNAs.
Hence, the manufacturers also provide a function in miRExpress which can return the sequence reads that are highly similar to known miRNAs including the nucleotide mismatch information.
System Requirements
Contact manufacturer.
Manufacturer
- Institute of Bioinformatics and Systems Biology
- National Chiao Tung University
- Hsin-Chu 300, Taiwan, Republic of China
Manufacturer Web Site miRExpress
Price Contact manufacturer.
G6G Abstract Number 20751
G6G Manufacturer Number 104333