Guide to RNA-seq Data Analysis

I-Hsuan Lin
The University of Manchester
RNA-seq utilizes the sequencing technology to assay the presence and quantity of RNA molecules in the given sample. RNA-seq offers many advantages and supersede the microarray technology that was introduced in 2000s. This includes detects known and novel transcripts, increased specificity and sensitivity, and identification of low-abundance transcripts and isoforms with sufficient sequencing depth. RNA-seq has also been used to discover alternative splicing variants, chimeric RNAs result from fusion genes and RNA editing sites. We can compare RNA-seq data between conditions to detect differences across groups of samples in terms of (1) gene-level expression, (2) transcript/isoform-level expression, and (3) transcript/isoform usage within a gene.
I will use a real-world illumina paired-end RNA sequencing dataset to demonstrate a step-by-step guide where readers can reproduced the analysis. The complete workflow includes:
  • Performing QC and trimming to pre-process RNA-seq raw data
  • Mapping of trimmed reads using either alignment-free (Salmon) and alignment-based (STAR) methods
  • Quantifing gene and transcript expression
  • Performing statistical testing to identified differentially expressed genes, transcripts and also detected expression switching between transcripts


Unless specified, this content is licensed under the Creative Commons β€” Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). You may view a copy of this license at​
Last modified 1yr ago