Introduction

Guide to RNA-seq Data Analysis

I-Hsuan Lin

The University of Manchester

RNA-seq utilizes the sequencing technology to assay the presence and quantity of RNA molecules in the given sample. RNA-seq offers many advantages and supersede the microarray technology that was introduced in 2000s. This includes detects known and novel transcripts, increased specificity and sensitivity, and identification of low-abundance transcripts and isoforms with sufficient sequencing depth. RNA-seq has also been used to discover alternative splicing variants, chimeric RNAs result from fusion genes and RNA editing sites. We can compare RNA-seq data between conditions to detect differences across groups of samples in terms of (1) gene-level expression, (2) transcript/isoform-level expression, and (3) transcript/isoform usage within a gene.

I will use a real-world illumina paired-end RNA sequencing dataset to demonstrate a step-by-step guide where readers can reproduced the analysis. The complete workflow includes:

  • Performing QC and trimming to pre-process RNA-seq raw data

  • Mapping of trimmed reads using either alignment-free (Salmon) and alignment-based (STAR) methods

  • Quantifing gene and transcript expression

  • Performing statistical testing to identified differentially expressed genes, transcripts and also detected expression switching between transcripts

License

Unless specified, this content is licensed under the Creative Commons — Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). You may view a copy of this license at https://creativecommons.org/licenses/by-nc-sa/4.0/

Last updated