Trinity is one of the most popular software package for efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. It consists of three software modules, Inchworm, Chrysalis and Butterfly, that run sequentially to process the sequencing reads.
Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptional complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.
Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.
Materials
The Trinity developers have provided training materials, and the raw data and the software required are built into a VirtualBox image (Trinity2015.ova). I have saved a copy on ALPS1. The RNA-Seq data are 76 bp strand-specific Illumina RNA-Seq paired-end reads derived from Schizosaccharomyces pombe (fission yeast) grown under 4 conditions:
logarithmic growth (Sp_log)
plateau phase (Sp_plat)
heat shock (Sp_hs)
diauxic shift (Sp_ds)
* Due to the space limitation of gitbook, I will not provide the fq.gz files here, please obtain these files from the VirtualBox image [Link]
-rw-rw-r-- 1 ycl6 ycl6 5790168 Oct 27 11:35 RNASEQ_data/Sp_ds.left.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 5590326 Oct 27 11:35 RNASEQ_data/Sp_ds.right.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 5815390 Oct 27 11:35 RNASEQ_data/Sp_hs.left.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 5751383 Oct 27 11:36 RNASEQ_data/Sp_hs.right.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 2154125 Oct 27 11:36 RNASEQ_data/Sp_log.left.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 2097534 Oct 27 11:36 RNASEQ_data/Sp_log.right.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 5488286 Oct 27 11:36 RNASEQ_data/Sp_plat.left.fq.gz
-rw-rw-r-- 1 ycl6 ycl6 5238362 Oct 27 11:36 RNASEQ_data/Sp_plat.right.fq.gz