De novo assembly using Trinity
Last updated
Was this helpful?
Last updated
Was this helpful?
Trinity is one of the most popular software package for efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. It consists of three software modules, Inchworm, Chrysalis and Butterfly, that run sequentially to process the sequencing reads.
Quote from GitHub:
Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptional complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.
Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.
The Trinity developers have provided , and the raw data and the software required are built into a VirtualBox image (Trinity2015.ova). I have saved a copy on ALPS1. The RNA-Seq data are 76 bp strand-specific Illumina RNA-Seq paired-end reads derived from Schizosaccharomyces pombe (fission yeast) grown under 4 conditions:
logarithmic growth (Sp_log)
plateau phase (Sp_plat)
heat shock (Sp_hs)
diauxic shift (Sp_ds)
* Due to the space limitation of gitbook, I will not provide the fq.gz
files here, please obtain these files from the VirtualBox image []
v2.2.0 [17 Mar 2016] - Latest version available at the time of writing and used in this exercise
v2.0.6 [13 Mar 2015] - Latest version available on ALPS1
v1.1.2 [23 Jun 2015] - Latest version available at the time of writing and used in this exercise
v1.0.1 [14 Mar 2014] - Latest version available on ALPS1
v2016-09-23 - Latest version available at the time of writing and used in this exercise
v2.5.2b [20 Aug 2016] - Latest version available at the time of writing and used in this exercise
v2.3.0e [14 Feb 2013] - Latest version available on ALPS1
v1.3.1 [22 Apr 2016] - Latest version available at the time of writing and used in this exercise
v1.2 [02 Feb 2015] - Latest version available on ALPS1
v1.3.0 [02 Oct 2016] - Latest version available at the time of writing
v1.2.31 [04 Jun 2016] - Version used in this exercise
v1.2.19 [05 Nov 2014] - Latest version available on ALPS1
Bowtie 1 (NOT Bowtie 2) is required by the Chrysalis module.
* Below is an example showing how to set up the paths, please remember to change the paths to these binaries accordingly.
You can use echo $PATH
to check the new PATH variable.