Quantification using RSEM
Like the previous exercise, we can use RSEM to estimate the expression levels of the re-constructed transcripts under the four conditions: logarithmic growth, plateau phase, heat shock and diauxic shift. First, we align the RNA-Seq reads to the Trinity transcripts using Bowtie. Then we run RSEM to estimate the number of reads mapped to each transcript. We do not need a splice-aware aligner (such as STAR) in this case because we are mapping the reads to cDNAs instead of a genomic sequence. Also the gap-free alignment produced by Bowtie is used as input for RSEM.
Execute
Locate util/align_and_estimate_abundance.pl in the trinityrnaseq-2.2.0 distribution, and run
cd ~/LSLNGS2015/Trinity
bsub -q 4G -o ./RSEM_Sp_ds.std -e ./RSEM_Sp_ds.err -J RSEM_Sp_ds \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq \
--left RNASEQ_data/Sp_ds.left.fq.gz --right RNASEQ_data/Sp_ds.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_ds --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_ds"
bsub -q 4G -o ./RSEM_Sp_hs.std -e ./RSEM_Sp_hs.err -J RSEM_Sp_hs \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq \
--left RNASEQ_data/Sp_hs.left.fq.gz --right RNASEQ_data/Sp_hs.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_hs --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_hs"
bsub -q 4G -o ./RSEM_Sp_log.std -e ./RSEM_Sp_log.err -J RSEM_Sp_log \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq \
--left RNASEQ_data/Sp_log.left.fq.gz --right RNASEQ_data/Sp_log.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_log --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_log"
bsub -q 4G -o ./RSEM_Sp_plat.std -e ./RSEM_Sp_plat.err -J RSEM_Sp_plat \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq \
--left RNASEQ_data/Sp_plat.left.fq.gz --right RNASEQ_data/Sp_plat.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_plat --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_plat"Get status with bjobs
Resource usage
Job
ALPS Queue Name
CPU Time
Max Memory
Duration
RSEM_Sp_ds
4G
42.28 sec.
-
32 seconds
RSEM_Sp_hs
4G
36.40 sec.
-
26 seconds
RSEM_Sp_log
4G
14.28 sec.
-
13 seconds
RSEM_Sp_plat
4G
48.40 sec.
-
38 seconds
Estimations
Once the jobs are completed, we will find *.isoforms.results and *.genes.results in the output folders. These files contain the expected counts and normalized expression values of the Trinity transcripts (isoforms) and components (genes).
ls -la RSEM_Sp_*/*results
We can use head to examine these files. Your values may not be the same because the assembly results are not deterministic.
head RSEM_Sp_ds/Sp_ds.genes.results
head RSEM_Sp_ds/Sp_ds.isoforms.results
Last updated
Was this helpful?