Quantification using RSEM

Like the previous exercise, we can use RSEM to estimate the expression levels of the re-constructed transcripts under the four conditions: logarithmic growth, plateau phase, heat shock and diauxic shift. First, we align the RNA-Seq reads to the Trinity transcripts using Bowtie. Then we run RSEM to estimate the number of reads mapped to each transcript. We do not need a splice-aware aligner (such as STAR) in this case because we are mapping the reads to cDNAs instead of a genomic sequence. Also the gap-free alignment produced by Bowtie is used as input for RSEM.

Execute

Locate util/align_and_estimate_abundance.pl in the trinityrnaseq-2.2.0 distribution, and run

cd ~/LSLNGS2015/Trinity

bsub -q 4G -o ./RSEM_Sp_ds.std -e ./RSEM_Sp_ds.err -J RSEM_Sp_ds \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq  \
--left RNASEQ_data/Sp_ds.left.fq.gz --right RNASEQ_data/Sp_ds.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_ds --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_ds"

bsub -q 4G -o ./RSEM_Sp_hs.std -e ./RSEM_Sp_hs.err -J RSEM_Sp_hs \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq  \
--left RNASEQ_data/Sp_hs.left.fq.gz --right RNASEQ_data/Sp_hs.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_hs --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_hs"

bsub -q 4G -o ./RSEM_Sp_log.std -e ./RSEM_Sp_log.err -J RSEM_Sp_log \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq  \
--left RNASEQ_data/Sp_log.left.fq.gz --right RNASEQ_data/Sp_log.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_log --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_log"

bsub -q 4G -o ./RSEM_Sp_plat.std -e ./RSEM_Sp_plat.err -J RSEM_Sp_plat \
"PATH_TO_TRINITY/util/align_and_estimate_abundance.pl --seqType fq  \
--left RNASEQ_data/Sp_plat.left.fq.gz --right RNASEQ_data/Sp_plat.right.fq.gz \
--transcripts trinity_reference/Trinity.fasta \
--output_prefix Sp_plat --est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir RSEM_Sp_plat"

Get status with bjobs

Resource usage

Job

ALPS Queue Name

CPU Time

Max Memory

Duration

RSEM_Sp_ds

4G

42.28 sec.

-

32 seconds

RSEM_Sp_hs

4G

36.40 sec.

-

26 seconds

RSEM_Sp_log

4G

14.28 sec.

-

13 seconds

RSEM_Sp_plat

4G

48.40 sec.

-

38 seconds

Estimations

Once the jobs are completed, we will find *.isoforms.results and *.genes.results in the output folders. These files contain the expected counts and normalized expression values of the Trinity transcripts (isoforms) and components (genes).

ls -la RSEM_Sp_*/*results

We can use head to examine these files. Your values may not be the same because the assembly results are not deterministic.

head RSEM_Sp_ds/Sp_ds.genes.results

head RSEM_Sp_ds/Sp_ds.isoforms.results

Last updated

Was this helpful?