Per-sample 2-pass mapping is enabled with --twopassMode Basic and the --sjdbOverhang option is set to 150 (the same value used to generate genome index here)
Alignment is run with 6 threads --runThreadN 6
The --quantMode TranscriptomeSAM option generates alignments translated into transcript coordinates in the Aligned.toTranscriptome.out.bam file necessary for downstream quantification using the Salmon/RSEM workflow
The --quantMode GeneCounts option allows STAR to count the number reads per gene while mapping and outputs the ReadsPerGene.out.tab count table
With --quantMode TranscriptomeSAM GeneCounts, STAR produces both the Aligned.toTranscriptome.out.bam and ReadsPerGene.out.tab outputs
The --quantTranscriptomeBan IndelSoftclipSingleend option (default) satisfies RSEM requirements, i.e. soft-clipping or indels are not allowed. However, it can be changed to--quantTranscriptomeBan Singleend when using other quantification software such as Salmon and eXpress.
Detailed descriptions of all the parameters and options are available in the STARmanual.pdf
With --quantMode GeneCounts, STAR outputs read counts per gene into ReadsPerGene.out.tab file with 4 columns which correspond to different strandedness options:
column 1: gene ID
column 2: counts for unstranded RNA-seq (htseq-count option --stranded no)
column 3: counts for the 1st read strand aligned with RNA (htseq-count option --stranded yes)
column 4: counts for the 2nd read strand aligned with RNA (htseq-count option --stranded reverse)
Alignment with STAR to the target genome, followed by quantification using RSEM with 6 threads. The --strandedness reverse was used to allow RSEM to quantify with the correct strandedness setting
$cd/home/USER/SSAPs$mkdirrsem_star$ declare -a runname=("ERR2675454" "ERR2675455" "ERR2675458" "ERR2675459" "ERR2675460" "ERR2675461" "ERR2675464" "ERR2675465" "ERR2675468" "ERR2675469" "ERR2675472" "ERR2675473" "ERR2675476" "ERR2675477" "ERR2675478" "ERR2675479" "ERR2675480" "ERR2675481" "ERR2675484" "ERR2675485")
for id in ${runname[@]}; do bam=star/${id}_Aligned.toTranscriptome.out.bam out=rsem_star/${id} log=rsem_star/${id}.logrsem-calculate-expression-p6--paired-end--alignments \--estimate-rspd \--calc-ci \--ci-memory32000 \--seed123456 \--no-bam-output \--strandednessreverse \ $bam /home/USER/db/refanno/gencode.v33_rsem-1.3.3 $out > $logdone
# RSEM parameters--paired-end-inputreadsarepaired-endreads--alignments-inputfilecontainsalignmentsinSAM/BAM/CRAMformat--estimate-rspd-estimatethereadstartpositiondistributionfromdata--calc-ci-calculate95%credibilityintervals (CI) and posterior mean estimates (PME)--ci-memory32000-maximummemory (MB) of the auxiliary buffer used for computing CI--seed123456-settheseedfortherandomnumbergeneratorsusedincalculatingPMEandCI--no-bam-output-donotoutputanyBAMfile--strandednessreverse-definesthestrandednessoftheRNA-Seqreads
cut -f1-8 rsem_star/ERR2675454.genes.results | head