Visualization

We can use STAR --runMode inputAlignmentsFromBAM to generate the RNA-Seq "signal" files using the coordinate-sorted BAM file (Aligned.sortedByCoord.out.bam). It produces the number of reads mapped to each genomic position. The output file (in bedGraph format) can be visualized in genome browsers such as the UCSC genomic browser or IGV browser.

Execute

cd ~/LSLNGS2015/
STAR --runMode inputAlignmentsFromBAM --runThreadN 6 \
--inputBAMfile RNASEQ_data/star_GM12878_rep1/Aligned.sortedByCoord.out.bam --outWigType bedGraph \
--outWigStrand Stranded --outWigNorm RPM --outFileNamePrefix RNASEQ_data/GM1287_rep1_
STAR --runMode inputAlignmentsFromBAM --runThreadN 6 \
--inputBAMfile RNASEQ_data/star_GM12878_rep2/Aligned.sortedByCoord.out.bam --outWigType bedGraph \
--outWigStrand Stranded --outWigNorm RPM --outFileNamePrefix RNASEQ_data/GM1287_rep2_

Resource usage

Sample

ALPS Queue Name

CPU Time

Max Memory

Duration

GM12878

16G

2223.97 sec.

7 GB

37 minutes 8 seconds

K562

16G

2703.06 sec.

7 Gb

45 minutes 8 seconds

Options

--runThreadN number of threads to run STAR

--inputBAMfile path to coordinate-sorted BAM file

--outWigType type of signal output (bedGraph or wiggle)

--outWigStrand strandedness of wiggle/bedGraph output (Stranded or Unstranded)

--outWigNorm type of normalization for the signal (RPM or none)

--outFileNamePrefix output files name prefix

Take a look at the generated signal files

ls -la ~/LSLNGS2015/RNASEQ_data/*_Signal.*.bg

-rw------- 1 u00ycl06 u00ycm02 813667524 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 848273078 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 778409908 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 817934615 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 787034556 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 820179772 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 753637183 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 791210099 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 315665648 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 319934522 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 295364571 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 300931098 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 444649222 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 450522698 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 415331427 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 423833653 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.Unique.str2.out.bg

How to visualize on genome browsers

You can load bedGraph into IGV. Alternatively, you can also convert the bedGraph to bigWig using bedGraphToBigWig and load bigWig to IGV or UCSC Genome Browser.

bedGraph to bigWig

Create the chrom.sizes file from FASTA sequence

samtools faidx Homo_sapiens.GRCh38.dna.primary_assembly.fa

Using bedGraphToBigWig to perform conversion

bedGraphToBigWig in.bedGraph chrom.sizes out.bw

Unfortunately, the bedGraph might not be sorted properly. If you encounter error when executing bedGraphToBigWig, such as:

GM1287_rep1_Signal.Unique.str1.out.bg is not case-sensitive sorted at line 28093639. Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C, or bedSort and try again.

You can use sort to sort the file, such as:

LC_COLLATE=C sort -k1,1 -k2,2n -o GM1287_rep1_Signal.Unique.str1.out.bg GM1287_rep1_Signal.Unique.str1.out.bg

Then re-run the conversion

bedGraphToBigWig GM1287_rep1_Signal.Unique.str1.out.bg \
../GENOME_data/Homo_sapiens.GRCh38.dna.primary_assembly.fa.fai \
GM1287_rep1_Signal.Unique.str1.out.bw

View in UCSC Genome Browser (Requires access to webserver) a bug regarding chr name

First, you need to upload/place the bigWig files at a location accessible from the World Wide Web, i.e. on a web server, for UCSC Genome Browser to use and display it.

To view the bigWig as custom tracks, we create a text file (e.g. mytracks.txt) with following information.

track type=bigWig name=GM1287_rep1_str1 smoothingWindow=4 color=255,0,0 autoScale=on viewLimits=1:200 visibility=full windowingFunction=maximum bigDataUrl=https://domain_name_or_ip/folder_name/GM1287_rep1_Signal.UniqueMultiple.str1.out.bw
track type=bigWig name=GM1287_rep1_str2 smoothingWindow=4 color=0,0,255 autoScale=on viewLimits=1:200 visibility=full windowingFunction=maximum bigDataUrl=https://domain_name_or_ip/folder_name/GM1287_rep1_Signal.UniqueMultiple.str2.out.bw

Then, in your web browser, use the below URL to call your file:

https://genome.ucsc.edu/cgi-bin/hgTracks?db=GRCh38&position=chr6:122657583-122663796&hgct_customText=https://domain_name_or_ip/folder_name/mytracks.txt