We can use STAR --runMode inputAlignmentsFromBAM
to generate the RNA-Seq "signal" files using the coordinate-sorted BAM file (Aligned.sortedByCoord.out.bam
). It produces the number of reads mapped to each genomic position. The output file (in bedGraph format) can be visualized in genome browsers such as the UCSC genomic browser or IGV browser .
Execute
Copy cd ~/LSLNGS2015/
STAR --runMode inputAlignmentsFromBAM --runThreadN 6 \
--inputBAMfile RNASEQ_data/star_GM12878_rep1/Aligned.sortedByCoord.out.bam --outWigType bedGraph \
--outWigStrand Stranded --outWigNorm RPM --outFileNamePrefix RNASEQ_data/GM1287_rep1_
STAR --runMode inputAlignmentsFromBAM --runThreadN 6 \
--inputBAMfile RNASEQ_data/star_GM12878_rep2/Aligned.sortedByCoord.out.bam --outWigType bedGraph \
--outWigStrand Stranded --outWigNorm RPM --outFileNamePrefix RNASEQ_data/GM1287_rep2_
Resource usage
Options
--runThreadN
number of threads to run STAR
--inputBAMfile
path to coordinate-sorted BAM file
--outWigType
type of signal output (bedGraph or wiggle)
--outWigStrand
strandedness of wiggle/bedGraph output (Stranded or Unstranded)
--outWigNorm
type of normalization for the signal (RPM or none)
--outFileNamePrefix
output files name prefix
Take a look at the generated signal files
ls -la ~/LSLNGS2015/RNASEQ_data/*_Signal.*.bg
Copy -rw------- 1 u00ycl06 u00ycm02 813667524 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 848273078 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 778409908 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 817934615 2015-11-16 00:03 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep1_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 787034556 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 820179772 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 753637183 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 791210099 2015-11-16 00:20 /work3/LSLNGS2015/RNASEQ_data/GM1287_rep2_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 315665648 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 319934522 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 295364571 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 300931098 2015-11-16 00:04 /work3/LSLNGS2015/RNASEQ_data/K562_rep1_Signal.Unique.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 444649222 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.UniqueMultiple.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 450522698 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.UniqueMultiple.str2.out.bg
-rw------- 1 u00ycl06 u00ycm02 415331427 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.Unique.str1.out.bg
-rw------- 1 u00ycl06 u00ycm02 423833653 2015-11-16 00:28 /work3/LSLNGS2015/RNASEQ_data/K562_rep2_Signal.Unique.str2.out.bg
How to visualize on genome browsers
You can load bedGraph into IGV. Alternatively, you can also convert the bedGraph to bigWig using bedGraphToBigWig and load bigWig to IGV or UCSC Genome Browser.
bedGraph to bigWig
Create the chrom.sizes file from FASTA sequence
samtools faidx Homo_sapiens.GRCh38.dna.primary_assembly.fa
Using bedGraphToBigWig
to perform conversion
bedGraphToBigWig in.bedGraph chrom.sizes out.bw
Unfortunately, the bedGraph might not be sorted properly. If you encounter error when executing bedGraphToBigWig
, such as:
GM1287_rep1_Signal.Unique.str1.out.bg is not case-sensitive sorted at line 28093639.
Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C, or bedSort and try again.
You can use sort
to sort the file, such as:
LC_COLLATE=C sort -k1,1 -k2,2n -o GM1287_rep1_Signal.Unique.str1.out.bg GM1287_rep1_Signal.Unique.str1.out.bg
Then re-run the conversion
Copy bedGraphToBigWig GM1287_rep1_Signal.Unique.str1.out.bg \
../GENOME_data/Homo_sapiens.GRCh38.dna.primary_assembly.fa.fai \
GM1287_rep1_Signal.Unique.str1.out.bw
View in UCSC Genome Browser (Requires access to webserver) a bug regarding chr name
First, you need to upload/place the bigWig files at a location accessible from the World Wide Web, i.e. on a web server, for UCSC Genome Browser to use and display it.
To view the bigWig as custom tracks, we create a text file (e.g. mytracks.txt) with following information.
Copy track type=bigWig name=GM1287_rep1_str1 smoothingWindow=4 color=255,0,0 autoScale=on viewLimits=1:200 visibility=full windowingFunction=maximum bigDataUrl=https://domain_name_or_ip/folder_name/GM1287_rep1_Signal.UniqueMultiple.str1.out.bw
track type=bigWig name=GM1287_rep1_str2 smoothingWindow=4 color=0,0,255 autoScale=on viewLimits=1:200 visibility=full windowingFunction=maximum bigDataUrl=https://domain_name_or_ip/folder_name/GM1287_rep1_Signal.UniqueMultiple.str2.out.bw
Then, in your web browser, use the below URL to call your file:
https://genome.ucsc.edu/cgi-bin/hgTracks?db=GRCh38&position=chr6:122657583-122663796&hgct_customText=https://domain_name_or_ip/folder_name/mytracks.txt