Add CpG Islands Co-localization Information to HMR BED Files

Here, we will prepare the HMR BED file to include CpG island information using intersectBed and groupBy commands.

# Console output

# intersectBed
chr1    554333  558188  HYPO0   3855    .       0       0
chr1    558271  560164  HYPO1   1893    .       0       0
chr1    703534  704946  HYPO2   1412    CpG|60  563     563
chr1    714082  714363  HYPO3   281     .       0       0
chr1    751975  753029  HYPO4   1054    CpG|115 1029    750

# groupBy
# col5 = HMR length; col6 = No. of CGI; col7 = CGI length; col8 = No. of bp overlapped between HMR & CGI
chr1    554333  558188  HYPO0   3855    1       0       0
chr1    558271  560164  HYPO1   1893    1       0       0
chr1    703534  704946  HYPO2   1412    1       563     563
chr1    714082  714363  HYPO3   281     1       0       0
chr1    751975  753029  HYPO4   1054    1       1029    750
cd ~/

bsub -q 16G -o stdout -e stderr "intersectBed -a /work3/NRPB1219/hg18_h1_hmr.bed -b Data/cpgIslandExt.bed.gz -wao | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,\$3-\$2,\$10,\$9-\$8,\$13 }' | groupBy -i - -g 1-5 -c 7,7,8 -o count,sum,sum | awk -F $'\t' 'BEGIN { OFS=FS } { if(\$7 == 0) print \$1,\$2,\$3,\$4,\$5,0,0,0; else print \$0 }' > Output/hg18_h1_hmr.cgi.bed"

bsub -q 16G -o stdout -e stderr "intersectBed -a /work3/NRPB1219/hg18_imr90_hmr.bed -b Data/cpgIslandExt.bed.gz -wao | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,\$3-\$2,\$10,\$9-\$8,\$13 }' | groupBy -i - -g 1-5 -c 7,7,8 -o count,sum,sum | awk -F $'\t' 'BEGIN { OFS=FS } { if(\$7 == 0) print \$1,\$2,\$3,\$4,\$5,0,0,0; else print \$0 }' > Output/hg18_imr90_hmr.cgi.bed"

Use bjobs to check the all jobs have completed and ls to check the files was in the "Output" folder.

ls -la ~/Output/hg18_*_hmr.cgi.bed

# Console output

-rw------- 1 s00yao00 s00yao00 2044351 2014-12-20 18:54 /home/s00yao00/Output/hg18_h1_hmr.cgi.bed
-rw------- 1 s00yao00 s00yao00 3303689 2014-12-20 18:54 /home/s00yao00/Output/hg18_imr90_hmr.cgi.bed

Last updated