PMDs

Calculate the amount overlapped between H1 and IMR90 PMDs.

cd ~/

bsub -q 16G -o stdout -e stderr "cat /work3/NRPB1219/hg18_h1_pmd.bed /work3/NRPB1219/hg18_imr90_pmd.bed | sortBed | mergeBed -i - | intersectBed -a - -b /work3/NRPB1219/hg18_h1_pmd.bed -wao | intersectBed -a - -b /work3/NRPB1219/hg18_imr90_pmd.bed -wao | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$6-\$5,sprintf(\"%.2f\",\$10/(\$3-\$2)),\$13-\$12,sprintf(\"%.2f\",\$17/(\$3-\$2)) }' > Output/h1_imr90_pmd_coverage.bed"

The first part of the full command line above produces the merged PMD BED output. We use cat to join the PMDs from H1 and IMR90, sort with sortBed and merge feature regions using mergeBed.

cat /work3/NRPB1219/hg18_h1_pmd.bed /work3/NRPB1219/hg18_imr90_pmd.bed | sortBed | mergeBed -i - | head

# Console output

chr1    9468    29468
chr1    118468  124468
chr1    554378  560468
chr1    828757  839593
chr1    842967  869372
chr1    882748  892862
chr1    896296  906791
chr1    919108  931096
chr1    943504  955027
chr1    957031  976543

Then, we pipe the output from the first part of the command to two consecutive intersectBed to identify the overlapped PMDs between H1 and merged output, and also IMR90 and merged output.

# Console output

# intersectBed
# col1~3: merged PMD; col4~10: H1; col11~17: IMR90
chr5    102737  107812  chr5    102737  107812  PMD3128 98      +       5075    .       -1      -1      .       -1      .       0
chr5    369466  375940  chr5    369466  375940  PMD3129 123     +       6474    .       -1      -1      .       -1      .       0
chr5    524506  667718  chr5    524579  536321  PMD3130 102     +       11742   chr5    524506  667718  PMD1472 97      +       143212
chr5    524506  667718  chr5    540844  549323  PMD3131 106     +       8479    chr5    524506  667718  PMD1472 97      +       143212
chr5    524506  667718  chr5    574176  582161  PMD3132 100     +       7985    chr5    524506  667718  PMD1472 97      +       143212
chr5    722732  728932  chr5    722732  728932  PMD3133 96      +       6200    .       -1      -1      .       -1      .       0
chr5    743822  749562  chr5    743822  749562  PMD3134 104     +       5740    .       -1      -1      .       -1      .       0
chr5    836617  838832  chr5    836617  838832  PMD3135 180     +       2215    .       -1      -1      .       -1      .       0
chr5    1056949 1064091 chr5    1056949 1064091 PMD3136 112     +       7142    .       -1      -1      .       -1      .       0
chr5    1150146 1365624 .       -1      -1      .       -1      .       0       chr5    1150146 1365624 PMD1473 98      +       215478

Use head to print the first 10 lines of the file to standard output..

grep chr5 ~/Output/h1_imr90_pmd_coverage.bed | head

# Console output

chr5    102737  107812  5075    1.00    0       0.00
chr5    369466  375940  6474    1.00    0       0.00
chr5    524506  667718  11742   0.08    143212  1.00
chr5    524506  667718  8479    0.06    143212  1.00
chr5    524506  667718  7985    0.06    143212  1.00
chr5    722732  728932  6200    1.00    0       0.00
chr5    743822  749562  5740    1.00    0       0.00
chr5    836617  838832  2215    1.00    0       0.00
chr5    1056949 1064091 7142    1.00    0       0.00
chr5    1150146 1365624 0       0.00    215478  1.00

Last updated