DNA Methylation at Repeat Elements
The BED file containing the repeat element annotations (prepared in Chapter 1.3) was located in the Data
folder. In the hg18 version of the RepMask 3.2.7 annotation, the repeat elements were categorized into 21 repeat classes (see Table 2). In this demonstration, we will calculate an compare the methylation levels of five types common repeats: SINE, LINE, LTR, Satellite and DNA.
Table 2. Number of features in each repeat class in the RepMask 3.2.7 annotation (build hg18)
No. of Entries
Repeat Classes
1757823
SINE
1468898
LINE
699087
LTR
454482
DNA
407205
Simple_repeat
364098
Low_complexity
6998
Unknown
6096
Satellite
4251
snRNA
3589
Other
2204
RC
1871
DNA?
1751
tRNA
1715
rRNA
1437
srpRNA
1296
scRNA
715
RNA
417
SINE?
123
LTR?
93
Unknown?
52
LINE?
Like before, we use intersectBed
and groupBy
commands to calculate the average methylation values at each repeat element from the five repeat classes.
# Console output
# intersectBed
chr1 468 1310 Satellite 0 - telo TAR1 chr1 468 469 0
chr1 468 1310 Satellite 0 - telo TAR1 chr1 470 471 0.666667
chr1 468 1310 Satellite 0 - telo TAR1 chr1 483 484 0.5
chr1 468 1310 Satellite 0 - telo TAR1 chr1 488 489 1
chr1 468 1310 Satellite 0 - telo TAR1 chr1 492 493 0.857143
# groupBy
chr1 468 1310 Satellite 89 0.21305
chr1 1540 1643 DNA 1 0
chr1 5128 5208 SINE 1 0
chr1 8769 8911 LINE 2 0
chr1 9877 10268 LINE 8 0
cd ~/
bsub -q 16G -o stdout -e stderr "intersectBed -a Data/hg18.rmskRM327.bed.gz -b /work3/NRPB1219/hg18_h1_meth.bedGraph -wa -wb | grep -Pw \"Satellite|DNA|LTR|LINE|SINE\" | groupBy -i - -g 1-4 -c 12,12 -o count,mean | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,\$5,sprintf(\"%.4f\",\$6) }' > Output/hg18.rmskRM327.h1.meth"
bsub -q 16G -o stdout -e stderr "intersectBed -a Data/hg18.rmskRM327.bed.gz -b /work3/NRPB1219/hg18_imr90_meth.bedGraph -wa -wb | grep -Pw \"Satellite|DNA|LTR|LINE|SINE\" | groupBy -i - -g 1-4 -c 12,12 -o count,mean | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,\$5,sprintf(\"%.4f\",\$6) }' > Output/hg18.rmskRM327.imr90.meth"
Use bjobs
to check the all jobs have completed and ls
to check the files was in the "Output" folder.
ls -la ~/Output/hg18.rmskRM327.*.meth
# Console output
-rw------- 1 s00yao00 s00yao00 119358409 2014-12-20 19:57 /home/s00yao00/Output/hg18.rmskRM327.h1.meth
-rw------- 1 s00yao00 s00yao00 119358409 2014-12-20 19:57 /home/s00yao00/Output/hg18.rmskRM327.imr90.meth
Last updated
Was this helpful?