DNA Methylation Sequencing Analysis
  • Introduction
  • Data Preparation
    • Locate the MethPipe Files
    • Download Utilities
    • Download Annotations
    • Annotation File Preparation – Defining Genomic Regions
  • Analysis Work Flow
    • DNA Methylation at Genomic Bins
    • DNA Methylation at CpG Islands
    • DNA Methylation at TFBS
    • DNA Methylation at Various Genic Structure Regions
    • DNA Methylation at Repeat Elements
    • Add CpG Islands Co-localization Information to HMR BED Files
    • Similarity and Differences of HMRs and PMDs from H1 and IMR90
      • HMRs
      • PMDs
  • Visualization Using R
    • Install R Libraries
    • Execute the R Scripts
  • An introduction of UCSC Genome Browser
    • General Usage
    • The Compressed Binary Index Format
Powered by GitBook
On this page

Was this helpful?

  1. Analysis Work Flow

DNA Methylation at CpG Islands

Same as Chapter 2.1, we used intersectBed and groupBy commands to calculate the average methylation values at each CpG island.

# Console output

# intersectBed 
chr1    18598   19673   CpG|116 216     +       chr1    18598   18599   0
chr1    18598   19673   CpG|116 216     +       chr1    18612   18613   0
chr1    18598   19673   CpG|116 216     +       chr1    18614   18615   0
chr1    18598   19673   CpG|116 216     +       chr1    18627   18628   0
chr1    18598   19673   CpG|116 216     +       chr1    18636   18637   0

# groupBy
chr1    18598   19673   116     0
chr1    124987  125426  30      0
chr1    317653  318092  29      0
chr1    427014  428027  84      0
chr1    439136  440407  99      0
cd ~/

bsub -q 16G -o stdout -e stderr "intersectBed -a Data/cpgIslandExt.bed.gz -b /work3/NRPB1219/hg18_h1_meth.bedGraph -wa -wb | groupBy -i - -g 1-3 -c 10,10 -o count,mean | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,sprintf(\"%.4f\",\$5) }' > Output/cpgIslandExt.h1.meth"

bsub -q 16G -o stdout -e stderr "intersectBed -a Data/cpgIslandExt.bed.gz -b /work3/NRPB1219/hg18_imr90_meth.bedGraph -wa -wb | groupBy -i - -g 1-3 -c 10,10 -o count,mean | awk -F $'\t' 'BEGIN { OFS=FS } { print \$1,\$2,\$3,\$4,sprintf(\"%.4f\",\$5) }' > Output/cpgIslandExt.imr90.meth"

Use bjobs to check the all jobs have completed and ls to check the files was in the "Output" folder.

ls -la ~/Output/cpgIslandExt.*.meth

# Console output

-rw------- 1 s00yao00 s00yao00 960543 2014-12-20 17:45 /home/s00yao00/Output/cpgIslandExt.h1.meth
-rw------- 1 s00yao00 s00yao00 960543 2014-12-20 17:44 /home/s00yao00/Output/cpgIslandExt.imr90.meth
PreviousDNA Methylation at Genomic BinsNextDNA Methylation at TFBS

Last updated 5 years ago

Was this helpful?