Perform DE analysis

We will now perform differential gene expression analysis using R.

Download the above R script, save in the LSLNGS2015 folder, unzip and run

cd ~/LSLNGS2015

Rscript --vanilla run_edgeR.R

Output shown in the terminal

Loading required package: limma
Loading required package: methods
null device
          1
null device
          1
null device
          1

View tabular results

The R script output and save the results of the analysis into 2 files: DE_analysis.gene.txt and DE_analysis.transcript.txt.

head DE_analysis.gene.txt

chr     left    right   gid     score   strand  name    source  biotype logFC   logCPM  F       PValue  FDR
1       14403   29570   ENSG00000227232 .       -       WASH7P  havana  unprocessed_pseudogene  2.89008492452335        2.97744129628027        103.370995316555        8.24483666908638e-09    3.265037563718e-08
1       89294   133723  ENSG00000238009 .       -       RP11-34P13.7    ensembl_havana  lincRNA 11.5139234165179        1.60559878078354        167.167165147925        2.10113367749614e-10    1.2064846331368e-09
1       135140  135895  ENSG00000268903 .       -       RP11-34P13.15   havana  processed_pseudogene    5.06459149590687        0.0881194042027241      42.478504740963 6.16479106039061e-06    1.38838943956985e-05
1       184922  200322  ENSG00000279457 .       -       FO538757.1      ensembl protein_coding  1.69721957446631        1.80636207182208        24.1093251919636        0.000103795172528207    0.000194091404654342
1       257863  297502  ENSG00000228463 .       -       AP006222.2      havana  lincRNA 13.0394746095367        3.07489541107415        477.124570057853        3.59556529227458e-14    4.97461947310632e-13
1       629639  630683  ENSG00000225630 .       +       MTND2P28        havana  unprocessed_pseudogene  3.13735915488644        4.44862764583624        199.762650496711        3.1768017645391e-10 1.76596074752208e-09
1       631073  632616  ENSG00000237973 .       +       MTCO1P12        havana  unprocessed_pseudogene  0.443696840509747       4.12363294925077        6.13472027306747        0.02307860917183    0.0308376479322919
1       633695  634376  ENSG00000248527 .       +       MTATP6P1        havana  unprocessed_pseudogene  1.34550546639341        5.77825431060693        3.37645905964982        0.191889704720111   0.224260991075299
1       725884  778626  ENSG00000228327 .       -       RP11-206L10.2   havana  transcribed_unprocessed_pseudogene      1.88938948596173        2.34383806326781        33.6517600212232        1.78166537736188e-05 3.74464140783175e-05

head DE_analysis.transcript.txt

chr     left    right   tid     score   strand  name    source  biotype logFC   logCPM  F       PValue  FDR
1       34553   36081   ENST00000417324 .       -       FAM138A havana  lincRNA 14.7287711778192        4.74754988064348        855.407406314551        9.10858850487461e-05    0.00433065266019644
1       110952  129173  ENST00000471248 .       -       RP11-34P13.7    ensembl_havana  lincRNA -11.6703832090145       1.69559510020159        212.015384690327        0.000721411986621592    0.0087655978717093
1       135140  135895  ENST00000494149 .       -       RP11-34P13.15   havana  processed_pseudogene    0.335112637932721       0.296069849428322       0.0990772814490298      0.768783599380141   0.826519388419821
1       165888  168767  ENST00000491962 .       -       RP11-34P13.13   havana  processed_transcript    0.653389858107978       2.37876611393755        2.69735804942855        0.176126586222385   0.266941473371366
1       257863  264733  ENST00000442116 .       -       AP006222.2      havana  lincRNA 2.10853347357073        1.82097762397731        30.5633854683598        0.00528769030910118     0.0209971307641988
1       258567  259024  ENST00000450734 .       -       RPL23AP21       havana  processed_pseudogene    1.85301838291288        3.06998220640272        13.5659404366886        0.0212839694345138  0.0554117574914843
1       266360  297502  ENST00000424587 .       -       AP006222.2      havana  lincRNA -10.8844390575141       0.915559268435426       124.294800531017        0.00158219973633488     0.0119137545202454
1       347981  348366  ENST00000458203 .       -       RPL23AP24       havana  processed_pseudogene    1.76650999454457        1.51923976505446        7.9898205742989 0.0477136055714843      0.0994641817520193
1       357382  359681  ENST00000441866 .       -       RP5-857K21.15   havana  lincRNA 0.260927151869554       0.438550143910703       0.120976329938423       0.745572526693337       0.807497648960747

Download PDFs and view plots

Multidimensional scaling (MDS) plot

MDS plot shows clear separation of the GM12878 versus K562 samples.

Mean-Difference (MD) Plot per sample

The plotMD() from limma generate mean difference plot displaying the log fold changes and average expression values for each gene (or M versus A plot or MA plot).

Gene-level plots

Transcript-level plots

Mean-Difference (MD) Plot, between 2 cell lines

The differentially expressed genes (adjusted P = 0.05) are highlighted in red.

Last updated