Obtain sequencing data

Real-world RNA-seq data

This tutorial will use the dataset published by Hannah R. Parker on sessile serrated adenomas/polyps

Raw reads

We will retreieve the raw data from ArrayExpress E-MTAB-6951

$ cd /home/USER/SSAPs
$ mkdir fastqs
$ cd fastqs

Download download.sh , place in the fastqs folder and execute the file using sh download.sh

This will download the paired-end fastqs of the 10 pairs of 0-IIa pre-lesion and adjacent normal tissue samples

$ ls *gz
ERR2675454_1.fastq.gz  ERR2675465_1.fastq.gz  ERR2675478_1.fastq.gz
ERR2675454_2.fastq.gz  ERR2675465_2.fastq.gz  ERR2675478_2.fastq.gz
ERR2675455_1.fastq.gz  ERR2675468_1.fastq.gz  ERR2675479_1.fastq.gz
ERR2675455_2.fastq.gz  ERR2675468_2.fastq.gz  ERR2675479_2.fastq.gz
ERR2675458_1.fastq.gz  ERR2675469_1.fastq.gz  ERR2675480_1.fastq.gz
ERR2675458_2.fastq.gz  ERR2675469_2.fastq.gz  ERR2675480_2.fastq.gz
ERR2675459_1.fastq.gz  ERR2675472_1.fastq.gz  ERR2675481_1.fastq.gz
ERR2675459_2.fastq.gz  ERR2675472_2.fastq.gz  ERR2675481_2.fastq.gz
ERR2675460_1.fastq.gz  ERR2675473_1.fastq.gz  ERR2675484_1.fastq.gz
ERR2675460_2.fastq.gz  ERR2675473_2.fastq.gz  ERR2675484_2.fastq.gz
ERR2675461_1.fastq.gz  ERR2675476_1.fastq.gz  ERR2675485_1.fastq.gz
ERR2675461_2.fastq.gz  ERR2675476_2.fastq.gz  ERR2675485_2.fastq.gz
ERR2675464_1.fastq.gz  ERR2675477_1.fastq.gz
ERR2675464_2.fastq.gz  ERR2675477_2.fastq.gz

Clinical information

The clinical data has been simplified based on the information provided in E-MTAB-6951. Download the clinical.txt file and place in the /home/USER/SSAPs folder

clinical.txt
ENA_RUN	individual	age	sex	disease	sampling_site	paris_classification	microscopic_appearance	clinical_information	number_of_lesions	braf_status	kras_status
ERR2675454	S1	88	female	Colon Sessile Serrated Adenoma/Polyp	pre-lesion	0-IIa	SSA/P	no dysplasia	2	BRAF mutation c.1799T>A (V600E)	wild type KRAS
ERR2675455	S1	88	female	Colon Sessile Serrated Adenoma/Polyp	normal tissue adjacent to pre-lesion	normal	normal	none	NA	normal	normal
ERR2675458	S11	57	male	Colon Sessile Serrated Adenoma/Polyp	pre-lesion	0-IIa	SSA/P	no dysplasia	2	wild type BRAF	KRAS mutation c.37G>C (G13R)
ERR2675459	S11	57	male	Colon Sessile Serrated Adenoma/Polyp	normal tissue adjacent to pre-lesion	normal	normal	none	NA	normal	normal
ERR2675460	S12	49	male	Colon Sessile Serrated Adenoma/Polyp	pre-lesion	0-IIa	SSA/P	no dysplasia	3	BRAF mutation c.1799T>A (V600E)	wild type KRAS
ERR2675461	S12	49	male	Colon Sessile Serrated Adenoma/Polyp	normal tissue adjacent to pre-lesion	normal	normal	none	NA	normal	normal
ERR2675464	S15	36	female	Colon Sessile Serrated Adenoma/Polyp	pre-lesion	0-IIa	SSA/P	no dysplasia	2	BRAF mutation c.1799T>A (V600E)	wild type KRAS
ERR2675465	S15	36	female	Colon Sessile Serrated Adenoma/Polyp	normal tissue adjacent to pre-lesion	normal	normal	none	NA	normal	normal
ERR2675468	S17	53	female	Colon Sessile Serrated Adenoma/Polyp	pre-lesion	0-IIa	SSA/P	no dysplasia	1	BRAF mutation c.1799T>A (V600E)	wild type KRAS

Last updated