QC & trimming

QC

The trimming process is run with 2 threads (-t 2) and took about 1.3 hours to complete. Results are placed in the fastqc folder

$ cd /home/USER/SSAPs
$ mkdir fastqc

$ declare -a runname=("ERR2675454" "ERR2675455" "ERR2675458" "ERR2675459" "ERR2675460" "ERR2675461" "ERR2675464" "ERR2675465" "ERR2675468" "ERR2675469" "ERR2675472" "ERR2675473" "ERR2675476" "ERR2675477" "ERR2675478" "ERR2675479" "ERR2675480" "ERR2675481" "ERR2675484" "ERR2675485")

for id in ${runname[@]}; do
        fq1=fastqs/${id}_1.fastq.gz
        fq2=fastqs/${id}_2.fastq.gz

        fastqc -t 2 --extract -o fastqc $fq1 $fq2
done

Results can be view by opening the *.html files in web browser or summary.txt andfastqc_data.txt in the output folders

fastqc/ERR2675454_1_fastqc/summary.txt
PASS    Basic Statistics        ERR2675454_1.fastq.gz
PASS    Per base sequence quality       ERR2675454_1.fastq.gz
PASS    Per tile sequence quality       ERR2675454_1.fastq.gz
PASS    Per sequence quality scores     ERR2675454_1.fastq.gz
WARN    Per base sequence content       ERR2675454_1.fastq.gz
PASS    Per sequence GC content ERR2675454_1.fastq.gz
PASS    Per base N content      ERR2675454_1.fastq.gz
PASS    Sequence Length Distribution    ERR2675454_1.fastq.gz
FAIL    Sequence Duplication Levels     ERR2675454_1.fastq.gz
PASS    Overrepresented sequences       ERR2675454_1.fastq.gz
FAIL    Adapter Content ERR2675454_1.fastq.gz

Per base sequence quality of ERR2675454_1.fastq.gz

Adapter removal and trimming

The trimming process is run with 6 threads (threads=6) and took about 1.6 hours to complete.

Generated log files contain information about the number of reads and bases removed and passed the trimming processing

Last updated

Was this helpful?