# Alignment-free method

## Salmon

* Quantifying in **mapping-based mode**. There is an alternative **alignment-based mode** where one can align raw reads with another mapper and supplies the alignment (BAM format) in transcript coordinates to Salmon, you can read more about this [here](https://salmon.readthedocs.io/en/latest/salmon.html#quantifying-in-alignment-based-mode)
* We are using `--libType ISR` as the sequencing libraries of this dataset were prepared following the Illumina TruSeq Stranded Total RNA protocol. Read [here](https://salmon.readthedocs.io/en/latest/salmon.html?highlight=libType#what-s-this-libtype) about choosing the appropriate `libType`&#x20;
* `salmon quant` is run with 6 threads `--threads 6`

```bash
$ cd /home/USER/SSAPs

$ declare -a runname=("ERR2675454" "ERR2675455" "ERR2675458" "ERR2675459" "ERR2675460" "ERR2675461" "ERR2675464" "ERR2675465" "ERR2675468" "ERR2675469" "ERR2675472" "ERR2675473" "ERR2675476" "ERR2675477" "ERR2675478" "ERR2675479" "ERR2675480" "ERR2675481" "ERR2675484" "ERR2675485")

for id in ${runname[@]}; do
        trim1=trimmed/${id}_1.fastq.gz
        trim2=trimmed/${id}_2.fastq.gz

        salmon quant --threads 6 \
        --index /home/USER/db/refanno/gencode.v33_decoys_salmon-1.2.1 \
        --libType ISR \
        --gcBias \
        --output salmon/$id \
        --mates1 $trim1 --mates2 $trim2
done
```

{% code title="salmon/ERR2675454/quant.sf" %}

```bash
Name	Length	EffectiveLength	TPM	NumReads
ENST00000456328.2	1657	1455.216	0.000000	0.000
ENST00000450305.2	632	468.000	0.000000	0.000
ENST00000488147.1	1351	1031.467	5.868714	134.186
ENST00000619216.1	68	9.000	0.000000	0.000
ENST00000473358.1	712	548.000	0.000000	0.000
ENST00000469289.1	535	371.000	0.000000	0.000
ENST00000607096.1	138	26.000	0.000000	0.000
ENST00000417324.1	1187	1023.000	0.000000	0.000
ENST00000461467.1	590	426.000	0.000000	0.000
```

{% endcode %}

## Salmon with bootstrap

Inspired by kallisto, Salmon also provides the ability to compute bootstrapped abundance estimates. Such estimates can be useful for downstream (e.g. differential expression analysis) tools that can make use of such uncertainty estimates (e.g. sleuth).

Bootstrap can be enabled by passing the `--numBootstraps N` option and a positive integer that dictates the number of bootstrap samples to compute. The more samples computed, the better the estimates of varaiance, but the more computation (and time) required.

```bash
$ cd /home/USER/SSAPs

$ declare -a runname=("ERR2675454" "ERR2675455" "ERR2675458" "ERR2675459" "ERR2675460" "ERR2675461" "ERR2675464" "ERR2675465" "ERR2675468" "ERR2675469" "ERR2675472" "ERR2675473" "ERR2675476" "ERR2675477" "ERR2675478" "ERR2675479" "ERR2675480" "ERR2675481" "ERR2675484" "ERR2675485")

for id in ${runname[@]}; do
        trim1=trimmed/${id}_1.fastq.gz
        trim2=trimmed/${id}_2.fastq.gz

        salmon quant --threads 6 \
        --index /home/USER/db/refanno/gencode.v33_decoys_salmon-1.2.1 \
        --libType ISR \
        --gcBias \
        --numBootstraps 100 \
        --output salmon-bs/$id \
        --mates1 $trim1 --mates2 $trim2
done
```

```bash
$ ls salmon-bs/ERR2675454/aux_info
ambig_info.tsv    exp_gc.gz       observed_bias_3p.gz
bootstrap         fld.gz          observed_bias.gz
expected_bias.gz  meta_info.json  obs_gc.gz

$ ls salmon-bs/ERR2675454/aux_info/bootstrap/
bootstraps.gz  names.tsv.gz
```

## Kallisto with bootstrap

* `kallisto quant` is run with 6 threads `-t 6` and `--rf-stranded` as the appropriate library type
* Bootstrap is enabled by passing the `-b N` option and a positive integer that dictates the number of bootstrap samples to compute
* Unlike Salmon, Kallisto **does not** create the top-level folder containing sample-specific outfiles, hence we need to create the top-level folder `kallisto` before running `kallisto quant`

```bash
$ cd /home/USER/SSAPs

$ declare -a runname=("ERR2675454" "ERR2675455" "ERR2675458" "ERR2675459" "ERR2675460" "ERR2675461" "ERR2675464" "ERR2675465" "ERR2675468" "ERR2675469" "ERR2675472" "ERR2675473" "ERR2675476" "ERR2675477" "ERR2675478" "ERR2675479" "ERR2675480" "ERR2675481" "ERR2675484" "ERR2675485")

mkdir kallisto
for id in ${runname[@]}; do
        trim1=trimmed/${id}_1.fastq.gz
        trim2=trimmed/${id}_2.fastq.gz

        kallisto quant -t 6 \
        -i /home/USER/db/refanno/gencode.v33_kallisto-0.46.2 \
        --rf-stranded -b 100 \
        -o kallisto/$id $trim1 $trim2
done
```

```bash
$ ls kallisto/ERR2675454
abundance.h5  abundance.tsv  run_info.json
```

{% code title="kallisto/ERR2675454/abundance.tsv" %}

```bash
target_id	length	eff_length	est_counts	tpm
ENST00000456328.2	1657	1493.74	3.28824	0.144435
ENST00000450305.2	632	468.87	0	0
ENST00000488147.1	1351	1187.74	67.8655	3.74895
ENST00000619216.1	68	16.6641	0	0
ENST00000473358.1	712	548.742	0	0
ENST00000469289.1	535	372.101	0	0
ENST00000607096.1	138	27.0133	0	0
ENST00000417324.1	1187	1023.74	0	0
ENST00000461467.1	590	426.87	0.5	0.0768524
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ycl6.gitbook.io/guide-to-rna-seq-analysis/raw-read-processing/mapping/alignment-free-method.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
