Remapping the reads onto the transcripts or the genome
Go to the same directory as this morning

cd ~/exDay2LF

a) remapping onto a transcriptome
bwa example (or use mapbwape.sh ./mapbwape.sh setAc Trinity.fasta 350)
bwa index Trinity.fasta
bwa aln -t 3 Trinity.fasta setBc_1.fq > setBc_1.sai
bwa aln -t 3 Trinity.fasta setBc_2.fq > setBc_2.sai
bwa sampe -a 350 Trinity.fasta setBc_1.sai setBc_2.sai setBc_1.fq setBc_2.fq > Trinity.fasta.sam
samtools faidx Trinity.fasta
samtools view -b -t Trinity.fasta.fai Trinity.fasta.sam > Trinity.fasta.bam
samtools sort Trinity.fasta.bam Trinity.fasta
samtools index Trinity.fasta.bam


bowtie example
#index reference first
bowtie-build –f Trinity.fa refidx
# map reads
bowtie -n 1 -l 101 -I 250 -X 450 --best --strata –un unmapped -p 3 refidx -1
setBc_1.fq -2 setBc_2.fq > mybest.sam
!
# convert to SAM & BAM
# index reference
samtools faidx reference.fa
# convert SAM to BAM
samtools view -T reference.fa -b mybest.sam > mybest.bam
# sort BAM
samtools sort mybest.bam mybest.sorted.bam
# index BAM
samtools index mybest.sorted.bam



b) remapping onto reference genome
bwa example (or use mapbwape.sh ./mapbwape.sh setAc Celegans_genome.fa 350)
bwa index Celegans_genome.fa
bwa aln -t 3 Celegans_genome.fa setBc_1.fq > setBc_1.sai
bwa aln -t 3 Celegans_genome.fa setBc_2.fq > setBc_2.sai
bwa sampe -a 350 Celegans_genome.fa setBc_1.sai setBc_2.sai setBc_1.fq setBc_2.fq > Celegans_genome.fa.sam
samtools faidx Celegans_genome.fa
samtools view -b -t Celegans_genome.fa.fai Celegans_genome.fa.sam > Celegans_genome.fa.bam
samtools sort Celegans_genome.fa.bam Celegans_genome.fa
samtools index Celegans_genome.fa.bam


bowtie example
to come

c) Visualize bam files with IGV, start it with igv.sh
load genome or transcriptome
load gtf file (only for genome)
load BAM file

What do you see?
Are the reads well distributed over the exons? or the transcripts?

d) count reads
from bam files use htseq-count for genome

for transcriptome use this command
samtools view oases.bam | awk '{print $3}' | awk '{count[$1]++} END {for(j in count) print j"\t"count[j]}' > oases.counts

Last modified: Tuesday, 22 January 2013, 9:06 AM