Finding Heteroplasmic Sites: Two tissues in a single individual

The problem

You would like to find heteroplasmic sites in human mitochondrial DNA. This is similar to finding SNPs, although frequency of heterplasmic sites is unpredictable as we are dealing with mixture of a  large number of mtDNA molecules.

The data

We have four sets of Illumina reads representing a single individual. Two of these datasets represent DNA from blood and the other two from buccal cells (cheek swab sample). Why do we have two datasets representing each tissue? This is because enrichment of mtDNA involves amplification with PCR. However, the PCR procedure introduces errors and a good way of correcting for these errors is having PCR replicated that can be compared to identify the error threshold.

How this document is organized

There are four parts to this tutorial:

A. Getting things mapped
B. Estimating methodological error
C. Mapping cheek datasets
D. Finding heteroplasmy

Galaxy Histories described in this tutorial

Galaxy Workflows used in this tutorial