The heteroplasmic condition for humans is when an individual contains multiple, replicated, versions of extra-nuclear genomic DNA, such as that of mitochondria (mtDNA). Everyone is mtDNA heteroplasmic, but the rate of heteroplasmy and the specific locations of SNPs varies by individual. Several disease conditions are associated with heteroplasmy and there is active research in the field exploring the relationship between heteroplasmy and aging.
Being heteroplasmic is not the same as being chimeric. Do you know why? What are the mechanisms of inheritance for each condition?
• Go ahead, start with Wikipedia to learn more: http://en.wikipedia.org/wiki/Heteroplasmy
Import child and mother .fastq datasets, review metadata for accuracy. Run FASTQC on datasets and confirm Illumina .fastqsanger format and note Sanger PHRED+33 quality score mean and range (*where Q20 indicating a sequence error rate of ~ 1.0%, or 1/100. Q10 is ~10.0% or 1/10, Q30 is ~0.1% or 1.1000). Map using BWA then Filter using SAMTools for properly mapped pairs only. Convert the resulting SAM file to BAM, add in read groups with Picard, then merge the two input datasets with Concatenate (child and mother). Execute the variant analysis tools Naive Variant Caller and FreeBayes. Note that FreeBayes at defaults will miss low-frequency variants, but that Naive Variant Caller reports all polymorphic sites (SNPs). Filter the Naive Variant Caller results with the tool Variant Annotator using the estimated sequencing error rate of 1.0%, then focus on polymorphisms present in the population >= 0.02 (2%).
• Explore the results. These are the native, potentially significant, locations of SNPs found both in and between the populations of mitochondria from the mother and child.
* Source at Illumina: http://www.illumina.com/truseq/quality_101/quality_scores.ilmn