A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors. | Academic Article individual record

MOTIVATION: In genetic studies of complex diseases, haplotypes provide more information than genotypes. However, haplotyping is much more difficult than genotyping using biological techniques. Therefore effective computational techniques have been in demand. The individual haplotyping problem is the computational problem of inducing a pair of haplotypes from an individual's aligned SNP fragments. Based on various optimal criteria and including different extra information, many models for the problem have been proposed. Higher accuracy of the models has been an important issue in the study of haplotype reconstruction. RESULTS: The current article proposes a highly accurate model for the single individual haplotyping problem based on weighted fragments and genotypes with errors. The model is proved to be NP-hard even with gapless fragments. Based on the characteristics of Single Nucleotide Polymorphism (SNP) fragments, a parameterized algorithm of time complexity O(nk(2)2(k(2)) + m log m + mk(1)) is developed, where m is the number of fragments, n is the number of SNP sites, k(1) is the maximum number of SNP sites that a fragment covers (no more than n and usually smaller than 10) and k(2) is the maximum number of the fragments covering a SNP site (usually no more than 19). Extensive experiments show that this model is more accurate in haplotype reconstruction than other models. AVAILABILITY: The program of the parameterized algorithm can be obtained by sending an email to the corresponding author.

author list (cited authors)
Xie, M., Wang, J., & Chen, J.
publication date
published in
  • Base Sequence
  • Reproducibility Of Results
  • Polymorphism, Single Nucleotide
  • Genetic Linkage
  • Molecular Sequence Data
  • Chromosome Mapping
  • Models, Genetic
  • Haplotypes
  • Sensitivity And Specificity
  • Computer Simulation
  • Sequence Analysis, DNA
  • Genotype
citation count