Bhartiya Krishi Anusandhan Patrika, volume 38 issue 3 (september 2023) : 193-202

Issues and Challenges of Imputation Techniques in Genome Wide Association Studies (GWAS): A Review

Rahul Banerjee1,*, Bharti1, Shbana Begum2, Pankaj Das1, Tauqueer Ahmad1
1ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, Pusa-110 012, New Delhi, India.
2ICAR-National Institute for Plant Biotechnology, LBS Centre, Pusa-110 012, New Delhi, India.
  • Submitted21-09-2022|

  • Accepted23-09-2023|

  • First Online 13-10-2023|

  • doi 10.18805/BKAP597

Cite article:- Banerjee Rahul, Bharti, Begum Shbana, Das Pankaj, Ahmad Tauqueer (2023). Issues and Challenges of Imputation Techniques in Genome Wide Association Studies (GWAS): A Review . Bhartiya Krishi Anusandhan Patrika. 38(3): 193-202. doi: 10.18805/BKAP597.
A genome-wide association study (GWAS) rapidly scans DNA markers in many individuals to find genetic links to diseases. New findings aid in disease detection, treatment and prevention. Imputation predicts untyped genotypes in genetic studies when data is missing due to quality, cost, or design issues. It’s a proven statistical technique for estimating unobserved genotypes by borrowing haplotype segments from a densely genotyped reference panel. This allows estimation and testing of associations at unassayed variants.Genotype imputation is vital in analyzing genome-wide association scans, helping geneticists evaluate evidence for association at untyped genetic markers. This summary outlines missing data issues and various imputation methods.

  1. Anderson, C.A., Pettersson, F.H. and Clarke, G.M. (2010). Data quality control in genetic case-control association studies. Nature Protocols. 5: 1564-1573.

  2. Browning, S.R. (2006). Multilocus association mapping using variable-length Markov chains. American Journal of Human Genetics. 78: 273-280. 

  3. Cheema, J.R. (2014). A review of missing data handling methods in educational research. Review of Educational Research. XX(X): 1-22. 

  4. Fearnhead, P. and Donnelly, P. (2001). Estimating recombination rates from population genetic data. Genetics. 159: 1299-1318. 

  5. Hindorff, L.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences. 106(23): 9362-9367. 

  6. Howie, B. (2012). Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nature Genetics. 44(8): 955-960. 

  7. Howie, B., Marchini, J. and Stephens. (2011). Genotype imputation with thousands of genomes. G3: Genes, Genomes, Genetics. 1: 457-469. 

  8. Howie, B.N., Donnelly, P. and Marchini, J. (2009). A flexible and accurate method for the next generation of genome-wide association studies. Plos Genetics. 5(6): e1000529. doi:10.1371/journal.pgen.1000529.  

  9. Krawczak, M. (2015). Genotype Imputation. In: eLS. John Wiley and Sons, Ltd: Chichester. doi: 10.1002/ 9780470015902.a0022399.

  10. Li, N. and Stephens, M. (2003). Modeling linkage disequilibrium and identifying recombination hotspots using single- nucleotide polymorphism data. Genetics. 165: 2213-2233.

  11. Liu, Q. (2014). Systematic assessment of imputation performance using the 1000 Genomes reference panels. Briefings in Bioinformatics. DOI: 10.1093/bib/bbu035.

  12. Marchini, J. (2007). A new multipoint method for genome- wide association studies by imputation of genotypes. Nature Genetics. 39(7): 906-913.

  13. Marchini, J. and Howie, B. (2010). Genotype imputation for genome-wide association studies. Nature Reviews Genetics. 11: 499-511. 

  14. Nothnagel, M. (2009). A comprehensive evaluation of SNP genotype imputation. Human Genetics. 125: 163-171. 

  15. Roberts, A. (2007). Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows. Bioinformatics. 23: 401-407. 

  16. Roth, P. (1994). Missing data: A conceptual review for applied psychologists. Personnel Psychology. 47: 537-560. doi: 10.1111/j.1744-6570.1994.tb01738.x. 

  17. Rubin, D.B. (1976). Inference and missing data. Biometrika. 63(3): 581-592. 

  18. Servin, B. and Stephens, M. (2007). Imputation-based analysis of association studies: Candidate Regions and Qualitative Traits. Plos Genetics. 3(7): e114. doi: 10.1371/ journal. pgen. 0030114.

  19. Sheet, P. and Stephens, M. (2006). A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. The American Journal of Human Genetics. 78: 629-644. 

  20. Wang, Y.N., Cai, Z.P., Stothard, P. (2012). Fast accurate missing SNP genotype local imputation. BMC Research Notes. 5: 404. doi: 10.1186/1756-0500-5-404.

Editorial Board

View all (0)