Anthony Herzig, Bretagne Occidentale University
-
Le 12 September 2023Amphi DEfalse false
-
11h30
Current methods in statistical imputation: going beyond missing genotypes
Current methods in statistical imputation: going beyond missing genotypes
Anthony Herzig, PhD
Postdoctoral Fellow
Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
Abstract
Statistical imputation of missing genotypes using population data has become exceedingly accurate. This is largely enabled by the increasing availability of enormous reference datasets and increased efforts to sequence population-specific panels. Highly accurate imputation enables for high-resolution sequencing data to be attained at a reduced cost, with resources such as the new UKBioBank imputation service carrying taglines along the lines of ‘near-perfect genome imputation for less than £0.10’. Furthermore, the vogue for shallow whole-genome sequencing represents a new tendency for imputation to be integrated more directly into bioinformatics pipelines.
Here we discuss current efforts and future challenges for genotype imputation strategies in French populations; focusing on different large-scale sequencing projects. Notably, we demonstrate the impact of fine-scale population structure on imputation efficacy. Furthermore, the modelling of haplotype sharing that is the foundation of statistical imputation is being increasingly re-purposed to empower genetic association studies. We give an overview of such recent advances as well as introducing our recent method development of a surrogate-family based association test (SURFBAT).
Grant: French Ministry of Research PFMG2025, ANR IA-10-LABX-0013 FranceGenRef and ANR-11-INBS-0002 Constances.
Biography
Anthony Herzig has a background in mathematics and statistics with degrees from the University of Cambridge and the University of Southampton in the United Kingdom. In 2015, he began his PhD studying the architecture of complex traits in isolated populations. Now a postdoc with INSERM in Brest, his research continues to centre around methodological development for the understanding the role of genetics in multifactorial disease in human populations. His work has often focused on practical aspects regarding data quality and availability; in particular, focusing on methods for imputing missing genotypic data.