• Le 09 October 2020
    Institut de Recherche en Santé - 8 quai Moncousu - Nantes
    Amphithéâtre Denis Escande
  • 11h30

Low-coverage sequencing imputation from large reference panels

Low-coverage sequencing imputation from large reference panels


Prof. Olivier Delaneau invited by Christian Dina (Eq I)
Department of Computational Biology
University of Lausanne
Switzerland
 

Abstract

Low coverage whole-genome sequencing (0.5x-1x) followed by imputation has been shown to recapitulate the same signals and discover new variants compared to imputation of SNP arrays (Pasaniuc, 2012; Gilly, 2019). However, imputation methods are computationally expensive and large reference panels cannot be used due to model constraints. We describe Low Coverage Caller (LCC), a method for genotype imputation of low coverage sequencing datasets. The model, based on the Li and Stephens, 2003, has two key features. First, a linear-time sampling algorithm for haplotype configurations. Second, it uses a procedure to reduce the state space by selecting a subset of highly confident haplotypes. This allows LCC to be efficient while leveraging information from very large reference panels of haplotypes. We use high-coverage data from the 1000 Genome Project and run LCC and Beagle4.1 on down-sampled coverages in the range 0.1x-8.0x. We also perform imputation on 35 different SNP array models using Beagle5.1. For all the experiments, we use the HRC as a reference panel. We show that our method is more accurate and orders of magnitudes faster than other low-coverage sequencing imputation methods. We also show that imputation from 0.5x and 0.8x outperforms imputation of Illumina Global Screening Array and Omni2.5, respectively. This is particularly true at extremely rare variants, where there is an accuracy boost of ~20%. LCC has a limited computational overhead and outperforms standard imputation from SNP arrays, allowing large-scale association studies to be based on low coverage sequencing.


Biography

Olivier Delaneau was initially trained as a Computer scientist and got a PhD in Bioinformatics from the Conservatoire National des Arts et Métiers in Paris in 2008. He went through two successive postdocs, at the department of Statistics of the University of Oxford (UK) and at the department of genetics and medicine of the University of Geneva (Switzerland). Recently, he joined the department of computational biology of the University of Lausanne as an assistant professor. Olivier Delaneau mostly work on two main topics. First, he  develops efficient statistical tools for the analysis of large scale genomics datasets, such as shapeit for haplotype estimation or QTLtools/FastQTL designed for mapping expression QTLs from RNA-seq data. Beside method development, Olivier Delaneau is also interested in the genetic control of gene expression, topic that he studies using integrative approaches based on multi-omics data.