Inference of host-pathogen interaction matrices from genome-wide polymorphism data.
Märkle H., John S., Metzger L., STOP-HCV Consortium None., Ansari MA., Pedergnana V., Tellier A.
Host-pathogen coevolution is defined as the reciprocal evolutionary changes in both species due to genotype x genotype (GxG) interactions at the genetic level determining the outcome and severity of infection. While co-analyses of host and pathogen genomes (co-GWAs) allow us to pinpoint the interacting genes, these do not reveal which host genotype(s) is/are resistant to which pathogen genotype(s). The knowledge of this so-called infection matrix is important for agriculture and medicine. Building on established theories of host-pathogen interactions, we here derive four novel indices capturing the characteristics of the infection matrix. These indices can be computed from full genome polymorphism data of randomly sampled uninfected hosts, as well as infected hosts and their pathogen strains. We use these indices in an Approximate Bayesian Computation method to pinpoint loci with relevant GxG interactions and to infer their underlying interaction matrix. In a combined SNP data set of 451 European humans and their infecting Hepatitis C Virus (HCV) strains and 503 uninfected individuals, we reveal a new human candidate gene for resistance to HCV and new virus mutations matching human genes. For two groups of significant human-HCV (GxG) associations, we infer a gene-for-gene infection matrix, which is commonly assumed to be typical of plant-pathogen interactions. Our model-based inference framework bridges theoretical models of GxG interactions with host and pathogen genomic data. It, therefore, paves the way for understanding the evolution of key GxG interactions underpinning HCV adaptation to the European human population after a recent expansion.