Finding Associations in Dense Genetic Maps: A Genetic Algorithm Approach
Clark TG., De Iorio M., Griffiths RC., Farrall M.
Large-scale association studies hold promise for discovering the genetic basis of common human disease. These studies will consist of a large number of individuals, as well as large number of genetic markers, such as single nucleotide polymorphisms (SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between phenotypes and SNPs in dense genetic maps. Our approach uses a genetic algorithm (GA) to construct logic trees consisting of Boolean expressions involving strings or blocks of SNPs. These blocks or nodes of the logic trees consist of SNPs in high linkage disequilibrium (LD), that is, SNPs that are highly correlated with each other due to evolutionary processes. At each generation of our GA, a population of logic tree models is modified using selection, cross-over and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression frame-work. Mutation and cross-over moves use LD measures to pro pose changes to the trees, and facilitate the movement through the model space. We demonstrate our method and the flexibility of logic tree structure with variable nodal lengths on simulated data from a coalescent model, as well as data from a candidate gene study of quantitative genetic variation.