Copyright © 2007 The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics, Volume 81, Issue 2, 321-337, 1 August 2007
doi:10.1086/519497
Article
Timothy Thorntona and Mary Sara McPeeka, b,
, 
a Department of Statistics, University of Chicago, Chicago
b Department of Human Genetics, University of Chicago, Chicago
Address for correspondence and reprints: Dr. Mary Sara McPeek, Department of Statistics, University of Chicago, 5734 S. University Avenue, Chicago, IL 60637Abstract
We consider the problem of genomewide association testing of a binary trait when some sampled individuals are related, with known relationships. This commonly arises when families sampled for a linkage study are included in an association study. Furthermore, power to detect association with complex traits can be increased when affected individuals with affected relatives are sampled, because they are more likely to carry disease alleles than are randomly sampled affected individuals. With related individuals, correlations among relatives must be taken into account, to ensure validity of the test, and consideration of these correlations can also improve power. We provide new insight into the use of pedigree-based weights to improve power, and we propose a novel test, the MQLS test, which, as we demonstrate, represents an overall, and in many cases, substantial, improvement in power over previous tests, while retaining a computational simplicity that makes it useful in genomewide association studies in arbitrary pedigrees. Other features of the MQLS are as follows: (1) it is applicable to completely general combinations of family and case-control designs, (2) it can incorporate both unaffected controls and controls of unknown phenotype into the same analysis, and (3) it can incorporate phenotype data about relatives with missing genotype data. The methods are applied to data from the Genetic Analysis Workshop 14 Collaborative Study of the Genetics of Alcoholism, where the MQLS detects genomewide significant association (after Bonferroni correction) with an alcoholism-related phenotype for four different single-nucleotide polymorphisms: tsc1177811 (P=5.9×10−7), tsc1750530 (P=4.0×10−7), tsc0046696 (P=4.7×10−7), and tsc0057290 (P=5.2×10−7) on chromosomes 1, 16, 18, and 18, respectively. Three of these four significant associations were not detected in previous studies analyzing these data.
| On the Identification of Disease Mutations by the Analysis of Haplotype Similarity and Goodness of Fit The American Journal of Human Genetics, Volume 72, Issue 4, 1 April 2003, Pages 891-902 Jung-Ying Tzeng, B. Devlin, Larry Wasserman and Kathryn Roeder Abstract The observation that haplotypes from a particular region of the genome differ between affected and unaffected individuals or between chromosomes transmitted to affected individuals versus those not transmitted is sound evidence for a disease-liability mutation in the region. Tests for differentiation of haplotype distributions often take the form of either Pearson’s χ2 statistic or tests based on the similarity among haplotypes in the different populations. In this article, we show that many measures of haplotype similarity can be expressed in the same quadratic form, and we give the general form of the variance. As we describe, these methods can be applied to either phase-known or phase-unknown data. We investigate the performance of Pearson’s χ2 statistic and haplotype similarity tests through use of evolutionary simulations. We show that both approaches can be powerful, but under quite different conditions. Moreover, we show that the power of both approaches can be enhanced by clustering rare haplotypes from the distributions before performing a test. Abstract | | |
| Regression-Based Association Analysis with Clustered Haplotypes through Use of Genotypes The American Journal of Human Genetics, Volume 78, Issue 2, 1 February 2006, Pages 231-242 Jung-Ying Tzeng, Chih-Hao Wang, Jau-Tsuen Kao and Chuhsing Kate Hsiao Abstract Haplotype-based association analysis has been recognized as a tool with high resolution and potentially great power for identifying modest etiological effects of genes. However, in practice, its efficacy has not been as successfully reproduced as expected in theory. One primary cause is that such analysis tends to require a large number of parameters to capture the abundant haplotype varieties, and many of those are expended on rare haplotypes for which studies would have insufficient power to detect association even if it existed. To concentrate statistical power on more-relevant inferences, in this study, we developed a regression-based approach using clustered haplotypes to assess haplotype-phenotype association. Specifically, we generalized the probabilistic clustering methods of Tzeng to the generalized linear model (GLM) framework established by Schaid et al. The proposed method uses unphased genotypes and incorporates both phase uncertainty and clustering uncertainty. Its GLM framework allows adjustment of covariates and can model qualitative and quantitative traits. It can also evaluate the overall haplotype association or the individual haplotype effects. We applied the proposed approach to study the association between hypertriglyceridemia and the apolipoprotein A5 gene. Through simulation studies, we assessed the performance of the proposed approach and demonstrate its validity and power in testing for haplotype-trait association. Abstract | | |
| Improving Power in Contrasting Linkage-Disequilibrium Patterns between Cases and Controls The American Journal of Human Genetics, Volume 80, Issue 5, 1 May 2007, Pages 911-920 Tao Wang, Xiaofeng Zhu and Robert C. Elston Abstract Genetic association studies offer an opportunity to find genetic variants underlying complex human diseases. The success of this approach depends on the linkage disequilibrium (LD) between markers and the disease variant(s) in a local region of the genome. Because, in the region with a disease mutation, the LD pattern among markers may differ between cases and controls, in some scenarios, it is useful to compare a measure of this LD, to map disease mutations. For example, using the composite correlation to characterize the LD among markers, Zaykin et al. recently suggested an “LD contrast” test and showed that it has high power under certain haplotype-driven disease models. Furthermore, it is likely that individual variants observed at different positions in a gene act jointly with each other to influence the phenotype, and the LD contrast test is also a useful method to detect such joint action. However, the LD among markers introduced by mutations and their joint action is usually confounded by background LD, which is measured at the population level, especially in a local region with disease mutations. Because the measures of LD that are usually used, such as the composite correlation, represent both effects, they may not be optimal for the purpose of detecting association when high background LD exists. Here, we describe a test that improves the LD contrast test by taking into account the background LD. Because the proposed test is developed in a regression framework, it is very flexible and can be extended to continuous traits and to incorporate covariates. Our simulation results demonstrate the validity and substantially higher power of the proposed method over current methods. Finally, we illustrate our new method by applying it to real data from the International Collaborative Study on Hypertension in Blacks. Abstract | | |