Copyright © 2007 The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics, Volume 81, Issue 5, 927-938, 1 November 2007
doi:10.1086/521558
Article
Jung-Ying Tzenga,
,
and Daowen Zhanga
a Department of Statistics, North Carolina State University, Raleigh, NC
Address for correspondence and reprints: Dr. Jung-Ying Tzeng, Department of Statistics, North Carolina State University, Campus Box 7566, Raleigh, NC 27695Abstract
Haplotypes provide a more informative format of polymorphisms for genetic association analysis than do individual single-nucleotide polymorphisms. However, the practical efficacy of haplotype-based association analysis is challenged by a trade-off between the benefits of modeling abundant variation and the cost of the extra degrees of freedom. To reduce the degrees of freedom, several strategies have been considered in the literature. They include (1) clustering evolutionarily close haplotypes, (2) modeling the level of haplotype sharing, and (3) smoothing haplotype effects by introducing a correlation structure for haplotype effects and studying the variance components (VC) for association. Although the first two strategies enjoy a fair extent of power gain, empirical evidence showed that VC methods may exhibit only similar or less power than the standard haplotype regression method, even in cases of many haplotypes. In this study, we report possible reasons that cause the underpowered phenomenon and show how the power of the VC strategy can be improved. We construct a score test based on the restricted maximum likelihood or the marginal likelihood function of the VC and identify its nontypical limiting distribution. Through simulation, we demonstrate the validity of the test and investigate the power performance of the VC approach and that of the standard haplotype regression approach. With suitable choices for the correlation structure, the proposed method can be directly applied to unphased genotypic data. Our method is applicable to a wide-ranging class of models and is computationally efficient and easy to implement. The broad coverage and the fast and easy implementation of this method make the VC strategy an effective tool for haplotype analysis, even in modern genomewide association studies.
| Finding Haplotype Block Boundaries by Using the Minimum-Description-Length Principle The American Journal of Human Genetics, Volume 73, Issue 2, 1 August 2003, Pages 336-354 Eric C. Anderson and John Novembre Abstract We present a method for detecting haplotype blocks that simultaneously uses information about linkage-disequilibrium decay between the blocks and the diversity of haplotypes within the blocks. By use of phased single-nucleotide polymorphism data, our method partitions a chromosome into a series of adjacent, nonoverlapping blocks. The partition is made by choosing among a family of Markov models for block structure in a chromosomal region. Specifically, in the model, the occurrence of haplotypes within blocks follows a time-inhomogeneous Markov process along the chromosome, and we choose among possible partitions by using the two-stage minimum-description-length criterion. When applied to data simulated from the coalescent with recombination hotspots, our method reliably situates block boundaries at the hotspots and infrequently places block boundaries at sites with background levels of recombination. We apply three previously published block-finding methods to the same data, showing that they either are relatively insensitive to recombination hotspots or fail to discriminate between background sites of recombination and hotspots. When applied to the 5q31 data of Daly et al., our method identifies more block boundaries in agreement with those found by Daly et al. than do other methods. These results suggest that our method may be useful for designing association-based mapping studies that exploit haplotype blocks. Abstract | | |
| The Power of Genomic Control The American Journal of Human Genetics, Volume 66, Issue 6, 1 June 2000, Pages 1933-1944 Silviu-Alin Bacanu, B. Devlin and Kathryn Roeder Abstract Although association analysis is a useful tool for uncovering the genetic underpinnings of complex traits, its utility is diminished by population substructure, which can produce spurious association between phenotype and genotype within population-based samples. Because family-based designs are robust against substructure, they have risen to the fore of association analysis. Yet, if population substructure could be ignored, this robustness can come at the price of power. Unfortunately it is rarely evident when population substructure can be ignored. Devlin and Roeder recently have proposed a method, termed “genomic control” (GC), which has the robustness of family-based designs even though it uses population-based data. GC uses the genome itself to determine appropriate corrections for population-based association tests. Using the GC method, we contrast the power of two study designs, family trios (i.e., father, mother, and affected progeny) versus case-control. For analysis of trios, we use the TDT test. When population substructure is absent, we find GC is always more powerful than TDT; furthermore, contrary to previous results, we show that as a disease becomes more prevalent the discrepancy in power becomes more extreme. When population substructure is present, however, the results are more complex: TDT is more powerful when population substructure is substantial, and GC is more powerful otherwise. We also explore general issues of power and implementation of GC within the case-control setting and find that, economically, GC is at least comparable to and often less expensive than family-based methods. Therefore, GC methods should prove a useful complement to family-based methods for the genetic analysis of complex traits. Abstract | | |
| Multipoint Linkage-Disequilibrium–Mapping Approach Based on the Case-Parent Trio Design The American Journal of Human Genetics, Volume 68, Issue 4, 1 April 2001, Pages 937-950 Kung-Yee Liang, Fang-Chi Hsu, Terri H. Beaty and Kathleen C. Barnes Abstract In the present study we propose a multipoint approach, for the mapping of genes, that is based on the case-parent trio design. We first derive an expression for the expected preferential–allele-transmission statistics for transmission, from either parent to an affected child, for an arbitrary location within a chromosomal region demarcated by several genetic markers. No assumption about genetic mechanism is needed in this derivation, beyond the assumption that no more than one disease gene lies in the region framed by the markers. When one builds on this representation, the way in which one may maximize the genetic information from multiple markers becomes obvious. This proposed method differs from the popular transmission/disequilibrium test (TDT) approach for fine mapping, in the following ways: First, in contrast with the TDT approach, all markers contribute information, regardless of whether the parents are heterozygous at any one marker, and incomplete trio data can be utilized in our approach. Second, rather than performing the TDT at each marker separately, we propose a single test statistic that follows a X2 distribution with 1 df, under the null hypothesis of no linkage or linkage disequilibrium to the region. Third, in the presence of linkage evidence, we offer a means to estimate the location of the disease locus along with its sampling uncertainty. We illustrate the proposed method with data from a family study of asthma, conducted in Barbados. Abstract | | |