Copyright © 2003 The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics, Volume 73, Issue 2, 336-354, 1 August 2003
doi:10.1086/377106
Eric C. Anderson
,
and John Novembre
Department of Integrative Biology, University of California, Berkeley
Address for correspondence and reprints: Dr. Eric C. Anderson, Department of Integrative Biology, University of California, Berkeley, CA 94720-3140Abstract
We present a method for detecting haplotype blocks that simultaneously uses information about linkage-disequilibrium decay between the blocks and the diversity of haplotypes within the blocks. By use of phased single-nucleotide polymorphism data, our method partitions a chromosome into a series of adjacent, nonoverlapping blocks. The partition is made by choosing among a family of Markov models for block structure in a chromosomal region. Specifically, in the model, the occurrence of haplotypes within blocks follows a time-inhomogeneous Markov process along the chromosome, and we choose among possible partitions by using the two-stage minimum-description-length criterion. When applied to data simulated from the coalescent with recombination hotspots, our method reliably situates block boundaries at the hotspots and infrequently places block boundaries at sites with background levels of recombination. We apply three previously published block-finding methods to the same data, showing that they either are relatively insensitive to recombination hotspots or fail to discriminate between background sites of recombination and hotspots. When applied to the 5q31 data of Daly et al., our method identifies more block boundaries in agreement with those found by Daly et al. than do other methods. These results suggest that our method may be useful for designing association-based mapping studies that exploit haplotype blocks.
| The Power of Genomic Control The American Journal of Human Genetics, Volume 66, Issue 6, 1 June 2000, Pages 1933-1944 Silviu-Alin Bacanu, B. Devlin and Kathryn Roeder Abstract Although association analysis is a useful tool for uncovering the genetic underpinnings of complex traits, its utility is diminished by population substructure, which can produce spurious association between phenotype and genotype within population-based samples. Because family-based designs are robust against substructure, they have risen to the fore of association analysis. Yet, if population substructure could be ignored, this robustness can come at the price of power. Unfortunately it is rarely evident when population substructure can be ignored. Devlin and Roeder recently have proposed a method, termed “genomic control” (GC), which has the robustness of family-based designs even though it uses population-based data. GC uses the genome itself to determine appropriate corrections for population-based association tests. Using the GC method, we contrast the power of two study designs, family trios (i.e., father, mother, and affected progeny) versus case-control. For analysis of trios, we use the TDT test. When population substructure is absent, we find GC is always more powerful than TDT; furthermore, contrary to previous results, we show that as a disease becomes more prevalent the discrepancy in power becomes more extreme. When population substructure is present, however, the results are more complex: TDT is more powerful when population substructure is substantial, and GC is more powerful otherwise. We also explore general issues of power and implementation of GC within the case-control setting and find that, economically, GC is at least comparable to and often less expensive than family-based methods. Therefore, GC methods should prove a useful complement to family-based methods for the genetic analysis of complex traits. Abstract | | |
| Transmission/Disequilibrium Test Meets Measured Haplotype Analysis: Family-Based Association Analysis Guided by Evolution of Haplotypes The American Journal of Human Genetics, Volume 68, Issue 5, 1 May 2001, Pages 1250-1263 Howard Seltman, Kathryn Roeder and B. Devlin Abstract Family data teamed with the transmission/disequilibrium test (TDT), which simultaneously evaluates linkage and association, is a powerful means of detecting disease-liability alleles. To increase the information provided by the test, various researchers have proposed TDT-based methods for haplotype transmission. Haplotypes indeed produce more-definitive transmissions than do the alleles comprising them, and this tends to increase power. However, the larger number of haplotypes, relative to alleles at individual loci, tends to decrease power, because of the additional degrees of freedom required for the test. An optimal strategy would focus the test on particular haplotypes or groups of haplotypes. In this report we develop such an approach by combining the theory of TDT with that of measured haplotype analysis (MHA). MHA uses the evolutionary relationships among haplotypes to produce a limited set of hypothesis tests and to increase the interpretability of these tests. The theory of our approach, called the “evolutionary tree” (ET)–TDT, is developed for two cases: when haplotype transmission is certain and when it is not. Simulations show the ET-TDT can be more powerful than other proposed methods under reasonable conditions. More importantly, our results show that, when multiple polymorphisms are found within the gene, the ET-TDT can be useful for determining which polymorphisms affect liability. Abstract | | |
| Detection of Disease Genes by Use of Family Data. I. Likelihood-Based Theory The American Journal of Human Genetics, Volume 66, Issue 4, 1 April 2000, Pages 1328-1340 Alice S. Whittemore and I-Ping Tu Abstract We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer ,; Schaid ; Schaid and Li ) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers. Abstract | | |