Copyright © 2005 The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics, Volume 76, Issue 5, 780-793, 1 May 2005
doi:10.1086/429838
Daniel J. Schaid1,
,
, Shannon K. McDonnell1, Scott J. Hebbring2, Julie M. Cunningham2 and Stephen N. Thibodeau2
1 Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN
2 Department of Laboratory Medicine and Pathology, Mayo Clinic College of Medicine, Rochester, MN
Address for correspondence and reprints: Dr. Daniel J. Schaid, Department of Health Sciences Research, Harwick 7, Mayo Clinic, 200 First Street SW, Rochester, MN 55905Abstract
The genetic basis of many common human diseases is expected to be highly heterogeneous, with multiple causative loci and multiple alleles at some of the causative loci. Analyzing the association of disease with one genetic marker at a time can have weak power, because of relatively small genetic effects and the need to correct for multiple testing. Testing the simultaneous effects of multiple markers by multivariate statistics might improve power, but they too will not be very powerful when there are many markers, because of the many degrees of freedom. To overcome some of the limitations of current statistical methods for case-control studies of candidate genes, we develop a new class of nonparametric statistics that can simultaneously test the association of multiple markers with disease, with only a single degree of freedom. Our approach, which is based on U-statistics, first measures a score over all markers for pairs of subjects and then compares the averages of these scores between cases and controls. Genetic scoring for a pair of subjects is measured by a “kernel” function, which we allow to be fairly general. However, we provide guidelines on how to choose a kernel for different types of genetic effects. Our global statistic has the advantage of having only one degree of freedom and achieves its greatest power advantage when the contrasts of average genotype scores between cases and controls are in the same direction across multiple markers. Simulations illustrate that our proposed methods have the anticipated type I–error rate and that they can be more powerful than standard methods. Application of our methods to a study of candidate genes for prostate cancer illustrates their potential merits, and offers guidelines for interpretation.
| Transmission/Disequilibrium Test Meets Measured Haplotype Analysis: Family-Based Association Analysis Guided by Evolution of Haplotypes The American Journal of Human Genetics, Volume 68, Issue 5, 1 May 2001, Pages 1250-1263 Howard Seltman, Kathryn Roeder and B. Devlin Abstract Family data teamed with the transmission/disequilibrium test (TDT), which simultaneously evaluates linkage and association, is a powerful means of detecting disease-liability alleles. To increase the information provided by the test, various researchers have proposed TDT-based methods for haplotype transmission. Haplotypes indeed produce more-definitive transmissions than do the alleles comprising them, and this tends to increase power. However, the larger number of haplotypes, relative to alleles at individual loci, tends to decrease power, because of the additional degrees of freedom required for the test. An optimal strategy would focus the test on particular haplotypes or groups of haplotypes. In this report we develop such an approach by combining the theory of TDT with that of measured haplotype analysis (MHA). MHA uses the evolutionary relationships among haplotypes to produce a limited set of hypothesis tests and to increase the interpretability of these tests. The theory of our approach, called the “evolutionary tree” (ET)–TDT, is developed for two cases: when haplotype transmission is certain and when it is not. Simulations show the ET-TDT can be more powerful than other proposed methods under reasonable conditions. More importantly, our results show that, when multiple polymorphisms are found within the gene, the ET-TDT can be useful for determining which polymorphisms affect liability. Abstract | | |
| High-Resolution Multipoint Linkage-Disequilibrium Mapping in the Context of a Human Genome Sequence The American Journal of Human Genetics, Volume 69, Issue 1, 1 July 2001, Pages 159-178 Bruce Rannala and Jeff P. Reeve Abstract A new method is presented for fine-scale linkage disequilibrium (LD) mapping of a disease mutation; it uses multiple linked single-nucleotide polymorphisms, restriction-fragment-length polymorphisms, or microsatellite markers and incorporates information from an annotated human genome sequence (HGS) and from a human mutation database. The method takes account of population demographic effects, using Markov chain Monte Carlo methods to integrate over the unknown gene genealogy and gene coalescence times. Information about the relative frequency of disease mutations in exons, introns, and other regions, from mutational databases, as well as assumptions about the completeness of the gene annotation, are used with an annotated HGS, to generate a prior probability that a mutation lies at any particular position in a specified region of the genome. This information is updated with information about mutation location, from LD at a set of linked markers in the region, to generate the posterior probability density of the mutation location. The performance of the method is evaluated by simulation and by analysis of a data set for diastrophic dysplasia (DTD) in Finland. The DTD disease gene has been positionally cloned, so the actual location of the mutation is known and can be compared with the position predicted by our method. For the DTD data, the addition of information from an HGS results in disease-gene localization at a resolution that is much higher than that which would be possible by LD mapping alone. In this case, the gene would be found by sequencing a region ≤7 kb in size. Abstract | | |