Copyright © 2008 The American Society of Human Genetics. All rights reserved.
The American Journal of Human Genetics, Volume 82, Issue 2, 444-452, 8 February 2008
doi:10.1016/j.ajhg.2007.11.004
Article
D.Y. Lin1,
,
, Y. Hu1 and B.E. Huang1
Corresponding authorAbstract
Missing genotype data arise in association studies when the single-nucleotide polymorphisms (SNPs) on the genotying platform are not assayed successfully, when the SNPs of interest are not on the platform, or when total sequence variation is determined only on a small fraction of individuals. We present a simple and flexible likelihood framework to study SNP-disease associations with such missing genotype data. Our likelihood makes full use of all available data in case-control studies and reference panels (e.g., the HapMap), and it properly accounts for the biased nature of the case-control sampling as well as the uncertainty in inferring unknown variants. The corresponding maximum-likelihood estimators for genetic effects and gene-environment interactions are unbiased and statistically efficient. We developed fast and stable numerical algorithms to calculate the maximum-likelihood estimators and their variances, and we implemented these algorithms in a freely available computer program. Simulation studies demonstrated that the new approach is more powerful than existing methods while providing accurate control of the type I error. An application to a case-control study on rheumatoid arthritis revealed several loci that deserve further investigations.
| Detection, Imputation, and Association Analysis of Small Deletions and Null Alleles on Oligonucleotide Arrays The American Journal of Human Genetics, Volume 82, Issue 6, 6 June 2008, Pages 1316-1333 Lude Franke, Carolien G.F. de Kovel, Yurii S. Aulchenko, Gosia Trynka, Alexandra Zhernakova, Karen A. Hunt, Hylke M. Blauw, Leonard H. van den Berg, Roel Ophoff, Panagiotis Deloukas, David A. van Heel and Cisca Wijmenga Abstract Copy-number variation (CNV) is a major contributor to human genetic variation. Recently, CNV associations with human disease have been reported. Many genome-wide association (GWA) studies in complex diseases have been performed with sets of biallelic single-nucleotide polymorphisms (SNPs), but the available CNV methods are still limited. We present a new method (TriTyper) that can infer genotypes in case-control data sets for deletion CNVs, or SNPs with an extra, untyped allele at a high-resolution single SNP level. By accounting for linkage disequilibrium (LD), as well as intensity data, calling accuracy is improved. Analysis of 3102 unrelated individuals with European descent, genotyped with Illumina Infinium BeadChips, resulted in the identification of 1880 SNPs with a common untyped allele, and these SNPs are in strong LD with neighboring biallelic SNPs. Simulations indicate our method has superior power to detect associations compared to biallelic SNPs that are in LD with these SNPs, yet without increasing type I errors, as shown in a GWA analysis in celiac disease. Genotypes for 1204 triallelic SNPs could be fully imputed, with only biallelic-genotype calls, permitting association analysis of these SNPs in many published data sets. We estimate that 682 of the 1655 unique loci reflect deletions; this is on average 99 deletions per individual, four times greater than those detected by other methods. Whereas the identified loci are strongly enriched for known deletions, 61% have not been reported before. Genes overlapping with these loci more often have paralogs (p = 0.006) and biologically interact with fewer genes than expected (p = 0.004). Abstract | | |