OBJECTIVE To test if knowledge of type 2 diabetes genetic variants improves disease prediction.
RESEARCH DESIGN AND METHODS We tested 40 single nucleotide polymorphisms (SNPs) associated with diabetes in 3,471 Framingham Offspring Study subjects followed over 34 years using pooled logistic regression models stratified by age (<50 years, diabetes cases = 144; or ≥50 years, diabetes cases = 302). Models included clinical risk factors and a 40-SNP weighted genetic risk score.
RESULTS In people <50 years of age, the clinical risk factors model C-statistic was 0.908; the 40-SNP score increased it to 0.911 (P = 0.3; net reclassification improvement (NRI): 10.2%, P = 0.001). In people ≥50 years of age, the C-statistics without and with the score were 0.883 and 0.884 (P = 0.2; NRI: 0.4%). The risk per risk allele was higher in people <50 than ≥50 years of age (24 vs. 11%; P value for age interaction = 0.02).
CONCLUSIONS Knowledge of common genetic variation appropriately reclassifies younger people for type 2 diabetes risk beyond clinical risk factors but not older people.
A genetic risk score built with 18 type 2 diabetes genetic loci predicted new diabetes cases (1), though it did not add to common diabetes clinical risk factors that usually appear during adulthood (1–3). In recent years, the number of genetic loci convincingly associated with diabetes has doubled (4–10). Here, we test two hypotheses: an updated genetic risk score incorporating a larger number of common diabetes-associated single nucleotide polymorphisms (SNPs) improves ∼8-year risk prediction of diabetes beyond common clinical diabetes risk factors; and the predictive ability is better in younger subjects in whom early preventive strategies could delay diabetes onset (11).
RESEARCH DESIGN AND METHODS
We have previously described the methods (1). We pooled data of the Framingham Offspring Study (12) into four time periods (exams 1 and 2, 2 to 4, 4 to 6, and 6 to 8) (3), extending follow-up 6 years beyond our previous report (1). We generated 11,358 person-observations for 3,471 subjects with available genetic data. We excluded prevalent diabetes at the baseline of each period. Diabetes was defined as fasting plasma glucose >7.0 mmol/l (>125 mg/dl) or use of antidiabetic therapy.
We genotyped or imputed 40 autosomal diabetes-SNPs reported in European-origin populations (4–10), thus adding 23 new SNPs and excluding INS from our previous 18-SNP analysis (1). Genotypes were obtained from Affymetrix array data available in the Framingham Heart Study SNP Health Associate Resource dataset (13) or from de novo genotyping on the iPLEX (Sequenom) platform. Minimum call rates were 97% for Affymetrix and 96.9% for iPLEX SNPs. All SNPs were in Hardy-Weinberg equilibrium. Median variance ratio for the imputed SNPs was 0.94; only for rs725210 at HNF1B, the variance ratio was <0.3 (namely, 0.2).
We modeled the 40 SNPs by constructing a 40-SNP weighted genetic risk score based on the published β coefficients (8,10) (see footnote, Table 1) and alternatively by entering one term per SNP in an additive model using the expected or observed number of minor alleles plus terms for sex or clinical variables. A general nonadditive genetic model was also fit for each SNP, but inclusion of a nonadditive term did not improve the fit (P > 0.043 for all SNPs). We also performed bootstrap resampling with replacement to assess the degree of statistical overestimation.
Association tests were done after age-stratification (<50 and ≥50 years) and in the sample overall. We compared the mean genetic risk score for persons who did develop diabetes with those who did not using mixed-effects linear models to account for family relatedness. Likewise, we used generalized estimating equations in pooled logistic-regression models (14) to test associations of the genetic risk scores with diabetes onset in sex- and simple clinical diabetes risk factors–adjusted models, which included sex, family history of diabetes (self-report that any parent had diabetes), BMI, fasting glucose and triglyceride levels, systolic blood pressure, and HDL cholesterol (3).
We evaluated model discrimination using C-statistics and net reclassification improvement (NRI) (15) (see footnote, Table 1). A two-tailed P value <0.05 indicated statistical significance. The institutional review board at Boston University approved the study, and all participants gave written informed consent.
Mean age was 36 ± 9 years at the first exam; nearly half the subjects were men, and BMI increased over follow-up (supplementary Table A1 in the online appendix available at http://care.diabetesjournals.org/cgi/content/full/dc10-1265/DC1). Over 11,358 person-observations we diagnosed 446 cases of diabetes. Few individual SNPs were significantly associated with diabetes in our sample, but for most SNPs the effects were in the same direction as in the original reports and of expected effect sizes (1.05–1.3) (supplementary Table A2). Individuals who developed diabetes had higher genetic risk scores than those who did not (20.4 vs. 19.7; P = 1.7 × 10−10).
The 40-SNP genetic risk score significantly reclassified subjects <50 years of age in the simple clinical variables model (NRI: 10.2%; P = 0.001), although it did not improve model discrimination (P = 0.3) (Table 1). In subjects ≥50 years, the 40-SNP score neither improved model discrimination (P = 0.2) nor risk reclassification (NRI: 0.4%; P = 0.7). The relative risk per risk allele was higher in subjects <50 years of age (24%) than in those ≥50 years of age (11%) (P = 0.02 for age-interaction effect). Results for the sex-adjusted model are shown in supplementary Table A3.
In the population overall, the 40-SNP genetic risk score marginally improved risk prediction (C-statistics: 0.903 and 0.906, without and with the score; P = 0.04), whereas the 17-SNP score did not (P = 0.11) (supplementary Table A4). In the whole population, NRI with the score was lower than in subjects <50 years of age (at most, 1.8%).
The individual incorporation of 40 SNPs improved model discrimination beyond the 40-SNP score (C-statistics: 0.908 and 0.920 without and with individual SNPs; P = 0.02), but after bootstrap resampling, median C-statistic values dropped to 0.905 and 0.907, respectively, thus lowering optimism about the effect of modeling individual SNPs.
We found that 40 SNPs selected based on the latest genetic association data improved diabetes risk reclassification after accounting for common diabetes clinical risk predictive factors.
The 40 SNPs contributing individually had the highest discrimination ability, but this model was probably overfit. The increased prediction performance of 40 as opposed to 17 SNPs appeared to be due to additional, more comprehensively modeled genetic information rather than to longer follow-up or greater number of diabetes cases as compared to our earlier report.
Limitations include that the Framingham Offspring Study subjects are mostly white and of European ancestry. Although we did not find sufficient evidence for departure from an additive model, we cannot definitely rule out that other nonadditive models are operating. We only analyzed common genetic variants; eventual incorporation of rare variants might enhance prediction. Lastly, criticism has been raised on the somewhat arbitrary assumptions needed to estimate NRI.
In summary, diabetes risk prediction improved with 40 diabetes-associated SNPs, especially in people <50 years of age. More subjects were appropriately reclassified for diabetes risk. Genetic prediction could be useful in younger people. Nonetheless, the clinical usefulness of common genetic variants for diabetes risk prediction should be further confirmed in other samples and in randomized controlled trials.
This study was supported by the by the National Heart, Lung, and Blood Institute's Framingham Heart Study (contract no. N01-HC- 25195), the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) grants R01 DK078616 and K24 DK080140 (to J.B.M.), NIDDK Research Career Award K23 DK65978 (to J.C.F.), NIDDK Grant R21 DK084527 (to R.W.G.), “Bolsa de Ampliación de Estudios” from the “Instituto de Salud Carlos III”, Madrid, Spain (2009/90071) (to J.M.D.M.Y.), and the Boston University Linux Cluster for Genetic Analysis (LinGA) funded by the National Institutes of Health National Center for Research Resources Shared Instrumentation Grant (1S10RR163736-01A1).
J.B.M. has a consulting agreement with Interleukin Genetics, Inc. No other potential conflicts of interest relevant to this article were reported.
J.M.D.M.Y. researched data and wrote the manuscript. P.S. researched data and contributed to discussion. M.J.P., J.D., R.B.D., and L.A.C. researched data, contributed to discussion, and reviewed the manuscript. C.S.F. and A.K.M. researched data and reviewed the manuscript. R.W.G. and J.C.F. contributed to discussion and reviewed the manuscript. J.B.M. contributed to discussion and wrote the manuscript.
Parts of this study were presented in poster form at the 70th Scientific Sessions of the American Diabetes Association, Orlando, Florida, 25–29 June 2010.
↵*MAGIC and DIAGRAM+ Investigators are listed in supplementary Table A5 in the online appendix available at http://care.diabetesjournals.org/cgi/content/full/dc10-1265/DC1.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Received July 3, 2010.
- Accepted September 23, 2010.
- © 2011 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.