The Diabetes Risk Score

A practical tool to predict type 2 diabetes risk

  1. Jaana Lindström, MSC1 and
  2. Jaakko Tuomilehto, MD, PHD12
  1. 1Diabetes and Genetic Epidemiology Unit, Department of Epidemiology and Health Promotion, National Public Health Institute, Helsinki, Finland
  2. 2Department of Public Health, University of Helsinki, Helsinki, Finland


    OBJECTIVE—Interventions to prevent type 2 diabetes should be directed toward individuals at increased risk for the disease. To identify such individuals without laboratory tests, we developed the Diabetes Risk Score.

    RESEARCH DESIGN AND METHODS—A random population sample of 35- to 64-year-old men and women with no antidiabetic drug treatment at baseline were followed for 10 years. New cases of drug-treated type 2 diabetes were ascertained from the National Drug Registry. Multivariate logistic regression model coefficients were used to assign each variable category a score. The Diabetes Risk Score was composed as the sum of these individual scores. The validity of the score was tested in an independent population survey performed in 1992 with prospective follow-up for 5 years.

    RESULTS—Age, BMI, waist circumference, history of antihypertensive drug treatment and high blood glucose, physical activity, and daily consumption of fruits, berries, or vegetables were selected as categorical variables. Complete baseline risk data were found in 4,435 subjects with 182 incident cases of diabetes. The Diabetes Risk Score value varied from 0 to 20. To predict drug-treated diabetes, the score value ≥9 had sensitivity of 0.78 and 0.81, specificity of 0.77 and 0.76, and positive predictive value of 0.13 and 0.05 in the 1987 and 1992 cohorts, respectively.

    CONCLUSIONS—The Diabetes Risk Score is a simple, fast, inexpensive, noninvasive, and reliable tool to identify individuals at high risk for type 2 diabetes.

    The prevalence of type 2 diabetes is increasing in all populations worldwide. It is a major risk factor for death and numerous nonfatal complications that will form a large burden to the patients, their families, and the health care system. Several recent intervention studies have undisputedly proved that type 2 diabetes can be efficiently prevented by lifestyle modification in high-risk individuals (13). Now, the major task for public health administrations is to identify individuals who would benefit from intensive lifestyle counseling.

    Screening for blood glucose has been used or proposed as the possible tool to identify individuals with high diabetes risk or asymptomatic diabetes. There is debate regarding whether screening for fasting glucose is sufficient or whether an oral glucose tolerance test is needed for detection of asymptomatic diabetes (4). Measuring either fasting or postchallenge (postprandial) blood glucose is an invasive procedure and is costly and time consuming. Blood glucose has a large random variation and only gives information on a subject’s current glycemic status. However, the true primary prevention would be to identify high-risk subjects when they are still in a normoglycemic state and to treat them by interventions that prevent their transition from normoglycemia to impaired glucose tolerance and to overt diabetes.

    The aim of this study was to develop a simple, practical, and informative scoring system to characterize individuals according to their future risk of type 2 diabetes. Furthermore, we have evaluated the usefulness of the scoring system in detecting asymptomatic diabetes in a cross-sectional setting.


    A random sample was drawn from the National Population Register in 1987 and another independent sample was drawn in 1992 (the FINRISK Studies). The samples included 6.6% of the population aged 25–64 years and were stratified so that at least 250 subjects of each sex and 10-year age group were chosen from North Karelia, Kuopio, and South-Western Finland, as well as from the Helsinki-Vantaa region in 1992. Participation rates were 82% in the 1987 survey (n = 4,746) and 76% in the 1992 survey (n = 4,615). Baseline surveys were performed from January to April 1987 (model development data) and from February to May 1992 (model validation data). The sampling schemes and survey procedures have been described in detail elsewhere (59).

    The subjects received by mail a questionnaire on medical history and health behavior and an invitation to a clinical examination, which included measurements of weight (in light indoor clothes, to the nearest 100 g), height (without shoes, to the nearest 1 mm), and waist circumference (at a level midway between the lowest rib and the iliac crest, to the nearest 5 mm). BMI was calculated dividing the weight (kg) by the height squared (m2).

    The end point of follow-up was development of drug-treated diabetes. Data were collected through computer-based data linkage with the nationwide Social Insurance Institution drug register until the end of 1997. This drug register comprises information about all Finnish people who have been approved to receive free-of-charge drug treatment for certain chronic diseases, including diabetes. The subjects aged ≤34 years and those on antidiabetic drug treatment at the time of the baseline survey were excluded from the analyses.

    Logistic regression was used to compute β-coefficients for known risk factors for diabetes. Because the aim was to produce a simple risk calculator that could be conveniently used in primary care and also by individuals themselves, only parameters that are easy to assess without any laboratory tests or other clinical measurements requiring special skills were entered into the model (Table 1). The logistic regression analyses with drug-treated diabetes diagnosed during follow-up as the dependent variable were performed using the LOGISTIC-procedure of SAS software (version 8.2; SAS Institute, Cary, NC). Interaction terms between the independent variables were not considered, because we wanted to keep the Diabetes Risk Score simple and easy to use. Coefficients (β) of the model were used to assign a score value for each variable, and the composite Diabetes Risk Score was calculated as the sum of those scores. The sensitivity (probability that the test is positive for subjects who will get drug-treated diabetes in the future) and the specificity (the probability that the test is negative for subjects without drug-treated diabetes) with 95% CIs (10) were calculated for each Diabetes Risk Score level in differentiating the subjects who developed drug-treated diabetes from those who did not. Then, receiver-operating characteristic (ROC) curves were plotted for the Diabetes Risk Score; the sensitivity was plotted on the y- axis, and the false-positive rate (1-specificity) was plotted on the x-axis. The more accurately discriminatory the test, the steeper the upward portion of the ROC curve and the higher the area under the curve (AUC), the optimal cut point being the peak of the curve (11). The FREQ procedures trend-option was used to calculate trend test for rates (Table 2).

    To identify prevalent diabetes in the 1987 survey, the subjects were asked to fast for at least 4 h before the scheduled examination. Blood samples for determination of fasting blood glucose levels were collected from participants aged 45–64 years. Then, the 2-h oral glucose tolerance test with a standard 75 g of glucose was administered, and the second blood sample was collected after 2 h. Venous full blood was collected into tubes containing oxalate-fluoride and mailed to the central laboratory. Blood glucose level was determined with hexokinase-glucose-6-phosphate dehydrogenase method as soon as the samples were received (1–2 days after the blood was collected).

    In the 1992 survey, individuals aged 45–64 years were invited to a repeat visit a few weeks after the first survey visit to undergo standard oral glucose tolerance test and for collection of blood samples for determination of fasting and 2-h plasma glucose levels. The test was completed after an overnight fast. Samples for plasma glucose determination were collected in heparinized and fluoridated tubes and centrifuged immediately. Plasma samples were mailed the same day to a central laboratory, where glucose concentration was determined with the hexokinase method.

    If fasting time was inadequate, glucose values were not accepted. In addition, if glucose solution was not consumed or if the postload blood sample was collected either 5 min too early or 5 min too late, the postload glucose value was not accepted. In such cases, subjects could only be classified as having diabetes, based on high fasting value.

    Subjects not under antidiabetic drug treatment were diagnosed as having diabetes, according to World Health Organization (WHO) 1999 criteria (12), if they had either fasting plasma glucose ≥7.0 mmol/l (fasting whole blood glucose ≥6.1 mmol/l) and/or 2-h plasma glucose ≥11.1 mmol/l (2-h whole blood glucose ≥10.0 mmol/l).


    Of the 4,746 subjects in the 1987 survey who were not on antidiabetic drug therapy at baseline, drug-treated diabetes developed in 196 during the follow-up of ∼10 years.

    Model development

    The 10-year incidence of drug-treated diabetes during the follow-up was 4.1%. The incidence increased by increasing age, BMI, and waist circumference, divided according to “waist action levels” suggested by Lean et al. (13).

    The baseline survey questionnaire included several questions about blood pressure. Overall, high blood pressure was associated with higher incidence of drug-treated diabetes.

    The question about history of blood pressure medication was selected into the Diabetes Risk Score because it is an unequivocal marker of clinically evident hypertension and can be determined without blood pressure measurement. The question about history of latent diabetes or diabetes covered transient or borderline elevated blood glucose and gestational diabetes, as well as diabetes treated with diet alone at baseline. A total of 35 subjects reported at baseline that they had been told they had diabetes but never had any antidiabetic drug treatment. Of these individuals, 32 had at least fasting glucose levels measured at baseline; 16 had glucose values considered diabetic. During follow-up, 21 of these individuals started using antidiabetic drugs according to the drug register data (15 of these subjects had glucose levels considered diabetic at baseline).

    The multivariable logistic regression models based on the follow-up of the 1987 survey are shown in Table 1. Statistically significant independent predictors of future drug-treated diabetes were age, BMI, waist circumference, antihypertensive drug therapy, and history of high blood glucose levels. The concise model includes only these statistically significant variables. The full model includes also physical activity and fruit and vegetable consumption. Even though these two variables did not add much to the predictive power of the statistical model, they were included in the Diabetes Risk Score to emphasize the importance of physical activity and diet in the prevention of diabetes. BMI between 25 and 30 kg/m2 was not a statistically significant predictor in the multivariate models. Nevertheless, it was included in the final Diabetes Risk Score because it is obviously the intermediate stage between normal weight and obesity, with a reasonably high impact on diabetes risk (odds ratio 2.53) even when other risk factors are in the model.

    In the multivariate model, male sex was a statistically significant predictor of drug-treated diabetes risk; the odds ratio was 1.58 (95% CI 1.15–2.18) in the concise model and 1.67 (1.19–2.34) in the full model. On the other hand, inclusion or exclusion of sex into the models changed the coefficients of the other independent variables only slightly. Therefore, we did not include sex in the final multivariate model and the final Diabetes Risk Score.

    A total of 4,595 subjects had complete baseline data for the concise model, and of these individuals, drug-treated diabetes developed in 194 during follow-up. For the full model, 4,435 subjects had complete baseline data and drug-treated diabetes developed in 182 subjects.

    The Diabetes Risk Score value (last column in Table 1) was defined using the full model, from the β-coefficient as follows: for β = 0.01–0.2, the score is 1; for β = 0.21–0.8, the score is 2; for β = 0.81–1.2, the score is 3; for β = 1.21–2.2, the score is 4; and for β >2.2, the score is 5. The lowest category of each variable was given a score of 0, except for the use of fruits and vegetables, where daily use was scored as 0, and physical activity, where “more than 4 h/week” was scored as 0. The total Diabetes Risk Score was calculated as the sum of the individual scores and varied from 0 to 20.

    Model validation

    Of the 4,615 subjects in the 1992 survey, drug-treated diabetes developed in 67 during ∼5-year follow-up. The Diabetes Risk Score could be calculated for each subject who had complete baseline information on the selected risk factors (n = 4,586). The 1987 and 1992 surveys had similar data, except for the intake of vegetables, fruits, or berries: in the 1992 survey, there were several frequency questions about use of raw and cooked vegetables, fruits, and berries. If the total frequency was <33/month, the individual was placed into the lower intake category. Only 15% of subjects were in this low group, compared with 52% in the 1987 survey, which may reflect a true increase in vegetable and fruit consumption during the 5 years between the surveys or may have arisen from the differences in the questions.

    The ROC curves (Fig. 1) demonstrate that the Diabetes Risk Score based on the 1987 cohort predicted drug-treated diabetes very well (AUC = 0.85). The prediction was similarly good in the 1992 cohort (AUC = 0.87). The Diabetes Risk Score value 9 was selected as the cut point for increased risk of drug-treated diabetes, along with sensitivity of 0.78 and specificity 0.77 in the 1987 cohort and sensitivity of 0.81 and specificity 0.76 in the 1992 cohort. The positive predictive value (PPV), the probability of drug-treated diabetes developing during follow-up if the Diabetes Risk Score was 9 or higher, was 0.13 for 1987 cohort (10-year follow-up) and 0.05 for 1992 cohort (5-year follow-up). The overall incidence was lower in the 1992 cohort due to the shorter follow-up period.

    In Table 2, the men and women of both cohorts are classified into four Diabetes Risk Score categories. The incidence of drug-treated diabetes was markedly elevated in the two highest categories. In the 1987 cohort, 25% of both men and women fell into the two highest categories; in the 1992 cohort, 26% of men and 24% of women were classified in the two highest categories. Therefore, the Diabetes Risk Score cut point of 9 identified the high-risk quartile of the population, identifying >70% of the incident cases of drug-treated diabetes.

    We also analyzed the performance of the Diabetes Risk Score cross-sectionally in identifying subjects who had either fasting or 2-h glucose levels exceeding the threshold of diabetes. A total of 2,525 subjects in the 1987 cohort and 1,976 subjects in the 1992 cohort could be classified according to results of oral glucose tolerance test and had complete Diabetes Risk Score data. The crude prevalence of undiagnosed diabetes was 3.5% (n = 87) in the 1987 survey and 5.7% (n = 112) in the 1992 survey (known diabetic patients treated with antidiabetic drugs and subjects with incomplete baseline data excluded from the analyses). The ROC curves (not shown) indicated good performance of the Diabetes Risk Score also in the cross-sectional setting (AUC = 0.80 for both surveys). For cut point Diabetes Risk Score of ≥9, sensitivity was 0.77 (95% CI 0.66–0.85) and 0.76 (0.67–0.83), specificity was 0.66 (0.64–0.68) and 0.68 (0.66–0.70), PPV was 0.07 (0.06–0.09) and 0.12 (0.10–0.15), and negative predictive value (the probability of not having diabetic glucose levels if Diabetes Risk Score was <9) was 0.99 (0.98–0.99) and 0.98 (0.97–0.99) in the 1987 and 1992 oral glucose tolerance tests, respectively.


    Recent studies have shown that type 2 diabetes can be prevented in high-risk subjects with impaired glucose tolerance by lifestyle intervention (13). Therefore, a strong argument exists in favor of screening for subjects who are at increased risk for diabetes (14).

    Our study is unique in that it focuses on predicting future drug-treated diabetes with several factors that are easy to measure with noninvasive methods, are known to be associated with risk of type 2 diabetes, are easily comprehensible, and direct attention to modifiable risk factors of diabetes. The interpretation of the individual’s diabetes risk is easy and can be expressed as a probability relatively accurately. Drug-treated diabetes is very unlikely to develop in individuals with a low Diabetes Risk Score. Therefore, these individuals can be excluded from further procedures such as glucose testing without causing a problem of false-negative results. Defining a suitable cut point is a trade-off between sensitivity and specificity.

    We included in the analyses all subjects who were not on antidiabetic drug therapy at baseline. Therefore, patients with diabetes who were treated with diet alone were included in the prospective follow-up, where the outcome was initiation of antidiabetic drug treatment. Initiation of drug therapy indicates a deterioration of glucose homeostasis also in patients who, at baseline, may have been treated with diet alone. This approach decreased the possibility of bias because, during follow-up, it would not have been possible to ascertain diet-treated cases. It is obvious that the recent incident cases, typically treated with diet, were missed in follow-up. Therefore, incidence of diabetes is an underestimate of the true value. We are also aware of the possibility of circular argument of identifying subjects based on the same risk factors that would evoke their physician to prescribe blood glucose testing, missing the diagnosis of less typical cases. However, the finding that the Diabetes Risk Score performed equally well in the cross-sectional analysis attenuated this concern.

    We did not exclude people with high glucose levels at baseline because we tested the Diabetes Risk Score under the assumption that no biochemical tests are performed at that stage. As shown by our analyses in the subset in which glucose values were available at baseline, use of a high Diabetes Risk Score value as a primary screening tool would efficiently identify unrecognized diabetes. Most cases of diabetes would then be diagnosed at the subsequent oral glucose tolerance test in individuals with a high Diabetes Risk Score value.

    The Diabetes Risk Score values were derived from the coefficients of the logistic model by classifying them into five categories. A more precise method would be to sum the original coefficients or their expansions. The sum of the coefficients would have a wide distribution and would therefore be impractical in clinical use. If the Diabetes Risk Score is used as a computerized version, the estimated probability (p) of drug-treated diabetes (during the following ∼10 years) for any combination of risk factors can be calculated from the coefficients as follows: Formula where β0 is the intercept and β1, β2, etc. represent the regression coefficients of the various categories of the risk factors x1, x2, etc. In Table 1, we have shown the coefficients for the full model that was used to formulate the Diabetes Risk Score as well as the concise model with fewer variables. We also calculated the model excluding those subjects who, at baseline, reported that they had diet-treated diabetes. The coefficient for history of high blood glucose was reduced from 2.263 to 1.860 (odds ratio 9.61–6.45).

    A few reports (1520) have suggested methods of screening for undiagnosed diabetes. In these assessments, the outcome was prevalent diabetes in a cross-sectional setting. In a follow-up study (21) with a median follow-up of 8 years, BMI at baseline predicted diabetes as well as fasting or 2-h plasma glucose; in that study, no other risk factors for diabetes were analyzed. In a recent follow-up study, Stern et al. (22) developed two models to predict diabetes incidence: a clinical model including age, sex, ethnicity, fasting glucose, systolic blood pressure, HDL cholesterol, BMI, and family history of diabetes; and a full model that also included 2-h glucose, diastolic blood pressure, total and LDL cholesterol, and triglyceride. Therefore, they included in their models most of the parameters of the metabolic syndrome as defined by the WHO Consultation (12). Their finding is not surprising, because it is well known that people with signs of the metabolic syndrome have increased risk of type 2 diabetes.

    The PPVs of the reported predictive models in identifying prevalent, undiagnosed diabetes have ranged from 5.6 and 10%. The performance of our Diabetes Risk Score, with PPVs of 7 and 12% in cross-sectional settings in the 1987 and 1992 cohorts, respectively, is comparable. Therefore, our method, even though it was developed using incident drug-treated diabetes as the outcome, might also be accurate in predicting earlier stages of type 2 diabetes. This will be seen when our Diabetes Risk Score is applied in such situations in the future.

    There are a few other risk factors about which we did not have information and therefore could not include in the Diabetes Risk Score. Family history of diabetes, which reflects the genetic predisposition for the disease, is known to be an important marker for increased risk of diabetes (23,24). The genetic predisposition may be necessary but not sufficient for development of type 2 diabetes. With healthy lifestyle, even individuals with genetic susceptibility to diabetes can avoid the symptomatic phase of the disease. We propose that family history should be included in this kind of model; score values 5 and 3 would probably be appropriate for positive history in first- and second-degree relatives, respectively.

    Previous gestational diabetes is known to be a strong risk factor for future diabetes (2527). Our question about history of glucose intolerance also covers gestational diabetes. Physical activity (28,29), the quality and quantity of dietary fat, and the intake of fiber (3032) have been demonstrated to modify risk of diabetes. We included into our prediction model the questions on physical activity at work and/or on leisure time and consumption of fruit, berries, and vegetables, of which we had information at baseline, to increase awareness of the importance of the modifiable risk factors for diabetes.

    The Diabetes Risk Score has been designed to be a screening tool for identifying high-risk subjects in the population and for increasing awareness of the modifiable risk factors and healthy lifestyle. Filling in the Diabetes Risk Score may encourage a person who gets a high value to have his/her blood glucose measured. In principle, however, no glucose testing is necessary to decide what should be done if the Diabetes Risk Score value is determined to be high, because such individuals will benefit from improvements in their lifestyle regardless their glucose levels. On the other hand, many individuals with a high Diabetes Risk Score may have unrecognized, asymptomatic diabetes and, therefore, may require blood glucose testing for diagnosis, other clinical assessments, and therapy. It is known that 30–60% of individuals with diabetes in the community are undiagnosed (33,34) and that undiagnosed diabetes is associated with increased mortality and risk of cardiovascular disease (35,36); therefore, diabetes is an important public health problem. This simple, safe, and inexpensive screening test will drastically reduce the number of invasive glucose tests required at the screening phase. We believe that the public health implications of this Diabetes Risk Score are considerable. It is a cost-efficient and practical way to identify individuals at high risk for drug-treated diabetes in the general population. This strategy has been recently adopted in Finland, where a nationwide program for prevention of type 2 diabetes (37) is being launched, and one of the tools in this prevention program is the Diabetes Risk Score.

    Figure 1—

    ROC curves showing the performance of the Diabetes Risk Score in predicting diabetes in the 1987 and 1992 cohorts; follow-up of both cohorts continued until the end of 1997. The area under the 1987 curve was 0.85 and the area under the 1992 curve was 0.87. For cut point Diabetes Risk Score ≥9 (black marker), sensitivity was 0.78 (95% CI 0.71–0.84) and 0.81 (0.69–0.89), specificity was 0.77 (0.76–0.79) and 0.76 (0.74–0.77), PPV was 0.13 (0.11–0.15) and 0.05 (0.04–0.06), and negative predictive value was 0.99 (0.98–0.99) and 0.996 (0.993–0.998) in the 1987 and 1992 cohorts, respectively.

    Table 1—

    Logistic regression models with drug-treated diabetes during follow-up as the dependent variable

    Table 2—

    Diabetes incidence by Diabetes Risk Score in 1987 and 1992 cohorts during follow-up through the year 1997


    The development of the Diabetes Risk Score was partly funded by the Finnish Diabetes Association, the Academy of Finland (grants 38387 and 46558), and the Yrjö Jahnsson Foundation.


    • Address correspondence and reprint requests to Jaana Lindström, National Public Health Institute, Mannerheimintie 166, 00300 Helsinki, Finland. E-mail: jaana.lindstrom{at}

      Received for publication 15 July 2002 and accepted in revised form 18 November 2002.

      A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.


    | Table of Contents