A Risk Score for Predicting Incident Diabetes in the Thai Population

  1. Wichai Aekplakorn, MD, PHD1,
  2. Pongamorn Bunnag, MD1,
  3. Mark Woodward, PHD2,
  4. Piyamitr Sritara, MD1,
  5. Sayan Cheepudomwit, MD1,
  6. Sukit Yamwong, MD1,
  7. Tada Yipintsoi, MD3 and
  8. Rajata Rajatanavin, MD1
  1. 1Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
  2. 2The George Institute for International Health, University of Sydney, Sydney, Australia
  3. 3Prince of Songkla University, Songkla Province, Thailand
  1. Address correspondence and reprint requests to Wichai Aekplakorn, Community Medicine Center, Ramathibodi Hospital, Rama 6 Rd., Rajdevi, Bangkok 10400, Thailand. E-mail address: rawap{at}


OBJECTIVE—The objective of this study was to develop and evaluate a risk score to predict people at high risk of diabetes in Thailand.

RESEARCH DESIGN AND METHODS—A Thai cohort of 2,677 individuals, aged 35–55 years, without diabetes at baseline, was resurveyed after 12 years. Logistic regression models were used to identify baseline risk factors that predicted the incidence of diabetes; a simple model that included only those risk factors as significant (P < 0.05) when adjusted for each other was developed. The coefficients from this model were transformed into components of a diabetes score. This score was tested in a Thai validation cohort of a different 2,420 individuals.

RESULTS—A total of 361 individuals developed type 2 diabetes in the exploratory cohort during the follow-up period. The significant predictive variables in the simple model were age, BMI, waist circumference, hypertension, and history of diabetes in parents or siblings A cutoff score of 6 of 17 produced the optimal sum of sensitivity (77%) and specificity (60%). The area under the receiver-operating characteristic curve (AUC) was 0.74. Adding impaired fasting glucose or impaired glucose tolerance status to the model slightly increased the AUC to 0.78; adding low HDL cholesterol and/or high triglycerides barely improved the model. The validation cohort demonstrated similar results.

CONCLUSIONS—A simple diabetes risk score, based on a set of variables not requiring laboratory tests, can be used for early intervention to delay or prevent the disease in Thailand. Adding impaired fasting glucose or impaired glucose tolerance or triglyceride and HDL cholesterol status to this model only modestly improves the predictive ability.

A remarkable worldwide increase in the number of people with type 2 diabetes has been predicted (1). Prevalence rates in the developing world, particularly in the Asia-Pacific region, are already high and expected to rise more quickly than elsewhere. In Thailand, the prevalence of type 2 diabetes among the population aged ≤35 years was 9.6% in 2001, an increase of 20% over a period of 5 years (2). Cardiovascular disease (CVD) is one of the leading causes of death in Thailand (3), and individuals with diabetes have a two to fourfold greater risk of developing CVD than those without (4). The burden of diabetes and its complications, which include other diseases besides CVD (5), imposes a massive load on the Thai health care system.

Lifestyle modification has been proven to effectively prevent and delay the development of diabetes (68). Therefore, early recognition of and intervention for the condition will be beneficial, particularly as cardiovascular complications set in early after the onset of diabetes (9). Delay and lack of detection of the disease are mostly due to patients being asymptomatic during the early stage of the disease; hence an accurate screening tool to identify those at high risk of developing diabetes will be of great value. Knowledge of the risk of diabetes could enhance people’s awareness, leading to lifestyle modification. Population screening for diabetes, using blood glucose tests, would not be convenient or cost-effective, especially in a resource-poor environment such as Thailand. A simple tool, using a few questions and simple measurement of anthropometric indexes, would be practical for use by the general public and in primary health care. Several diabetes risk score protocols have been developed. Some have been derived from prevalence (cross-sectional) studies, and thus identify individuals with current diabetes; others derive from cohort studies and predict who will get new (incident) diabetes. Most of the latter have been developed among Caucasians (1012). Only one diabetes risk score has been based on an Asian population (13), and this was a prevalence study carried out in a part of Asia very different from Thailand.

Our aim in the present study was to develop and evaluate a simple, low-technology, diabetes risk scoring system for identifying individuals who are likely to develop diabetes in the near future for use in Thailand. A secondary aim was to evaluate the extent of improvement in the score when adding information about impaired fasting glucose (IFG) or glucose intolerance and high serum triglycerides and low HDL cholesterol to the prediction model.


Data were taken from a cohort study of employees of a state enterprise, the Electric Generation Authority of Thailand. Detailed methods are described elsewhere (14). Briefly, the cohort initially included 3,499 employees aged ≥35 years, recruited from a Bangkok power plant of Electric Generation Authority of Thailand in 1985; this cohort will be referred to here as the “exploratory cohort.” Most of the employees were urban dwellers of middle-income social class. Follow-up health interviews and examination surveys were performed after 12 years (in 1997).

Measurements of variables were collected at baseline from a questionnaire, physical examination, and blood samples obtained after a 12-h overnight fast. Oral glucose tolerance tests (OGTTs) were performed 2 h after the ingestion of a standard 75-g glucose load. Blood glucose levels were analyzed by a glucose oxidase method. Diabetes was diagnosed according to the American Diabetes Association criteria, as fasting plasma glucose level ≥126 mg/dl (7.0 mmol/l) or 2-h glucose level ≥200 mg/dl (11.1 mmol/l) (15) or a previous diagnosis of diabetes. Serum HDL cholesterol and triglycerides were measured by an enzymatic calorimetric assay. Hypertension was defined as blood pressure ≥140/90 mmHg or current prescription of blood pressure–lowering treatment. The outcome of interest in the present study was development of type 2 diabetes during the follow-up period until 1997, among those who had no diabetes at baseline. Definition of diabetes incidence was based on the results of a blood test at the follow-up survey, according to the American Diabetes Association criteria (fasting plasma glucose or OGTT), as well as diagnosis and/or receipt of diabetes medication during the follow-up period.

Statistical analysis

Analyses were performed using STATA version 8. The potential risk factors for incident diabetes were age (categorized into four groups: 35–39, 40–44, 45–49, and 50–54 years), sex, BMI (normal, <23; overweight, ≥23 but <27.5; obese, ≥27.5 kg/m2), abdominal obesity (waist circumference ≥90 in men or ≥80 cm in women), current smoking (yes/no), alcohol consumption (nondrinker, occasional drinker [<4 times a month], and frequent drinker [≥1 times a week]), history of diabetes either in parents or sibling, and diagnosed or currently treated hypertension. A multivariable logistic regression model, including all of these variables that were significantly (P < 0.05) associated with incident diabetes, was used to estimate mutually adjusted odds ratios (ORs) for incident diabetes and corresponding β coefficients (log ORs). This is referred to as the simple model or model 1. Next, IFG status (glucose ≥100 mg/dl [5.6 mmol/l] but <126 mg/dl [7.0 mmol/l]) was added to the simple model to give model 2. Then, model 3 was created by adding a variable indicating those with impaired glucose tolerance (IGT), defined as those having 2-h values in the OGTT of ≥140 mg/dl (7.8 mmol/l) but <200 mg/dl (11.1 mmol/l) (15), to model 1. Model 4 added high triglycerides (≥200 mg/dl) to model 3, and model 5 added low HDL cholesterol level (<40 mg/dl in men and <50 mg/dl in women) to model 3. Plausible interactions between age and sex with other covariates were tested using Wald tests with a P value set at the 0.10 level, but no interactions were identified.

For each logistic model, a receiver-operating characteristic curve was plotted, and the area under it (AUC) was calculated. Subjects were divided into equal tenths of predicted diabetes risk by the logistic function. The predicted and actual number of diabetes in each tenth was compared, and the goodness of fit was assessed by the Hosmer-Lemeshow χ2 test (16). AUCs for both IFG and IGT in prediction of diabetes incidence were also calculated.

A scoring system was developed for the simple model; points were assigned to each variable based on the magnitude of its regression coefficient (17). A total diabetes risk score for each individual was calculated as the sum of points for each variable; this score was related to actual observed incidence. A receiver-operating characteristic curve and AUC were produced. Sensitivity and specificity were calculated for each cutoff score. The cutoff score that gave the maximum sum of sensitivity and specificity was taken as the optimum (16).


The performance of the risk score was evaluated in another cohort (the “validation cohort”). Of 2,879 employees aged 35–54 years working in four other power plants, one plant in Bangkok and one from each of the north, west, and northeast provinces of Thailand. The baseline survey of the validation cohort was conducted in 1998 and was followed by a second survey in 2003. The variables collected and methods used were similar to those for the exploratory set, except that diabetes incidence was based on measurement of fasting glucose only (≥126 mg/dl [7.0 mmol/l]). Of the 2,420 subjects in the 2003 survey who had no diabetes at baseline, 125 developed diabetes during the 5-year follow-up period. The simple diabetes risk score was calculated for each participant who had baseline data and evaluated as for the exploratory cohort.


Of the 3,254 (93.0%) participants without diabetes at baseline in the exploratory cohort, 2,667 (82%) participants took part in the 1997 survey, of whom 361 (13.5%) experienced incident type 2 diabetes. Compared with those who participated in 1997, subjects who were lost to follow-up were slightly older (mean age 44.6 vs. 42.4 years), but there were no significant differences with respect to baseline BMI (22.98 vs. 23.0 kg/m2), waist circumference (80.7 vs. 80.0 cm.), or history of diabetes in parents or sibling (30.5 vs. 32.1%).

Those with incident diabetes in the exploratory cohort were more likely to be of male sex, overweight, obese, abdominally obese, and hypertensive and to have a parent or sibling with diabetes, compared with the nondiabetic group, but alcohol and smoking were not associated with diabetes status (Table 1). In the simple multivariable regression model, using all these variables except alcohol and smoking, all variables were significant predictors of development of diabetes after mutual adjustment (Table 2). IFG or IGT alone had less predictive power compared with the simple model (AUC 0.55, 0.63 vs. 0.75, respectively). Inclusion of either IFG (model 2) or OGTT status (model 3) in the simple model improved the accuracy of prediction slightly, the AUC increased from 0.75 in the simple model to 0.76 in model 2 and to 0.78 in model 3. In model 4, high triglycerides hardly improved the predictive power (AUC = 0.79). Adding low HDL cholesterol status gave much the same result (AUC = 0.79). For the goodness-of-fit tests, all of the models showed no significant difference between predicted and observed diabetes incidence, indicating reasonable fit, except for the model with either IFG or IGT alone.

Table 3 shows the diabetes risk score points defined from the coefficients of the simple model. The total score ranged between 1 and 17. The cutoff score of 6 was optimum (sensitivity = 77%; specificity = 60%). The AUC was 0.74 (95% CI 0.71–0.78), which is almost identical to that for the corresponding model included in Table 1 (using exact, as opposed to rounded, coefficients).


In the validation cohort, 125 subjects developed diabetes from a total of 2,420 subjects who had no diabetes at baseline. Table 1 shows that relationships of variables with incident diabetes in the validation cohort were broadly similar to those in the exploratory cohort. The diabetes risk score (computed from the exploratory cohort) predicted incidence cases of diabetes in the validation cohort well, with an AUC of 0.75 (95% CI 0.71–0.80). At the cutoff point of 6, the sensitivity and specificity were 84.4 and 52.5%, respectively. However, the optimum cutoff point for the validation set was 7 (sensitivity = 77%; specificity = 61.9%). Figure 1 shows the distribution of the diabetes risk score in the two cohorts. These were relatively similar, with the highest proportion of individuals at score 2 (19.5% in the exploratory set vs. 16.1% in the validation set), followed by scores 6 and 5 (11.9 and 11%) in the exploratory set and scores 5 and 7 (11.7 and 10.2%) in the validation set. The incidence of diabetes increased with increasing score (Fig. 1, downward bars) in both cohorts, with lower incidence in the validation cohort after 5 years of follow-up than in the exploratory cohort who had 12 years of follow-up. Additional analysis was conducted to calculate the impact of adding IFG to the diabetes risk score, tested in the validation cohort. The predictive ability improved (AUC = 0.81) and, at the optimum cutoff score of 9, yielded a sensitivity of 76% and specificity of 74%.


In the present study we develop a practical tool for prediction of diabetes incidence in Thailand. The variables included are age, sex, BMI, waist circumference, history of hypertension, and history of diabetes in parents or siblings. The simple model without laboratory tests is almost as good as models that include IFG, glucose intolerance, HDL cholesterol, or triglycerides. The simple model is attractive because it is noninvasive, more convenient, and less expensive compared with the models that rely upon blood tests. The scoring rule based on the simple model performed well when it was validated in an independent cohort. The risk score method for identifying high-risk people is practical for a primary medical care setting and for a layperson to perform self-assessment. The high-risk individuals identified will benefit from receiving health education and having the opportunity to engage in healthy lifestyles at an early stage so as to prevent or delay the onset of type 2 diabetes. It is generally recommended that individuals with high risk have a fasting blood test, and, nowadays, the test is available in most primary health care settings in Thailand. Combining the information on IFG with the simple diabetes risk score increased the predictive ability in the validation cohort. This might indicate that IFG is an influential predictor in populations with a high prevalence of IFG.

Other risk scores have been developed elsewhere, but few were developed from a cohort study design in which incident rather than prevalent cases were used. Among those with a cohort design, virtually all were developed in Caucasians and contain variables that may not be readily available in other populations. For instance, the Finnish diabetes risk model (10) requires knowledge of a history of high blood glucose. The model by Stern et al. (11) includes biochemical data on blood glucose, total and LDL cholesterol, and triglycerides. The inclusion of blood test data might not be practical in countries such as Thailand, where health care resources are limited, and such tests are not easily affordable. However, the current study lacks information on certain easily measurable risk factors that may be important predictors of diabetes, such as physical activity and fruit and vegetable consumption. Results are given for both sexes combined, because of the small number of events among women (67 in the exploratory cohort and 27 in the validation cohort). This, unfortunately, precludes the chance to investigate potentially important sex differences in the optimum risk score.

Among the modifiable risk factors that played a substantial role in previous studies was obesity, measured by BMI or waist circumference. In the present study, both BMI and waist circumference were found to increase diabetes risk at cutoff points suggested for Asian populations that are lower than those used for people in Western countries (18,19). HDL cholesterol and triglycerides are components of the metabolic syndrome, which is related to an increased risk of type 2 diabetes and cardiovascular risk. We did not find that low HDL cholesterol or high triglyceride levels add more predictive ability than the model with fasting blood glucose or IGT. A recent study showed that a diagnosis of the metabolic syndrome is not superior to a diabetes risk score in the prediction of diabetes (20).

Importantly, this study, unlike many previous ones, included a validation test of the diabetes prediction score in another population. Although the validation cohort was drawn from the same workforce, it contained an entirely different set of individuals drawn from a wider geographical area. The results of the validation confirmed that the risk score performs well in the prediction of diabetes. However, at the score that best defined diabetes in the exploratory cohort, it slightly overestimated the risk of diabetes in the validation set. This result might be due to the shorter follow-up in the validation cohort compared with that for the exploratory cohort. However, some degree of poorer fit is expected, because all scores are more likely to perform well in the study within which they were developed. The optimal score in the validation cohort was only 1 point higher than in the exploratory cohort, further confirming the utility of the score derived here.

The present study provides a risk score based on a specific population in Asia. The predictive performance and discriminative ability of the score is relatively comparable to those developed among Caucasians (1012). We were unable to validate the Finnish diabetes risk score or other scores with our data because certain information such as physical activity and vegetable or fruit consumption used in those studies was not available in our data. However, one study proved that those score rules yielded low accuracy when applied to another population, probably due to the differences in population characteristics (21). Nevertheless, future researchers might investigate more about the generalizability of these score rules across countries.

In summary, the simple diabetes risk score developed here can be applied in primary medical care practice and by the public as a self-assessment tool to identify people at high risk of diabetes. People with a high score should be referred for further blood tests and changes to a healthier lifestyle for primary prevention.

Figure 1—

Distribution of diabetes risk score (upward bars) and diabetes incidence (downward bars) against diabetes risk score in the exploratory and validation cohorts. Note that the incidence of diabetes at score 17 was zero in the validation cohort.

Table 1—

Baseline characteristics at entry for two cohorts, by incident diabetes status

Table 2—

ORs for predictors for diabetes in the exploratory cohort

Table 3—

Diabetes risk score based on the simple model for diabetes incidence in the exploratory cohort


The authors thank the following organizations for their support: the faculty of Medicine Ramathibodi Hospital, Mahidol University; the Electric Generating Authority of Thailand; and the Health Information System Development Office, Thai Health Foundation.


  • A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.

    • Accepted April 27, 2006.
    • Received November 4, 2005.


| Table of Contents