Is it universal?
Abstract
OBJECTIVE—Bimodality in blood glucose (BG) distribution has been demonstrated in several populations with a high prevalence of diabetes and obesity. However, other population studies had not found bimodality, thus casting doubt on its universality. We address this question in four ethnic populations—namely Malay, Chinese, Indian, and the indigenous people of Borneo.
RESEARCH DESIGN AND METHODS—A national health survey was conducted in Malaysia in 1996. A total of 18,397 subjects aged ≥30 years had postchallenge BG measurements taken. To test whether BG was consistent with a bimodal distribution, we fitted unimodal normal and skewed distribution as well a mixture of two normal distributions to the data by age and ethnic groups.
RESULTS—Agespecific prevalence of diabetes varied from 1.3 to 26.3%. In all ethnic/age groups, the bimodal model fitted the log BG data better (likelihood ratio tests, all P values <0.001).
CONCLUSIONS—Bimodality in BG distribution is demonstrable even in populations with a very low prevalence of diabetes and obesity. Previous studies that found unimodality had failed to detect the second mode because of inadequate sample size, bias due to treatment of subjects with known diabetes, and inclusion of subjects with type 1 diabetes in the sample. Bimodality implies that diabetes is a distinct entity rather than an arbitrarily defined extreme end of a continuously distributed measurement.
Bimodality in blood glucose (BG) distribution has been described in several populations, including Pima Indians (1), Micronesian Naruans (2), Polynesian Samoans (3), Mexican Americans (4), Creole and Chinese Mauritians (5), Micronesians of Kiribati (6), South African Indians (7), and the Wanigela people of Papua New Guinea (8). However, these populations have a high prevalence of diabetes and obesity, and most are relatively isolated genetically (9). The distribution of BG in most populations studied has apparently been unimodal (10,11). However, the failure to demonstrate a bimodal pattern does not necessarily mean it is absent. A recent national health survey conducted in Malaysia presented us with an opportunity to reexamine this question.
RESEARCH DESIGN AND METHODS
Sample
The national health survey was conducted in 1996. The Malaysia Department of Statistics provided the sampling frame (12). A sample of 17,995 households was drawn using a stratified twostage cluster sampling design with a selfweighting sample. Only 13,025 (87%) households responded, yielding a sample size of 59,903 individuals. A total of 19,218 (87%) of 22,093 eligible individuals aged ≥30 years from the four major ethnic groups (Malay, Chinese, Indians, and Other Indigenous) had their BG measurements taken.
Survey procedure
During a home visit, the first hour was devoted to completing a questionnaire administered by an interviewer. Subsequently, each respondent’s weight was measured in light indoor clothing without shoes using a bathroom spring balance. Height was measured without shoes using a measuring tape. A trained nurse measured BG with a reflectance photometer (Accutrend GC; Boehringer Mannheim) in all subjects without a medical history of diabetes. Subjects with known diabetes were excluded from BG measurement on ethical and clinical grounds, except in a small subsample (n = 25). The procedure was explained, and verbal permission was obtained from the respondent before the examination. Glucose monohydrate powder (75 g) was mixed with a glass of plain water and ingested by the respondent. The respondent then fasted for 2 h, after which a blood sample was obtained by finger prick to measure BG. The precision of the reflectance photometer was deemed satisfactory for survey use. The withinrun coefficient of variation was 2–4%, and correlation with measurements on plasma using a conventional laboratory method varied between 0.98 and 0.99 (unpublished data).
Definition and classification
A known diabetes subject is defined as an individual with a medical history of diabetes and one who is currently on antidiabetes medication. A total of 25 subjects with diagnosed diabetes had BG measured; the geometric mean (minimum, maximum) of their BG concentrations was 17.6 mmol/l (11.3, 31.2). This result provides evidence for the validity of selfreported diabetes. The diagnostic criteria recommended by the World Health Organization (13), based on a 2h postload BG value, were used to classify subjects without a medical history of diabetes as having diabetes, as having impaired glucose tolerance, or as normal.
Statistical methods
Probabilityweighted estimation was used to calculate the prevalence of diabetes as appropriate for the sampling design (14). The sampling weights were adjusted for household nonresponse using adjustment cells formed by state and urban/rural residence. Poststratification (15) was used to adjust the weighted sample totals to known population totals for age, sex, and ethnicity based on 1996 census population projection. The agestandardized prevalence of diabetes was calculated by the direct method using the Segi’s world population as standard (16).
Blood glucose values were logtransformed to reduce skewness. To test whether BG was consistent with a bimodal distribution, we fitted unimodal normal distribution as well as a mixture of two normal distributions to the data within each ethnic/age group. Normal distribution was chosen as a reasonable first approximation, and previous studies (1–8) found it fitted log BG data well. However, evidence for bimodality can be confounded by residual skewness in the log BG distribution (17). We therefore fitted a unimodal skewed distribution to the data. Estimation was by method of maximum likelihood based on an expectationmaximization (EM) algorithm as implemented by the program NOCOM (18). Parameters of the unimodal normal model are mean μ and variance σ^{2}. For the unimodal skewed distribution, an additional parameter λ is required to remove residual skewness by Box and Cox transformation (17). The bimodal model parameters are mean μ and variance σ^{2} of the two component normal distributions, and q is the mixture proportion. The variance σ^{2} of the two components may be unequal. Hypothesis testing was carried out by comparing nested models using likelihood ratio tests to select the bestfitting model. The difference in loglikelihood between the two models was twice assumed to be distributed as χ^{2}, with the df equal to the difference in the number of parameters between models. The bimodal distribution is only accepted if both the unimodal normal and unimodal skewed distribution hypotheses are rejected at the 1% level. A 1% significance level is chosen rather than the more conventional 5% level because the assumed χ^{2} distribution for the likelihood ratio test statistic is only approximate, and the P value obtained is liberal (19).
The above analysis was performed excluding missing BG data. Because of the design of the survey, the vast majority of subjects with a medical history of diabetes (n = 821) had no BG measurement taken, and in principle their true BG concentrations before diagnosis, unmodified by treatment, are unknown. Instead of excluding such missing BG data from the analysis, we imputed their BG values based on a censored regression imputation model (20) and guided by the imputation principles described by Little (21). BG values for subjects with known diabetes were rightcensored at 11.1 mmol/l. The imputation model included medical history of diabetes status, BMI, age, sex, and ethnicity. These variables are all known to be predictive of BG outcome in a population (22). The imputations were then drawn by predictive mean matching (21). Each subject without a BG value (nonrespondent) was matched with each respondent on his or her predicted BG values. We then imputed the BG value of the respondent with the closest predicted value. In effect, imputed values were drawn from the BG distribution of subjects in the sample with undiagnosed diabetes.
RESULTS
Table 1 shows the characteristics of the sample. The distribution of BMI shows that obesity was not markedly prevalent in these populations. Only 5% of men in all ethnic groups had BMI ≥30 kg/m^{2}, whereas the corresponding proportion among women in the four ethnic groups varied from 6 to 12%. Indian men and women had the highest prevalence of diabetes, whereas Other Indigenous men and women had the lowest. Similar differences among ethnic and sex groups were observed in the agestandardized prevalence of diabetes.
The likelihood ratio χ^{2} test statistics (df = 2), when comparing unimodal normal model versus bimodal normal model, range from 15 to 374 (all P values <0.001). Similarly, when comparing the unimodal skewed model versus the bimodal normal model, the test statistics (df = 1) range from 14 to 331 (all P values <0.001). Hence, in all groups, the bimodal normal models fit the data better than their corresponding unimodal normal model or unimodal skewed model.
Table 2 summarizes the parameter estimates of the bimodal normal models. Only results of the betterfitting bimodal models (equal variance or unequal variance) are shown. In all groups, the equal variance model was preferred, except in Chinese subjects aged ≥60 years and Indian subjects aged 40–49 years. In practical terms, however, the parameter estimates of the equal and unequal variance bimodal models were similar. The proportion of subjects in the second mode of the bimodal model closely matches the proportion of subjects in the sample with diabetes. The mean (minimum, maximum) of the cut points between the two distributions was 12.0 mmol/l (10.9, 13.3). There was little variation in the means of the two modes among ethnic/age groups, nor was there any apparent age trend in the means of both the first and second mode. The means of the first mode varied from 1.5 to 1.7 (geometric mean 4.6–5.3 mmol/l), and those of the second mode varied from 2.6 to 3.0 (geometric mean 13.4–19.3 mmol/l). Likewise, there was little difference in the variance of the two modes among ethnic/age groups. Figure 1 shows the histograms of the log BG data, with the fitted bimodal distribution curves superimposed.
Repeat model fitting using imputed BG for subjects with known diabetes but without BG values similarly results in the bimodal model being preferred in all ethnic/age groups. The likelihood ratio χ^{2} test statistics (df = 2), when comparing the unimodal normal model versus the bimodal normal model, range from 39 to 616 (all P values <0.0001). Similarly, when comparing the unimodal skewed model versus the bimodal normal model, the test statistics (df = 1) range from 31 to 464 (all P values <0.0001). Table 3 summarizes the parameter estimates of the bimodal normal models. The proportion of subjects in the second mode of the bimodal model closely matches the proportion of subjects in the sample with diabetes, although the latter proportion is increased as a result of including diabetic subjects without BG measurements. Other model parameters (mean and variance) were also similar to parameters obtained from modeling that excluded missing BG data, except that the cut points were slightly lower. The mean (minimum, maximum) of the cut points between the two distributions was 10.8 mmol/l (9.4, 11.8). There was similarly little variation in the means of the two modes among ethnic/age groups, nor was there any apparent age trend in the means of both the first and second mode.
CONCLUSIONS
We have demonstrated that postchallenge BG concentrations are bimodally distributed in the four ethnic populations studied. These populations were not markedly obese, and the prevalence of diabetes ranged from 3.4 to 13.9%. Even in the Other Indigenous group, aged 30–39 years, with a diabetes prevalence of only 1.3%, the bimodal pattern was still evident. Further, the parameter estimates (means and variances of the two modes) were consistent and showed little variation among ethnic/age groups. The mean cut point between the two distributions was 12.0, which is consistent with the cutoff value of 11.1 mmol/l currently used for diagnosing diabetes (13). These results, together with findings of bimodal BG distributions in several other populations (1–8), suggest that the bimodal pattern is universal. We explain why we have been able to demonstrate bimodality when others have apparently found unimodality (10,11).
First, the four ethnic populations examined in this study have a relatively homogenous type of diabetes. We may reasonably assume that the diabetes cases ascertained in this survey were almost exclusively type 2 diabetes because type 1 diabetes is rare in these populations (23). Mixture of subjects with type 2 diabetes and type 1 diabetes in a sample would increase the variance in the diabetes second mode, thus rendering the second mode less discernible. This is almost certainly true of surveys of Caucasian populations in which type 1 diabetes is far more prevalent.
Second, this was a large survey that possessed adequate power to detect the second mode in all ethnic/age groups. Previous studies may be deficient in this respect. Where bimodality had been demonstrated, it only occurred in populations with a relatively high prevalence of diabetes, thus minimizing the sample size requirement to demonstrate the second mode. Given adequate statistical power, the ability of a study to detect the second mode depends on two factors (24). The first is the difference between the means of the two modes in SD units (assuming equality of variance between the two modes for simplicity). In the 16 groups studied, the average difference was 4 SD units. The second factor is the relative sizes of the two populations. When one population is relatively small (that is, a population with a low prevalence of diabetes), it could be obscured in the tail of the larger group. For a given difference in means between the two modes, there is a limit to the prevalence of diabetes below which the second mode is not discernible, no matter how large the sample size. Murphy (24) showed that for a difference in means of 4 SD units, even for the relative size of 99 (that is, 1% prevalence of diabetes), the second mode would still be discernible if the sample had adequate power. We further assessed informally the sample size requirement to detect the second mode in the Other Indigenous group aged 30–39 years with a 1.3% prevalence of diabetes. For samples of size 100, 200, 400, 600, and 800, we drew 50 random samples of each size and determined the proportion of samples in which the second mode was statistically discernible at the 1% level on a likelihood ratio test. The proportions were 38, 62, 88, 98, and 100% for sample sizes of 100, 200, 400, 600, and 800, respectively. Clearly, to have a 90% chance of detecting the second mode in that ethnic/age group, a sample size of about 400 is required.
Third, bias existed because of the treatment of subjects with known diabetes, or equivalently, subjects with treated diabetes were excluded in the sample. This would effectively reduce the second mode. However, it did not significantly impair the ability of this large study to detect the second mode. Nevertheless, we avoided this source of bias by imputing the BG values of subjects with treated diabetes. In effect, the imputation model and method implicitly imply that had we known the pretreatment BG values of subjects with treated diabetes, theses values would resemble the distribution of subjects with undiagnosed diabetes in the sample. Such an assumption is reasonable for subjects with type 2 diabetes who are relatively asymptomatic and in whom hyperglycemia is less marked. Subjects with diabetes in our sample almost exclusively have type 2 diabetes. It is true that known diabetic subjects are likely to have more severe hyperglycemia than undiagnosed diabetic subjects (mean BG of the 25 diabetic subjects with BG measurement was greater than the mean BG of undiagnosed diabetic subjects in the sample). In that case, the imputed BG values would have been systematically lower than the true pretreatment values. Given that such a bias is likely to be present, we must point out that the means of the second mode and the cut points obtained in this study are lower than what they would have been. However, the demonstration of bimodal BG distribution remains valid.
The demonstration of bimodality in BG distribution has important implications (9). It implies that populations do segregate naturally into two groups; a normal nondiabetic group and a diabetic group. This segregation allows type 2 diabetes to be defined as a distinct disease entity. Bimodality in BG distribution also suggests that impaired glucose tolerance represents the upper part of the distribution of the first component and is not a distinct diagnostic category (25).
Acknowledgments
This work was supported by an Intensification of Research in Priority Areas (IRPA) grant (0605010060) from the Ministry of Science and Technology, Malaysia.
We thank the research groups of the Malaysian Second National Health and Morbidity Survey for providing the data for this current work.
Footnotes

Address correspondence and reprint requests to Dr. TeckOnn Lim, Clinical Research Centre, 3rd Floor, Block Dermatology, Kuala Lumpur Hospital, Jalan Pahang 50586, Kuala Lumpur, Malaysia. Email: limto{at}crc.gov.my.
Received for publication 10 April 2002 and accepted in revised form 16 August 2002.
A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.
 DIABETES CARE