Association Between Fine Particulate Matter and Diabetes Prevalence in the U.S.
- John F. Pearson, BS1,2,
- Chethan Bachireddy, BS1,3,
- Sangameswaran Shyamprasad, MS1,
- Allison B. Goldfine, MD4,5 and
- John S. Brownstein, PHD1,4,6,7
- 1Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, Boston, Massachusetts,
- 2St. George's University School of Medicine, Grenada, West Indies;
- 3Yale University School of Medicine, New Haven, Connecticut;
- 4Partners Healthcare, Boston, Massachusetts;
- 5Joslin Diabetes Center, Boston, Massachusetts;
- 6Division of Emergency Medicine, Children's Hospital Boston, Boston, Massachusetts;
- 7Department of Pediatrics, Harvard Medical School, Boston, Massachusetts.
- Corresponding author: John S. Brownstein, .
OBJECTIVE Recent studies have drawn attention to the adverse effects of ambient air pollutants such as particulate matter 2.5 (PM2.5) on human health. We evaluated the association between PM2.5 exposure and diabetes prevalence in the U.S. and explored factors that may influence this relationship.
RESEARCH DESIGN AND METHODS The relationship between PM2.5 levels and diagnosed diabetes prevalence in the U.S. was assessed by multivariate regression models at the county level using data obtained from both the Centers for Disease Control and Prevention (CDC) and U.S. Environmental Protection Agency (EPA) for years 2004 and 2005. Covariates including obesity rates, population density, ethnicity, income, education, and health insurance were collected from the U.S. Census Bureau and the CDC.
RESULTS Diabetes prevalence increases with increasing PM2.5 concentrations, with a 1% increase in diabetes prevalence seen with a 10 μg/m3 increase in PM2.5 exposure (2004: β = 0.77 [95% CI 0.39–1.25], P < 0.001; 2005: β = 0.81 [0.48–1.07], P < 0.001). This finding was confirmed for each study year in both univariate and multivariate models. The relationship remained consistent and significant when different estimates of PM2.5 exposure were used. Even for counties within guidelines for EPA PM2.5 exposure limits, those with the highest exposure showed a >20% increase in diabetes prevalence compared with that for those with the lowest levels of PM2.5, an association that persisted after controlling for diabetes risk factors.
CONCLUSIONS Our results suggest PM2.5 may contribute to increased diabetes prevalence in the adult U.S. population. These findings add to the growing evidence that air pollution is a risk factor for diabetes.
Over the past 15 years diabetes prevalence has more than doubled in the U.S. to ∼24 million people (1). The impact of environmental pollution on diabetes risk remains incompletely understood. Air pollution has long been recognized as detrimental to health, contributing to respiratory and cardiovascular diseases (2). In 2004, the American Heart Association concluded that short-term exposure to particulate matter contributes to increased hospital admissions and cardiovascular mortality (2). Studies suggest that diabetic patients are especially sensitive to pollution-triggered cardiovascular events (3,4). During periods of high pollution, diabetic patients demonstrate greater impairment of vascular reactivity (3) and doubled rates of hospital admission for heart disease (5).
Environmental pollution, especially particulate matter between 0.1 and 2.5 μm in size (PM2.5), may be a neglected risk factor for diabetes (6,7). As a main component of haze, smoke, and motor vehicle exhaust, PM2.5 is dangerous in part because of its small size and ability to invade critical human organs in the respiratory and vascular systems (8). Exposure to higher levels of air pollution exaggerates adipose inflammation and insulin resistance in a mouse model of diet-induced obesity. In diabetic patients, plasma inflammatory markers increase in response to higher PM2.5 exposure (4,9). The presence of a large-scale population relationship between air pollution and diabetes risk has not been reported. Thus, the objective of this study was to evaluate the association between ambient air pollution exposure and adult diabetes prevalence in the U.S. At the county level in the U.S., we assessed the relationship between PM2.5, diabetes prevalence, and diabetes risk factors for the years 2004 and 2005. We hypothesized that higher PM2.5 levels would be associated with higher diabetes prevalence, and this relationship would be independent of multiple socioeconomic and behavioral risk factors typically associated with diabetes.
RESEARCH DESIGN AND METHODS
Data for the annual mean level of PM2.5 were obtained from the U.S. Environmental Protection Agency (EPA) for 2004 and 2005. We used data for both maximum recorded annual weighted mean by county and the average annual weighted mean of all monitors in each county (10). Because ground monitor information is limited to <700 counties, we also used the EPA's more recently available statistically fused air model of PM2.5 data from the Landscape Characterization Branch (11). Data were imported into SAS (version 10.2.1; SAS Institute, Cary, NC) with which an annual mean was generated for each grid point. This value was then transformed into raster surface, generating an annual mean by county by using spatial analyst and zonal statistics within ArcGIS (version 9.3; Environmental Systems Research Institute Inc., Redlands, CA). The average of all grid cells in each county was used for the primary analysis, whereas the greatest PM2.5 grid cell by county, considered the highest annual exposure, was used for confirmatory analyses.
We used county-level prevalence values of diagnosed diabetes for 2004 and 2005 created by the National Diabetes Surveillance System at the CDC (12). These data are based on the CDC's Behavioral Risk Factor Surveillance System (BRFSS) and represent percentage of the population ≥20 years who report diagnosed diabetes. BRFSS is a monthly telephone survey of the adult U.S. population. Bayesian multilevel modeling techniques and BRFSS diagnosed diabetes data from 2003, 2004, and 2005 were used to create county-level estimates of diagnosed diabetes prevalence in 2004 and similarly for 2005 (12). The CDC used 3-year averaging to increase statistical strength and reduce bias by increasing the sample size of the population surveyed in each individual county (13).
Diabetes risk factors
To examine the impact of diabetes risk-factors as potential confounders, we used county-level prevalence for obesity (defined as BMI >30 kg/m2), physical activity from adults who exercised, outside of their job, in the past month (herein denoted “physical activity”), and fast food establishment density from the CDC's BRFSS county-level data (14). We created 3-year estimates of each BRFSS sourced covariate, so that 2004 was an assembly of 2003, 2004, and 2005 survey responses, and created similar data for 2005. This allowed us to improve the statistical strength of the model through increased county sample size and to remain consistent with the diabetes prevalence data, which also uses 3-year averages. Fast food establishment density data were only available for 2006. Because of the smaller survey sample sizes of the BRFSS datasets in less populated counties, we separately analyzed only those counties with obesity and physical activity survey sample size of ≥25 respondents. For our primary analysis, we included only counties with >25 respondents, whereas for confirmatory analysis all counties were included.
For both 2005 and 2004 analyses, we used the U.S. Census American Community Survey (ACS) 1-year measures as the primary dataset because of its consistency with our analysis year. ACS is methodologically similar to Census 2000, although it is based on a smaller sample size (15). Socioeconomic covariate data were obtained from the ACS including median age, per capita income, percent male sex, percentage of population aged >25 years with a high school or general equivalency degree, and percent Hispanic, Asian, Native American, African American, and Caucasian. The census categorizes ethnicity by self-identification as well as by whether someone identifies as one group alone or more than one group. We used only those who identified as one ethnic group. Note that the census classifies mixed ethnic group categories, such as Hispanic, as one group when in reality this group comprises diverse populations including Mexican, Puerto Rican, Spanish American, and others, as do other groups. Also, note that the model used each ethnic group as its own covariate. In addition, we used health insurance data from the 2000 and 2005 versions of the Small Area Health Insurance Estimates from the U.S. Census. The Small Area Health Insurance Estimates is a two-level regression model of the uninsured population by county based on the Annual Social and Economic Supplement of the Current Population Survey as well as Internal Revenue Service tax returns, food stamp participants, Medicaid, and State Children's Health Insurance Program participants (16). We used annual U.S. census population estimates divided by the square miles per county to derive the population density of each county (17). Confirmatory analyses were performed both with 2000 Census and 2005 3-year ACS measures for the socioeconomic covariates listed above.
To assess the relationship between the micrograms per meter cubed weighted annual mean for PM2.5 exposure (independent) and diabetes prevalence (dependent), multivariate linear regression models were developed using the ordinary least squares method, controlling for socioeconomic covariates, behavioral risk factors, population density, and latitude to correct for unobserved geospatial biases. We used the 36-km PM2.5 statistically fused air model dataset for our primary analysis because it covers the entire contiguous U.S. Confirmatory analyses substituted in the 12-km and ground-level PM2.5 datasets. All datasets were normally distributed, with Kolmogorov-Smirnov results, histograms, box plots, and normal probability plots all concordant. Distribution analyses were performed in SAS. Testing of standard data transformations (including logarithmic) provided no added benefit to the models compared with the linear fitting. Finally, a risk factor analysis was performed by comparing mean diabetes prevalence in counties from the bottom quartile of PM2.5 levels with counties from the upper quartile of PM2.5 levels. To elucidate the difference in diabetes prevalence for counties with PM2.5 levels in legal compliance with EPA limits, only counties below the EPA limit of 15 μg/m3 were selected for additional evaluation including risk factor analysis (18). Because diabetes data covered all U.S. counties, analyses were limited only by the EPA data coverage and the extent of the covariates. Herein the analysis will be referred to by their EPA source data, such that the 36-km Bayesian model is denoted “36-km model” (similarly for 12-km model) and EPA surface monitor data are denoted “ground.” Statistical and exploratory analyses were performed in SYSTAT (version 12; Cranes Software International, Bangalore, India).
Univariate linear regression of the 36-km model resulted in a strong and significant association between mean PM2.5 levels and diabetes prevalence by county during both calendar year 2004 (β = 1.9 [95% CI 1.71–2.05]; P < 0.001; n = 3,082 counties) and 2005 (β = 1.9; [1.69–2.07]; P < 0.001; n = 3,082), with regression values interpreted as percent increase in overall diabetes prevalence per increase of 10 μg/m3. Nearly identical results were found by using the county maximum PM2.5 values. Given a hypothetical population of 1,000,000 people, our model suggests that for every 10 μg/m3 increase of PM2.5, there could be a resulting increase of ∼10,000 diagnosed cases of diabetes or an overall increase in diabetes prevalence of ∼1%/10 μg/m3.
Figure 1A presents a map of diabetes prevalence and PM2.5 levels by county. The association identified in univariate analysis remained statistically strong, although the magnitude of the impact lessened slightly in the multivariate model for both years (2004: β = 0.78 [95% CI 0.39–1.25]; P < 0.001; n = 241; 2005: β = 0.81; [0.48–1.07]; P < 0.001; n = 766). Analysis of the 36-km model with two sets of covariates as well as maximum PM2.5 values resulted in consistent and significant results (Table 1). Stepwise analysis in the multivariate linear regression model indicated a good model fit for our dataset with adjusted squared multiple R (ASMR) increasing from 0.73 without PM2.5 to 0.74 with the addition of the PM2.5 36-km model dataset, with consistent results for the 12-km (ASMR = 0.75) and ground (ASMR = 0.78) datasets.
Overall, controlling for diabetes risk covariates did not alter the significance of the relationship, although various factors did alter the magnitude of impact of PM2.5 on diabetes. Obesity was strongly related to diabetes prevalence as expected. The modest relationship between obesity and PM2.5 levels (β = 0.004, P < 0.01) had very little effect on the relationship between pollution exposure and diabetes prevalence. Likewise, physical activity did not alter the significance level and only marginally decreased the magnitude of the relationship between diabetes and PM2.5. Removal of Asian, Native American, and Hispanic ethnic groups (ethnic groups at high risk for diabetes) as covariates from the model but including all counties increased the magnitude of impact of PM2.5. When physical activity and obesity were removed from the model, the impact of the ethnic groups became less pronounced. When we removed behavioral risk factors, included all counties, and used select covariates (socioeconomic factors, population density, and African American and Caucasian races) we found that (2004: β = 1.27; [0.93–1.69], P < 0.001; n = 242; 2005: β = 1.43; [1.18–1.70]; P < 0.001; n = 766 counties).
Given the clustering of minority populations in areas with greater PM2.5 exposure in the southern states and to further examine the potential confounding effects of race on the model, an analysis was performed including only those counties with a high percentage of Caucasian inhabitants. With use of ACS 1-year covariates, those counties with >90% Caucasian population (top quartile for Caucasian race) showed a consistent and significant relationship between diabetes and PM2.5 (2005: β = 0.86; [95% CI 0.22–1.64]; P = 0.008; n = 227 counties). When examining counties with 95% Caucasian inhabitants (top 10%), we noticed a clustering in the Midwest along similar latitudes, introducing geospatial bias. Thus, we eliminated latitude from the multivariate model and again demonstrated a significant association between PM2.5 and diabetes prevalence (2005: β = 1.1; [0.59–1.52]; P < 0.001; n = 188). Although we had limited data to perform this analysis for counties reporting predominantly (>97%) Caucasian inhabitants, similar nonsignificant trends of increasing PM2.5 and increasing diabetes prevalence were present.
Cursory examination of the geospatial interaction between diabetes prevalence and PM2.5 levels raised concerns that regions in the Southeastern and Central U.S. could account for the observed interrelationship. In addition to analysis of covariates associated with ethnicity, obesity, fast-food chains, and other potential confounders, using U.S. Census Divisions, we analyzed the relationship after eliminating, individually and together, the East South, East North Central, and the South Atlantic Census Divisions (19). The relationship remained consistent and significant with ACS 1-year covariates until all three regions were eliminated together (2005: β = 0.24, [95% CI 0.26–0.71], P = 0.28, n = 393). However, to correct for sample size bias, Census 2000 covariates were used, and again we found a consistent and significant relationship between PM2.5 exposure and diabetes prevalence even when all three regions were eliminated (2005: β = 0.86, [0.61–1.11], P < 0.001, n = 1,014).
Confirmatory analysis using both the 12-km model and ground data again resulted in consistent and highly significant findings (Table 1). Analysis demonstrated only negligible differences between use of maximum PM2.5 values and average PM2.5 values by county. There were large differences in diabetes prevalence between counties (Table 2, Fig. 1B). However, a consistent increase in diabetes prevalence was observed between counties in the bottom quartile compared with those in the top quartile of PM2.5 exposure, with populations in more polluted counties having a >20% higher mean diabetes prevalence. Importantly, only counties below the EPA NAAQS limit of 15 μg/m3 were used for the quartile analysis.
We demonstrate a strong association between PM2.5 exposure and diabetes prevalence, suggesting that ambient air pollution may contribute to the increased prevalence of diabetes in the adult U.S. population. Advances in both data collection and statistical techniques (20) permitted this first large-scale population-based analysis of the relationship between PM2.5 and diabetes prevalence. Our findings are consistent with the few studies of geographically small areas that have also suggested a relationship between diabetes and air pollution from either road traffic or industrial facilities (21–23). Our results are also consistent with previous evidence from animal models (9). A growing body of epidemiological and laboratory-based literature connects air pollution, particularly PM2.5, and deterioration of cardiovascular health (2). Therefore, although unique by scope, our study is not without precedent.
Chronic inflammation has been suggested to be a mechanism promoting increased insulin resistance in mice with diet-induced obesity after increased PM2.5 exposure (9). Sun et al. (9) demonstrated that whole-body glucose homeostasis was reduced with PM2.5 exposure, whereas proinflammatory M1 adipose tissue macrophage activity was upregulated and anti-inflammatory M2 adipose tissue macrophage activity was suppressed. Furthermore, pollutants promote catabolic inflammatory action while inhibiting anabolic responses to insulin (24). In contrast, lean mice show little change in insulin sensitivity or lipid profile in response to PM2.5 exposure (9). Thus, increasing exposure to ambient air pollution in Westernized countries may be particularly problematic in the setting of the obesity epidemic. Similarly, O'Neill et al. (4) found that obese diabetic patients demonstrated a greater inflammatory response than nonobese diabetic patients upon exposure to pollutants. Taken together, these studies suggest that obesity may play a critical permissive role in priming the body for pollution-induced inflammation and disordered metabolism. Although our study cannot provide additional insights on mechanisms underlying the association or confirm causality, we clearly demonstrated a strong relationship between PM2.5 exposure and diabetes prevalence within our modeled dataset similar to that in other studies (4,9,23,24).
Throughout our multiple datasets and models, we find a consistent and significant association between ambient air pollution PM2.5 and diabetes prevalence. Additions of behavioral, ethnic, and socioeconomic covariates only modestly alter the magnitude of the impact of PM2.5 on diabetes prevalence. In addition, removal of highly polluted regions with high diabetes prevalence did not alter the relationship in a significant manner.
Although we found that increased PM2.5 was associated with increased diabetes prevalence, our design does not allow us to conclude whether this is a causal relationship. Ecological studies assume that characteristics of a study group within a certain area represent characteristics of the entire population for that area. A potential ecological bias would be most prominent within our study for diabetes, socioeconomic, and assessed behavioral risk factor covariates because they are based on aggregate survey datasets. It is challenging to exclude potential confounding introduced earlier in time, including diabetic or pre-diabetic individuals selecting residence in relatively more polluted neighborhoods. It is also possible that the best institutions for diabetes care may be located in areas of high pollution. Additional studies are warranted to further elucidate this relationship. Importantly, the study assesses 2 years independently for confirmation, yet we are not able to draw any conclusions regarding effects of sustained pollution exposure over time or of prior exposure on incident disease. This will be possible only when data have been available over several years, permitting time-lag analysis.
Our studies only capture diagnosed cases of diabetes; 2007 estimates suggest that there were 6.3 million adults in the U.S. with undiagnosed diabetes (1). In addition, our data do not distinguish between types 1 and 2 diabetes. However, type 2 diabetes accounts for >90–95% of all cases of diagnosed diabetes in U.S. adults (1).
EPA air quality reference data report the worst air quality monitor in each county. To overcome potential biases inherent in such a measurement we repeated analyses using both the worst air quality monitor in each county and the average of the annual means of all monitors in each county, with similar findings. In the EPA-modeled datasets (36-km and 12-km), some error could be introduced from the geospatial interpolation of the two datasets within ArcGIS. In addition, we could not verify the uniform distribution of pollution across our study areas because we were limited by the resolution of the EPA datasets and the placement of ground monitors.
As noted earlier, we used 2005 health insurance estimates for the 2004 analysis because reliable county-level health insurance estimates for 2004 were unavailable. Furthermore, we used 2000 Census data for 2004 and 2005 primary analyses to expand our sample size, as 2004 and 2005 covariate data were not available for all counties, whereas the 2000 Census provided covariate data for all counties. Thus, we assumed that the socioeconomic and demographic profile of the U.S. did not change dramatically between 2000 and 2004 and between 2000 and 2005. These were years of relative economic stability in the U.S. However, repeat analysis with the more limited datasets for the same year did not alter conclusions.
Although this article focuses specifically on the relationship between PM2.5 and diabetes, other pollutants not mentioned here have been reported to share a similar relationship with insulin resistance and diabetes prevalence (21). Brook et al. (21) previously observed a relationship between NO2 exposure and diabetes among patients with respiratory disease in two Canadian cities. It is possible that our analysis has omitted variable bias and that other copollutants account in part for the relationship between PM2.5 and diabetes.
In this study, we demonstrate an increase in diabetes risk even among areas that are below the EPA legal limits for PM2.5. Populations living in areas that are near, but still below, the EPA limits show a >20% higher diabetes prevalence compared with those in cleaner areas. Although EPA limits have resulted in reduced exposure to PM2.5, workers who commute experience highway levels of PM2.5, that often exceed locally measured values (22). Although outside the scope of our study, increasing commutes for U.S. workers may contribute to chronic disease through increased pollutant exposure, in addition to increased sedentary time, and reduced time for physical activity (21). Outside the U.S., the risk may be far greater as air pollution limits are often not enforced or are nonexistent, with some countries, notably in Asia and Latin America, showing PM2.5 levels >10 times higher than the U.S. EPA limits (25).
Our results, although associative, demonstrate that additional research is needed to understand the role that PM2.5 plays in the inflammatory pathway or other pollution-mediated mechanisms giving rise to diabetes. Such research could lead to novel therapeutic approaches to reduce pollution-induced inflammation. Preventative measures should be considered to reduce exposure to PM2.5 from those at highest risk. Furthermore, evidence based on this study and others suggests that current limits on particulate matter exposure may not adequately mitigate the public health consequences.
This study was funded by National Institutes of Health National Center for Biomedical Computing (grant 5U54-LM-008748).
No potential conflicts of interest relevant to this article were reported.
J.F.P. designed and coordinated the project, led the analysis, wrote the manuscript, and reviewed/edited the manuscript. C.B. collected and analyzed data, wrote the manuscript, and reviewed/edited the manuscript. S.S. performed all the SAS transformations, analyzed data, wrote the manuscript, and reviewed/edited the manuscript. J.S.B. wrote the manuscript and reviewed/edited the manuscript.
We acknowledge Kiran Chinnayakanahalli of Washington State University for assistance in data analysis.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Received April 12, 2010.
- Accepted July 5, 2010.
- © 2010 by the American Diabetes Association.
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.