Rapid Identification of Myocardial Infarction Risk Associated With Diabetes Medications Using Electronic Medical Records

  1. Isaac S. Kohane, MD, PHD1,5,10
  1. 1Children's Hospital Informatics Program at the Harvard–MIT Division of Health Sciences and Technology, Boston, Massachusetts;
  2. 2Division of Emergency Medicine, Children's Hospital Boston, Boston, Massachusetts;
  3. 3Department of Pediatrics, Harvard Medical School, Boston, Massachusetts;
  4. 4Laboratory of Computer Science, Massachusetts General Hospital, Boston, Massachusetts;
  5. 5Partners Healthcare, Boston, Massachusetts;
  6. 6Joslin Diabetes Center, Boston, Massachusetts;
  7. 7Division of General Medicine, Massachusetts General Hospital, Boston, Massachusetts;
  8. 8Department of Medicine, Harvard Medical School, Boston, Massachusetts;
  9. 9Diabetes Center, Massachusetts General Hospital, Boston, Massachusetts;
  10. 10Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts.
  1. Corresponding author: John S. Brownstein, john_brownstein{at}harvard.edu.
  1. J.S.B. and S.N.M. contributed equally to the work.


OBJECTIVE To assess the ability to identify potential association(s) of diabetes medications with myocardial infarction using usual care clinical data obtained from the electronic medical record.

RESEARCH DESIGN AND METHODS We defined a retrospective cohort of patients (n = 34,253) treated with a sulfonylurea, metformin, rosiglitazone, or pioglitazone in a single academic health care network. All patients were aged >18 years with at least one prescription for one of the medications between 1 January 2000 and 31 December 2006. The study outcome was acute myocardial infarction requiring hospitalization. We used a cumulative temporal approach to ascertain the calendar date for earliest identifiable risk associated with rosiglitazone compared with that for other therapies.

RESULTS Sulfonylurea, metformin, rosiglitazone, or pioglitazone therapy was prescribed for 11,200, 12,490, 1,879, and 806 patients, respectively. A total of 1,343 myocardial infarctions were identified. After adjustment for potential myocardial infarction risk factors, the relative risk for myocardial infarction with rosiglitazone was 1.3 (95% CI 1.1–1.6) compared with sulfonylurea, 2.2 (1.6–3.1) compared with metformin, and 2.2 (1.5–3.4) compared with pioglitazone. Prospective surveillance using these data would have identified increased risk for myocardial infarction with rosiglitazone compared with metformin within 18 months of its introduction with a risk ratio of 2.1 (95% CI 1.2–3.8).

CONCLUSIONS Our results are consistent with a relative adverse cardiovascular risk profile for rosiglitazone. Our use of usual care electronic data sources from a large hospital network represents an innovative approach to rapid safety signal detection that may enable more effective postmarketing drug surveillance.

Adverse events that occur infrequently during premarketing randomized clinical trials or are under-reported with traditional postmarketing methods of drug surveillance underscore the need for additional methodologies and data sources to monitor drug safety (1). Critical insights may be realized by monitoring large clinical databases using automated data feeds in near real time (2). Diabetes medications present an ideal paradigm to test new safety signal detection approaches because they are used frequently in large numbers of patients with type 2 diabetes, and new products have been recently launched while suitable drug comparators remain marketed. Existing concerns regarding adverse cardiovascular risk for diabetes therapies provide motivation for hypothesis-driven prospective surveillance. Adverse cardiovascular side effects have been seen with rosiglitazone (3,4). Although a recent noninferiority clinical trial has provided some evidence exonerating rosiglitazone from a risk for excess mortality (5), concern remains regarding a possible adverse risk for myocardial infarction.

We tested an automated strategy analyzing clinical data in real time to detect adverse drug-related events. Because premarketing clinical trials of diabetes therapies are currently designed primarily to evaluate efficacy for glycemic improvement and have not previously been designed to assess relatively infrequent but clinically important adverse outcomes, active surveillance may play a valuable role in assessment of risk (6). Active surveillance could provide evidence of risk earlier than postmarketing outcome trials. Furthermore, it may be cost prohibitive to conduct randomized controlled trials for each drug product toward important hard safety outcomes. Although such an analysis would not provide conclusive causal evidence, we determined whether prospective analysis of clinical data could have provided early evidence of cardiovascular risk associated with rosiglitazone that would warrant additional evaluation.


We identified a cohort of patients who had new prescriptions for diabetes medications within Partners Healthcare System, a large, nonprofit academic health care network including Brigham and Women's and Massachusetts General Hospitals. The source of clinical data was the Research Patient Data Registry, a centralized data warehouse including patient demographic information, dates of service, medications, diagnoses, laboratory results, and discharge summaries.

The retrospective cohort analysis included all patients aged >18 years identified by an ICD-9 code for Diabetes Mellitus (250.XX) or an A1C of >6.0% and at least one record of prescription of an oral diabetes medication as an outpatient or dispensation as an inpatient, between 1 January 2000 and 31 December 2006. Analyses focused on three classes of diabetic medications: sulfonylureas, the biguanide metformin, and the thiazolidinediones, rosiglitazone and pioglitazone. Evidence of insulin therapy did not exclude patients but was adjusted for in multivariate models and used for stratified analysis (described below). We excluded patients receiving either metformin or thiazolidinedione who had a diagnosis of polycystic ovaries but not diabetes. For each patient, all available associated data were extracted, including narrative notes and hospital discharge summaries. Narrative notes were used for validating coded medications and diagnoses found in medical records, permitting determination of sensitivity and specificity of events as recorded in the electronic medical record.

Patient enrollment, observation, drug exposure, and event identification

The study population does not receive health care exclusively within the Partners system, and, thus, some patients within the surveillance database may have had incomplete records. To address this issue, we used health care encounters (inpatient or outpatient) as a proxy for receipt of care at Partners over a specific observation period. We constructed 14 6-month observation periods, beginning on 1 January or 1 July between 2000 and 2006, during which a patient had at least one outpatient office visit, including psychotherapy or nutrition visits, or an inpatient encounter. Study entry was considered the first period meeting one of these criteria within the study dates.

For each patient, duration of exposure to individual diabetes medications was assessed in 6-month increments during which only one of the four medications was prescribed. Patients receiving multiple medications under consideration were excluded. The study end point for each evaluable patient was first hospitalization between 1 January 2000 and 31 December 2006 for myocardial infarction (ICD-9 code 410), death (all causes), a gap in care in which there were no patient encounters in subsequent observation periods, or end of study in 2006. The ICD-9 diagnostic code for acute myocardial infarction has been validated previously (7). Events were associated with a particular medication only when the prescription or dispensation occurred within the 6 months before the documented myocardial infarction. If a patient did not have any activity for a 6-month observation period but resumed activity in the following period, than the particular 6-month observation period with no activity was excluded from analysis. Analysis was repeated considering only patients having been prescribed one of the four medications, considered to be monotherapy. Finally, we also performed stratification of our data to analyze patients who had not received insulin as outpatient therapy.

We conducted a manual review of outpatient notes and inpatient discharge summaries on a random sample of 200 patients to validate use of electronic medical record data to identify both drug exposure and myocardial infarction events. Review included patients identified as exposed to rosiglitazone and with myocardial infarction (n = 50) or exposed and without an event (n = 50) as well as the comparator group of patients (receiving one of the other three oral diabetes medications but not exposed to rosiglitazone) and with (n = 50) or without myocardial infarction event (n = 50). Institutional review board approval was obtained for medical record review.

Statistical analysis

The relative risk of myocardial infarction associated with therapy was calculated for rosiglitazone compared with metformin, sulfonylureas, or pioglitazone. Both crude and adjusted rate ratios with 95% CIs were estimated using generalized linear modeling, assuming a Poisson distribution for the response and set duration of time taking a particular medication (as 6-month intervals) as the offset. To account for overdispersion in the count data, extra-Poisson variability was modeled and incorporated into estimates of SEs. Parameter estimates were transformed to rate ratios.

Adjustments were made for potential risk factors including age, sex, cardiovascular disease prior to enrollment (defined by billing codes for coronary artery disease, myocardial infarction, angina, congestive heart failure, cerebrovascular incident, percutaneous coronary intervention, and coronary artery bypass graft surgery), any use of hypertensive medications, lipid-lowering medications, and outpatient insulin use during study period. The model also included adjustment for underlying morbidity using an age-adjusted Charlson score. In an additional model, we evaluated potentially important factors for which we had less than complete data. These included race/ethnicity (with information available in 93% of patients), insurance coverage (commercial, Medicare, Medicaid, or uninsured) (83%), A1C (60%), and creatinine (71%) levels. Overall mean A1C and creatinine levels (<2.0 or ≥2.0 mg/dl) during the study period were considered indicators of diabetes severity. Differences in these characteristics between medication groups were identified with ANOVA and a Tukey post hoc test. Finally, because previous myocardial infarction imparts a greater risk for recurrent cardiovascular events (8) and because of the need to consider starting new medications to minimize potential prolonged effects of prior diabetes therapies on cardiovascular events, we tested a model in which all patients who had ever had a recorded inpatient stay for myocardial infarction or had been prescribed a diabetes medication in the year before entry were excluded.

Signal detection analysis

To construct a general surveillance approach to identify adverse events from clinical data, we repeated the above analysis using a cumulative temporal approach by the defined 6-month intervals. All available data from the first time period (1 January 2000–31 May 2000) were analyzed, and data were iteratively added with each subsequent 6-month period. Cumulative data were analyzed until the final period. Data were treated as cumulative with additional new patients and patient-year exposure providing increased power to the analyses. A significant risk ratio (where the lower bound of the 95% CI was >1.0) was considered to be a safety signal. All analyses were performed using SAS statistical software (version 9.0; SAS Institute, Cary, NC). Numbers of prescriptions of pioglitazone were insufficient for comparison with rosiglitazone until 1 January 2002.


We identified 34,252 diabetic patients treated with at least one of the four diabetes medications between 1 January 2000 and 31 December 2006. Of the total 159,586 evaluable 6-month intervals, there were 40,695 periods of sulfonylurea therapy (17,157 patients), 48,713 periods of metformin therapy (18,162 patients), 8,707 periods of rosiglitazone therapy (4,274 patients), and 3,591 periods of pioglitazone therapy (1,800 patients). When only one of the four diabetes medications was prescribed in a 6-month period, we identified 20,233 periods for sulfonylureas (11,200 patients), 27,860 periods for metformin (12,490 patients), 2,834 periods for rosiglitazone (1,879 patients), and 1,290 periods for pioglitazone (806 patients) (Table 1). When only one of the four diabetes medications was prescribed during the entire period, we identified 7,152 patients taking sulfonylureas, 8,798 patients taking metformin, 1,028 patients taking rosiglitazone, and 418 patients taking pioglitazone. Given the large number of patients in the different treatment groups, there were statistically significant, although generally small, differences in many baseline variables. These were adjusted for in analyses to control for known baseline differences.

Table 1

Characteristics of the population

We identified 1,343 hospitalized myocardial infarction events and an overall event rate of 16.8 per 1,000 patient-years. There were 768 events associated with sulfonylureas (38.0 events per 1,000 patient-years), 406 with metformin (14.6 events per 1,000 patient-years), 133 with rosiglitazone (46.9 events per 1,000 patient-years), and 36 with pioglitazone (27.9 events per 1,000 patient-years). Manual review of 235 randomly selected patient records revealed a high level of confirmation for drug exposure to individual medications, with both sensitivity and specificity of 94%. Identification of myocardial infarction events was confirmed with a sensitivity of 93% and specificity of 74%. Lower specificity was primarily due to the presence of previous and “rule out” myocardial infarctions noted in patient records. Overall, there were no differences in specificity and sensitivity of myocardial infarction by drug type.

Rosiglitazone was associated with an unadjusted rate ratio for increased myocardial infarction of 1.2 (95% CI 1.0–1.3) compared with sulfonylureas, 3.3 (2.9–3.6) compared with metformin, and 1.7 (1.3–2.1) compared with pioglitazone. After adjustment for identified risk factors (age, sex, cardiovascular disease, hypertensive medications, lipid-lowering medications, and age-adjusted Charlson score), individuals treated with rosiglitazone had an increased rate ratio for myocardial infarction risk of 1.3 (1.0–1.6) compared with sulfonylurea, 2.7 (2.2–3.4) compared with metformin, and 1.7 (1.1–2.6) compared with pioglitazone. Additional adjustments for factors with limited data in our patient population (race/ethnicity, insurance coverage, A1C, and creatinine levels) resulted in only small differences in adjusted relative risk. In the model with additional factors not available for the entire population, rosiglitazone was associated with a relative risk of myocardial infarction compared with sulfonylurea, metformin, and pioglitazone of 1.4 (95% CI 1.0–1.9), 2.4 (1.0–4.2), and 2.0 (1.0–4.2), respectively. Analyses restricted to patients without prior myocardial infarction (29,055 of 34,252) and patients with no prior diabetes medication in the 12 months before enrollment (30,142 of 34,252) had no effect on model results.

Considering only patients receiving monotherapy, rosiglitazone was associated with an unadjusted rate ratio for increased myocardial infarction of 1.1 (95% CI 1.0–1.3) compared with sulfonylureas, 3.5 (3.1–3.9) compared with metformin, and 1.9 (1.4–2.5) compared with pioglitazone. After adjustment for identified risk factors, individuals treated with rosiglitazone had an increased rate ratio for myocardial infarction of 1.2 (1.0–1.4) compared with sulfonylurea, 2.5 (2.0–3.2) compared with metformin, and 1.7 (1.3–2.2) compared with pioglitazone. In the model with additional factors not available for the entire population, rosiglitazone was associated with a relative risk of myocardial infarction compared with sulfonylurea, metformin, and pioglitazone of 1.3 (95% CI 1.1–1.6), 2.2 (1.6–3.1), and 2.2 (1.5–3.4), respectively.

After performing stratification of our data to analyze patients who had not received insulin as an outpatient therapy, we found that rosiglitazone was associated with an unadjusted rate ratio for increased myocardial infarction of 1.3 (95% CI 1.1–1.4) compared with sulfonylurea and 3.5 (3.2–3.9) compared with metformin. After adjustment for identified risk factors, individuals treated with rosiglitazone had an increased rate ratio for myocardial infarction risk of 1.3 (1.0–1.7) compared with sulfonylureas and 3.0 (2.4–3.7) compared with metformin. In the model with additional factors not available for the entire population, rosiglitazone was associated with a relative risk of myocardial infarction compared with sulfonylureas and metformin of 1.4 (95% CI 1.0–2.0) and 2.6 (1.8–3.6), respectively. No myocardial infarctions were identified among the 594 patients receiving pioglitazone without additional insulin outpatient therapy.

The iterative temporal analysis to define the earliest possible date a safety signal would have been detected (Fig. 1) demonstrates that a safety signal would have been identified for rosiglitazone compared with metformin after 18 months in July 2001 with an adjusted risk ratio of 2.1 (95% CI 1.2–3.8). Compared with sulfonylurea or pioglitazone, rosiglitazone safety signals would have been identified by January 2005 with adjusted risk ratios of 1.2 (1.1–1.8) and 1.8 (1.0–3.4), respectively.

Figure 1

Temporal analysis to ascertain the calendar date for earliest identifiable risk associated with rosiglitazone compared with other therapies is shown with each curve representing relative risk ratio of myocardial infarction for patients on rosiglitazone compared with alternatively prescribed medications (sulfonylurea, metformin, and pioglitazone).


A recent meta-analysis of available case-control and cohort studies derived from the rosiglitazone phase III clinical dataset suggested a 43% increased risk for cardiovascular events in patients receiving rosiglitazone (3). Many factors contribute to uncertainty regarding these findings, including availability of only summary trial-level data rather than patient-level data, heterogeneity of trial design, and absence of uniform event adjudication (9). However, review of patient-level data by the U.S. Food and Drug Administration (FDA) yielded similar relative risk findings (10). Absolute risk was low because cardiovascular event rates were sparse in these studies and statistical methods to deal with infrequent event rates yield uncertainty regarding validity of the risk (11). Likewise, phase IV studies in patients with type 2 diabetes have neither confirmed nor excluded an increased hazard ratio for rosiglitazone (12,13), and, similarly, large randomized multicenter trials in high-risk diabetic patients with substantial use of rosiglitazone neither confirm nor exclude increased risk (14,15). The recently completed phase IV Rosiglitazone Evaluated for Cardiovascular Outcomes Regulation of Glycaemia in Diabetes (RECORD) study was designed as a noninferiority study comparing rosiglitazone plus either sulfonylurea or metformin versus metformin and sulfonylurea. Although it was underpowered and treatment crossover complicated interpretation of findings, relative risk for mortality was ∼1.0; however, risk for myocardial infarction with rosiglitazone was 1.14, leaving the risk of rosiglitazone for myocardial infarction uncertain (5). In contrast, results of randomized phase IV clinical trials and meta-analyses have suggested pioglitazone to be neutral to favorable in cardiovascular risk profile (16,17).

The thiazolidinediones rosiglitazone and pioglitazone both gained FDA approval within a short time span, have similar indications for being prescribed, have similar cost, and are initially without apparent prescription bias. A comparison of these two products reduces the likelihood of comorbidities and unmeasured variables confounding findings, which might cause greater potential bias for drugs of different class, cost, or safety profiles. Thus, evaluating cardiovascular safety of approved oral diabetes therapies in a real-world setting provides context, internal model validation, and potentially valuable clinical information for health care providers.

Our results are consistent with a previously suggested protective effect for metformin (18), more neutral effect for pioglitazone (16,17), and potential relative adverse cardiovascular safety profile for rosiglitazone (19,3,4,20). In particular, our results comparing rosiglitazone with pioglitazone complement other recent findings (19,20) and are not likely to be confounded by indication, given the similar prescribing patterns. Together, these findings demonstrate that methods for medical record surveillance may provide useful adjunct methods to assess postmarketing drug safety.

It is interesting that the relative risk confidence limits boundary either touches or is near 1.0 in all analysis for rosiglitazone compared with sulfonylureas. Sulfonylureas are established agents that have been used for the treatment of type 2 diabetes since approval in the 1950s. Our findings are consistent with those in the meta-analysis performed by the FDA (10), which suggested no increase in risk for rosiglitazone compared with this established therapeutic class.

Our results do differ somewhat from other recent studies. An analysis of potentially more robust data showed similar trends of decreased relative risk for cardiovascular disease with metformin and increased relative risk for sulfonylureas, but they did not show significant increased relative risk for rosiglitazone compared with pioglitazone (21). However, this finding contrasts with other observational studies showing increased risk with rosiglitazone (19,22). Differences among patient populations, in absolute event rates, or in methodologies may underlie differences in the magnitude of relative risks in such studies.

Importantly, combined treatments for dyslipidemia, hypertension, antithrombotic agents, and glycemia have markedly reduced event rates in patients with type 2 diabetes, and these gains are realized using strategies that include rosiglitazone (13,14). Relative risk analysis may be used to inform a provider regarding priority for selecting among treatment options, but individual patient comorbidities and tolerance must also be considered when one is choosing among specific therapeutic options, and absolute risk must be carefully considered before withholding a therapeutic option.

Our analysis does have important limitations. We do not have complete longitudinal prescription data for all individual patients, and patients may not take medication that has been prescribed. Hence we cannot confirm for all patients whether they were taking a medication at the time of myocardial infarction. Although we have derived an estimate of recent exposure, defining true exposure is currently not possible with usual clinical data. We may also have missed patients who did have exposure. Prescriptions for diabetes medications may have been obtained outside the Partners system and may therefore not have been captured. However, this situation would underestimate rather than overestimate drug risk. Our use of other diabetes medications as comparators, however, should reduce or eliminate the majority of these potential biases, although we cannot fully exclude biases introduced by physicians or patients leading to selection of specific drugs. Furthermore, there may be increased cardiovascular risk with rosiglitazone for patients using insulin, which may also be a surrogate for duration of diabetes. In addition to adjusting for insulin in our models, we performed stratification, yielding very similar results. Notably, no patients receiving pioglitazone without additional outpatient insulin were identified to have a myocardial infarction. Future analyses should consider drug combinations because concomitant use of insulin and thiazolidinediones may be particularly unfavorable (10,21). Furthermore, our low specificity for detection of myocardial infarction events of 74% is of particular concern, indicating a need for future analyses to incorporate laboratory data to verify the occurrence of myocardial infarction more accurately. Composite end points of major adverse cardiovascular events are standard measures for comparing treatments in large cardiovascular outcome studies. Our analysis included only myocardial infarction, whereas other cardiovascular events, such as sudden death and stroke, were not considered in this analysis. Finally, if there is increased health risk shortly after initiation of therapy that is abrogated with longer duration of administration, then cumulative assessments including additional new patients and patient-year exposure would tend to produce a bias toward early risk.

The control of residual confounders in observational data is an important issue. Approaches addressing this issue in medical record data include comparing risk in groups for the measured outcome before and after an exposure (23) to test whether a group was at prior higher risk, picking comparable exposures (medications in the same class) where heuristically there is no reasonable argument for differences in groups, and using global, accepted measurements of acuity (such as the Charlson score) to detect differences in underlying health of groups. We selected the latter method, since there was some suggestion of risk differences for the two marketed thiazolidinedione products available for study.

Although the increased risk ratio for rosiglitazone compared with other diabetes medications has been demonstrated in more robust clinical datasets with adequate longitudinal records of patients (21), the current study provides two novel and important insights. First, with the need to monitor numerous products and numerous potential events, it is increasingly difficult to develop randomized clinical trials to adequately address all potential study bias and confounding factors. From a surveillance perspective, a real-time strategy detecting risk that may require further investigation is potentially more cost-effective than numerous long-term investigations into one drug–one event relationships. Moreover, designing studies to identify relatively infrequent, but medically important, adverse events that would probably be missed by phase III clinical trials and current postmarketing voluntary reporting mechanisms would probably be expensive and curtail or delay development of new treatments. Surveillance analysis should be guided by a priori evidence (such as nonstatistically significant adverse events) from phase III clinical trials to limit the potential for false-positive results. It is important to note that surveillance methods work best when agents have adequate population uptake. For instance, the time to identify signals comparing rosiglitazone and pioglitazone was delayed because of low sample sizes for both drugs. Methodologies that improve detection performance, especially when drugs and events are rare, or permit all possible drug-event interactions are described elsewhere (24).

Second, our study shows how relatively simple clinical surveillance methods can be implemented in real time. With the availability of electronic datasets such as the one used herein, it is possible to perform analyses of drug-event combinations prospectively on a quarterly, monthly, or even weekly basis. In this study, we demonstrated that if these methods had been in use when thiazolidinediones were first introduced to the market, a potential hazard would have been apparent ∼18 months after the launch, in 2001, well before concerns were raised publicly in 2007 (3). This time frame is also faster than would be realized in phase IV postmarketing trials and may cause less delay than requiring cardiovascular outcome trials before FDA approval for diabetes medications that do not have adverse safety signals in aggregate phase II–III study analysis. Although these methods would not provide the same degree of information as a prospective randomized control trial, they might indicate caution to care providers faced with options to prescribe multiple newer medications, fulfilling clear needs for complementary approaches (25).

Our study provides a framework for implementation of future postmarketing surveillance activities with semiautomated extraction of large clinical datasets. Despite inherent limitations, these data can provide robust real-time signals of adverse drug events in the postmarketing setting. How such systems will interact with activities at the FDA requires thoughtful consideration.


This work was supported in part by the National Institutes of Health National Center for Biomedical Computing (Grant 5U54-LM-008748).

No potential conflicts of interest relevant to this article were reported.


  • The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

  • The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

    • Received August 13, 2009.
    • Accepted December 2, 2009.
  • Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.


| Table of Contents

This Article

  1. Diabetes Care vol. 33 no. 3 526-531
  1. All Versions of this Article:
    1. dc09-1506v1
    2. 33/3/526 most recent