The Diabetes Quality Improvement Project

Moving science into health policy to gain an edge on the diabetes epidemic

  1. Barbara B. Fleming, MD, PHD1,
  2. Sheldon Greenfield, MD2,
  3. Michael M. Engelgau, MD, MS3,
  4. Leonard M. Pogach, MD, MBA4,
  5. Steven B. Clauser, PHD1,
  6. Marian A. Parrott, MD5 and
  7. For the DQIP Group
  1. 1Health Care Financing Administration, Baltimore, Maryland
  2. 2Primary Care Outcomes Research Institute, New England Medical Center Hospitals and Tufts University School of Medicine, Boston, Massachusetts
  3. 3Division of Diabetes Translation, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, Georgia
  4. 4Medical Services, VA New Jersey Health Care System, East Orange, New Jersey
  5. 5American Diabetes Association, Alexandria, Virginia

    Background

    As the worldwide diabetes epidemic continues to unfold, some experts have asked whether the war against it is being lost (1,2,3). In the U.S., blindness, kidney failure, amputations, and cardiovascular disease resulting from diabetes not only markedly reduce quality and length of life but also cost nearly $100 billion annually (4,5,6,7). Fortunately, a vast body of research has clearly established that several effective treatments and practices may substantially reduce or prevent this burden (8,9,10,11,12,13,14,15,16,17,18,19). These interventions, if broadly implemented, can allow us to use our health efforts and resources more effectively and efficiently to minimize the diabetes burden. Judging from the scant data available, however, the level of care currently delivered to populations with diabetes often may not produce the possible health-related gains (20,21). To assess more completely the level of diabetes care delivered in the U.S., we need standardized uniform performance measures that can assess quality of care accurately and reliably. These measures will enhance uptake of research into practice and may ultimately improve diabetes care and clinical outcomes. In the early to mid-1990s, many organizations developed performance measures for diabetes care, but the result was that providers were often required to collect and report many different, sometimes conflicting, measures, depending on their care delivery system. It was recognized that a national consensus on measures could enhance this process and provide a method for assessing care within and across health care settings while providing a meaningful mechanism for quality improvement (QI). In this review, we describe the Diabetes Quality Improvement Project (DQIP), which has developed and implemented a comprehensive set of national measures for evaluation and QI.

    HISTORY AND ORGANIZATIONAL STRUCTURE

    In 1995, the Health Care Financing Administration (now known as the Centers for Medicare and Medical Services [CMS]) and the National Institute of Diabetes and Digestive and Kidney Diseases convened ∼20 public and private health organizations with an interest in performance measurement for diabetes care (Fig. 1). The goal was to identify a strategy for developing a comprehensive measurement set for diabetes care that would be broadly accepted and widely implemented.

    Financial and logistical support were needed to create the DQIP, as was an organizational structure with broad representation from the private and public sectors as well as professional organizations, clinical and public health experts, diabetes researchers, and experts in methodology. In 1997, DQIP was founded jointly by CMS, the National Committee for Quality Assurance (NCQA), and the American Diabetes Association (ADA). CMS, which has long been committed to quality of care issues, provided financial support, and NCQA provided logistical support. An Operations Group was created of public and private sector organizations and agencies to provide general direction, and a Technical Expert Panel (TEP) was formed to develop quality performance measures. The organizational structure was designed to enhance nationwide adoption of the DQIP performance measures by giving the project scientific credibility, links to the medical community, and the commitment of organizations that pay for diabetes care.

    The Operations Group included the ADA, the Foundation for Accountability in Health Care (FACCT), CMS, NCQA, the American Academy of Family Practice (AAFP), the American College of Physicians–American Society of Internal Medicine (ACP-ASIM), the Centers for Disease Control and Prevention (CDC), and the Veterans Health Administration (VHA). Thus, the members included the majority of diabetes care providers in the U.S., the major health care organizations in which diabetes care is delivered, and the major payors for diabetes care, along with leading researchers in clinical diabetes and health services. Several of these organizations had already worked extensively on improving diabetes care, and many had developed specific quality measures. Each organization realized that joining DQIP potentially meant losing its autonomy in performance assessment. But many were swayed by the prospect of a uniform set of performance measures. In addition, working together might mean avoiding the potential confusion and rejection of QI efforts that could result from not working together.

    To ensure that the measures represented diabetes management comprehensively, the TEP included endocrinologists, internists, family physicians, dieticians, educators, nurses, epidemiologists, and experts in performance measurement. To make sure that the practical issues of implementing the measures in practice settings were considered, individuals with operational responsibilities in large group practices, managed care plans, and federal health care agencies were included as well.

    MEASURE DEVELOPMENT PROCESS

    The development of quality performance measures, not new care guidelines, was the goal of the DQIP. Although the distinction between performance measures and guidelines may be viewed as minor, there are important differences (22). Performance measures retrospectively assess the level of care delivered across the entire population with diabetes, in contrast to guidelines that recommend the desired level of care for any single patient. Required criteria for a performance measure include 1) a firm evidence base; 2) feasibility, reliability, and suitability for uniform application across health care systems; and 3) variability across populations so that improvement can be monitored (Fig. 2). By contrast, care guidelines define a high or even ideal standard of care individualized to each patient that is based on evidence but also incorporates consensus that may be less rigorous than that needed for developing performance measures. Guidelines can guide care across subgroups of populations with diabetes, but in contrast to performance measures, they will not be applicable to all clinical situations. Performance measures are applied across entire populations.

    The TEP identified performance measures for key clinical care processes and outcomes to assess diabetes care in varied clinical settings (e.g., individual physician offices, group practices, and health plans) and for diverse diabetic subpopulations (e.g., elderly patients and minorities). The goal was to develop a set of accountability measures that was comprehensive yet parsimonious in order to minimize the burden of data collection. Candidate measures were identified by reviewing the literature, consulting with experts, and surveying organizations already using diabetes performance measures. Required criteria included credible evidence linking process measures to important clinical outcomes and modifiability of the clinical outcome measures (i.e., they could be improved) by the efforts and interventions of health care systems. Feasibility considerations included whether the measure could be collected accurately, reliably, and at a reasonable cost. Variability across health care settings ensured that there would be opportunity for improvement. Meeting these criteria established accountability measures that would be acceptable for public reporting and for comparing plans, practices, and providers.

    Another class of measures was identified. Referred to as QI measures, these met the evidence base applied to accountability measures but could not be uniformly measured across different health care settings. Thus, these measures are of value primarily for QI within a health care setting but not for external comparisons.

    After consideration as either an accountability or QI measure, candidate measures underwent further review using large national studies conducted by CMS and the ADA. Each measure had its feasibility assessed and its variability, measure specifications, and inter-rater reliability determined.

    The accountability and QI candidate measures were evaluated using evidence and information from previous studies. However, because these measures had not been tested in the aggregate, it was deemed necessary to reconfirm their previously documented variability across sites of care and to show that opportunities for improvement still existed. Thus, a new study was commissioned by CMS to evaluate the entire measure set before release. Nine geographically dispersed managed care plans with Medicare memberships of 1,000 to >30,000 participated. The plans varied significantly in their claims data systems (from less developed to well established), use of electronic records, and years of participation as Medicare managed care plans. Rates for these measures from 1999 data were consistent with findings from other major studies or surveys (21). Marked variability across plans was noted, as was an opportunity to improve care for nearly all the areas the measures assessed (Table 1). The study also found data collection to be feasible (average 20 min of record review per member). Inter-rater agreement among skilled record abstractors was high (97%). In addition, modifications of measurement specifications that improved inter-rater reliability were identified, thresholds for the measures were validated, and the elements for standardized paper and electronic data collection tools were identified.

    The final step in the measurement development process was an extensive peer review. The measures were sent to ∼200 clinicians, researchers, or organizations with an interest in diabetes performance measurement; all comments were considered, and revisions were incorporated. In August 1998, the DQIP 1.0 accountability and QI measure set was released (Table 2). In 2000, a set of patient survey–derived measures (smoking cessation counseling, self-management, nutritional education, satisfaction, interpersonal skills of the health care team, and functional status) was added after being extensively field tested (Table 2).

    EXAMPLE OF DEVELOPING A MEASURE: HbA1c FOR GLYCEMIC CONTROL

    Reviewing the process for developing an HbA1c level for an accountability measure illustrates the measure development process. Each of the criteria cited above (Fig. 2) is carefully considered in this example.

    Evidence

    The link between sustained hyperglycemia and the risk of microvascular complications has been well established by several studies (8,9,23,24,25,26). In these studies, HbA1c was used to monitor glycemic control, was correlated with complications, and was used for stratifying patients in risk categories for microvascular complications. Thus, HbA1c has become the standard assay for managing and monitoring glycemic control.

    Multiple considerations were included in the decision to use HbA1c testing and levels as performance measures. The ADA’s clinical guidelines, for example, recommend that the HbA1c goals for any patient must “be individualized in consultation between patient and primary health care provider” (27). In addition, clinical trials have shown that there is no lower level of HbA1c below which there ceases to be a benefit (26). Modeling studies, however, have shown that factors such as advanced age, the presence of diabetic complications, and other comorbidities (e.g., hypertension) can diminish the benefits of aggressive glycemic control (28,29). Another consideration was the recognition that the laboratory measurement of HbA1c is not nationally standardized.

    Considering the difficulties associated with collecting data about the patient factors that influence HbA1c and the lack of laboratory standardization, it was deemed unadvisable to choose a normal or near-normal HbA1c level as an accountability measure for all patient populations across all health care settings. Thus, the TEP decided to develop an accountability measure for “poor control” that was independent of these obstacles and chose a value of ≥9.5%. At this high level, symptoms of hyperglycemia are likely to occur that may decrease the quality of life for any patient, regardless of the long-term risk of developing microvascular complications. Individuals with no HbA1c measurement within the reporting period were classified as being in poor control (i.e., having a value ≥9.5%). This decision penalizes plans with low rates of HbA1c testing and reduces the chance of spurious performance assessment when only patients with good control are tested. Finally, because the evidence for benefit from glucose control is much less compelling in patients aged ≥75 years, 74 years was set as the upper age limit for the measure (29).

    As a QI measure (for use in a setting but not for comparison across settings), categories from <7.0% to >10% were recommended. Because the average severity of disease at each site is unlikely to change over the short term and because the laboratory method for determinations of HbA1c are also likely to remain unchanged within the site over the short term, the distribution of HbA1c can provide useful information for within-site QI. However, comparison of diverse populations across settings should be made with caution.

    Feasibility and variability

    Several studies have documented the feasibility of collecting HbA1c from medical records or laboratory data, but the variability of the measure has not been firmly established. Thus, the measure of poor control (HbA1c ≥9.5%) was first tested in several data sets. Data from the ADA Provider Recognition Program (29 sites), from 300 office practices, and from 23 managed care plans participating in the CMS Ambulatory Care Quality Improvement Project showed major variability across sites and showed that >20% of patients had HbA1c values ≥9.5%. In addition, the CMS study found that up to 70% of patients in some offices or plans did not have an HbA1c test performed. Thus, considerable opportunity exists to increase the number of patients tested for HbA1c and to decrease the number of patients with documented poor control.

    DQIP AND NATIONAL POLICY

    The DQIP is the first widely adopted comprehensive performance measurement set, not just for diabetes, but for any single disease. The NCQA adopted six of the original eight measures to use in HEDIS (Health Plan Employer Data and Information Set) 2000 for reporting by managed care plans. Both the ADA Provider Recognition Program, the only national recognition program for providers of diabetes care, and the FACCT have adopted the entire set of DQIP measures. The American Medical Association Diabetes Measures Group adopted the DQIP measures to use in plans and is using a similar set of QI measures for its physician-level measure set. CMS has extensively implemented the DQIP measures. In 2000, the agency required reporting of these measures in all Medicare managed care plans, and their use for QI projects was required; in the CMS fee-for-service setting, three measures (eye examinations, HbA1c testing, and lipid testing) have been collected and reported in all 50 states (20). Finally, CMS’s Peer Review Organizations are using these measures in working with physicians and group practices to improve quality of diabetes care.

    In the fall of 1999, the Federal Quality Interagency Council asked for a federal agency–wide commitment to the DQIP measures, and thus all federal agencies have adopted the measures for their diabetes efforts.

    CONCLUSION

    The factors that have contributed to the success of the DQIP include a broad appreciation of the opportunities to reduce the diabetes burden in the U.S. and a conviction that assessment of the quality of care would lead to improvements; a concern that multiple independent efforts to develop performance measures introduce burden without added benefit; a need for standardized, uniform performance measures capable of accurately and reliably assessing the quality of care within and across health care systems; and the partnerships forged early in the project that helped the DQIP gain valuable input for both the public and private sector.

    The collective power of the DQIP collaboration is immense. With the nation’s largest purchaser of health care (CMS), the nation’s largest direct provider of health care (VHA), the two most important developers and users of performance measurement in ambulatory care (NCQA and FACCT), an organization with great influence in the diabetes community (ADA), and the two medical specialty organizations that represent physicians who care for >80% of the diabetic patients in the country (ACP-ASIM and AAFP) working together, the result will be better translation of science into clinical practice.

    This key public and private partnership brought together groups ranging from consumers to purchasers of health care. The various constituencies and particular perspectives of these groups contributed both substantively and politically to the success of the endeavor. In addition, the collaboration with organizations or agencies prepared to implement the measures in mandatory programs or as part of voluntary accreditation was no doubt important.

    The direction of current and future DQIP activities and efforts are being shaped by the experiences with DQIP 1.0. Several issues warrant discussion. First, widespread confusion remains about the distinction between performance measures and clinical guidelines. Given the comments of some peer reviewers that the measures were inadequate guidelines or standards of care, further clarification is needed. This common misperception may have impeded acceptance of the measures.

    Second, in a national effort such as the DQIP, inclusiveness in building partnerships is critical. Although the Operations Group and the TEP had broad representation, some interested organizations with a commitment to diabetes were not initially represented. Subsequently, the Operations Group’s revisions and updates of the measures include participation from a 30-member Leadership Group. The American Association of Clinical Endocrinologists, an influential and effective advocate for quality care in diabetes, was an important addition to the Operations Group for subsequent DQIP revisions.

    A third important issue is the finding that the commitment of major health care purchasers and users of performance measures almost ensured rapid implementation of the DQIP measures. These measures were developed over a 1-year period and were broadly implemented across multiple health care settings within 1 year of their release. Thus, the intent of the DQIP to rapidly translate research into practice by developing a national performance measure set was certainly met. The success of DQIP has stimulated additional broad-based public/private sector efforts toward measure development. The SCRIPT project (Study of Clinically Relevant Indicators of Pharmacologic Therapy), a collaborative effort of 50 organizations to develop medication-related performance measures, is one example of an initiative based on the DQIP model. At a minimum, the wide implementation of the DQIP measures has intensified interest in diabetes and diabetes care nationally.

    Finally, DQIP measures provide readily available methods to evaluate interventions to improve diabetes care. For example, the availability of the DQIP standardized measure set has facilitated a case study analysis by CMS of Medicare plans reporting DQIP measures in 2000 that attempts to identify and disseminate best practices in care.

    The processes and results achieved by the DQIP have served and will continue to serve as a template for performance measurement for organizations seeking to improve care for diseases or conditions other than diabetes. For diabetes morbidity specifically, the ultimate effect of DQIP 1.0 will become apparent as data are publicly reported and used for QI. Use of the DQIP measure set will no doubt play a critical role in translating today’s clinical interventions into practice that will improve quality of life and clinical outcomes for diabetic patients and help us to gain an edge on the diabetes epidemic.

    Figure 1—

    Timeline of the DQIP. OG, Operations Group.

    Figure 2—

    Relationship between evidence, feasibility, and variability for accountability measures included in the DQIP 1.0 measure set.

    Table 1—

    Evaluation of the DQIP measures in nine managed care plans with Medicare participation

    Table 2—

    DQIP 1.0 measure set

    Acknowledgments

    Primary funding was provided by the CMS, with contributions by the ADA, CDC, and VHA. These agencies provided resources but did not direct the process or influence the outcome. Each agency had one voting member on the project’s steering committee.

    The authors thank Sherrie Kaplan, PhD, and Richard Kahn, PhD, for review of previous versions of this manuscript and Herman Jenich and Elisabeth Everitt of New York Peer Review Organization for collection and analysis of the data presented in Table 1. The data were collected under contract with the CMS.

    Footnotes

    • Address correspondence and reprint requests to Dr. Barbara Fleming, CMS, 7500 Security Blvd., S3-02-01, Baltimore, MD 21244. E-mail: bfleming{at}cms.hhs.gov.

      Received for publication 6 June 2001 and accepted in revised form 12 June 2001.

      A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.

    References

    | Table of Contents