© 2002 by the American Diabetes Association, Inc.
Inside GuidelinesComparative analysis of recommendations and evidence in diabetes guidelines from 13 countries
1 Centre for Quality of Care Research, University Medical Centre Nijmegen, Nijmegen, the Netherlands
OBJECTIVETo compare guidelines on diabetes from different countries in order to examine whether differences in recommendations could be explained by use of different research evidence. RESEARCH DESIGN AND METHODSWe analyzed 15 clinical guidelines on type 2 diabetes from 13 countries using qualitative methods to compare the recommendations and bibliometric methods to measure the extent of overlap in citations used by different guidelines. A further qualitative analysis of recommendations and cited evidence for two specific issues in diabetes care explored the apparent discrepancy between recommendations and evidence. RESULTSThe recommendations made in the guidelines were in agreement about the general management of type 2 diabetes, with some important differences in treatment details. There was little overlap in evidence cited by the guidelines, with 18% (185/1,033) of citations shared with any other guideline, and only 10 studies (1%) appearing in six or more guidelines. The measurable overlap in evidence between guidelines increases if multiple publications from the same study and the use of reviews are taken into account. Research originating from the U.S. predominated (40% of citations); however, nearly all (11/12) guidelines were significantly more likely to cite evidence originating from their own countries. CONCLUSIONSDespite the variation in cited evidence and preferential citation of evidence from a guidelines country of origin, we found a high degree of international consensus in recommendations made for the clinical care of type 2 diabetes. The influence of professional bodies such as the American Diabetes Association may be an important factor in explaining international consensus. Globalization of recommended management of diabetes is not a simple consequence of the globalization of research evidence.
Abbreviations: ADA, American Diabetes Association AGREE, Appraisal of Guidelines for Research and Evaluation DCCT, Diabetes Control and Complications Trial UKPDS, U.K. Prospective Diabetes Study
Over the past 20 years, clinical guidelines have been developed to bridge the gap between research and practice (1). There has been a concerted effort to base clinical decisions on research evidence (2) and, particularly through the Cochrane collaboration, to make this evidence available globally (3). Guideline development groups aim to use the totality of relevant research evidence to formulate recommendations (4). Since bibliographic databases (for example Medline and Embase) are easily available, one might expect that this would lead to international consensus on the evidence chosen to underpin recommendations for clinical care and a consequent convergence of recommendations made in guidelines. Nevertheless, recommendations often differ in guidelines on the same topic, particularly when evidence for treatment decisions is weak. For example, Eisinger et al. (5) found substantial differences between recommendations from the U.S. and France about prophylactic mastectomy or oophorectomy in high-risk women. Differences were attributed to cultural variation in ideas about patient autonomy and involvement in health care, differing national views on esthetics of the breast, and fertility. Even where there is good trial evidence, recommendations vary. For instance, analysis of hypertension guidelines from New Zealand, U.S., Canada, U.K., and the World Health Organization showed wide variation in the criteria for blood pressure treatment decisions (6). Differences persisted between more recent editions of national hypertension guidelines, even with more systematic and transparent methods of guideline development (7). It is evident that there are disparities in recommendations in guidelines for a range of different clinical conditions. Investigators hypothesize that differences are due to insufficient evidence (6,8,9), differing interpretations of evidence (10), unsystematic guideline development methods (11,12), the influence of professional bodies (13), cultural factors such as differing expectations of apparent risks and benefits (5,6), socioeconomic factors, or characteristics of health care systems (14). In this study, we compared recommendations among a range of guidelines on the management of type 2 diabetes and analyzed to what extent the variation (or concordance) among recommendations was explained by the evidence cited in the guidelines.
Selection of guidelines We applied the Institute of Medicines definition of clinical guidelines: "systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances" (15). Systematic reviews and evidence reports that did not contain specific recommendations were not included in this study. Because of the large number of clinical issues related to diabetes, the selection of guidelines was limited to two areas: 1) ambulatory or outpatient care, excluding guidelines exclusively covering type 1 diabetes, complications of diabetes that need specialist care (retinopathy, diabetic foot, nephropathy, and neuropathy), and gestational diabetes; and 2) treatment of diabetes, excluding guidelines on prevention and diagnosis. The sample consisted of a total of 15 guidelines for the clinical care of type 2 diabetes (Table 1) representing the national guidelines of the Appraisal of Guidelines for Research and Evaluation (AGREE) collaboration. This international group of researchers has investigated variation between guidelines and guideline development models with the aim of advising the European Commission on guideline development, dissemination, and implementation. The east London guideline was chosen because there were no national English guidelines available. Two French guidelines were complementary and were analyzed as one guideline. The guidelines from Australia, New Zealand, Canada, and the U.S. were identified through a web-based search and consultation with colleagues. Four guidelines (CA, NL2, US1, and US2) were updated versions of earlier guidelines. Two guidelines (CA and DK) were funded by pharmaceutical companies, and the others were funded by national or regional government agencies (EN, FR, NZ, SC, and SP) or state health care systems (AU and US2), national professional organizations (FI, IT, NL2, and US1), or hospitals (NL1, SW).
Selection of comparable sections Because the guidelines varied in their scope, we selected sections that covered the treatment and monitoring of hyperglycemia and cardiovascular risk. The guidelines were in seven different languages. Members of the study team translated those guidelines written in French or Dutch; relevant sections of guidelines in Finnish, Danish, and Spanish were translated by guideline developers in their respective countries. The Italian guideline was excluded from the analysis of recommendations because of its length and lack of structure.
Extraction and comparison of recommendations
Extraction and measurement of overlap of citations
Examination of link between recommendations and citations
Guidelines varied considerably in length (range 3350 pages), format, and number of references (Table 2). Nine guidelines linked their recommendations to citations; four of these (FR, SC, CA, and US2) also used grading systems to appraise the evidence.
Guidelines varied in their coverage. For example, the Danish and Spanish guidelines allocated >10% of the text to detailed dietary recommendations, whereas the English and New Zealand guidelines only made a few general statements. Guidelines also varied in their scope; for example, the Scottish, Australian, and Dutch guidelines (NL1) did not cover drug treatment of hyperglycemia.
Comparison of recommendations
Differences between the recommendations were found in the following areas: 1) length of trial of diet and exercise before oral treatment ranged from 2 to 9 monthssome guidelines recommended a longer period in obese than nonobese patients; 2) BMI used to define obesity ranged from 25 to 30 kg/m2; 3) widely varying indications were suggested for the use of
Comparison of linked citations The largest proportion of lead authors of papers cited in the guidelines (40%) originated from the U.S. (Table 3). All guidelines, except the Australian, cited a significantly higher proportion of studies from authors of their own countries than the origin overall of citations in the database (P < 0.02). Citations in the English, Scottish, and New Zealand guidelines were predominantly from the U.K., and citations in all other guidelines, except the Dutch general practice guideline (NL2), were predominantly from the U.S.
Sixteen of the total 1,033 citations (2%) were meta-analyses, 89 (9%) were reviews or overviews (including 4 systematic reviews), and 55 (5%) were existing guidelines (including practical guides and clinical practice recommendations) or consensus statements. Twenty of these 160 secondary citations (13%) were ADA publications.
Examination of link between recommendations and citations (case studies)
Self-monitoring of blood glucose
This is the first study comparing both guideline recommendations and cited evidence across national guidelines. Our bibliometric analysis included >1,000 citations. We minimized selection and observer bias by prospective choice of inclusion criteria for recommendations and citations and independent extraction by two researchers. We found a high degree of international consensus on the clinical care of people with type 2 diabetes, despite differences in detailed recommendations. This was in contrast to what we expected, considering the range of influences on the guideline development process and the variation in organization of care and health care system among countries (23). Yet the citations linked to and presumably justifying the guideline recommendations were widely disparate. The influence of large pragmatic treatment trials (e.g. DCCT [17]) and UKPDS studies (19,24,25) was nevertheless visible in most of the guidelines and apparent even in guidelines without references. Little use was made of systematic reviews (for example Cochrane reviews), which is consistent with the findings of Silagy et al. (26). National guidelines were significantly more likely to cite research from investigators from the same country, explaining some of the variation in citations between guidelines. Others have found that local sources of evidence are overrepresented in guidelines (27) and that the results of trials conducted in the same country may be given more prominence (28). We used the case studies to generate hypotheses to explain the small degree of overlap in citations between guidelines. Recommendations for the use of metformin in obese patients drew on supportive trial and review evidence. The different studies linked to these concordant recommendations often had similar conclusions. We also observed a consensus in recommendations for the use of self-monitoring of blood glucose, despite citation of evidence that did not support this position. The overlap in evidence would have been larger if we had aggregated citations from the same study (e.g., UKPDS) and if we had included the primary citations made within reviews and meta-analyses. Even taking this into account, the evidence cited in type 2 diabetes guidelines largely does not overlap. Therefore, we hypothesize that there are other potential influences on guideline developers. For example, the recommendations of the ADA strongly influenced the other guidelines on diabetes, with the exception of the English and Scottish. Similarly, Littlejohns et al. (13) found that professional opinion expressed in a consensus statement from the Royal College of General Practitioners and the Royal College of Physicians influenced the recommendations made in nine U.K. guidelines for the treatment of depression in primary care. Guideline development is a social as well as technical process that is affected by access to and choice of research evidence and decisions about the interpretation of evidence and formulation of recommendations (2931). Our study suggests that research evidence is not necessarily the most powerful influence on the content of recommendations in the current generation of guidelines on the management of type 2 diabetes. Guideline developers might first aim to achieve consensus about recommendations and then switch to the evidence as a rhetorical device to support decisions post hoc. Thus, the relationship between choice and interpretation of research evidence and the formulation of guideline recommendations is neither necessarily linear nor unidirectional. However, we are not suggesting a complete epistemological divide between evidence as represented by research papers and guidelines recommendations. As Greenhalgh and McCormack (32) have argued with regards to the UKPDS study, the interpretation of results within primary research studies is also debatable, influenced by prior beliefs, and open to challenge. There are several sources of imprecision in our analysis. First, the guidelines were partly selected by researchers participating in the AGREE Collaboration. Therefore, the sample might be biased toward guidelines developed with more explicit and robust methods, such as systematic searching and the use of evidence grading systems. Nevertheless, the extent and format of the guidelines differed widely. Six guidelines did not link their recommendations to evidence, which complicated the data extraction. Second, we did not record the extent of initial agreement on choice of recommendations, judgement on concordance or discordance of recommendations, or linkage between citations and recommendations. However, there were few disagreements and these were easily resolved by panel consensus. Third, some of the variation in the content of the guidelines might be explained by the different publication dates of the guidelines and the rapid shift of information during the period studied. For instance, nine of the guidelines included in our study could not consider the UKPDS data that were published in 1998. In our analysis of the citations, we dealt with this confounding factor by correcting for publication dates of the guidelines and the cited evidence. Finally, analysis of shared references is a blunt instrument for exploring the relationship between guideline recommendations and evidence. High-quality and large trials should be given more weight in the analysis. That is why we included two case studies exploring in more detail the relationship between recommendations and evidence in diabetes guidelines. Other clinical issues will need this kind of analysis to test the generalizability of our findings. The process of formulating guideline recommendations and the social determinants of guidelines require further investigation. Decisions about choice of evidence and the role of international conferences, pharmaceutical companies, and opinion-forming bodies, such as the ADA, on national guidelines is not well understood. The growing availability of high-quality systematic reviews may support more uniformity in the use of research evidence in guidelines (33). Nevertheless, guidelines go beyond simple reviews of available evidence and necessarily reflect value judgements in considering all the issues relevant to clinical decision making. Transparency by guideline developers about how their judgements have been made would allow clinicians to evaluate the applicability of guideline recommendations to their own health care context and to individual patients.
The AGREE Collaboration The following individuals participated in the AGREE Collaboration: José Asua, MD, PhD, Basque Office for Health Technology Assessment, Spain; Anne Bataillard, MD, Fédération Nationale des Centers de Lutte Contre le Cancer, Paris, France; Melissa Brouwers, PhD, McMaster University and Cancer Care Ontario, Hamilton, ON, Canada; George Browman, MD, Hamilton Regional Cancer Center, Hamilton, Canada; Jako Burgers, MD, Center for Quality of Care Research, University Medical Center Nijmegen, the Netherlands; Bernard Burnand, MD, MPH, Institut Universitaire de Médecine Sociale et Préventive, Lausanne, Switzerland; Françoise Cluzeau, MSc, PhD, St Georges Hospital Medical School, London, U.K.; Isabelle Durand-Zaleski, PhD, Hôpital Henri Mondor, Cedez, France; Pierre Durieux, MD, Hôpital Européen Georges Pompidou, Paris, France; Cindy Farquhar, MD, PhD, New Zealand Guidelines Group, Auckland, New Zealand; Gene Feder, MD, FRCG, Barts and The London, Queen Marys School of Medicine and Dentistry, University of London, U.K.; Béatrice Fervers, MD, Fédération Nationale des Centers de Lutte Contre le Cancer, Paris, France; Roberto Grilli, MD, Agenzia Sanitaria Regionale, Bologna, Italy; Jeremy Grimshaw, MB, PhD, Ottawa Health Services Research Institute, Ottawa, Canada; Richard Grol, PhD, Center for Quality of Care Research, University Medical Center Nijmegen, the Netherlands; Steven Hanna, PhD, McMaster University, Hamilton, ON, Canada; Pieter ten Have, MD, Dutch Institute for Healthcare Improvement CBO, Utrecht, the Netherlands; Rod Jackson, PhD, Effective Practice Institute, University of Auckland, New Zealand; Albert Jovell, MD, PhD, Fundacio Biblioteca Josep Laporte, Barcelona, Spain; Niek Klazinga, MD, PhD, Academic Medical Center, University of Amsterdam, the Netherlands; Finn Kristensen, MD, PhD, Danish Institute for Health Technology Assessment, Copenhagen, Denmark; Peter Littlejohns, MBBS, MD, National Institute for Clinical Excellence, London, U.K.; Pia Bruun Madsen, Danish Institute for Health Technology Assessment, Copenhagen, Denmark; Marjukka Mäkelä, MD, PhD, MSc, Finnish Office for Health Care Technology Assessment, Helsinki, Finland; Juliet Miller, MA, MBA, Scottish Intercollegiate Guidelines Network (SIGN), Edinburgh, U.K.; Günter Ollenschläger, MD, PhD, Agency for Quality in Medicine, Cologne, Germany; Camilla Palmhøj-Nielsen, Danish Institute for Health Technology Assessment, Copenhagen, Denmark; Loes Pijnenborg, MD, PhD, Dutch College of General Practitioners, Utrecht, the Netherlands; Safia Qureshi, PhD, Scottish Intercollegiate Guidelines Network (SIGN), Edinburgh, U.K.; Rosa Rico-Iturrioz, MD, MSc, Basque Office for Health Technology Assessment, Spain; Kitty Rosenbrand, MD, Dutch Institute for Healthcare Improvement CBO, Utrecht, the Netherlands; Jean Slutsky, Agency for Healthcare Research and Quality, Rockville, MD; John-Paul Vader, MD, MPH, Institut Universitaire de Médecine Sociale et Préventive, Lausanne, Switzerland; and Joost Zaat, MD, PhD, Center for Quality of Care Research, University Medical Center Nijmegen, the Netherlands.
The research was funded by a grant From the EU BIOMED2 Programme (BMH4-98-3669). We thank Jean Ramsay, Rachel Kerr, and Emily Hallgarten for entering data into the reference manager database, Sudip Nandy for piloting the study methodology, and Rob Dijkstra for contributing to the data collection of the Dutch recommendations.
Address correspondence and reprint requests to Jako S. Burgers, MD, Centre for Quality of Care Research, University Medical Centre Nijmegen, PO Box 9101, 6500 HB Nijmegen, The Netherlands. E-mail: burgersj{at}knmg.nl. Received for publication 21 April 2002 and accepted in revised form 5 August 2002. The AGREE Collaboration coordinating center is the Department of Public Health Sciences, St. Georges Hospital Medical School, University of London, London, U.K. (see APPENDIX). Additional information for the article can be found in an online appendix at http://care.diabetesjournals.org. A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||