Evaluation of a system for automatic detection of diabetic retinopathy from color fundus photographs in a large population of patients with diabetes.

  1. Michael D. Abràmoff, MD, PhD (michael-abramoff{at}uiowa.edu)1,,2,,3,
  2. Meindert Niemeijer, PhD4,,3,
  3. Maria S.A. Suttorp-Schulten, MD, PhD5,
  4. Max A. Viergever, PhD4,
  5. Stephen R. Russell, MD1,,2 and
  6. Bram van Ginneken, PhD4
  1. 1Retina Service, Department of Ophthalmology and Visual Sciences, University of Iowa Hospitals and Clinics, 200 Hawkins Drive, Iowa City, IA 52242, USA
  2. 2Department of Veterans Affairs, Iowa City VA Medical Center, 601 Highway 6 West, Iowa City, IA 55242, USA
  3. 3Department of Electrical and Computer Engineering, University of Iowa, Iowa City, IA 52242, USA
  4. 4Image Sciences Institute, University Medical Center Utrecht, Heidelberglaan 100, 3584CX, Utrecht, The Netherlands
  5. 5Ophthalmology Service, OLVG, Oosterpark 9, 1091 AC Amsterdam, Amsterdam, The Netherlands

    Abstract

    Objective: evaluate the performance of a system for automated detection of diabetic retinopathy in digital retinal photographs, built from published algorithms, on a large, representative, screening population.

    Research design and methods: Retrospective analysis of 10,000 consecutive patient visits = exams (4 retinal photographs, two left and two right) from 5,692 unique patients from the EyeCheck diabetic retinopathy screening project imaged with three types of cameras at ten centers. Inclusion criteria: no previous diagnosis of diabetic retinopathy, no previous visit to ophthalmologist for dilated eye exam, both eyes photographed. One of three retinal specialists evaluated each exam as unacceptable quality, no referable retinopathy, or referable retinopathy. The system selected exams with sufficient image quality, on those, determined presence or absence of referable retinopathy. Outcome measures: area under ROC curve (AROC), ‘number needed to miss one case’ (NNM), type of false negative.

    Results: Total AROC was 0.84, NNM was 80 at a sensitivity of 0.84 and a specificity of 0.64. At this point, 7689/10000 exams had sufficient image quality, 4648/7689 (60%) were true negatives, 59/7689 (0.8%) false negatives, 319/7689 (4%) true positives, and 2581/7689 (33%) false positives. 27% of false negatives contained large hemorrhages and/or neovascularizations.

    Conclusion: automated detection of diabetic retinopathy using published algorithms cannot yet be recommended for clinical practice. However, performance is such that evaluation on validated, publicly available datasets should be pursued. If algorithms can be improved, such a system may in the future lead to improved prevention of blindness and visual loss in patients with diabetes.

    Footnotes

      • Received September 3, 2007.
      • Accepted November 8, 2007.