|
|
||||||||
Breast Imaging |
1 From the Department of Radiology, S092, Stanford University Medical Center, 300 Pasteur Dr, Stanford, CA 94305-5105 (R.L.B., D.M.I.); and Radiology Residency Program, Loma Linda University Medical Center, Loma Linda, Calif (P.B.). Received May 14, 2004; revision requested August 3; revision received September 22; accepted October 22. Address correspondence to R.L.B. (e-mail: rbirdwell{at}partners.org).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Institutional review board approval was granted, and informed consent was waived. During a 19-month period, 8682 women (median age, 54 years; range, 3395 years) underwent screening mammography. Each mammogram was interpreted by one of seven radiologists, followed by immediate re-evaluation of the mammogram with CAD information. Each recalled case was classified as follows: radiologist perceived the finding and CAD marked it, radiologist perceived the finding and CAD did not mark it, or CAD prompted the radiologist to perceive the finding and recall the patient. Lesion type was also recorded. Recalled patients were tracked to determine the effect of CAD on recall and biopsy recommendation rates, positive predictive value (PPV) of biopsy, and cancer detection rate. A 95% confidence interval was calculated for cancer detection rate. Pathologic examination was performed for all cancers.
RESULTS: Of 8682 patients, 863 (9.9%) with 960 findings were recalled for further work-up (Breast Imaging Reporting and Data System category 0). After further diagnostic imaging, it was recommended that biopsy or aspiration be performed for 181 of 960 findings (19%); 165 interventions were confirmed to have been performed. Twenty-nine cancers were found in this group, with a PPV for biopsy of 18% (29 of 165 findings) and a cancer detection rate of 3.3 per 1000 screening mammograms (29 of 8682 patients). CAD-prompted recalls contributed 8% (73 of 960 findings) of total recalled findings and 7% (two of 29 lesions) of cancers detected. Of 29 cancers (59%), 17 manifested as masses and 12 (41%) were microcalcifications. Ten (34%) cancers were ductal carcinoma in situ, and the remaining cancers had an invasive component. Both cancers found with CAD manifested as masses, and both were invasive ductal carcinoma.
CONCLUSION: Prospective clinical use of CAD in a university hospital setting resulted in a 7.4% increase (from 27 to 29) in cancers detected. Both cancers were nonpalpable masses.
© RSNA, 2005
| INTRODUCTION |
|---|
|
|
|---|
If we look only at studies of missed cancers that were later detected at screening (eliminating those studies including interval cancers), most (57%75%) of these missed lesions have some finding visible on mammograms in retrospect. Of these visible findings, however, only 27%36% are interpreted as "actionable" (either warranting a recall of the patient or worrisome enough to consider biopsy recommendation) when evaluated in a blinded fashion (3,4). Experience has shown that prospective double reading of screening mammograms increases the detection rate of cancer from 4% to 15% (5). Error rates have been shown to decrease with multiple readings, whether by humans (58) or by humans aided by means of machines (9,10).
Computer-aided detection (CAD), when used as a "spell check" type of system, has been shown to mark missed mammographic lesions later shown to be cancer in both retrospective and prospective studies. The goal of our study, therefore, was to prospectively asses the effect of CAD on the outcome of screening mammogram interpretation in an academic medical center to determine if this outcome is different than that previously reported for community practice experience (10).
| MATERIALS AND METHODS |
|---|
|
|
|---|
The image interpretation protocol included preliminary evaluation of approximately half of the mammograms by a 2nd-, 3rd-, or 4th-year radiology resident and final interpretation by one of seven attending radiologists (two who specialized in breast imaging [R.L.B., D.M.I.] and five with special interest in breast imaging) who met the requirements of the Mammography Quality Standards Act of 1992. The years of experience for all breast image readers ranged from 10 to 30 years. The analog images were interpreted in a standard fashion, including the use of clinical history, a magnifying glass, and any available images from previous examinations. After the attending radiologist made the final assessment, the CAD marks were displayed. At this point, the radiologist reviewed the areas marked by the CAD system and assigned final Breast Imaging Reporting and Data System (BI-RADS) (11) categories for the cases.
For the purposes of this study, the detection methods for those mammograms assessed as BI-RADS category 0 (recall cases) were classified as follows: (a) the radiologist initially detected a finding and this area was marked by the CAD system on at least one of the two mammographic views (radiologist and CAD group); (b) the radiologist detected a finding and the area was not marked by the CAD system (radiologist-only group); and (c) the radiologist initially assessed the mammogram as negative but was prompted by the CAD marks to look again at an area (or areas) and then judged the finding as warranting a recall (CAD-only group). Each radiologist categorized the recall cases at interpretation into one of these three groups and recorded each finding separately into a logbook that allowed for more than one lesion per patient.
Data and Statistical Analysis
The results of the recall cases were collected and tracked on the basis of the systems put in place because of requirements of the Mammography Quality Standards Act of 1992. Of the 8682 women examined with screening mammography and CAD, the following data were collected (R.L.B., P.B.): the number of women recalled from screening; the type of lesion prompting the recalls and how these lesions were detected (radiologist and CAD, radiologist only, CAD only); and the results of follow-up diagnostic imaging, including the number of examinations in which additional imaging resulted in a negative reading, the number of biopsies recommended, the number of women who complied with biopsy recommendations, and the number of women lost to follow-up. Also collected were the types of lesions recommended for biopsy and how these lesions were detected; biopsy methods used and information pertinent to the cancers diagnosed, including lesion type and method of detection; and cancer sizes, invasive versus in situ histologic characteristics, and grades. The callback rate was calculated and compared with that from a prior time period.
The predicted improvement in the detection rate was computed as the ratio of cancers detected with the CAD system alone over those detected without CAD. Because this ratio is necessarily greater than zero if even one cancer is detected with CAD alone, the effect is described in terms of a 95% confidence interval rather than a hypothesis test (12). All analyses were conducted by using SAS software (version 8.2; SAS Institute, Cary, NC). Our investigation was not a paired study in that we had no control group. Therefore, we performed no paired study analyses.
| RESULTS |
|---|
|
|
|---|
|
Biopsies
Of the 181 lesions recommended for intervention (in 159 women), procedures were completed for 165 lesions (91%). Thirteen women with 16 findings (9%) did not comply with biopsy recommendations because they were lost to follow-up, they relocated, or they had insurance changes dictating care at a different fa-cility.
Biopsy procedure methods and results included 45 fine-needle aspiration biopsies with either lesion resolution or benign findings and recommendation for 6- or 12-month follow-up, 54 core biopsies with benign findings and recommendation for 6-month imaging follow-up, and 57 core biopsies that resulted in recommendations for excisional biopsy (all cancer cases and high-risk lesions, including atypical hyperplasias and lobular carcinoma in situ, as well as lesions with discordance between imaging and histologic results). In nine cases, patients proceeded directly to excisional biopsy. The manner in which these biopsy-recommended findings were detected is outlined in Table 2.
|
Of the 29 cancers, 21 (72%) were initially detected by the radiologist and also marked by the CAD system (radiologist and CAD group) and six (21%) were detected by the radiologist and not marked by the CAD system (radiologist-only group). In two cases (7%), a repeat evaluation by the radiologist of an area marked by the CAD system led to a recommendation for patient recall (CAD-only group). Lesion characterization resulted in a total of 17 masses (59%) and 12 calcifications (41%), with methods of detection as shown in Table 3 (Figs 13). The detection of two additional cancers with use of CAD represents a 7.4% (two of 27 cancers) increase in the cancer detection rate (95% confidence interval: 1.8%, 31.0%).
|
|
|
|
|
|
|
|
|
|
| DISCUSSION |
|---|
|
|
|---|
CAD systems are designed to be used after independent (ie, unaided) case assessment by the interpreting radiologist who, after taking into account any CAD marks, assigns a final assessment to the case. One of the challenges inherent to the initial use of a CAD system by radiologists is to become comfortable with the number of predominantly false marks; false marks average 2.0 per four-view negative mammogram for the CAD system used in this study (13). It appeared to the interpreting radiologists that there was a learning curve when working with the CAD system marks, in that it took more time per screening mammogram interpretation early in our use of CAD than it does at the present time. Fortunately, with experience, the overwhelming majority of false CAD marks are readily dismissed.
At the time of preparing this article, there were only two published reports of prospective studies, to our knowledge, in which the effect of CAD on screening-detected breast cancer was reported. Both the study designs and results in these reports differ substantially. In 2001, Freer and Ulissey (10) reported on their prospective study, in which they used a study design similar to the one we followed, but their study was performed in a community practice setting and without the assistance of in-house staff. They interpreted 12 860 mammograms, recalled 986 women with 1026 findings, and diagnosed 49 cancers (10). Eight of the 49 cancers were detected with CAD only, which increased the cancer detection rate by 19.5% (eight of 49 cancers). In 2004, Gur et al (14), in a historical comparison study, reported no statistically significant change in breast cancer detection rates when using CAD in a practice that was defined as academic rather than private. In that study, 24 radiologists inter-preted screening mammograms with (n = 59 139) and without (n = 56 432) CAD and found similar breast cancer detection rates for both groups (3.55% vs 3.49% per 1000 screening examinations).
Although Gur et al (14) reported no statistically significant increase in cancer detection between those radiologists using CAD and those reading screening mammograms without CAD, Feig et al (15) noted differences when separating seven high-volume radiologistswho interpreted 71% (82 128 of 115 571) of the casesfrom 17 low-volume radiologists. Although the high-volume readers recorded no statistically significant difference in the recall or cancer detection rates with CAD compared with without CAD (11.05 vs 11.62 and 3.49 vs 3.61, respectively), the cases in which images were interpreted by the low-volume readers did show differences in both recall and cancer detection rates with CAD and without CAD (12.00 vs 10.52 and 3.65 vs 3.05, respectively). The increase in the cancer detection rate for low-volume readers was approximately 19%; these increases in recall and cancer detection rates are similar to those reported by both Freer and Ulissey (10) and Cupples (16) with regard to community-based practices.
The only specific lesion type detection analysis in the study by Gur et al (14) was that of clustered microcalcifications alone, where they reported 1.44 per 1000 mammograms with CAD and 1.35 per 1000 mammograms without CAD. This finding differs from that of Freer and Ulissey (10), who saw an increase of 53% (80 of 150 cases) for those cases in which CAD results prompted the radiologist to detect the actionable calcifications. Differences in these two reported experiences raise the possibility that the Freer and Ulissey practice might have had a greater reliance on CAD and were more likely to recall those cases in which the CAD system marked calcifications. This may be due to the known performance of the CAD algorithm, which has consistently demonstrated higher sensitivity for calcifications (98%) than for masses (86%) (13).
Gur et al (14) suggest that our study and studies such as that by Freer and Ulissey (10) "may have been affected by the fact that our results of mammographic interpretations without and with CAD were reported on the same cases." Their concern centers on the fact that the interpreting radiologists may have had a lower level of vigilance "because they knew that computer-aided detection would be available to them for the final recommendation and that the initial interpretation did not constitute a formal clinical recommendation." However, we think it more likely that because the radiologists participating in the sequential study design used by us and by Freer and Ulissey knew that any "misses" on their part would be recorded, they were likely to be overly vigilant in their initial (pre-CAD) reading.
Our finding of a 7.4% increase in cancers detected with the use of CAD is similar to the 6.4% increase reported by Morton et al (17), but both are less than the 19.5% increase reported by Freer and Ulissey (10). Speculations as to the causes of these differences in cancer detection rates might be related to practice setting, the volume of cases interpreted per radiologist, the number of radiologists dedicated to interpreting breast images and teaching about breast imaging, and the effect of interpretations made in conjunction with the in-house radiology staff (double reading) before engaging a CAD system.
We were somewhat surprised to note that both the cancers detected in the CAD-only group were masses rather than microcalcifications. CAD systems are known to be very sensitive for marking microcalcifications (9,13). Although two of 29 (7%) additional cancers diagnosed because of CAD were masses, one of 12 (8%) cancers was perceived by a radiologist and not marked by the CAD system. The result that radiologists found 27 (93%) of the 29 cancers and the CAD system marked 23 (79%) emphasizes the fact that CAD is indeed an adjunct methoda reminder to "take another look." CAD is not a first-line assessment of screening mammograms where the negative mammograms (those mammograms without CAD marks) can be ignored by the radiologist and all of the physician's attention can be focused on those mammograms with CAD marks. At the current level of performance, a CAD system cannot and should not replace the radiologist as either a first or a final look.
Possible study limitations include the fact that each interpreting radiologist recorded his or her own recall findings at the prospective interpretation. The presence of a nonbiased individual in the reading room to record all findings might have resulted in a more strict assessment of how each of the lesions was categorized as having been detected (radiologist and CAD, radiologist only, CAD only). The study also did not record the exact contribution of the preliminary evaluations by residents to the interpretations. It is also possible that the results from the 13 women with biopsy recommendations who were lost to follow-up could have had an effect on the overall results reported herein. Finally, the study was not designed for follow-up of patients into the next screening interval. Therefore, the cancer detection rate and PPV at biopsy may have been affected by those recalled patients who were returned to screening after undergoing diagnostic imaging work-up alone.
Our study was designed to assess the effect of a CAD system on the detection of breast cancer in an academic medical center. We found a 7.4% increase in the detection rate of breast cancer with CAD. Our findings did not suggest a shift toward the assistance of CAD systems in the detection of microcalcifications. Our recall rate showed a modest increase when compared with that from a similar time period when we interpreted screening mammograms without the use of CAD systems. Comparing our experience with those from the other two published analyses, we suggest that the use of CAD systems may be more or less beneficial depending on practice type, practice volume, number of dedicated breast imagers interpreting the mammograms, the addition of human-human (attending radiologist and house staff) double interpretation before engaging the CAD system, and the experience of the radiologists with the CAD system.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System CAD = computer-aided detection PPV = positive predictive value
Authors stated no financial relationship to disclose.
Author contributions: Guarantors of integrity of entire study, R.L.B., P.B., D.M.I.; study concepts and design, R.L.B., P.B., D.M.I.; literature research, R.L.B.; clinical studies, R.L.B., D.M.I.; data acquisition, R.L.B., P.B.; data analysis/interpretation, R.L.B., P.B., D.M.I.; statistical analysis, R.L.B.; manuscript preparation, definition of intellectual content, editing, revision/review, and final version approval, R.L.B., P.B., D.M.I.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. A. Harvey, B. T. Nicholson, and M. A. Cohen Finding Early Invasive Breast Cancers: A Practical Approach Radiology, July 1, 2008; 248(1): 61 - 76. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Gur Imaging Technology and Practice Assessment Studies: Importance of the Baseline or Reference Performance Level Radiology, April 1, 2008; 247(1): 8 - 11. [Full Text] [PDF] |
||||
![]() |
M. Gromet Comparison of Computer-Aided Detection to Double Reading of Screening Mammograms: Review of 231,221 Mammograms Am. J. Roentgenol., April 1, 2008; 190(4): 854 - 859. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Georgian-Smith, R. H. Moore, E. Halpern, E. D. Yeh, E. A. Rafferty, H. A. D'Alessandro, M. Staffa, D. A. Hall, K. A. McCarthy, and D. B. Kopans Blinded Comparison of Computer-Aided Detection with Human Second Reading in Screening Mammography Am. J. Roentgenol., November 1, 2007; 189(5): 1135 - 1141. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. F. Brem Blinded Comparison of Computer-Aided Detection with Human Second Reading in Screening Mammography: The Importance of the Question and the Critical Numbers Game Am. J. Roentgenol., November 1, 2007; 189(5): 1142 - 1144. [Full Text] [PDF] |
||||
![]() |
R. L. Ellis, A. A. Meade, M. A. Mathiason, K. M. Willison, and W. Logan-Young Evaluation of Computer-aided Detection Systems in the Detection of Small Invasive Breast Carcinoma Radiology, October 1, 2007; 245(1): 88 - 94. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ciatto, N. Houssami, D. Gur, R. M. Nishikawa, R. A. Schmidt, C. E. Metz, J. F. Ruiz, S. A. Feig, R. L. Birdwell, M. N. Linver, et al. Computer-aided screening mammography. N. Engl. J. Med., July 5, 2007; 357(1): 83 - 84. [Full Text] [PDF] |
||||
![]() |
J. J. Fenton, S. H. Taplin, P. A. Carney, L. Abraham, E. A. Sickles, C. D'Orsi, E. A. Berns, G. Cutter, R. E. Hendrick, W. E. Barlow, et al. Influence of Computer-Aided Detection on Performance of Screening Mammography N. Engl. J. Med., April 5, 2007; 356(14): 1399 - 1409. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Skaane, A. Kshirsagar, S. Stapleton, K. Young, and R. A. Castellino Effect of Computer-Aided Detection on Independent Double Reading of Paired Screen-Film and Full-Field Digital Screening Mammograms Am. J. Roentgenol., February 1, 2007; 188(2): 377 - 384. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. F. Brem Clinical Versus Research Approach to Breast Cancer Detection with CAD: Where Are We Now? Am. J. Roentgenol., January 1, 2007; 188(1): 234 - 235. [Full Text] [PDF] |
||||
![]() |
J. M. Ko, M. J. Nicholas, J. B. Mendel, and P. J. Slanetz Prospective assessment of computer-aided detection in interpretation of screening mammography. Am. J. Roentgenol., December 1, 2006; 187(6): 1483 - 1491. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Dean and C. C. Ilvento Improved cancer detection using computer-aided detection with diagnostic and screening mammography: prospective study of 104 cancers. Am. J. Roentgenol., July 1, 2006; 187(1): 20 - 28. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Sumkin, D. Gur, R. L. Birdwell, and D. M. Ikeda Computer-aided Detection with Screening Mammography: Improving Performance or Simply Shifting the Operating Point? Radiology, June 1, 2006; 239(3): 916 - 918. [Full Text] [PDF] |
||||
Read all eLetters
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |