|
|
||||||||
Breast Imaging |
1 From the Department of Radiology, Duke University Medical Center, South Hospital, Box 3808, Durham, NC 27710. Received April 20, 2004; revision requested May 28; revision received August 3; accepted September 2. Address correspondence to S.V.G. (e-mail: ghate001@mc.duke.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Institutional review board approval was obtained, and informed consent was waived. Retrospective analysis was performed for 8698 screening mammograms obtained between January 1 and October 31, 2001, which were interpreted either immediately (n = 4113) or subsequently with batch method (n = 4585). Data were collected from data reporting system and patient billing records. Patients with high risk factors were excluded; 3441 patients were in the immediate group, and 3932 were in the batch group. The two groups were compared with respect to age, breast density, and availability of comparison films with Wilcoxon rank sum test. Recall rates and cancer detection rates for each group were determined and compared with Pearson
2 test; false-negative rates were compared with Fischer exact test.
RESULTS: A significant difference (P < .001) was noted in recall rates between immediate (18%) and batch (14%) groups; however, no significant difference (P = .7) was noted in cancer detection rates (immediate, 0.5%; batch, 0.4%). Mean age of patients was 56.8 years (age range, 2196 years) in the immediate group and 56.2 years (age range 2498 years) in the batch group (P = .02). Comparison of breast densities between groups indicates no statistically significant difference (P = .4). The batch group had significantly fewer comparison mammograms (3106 [79%]) available than the immediate group (2856 [83%]) (P < .001). There was no significant difference in false-negative rates between the immediate group (0.1%) and the batch group (0.1%) (P > .99).
CONCLUSION: Immediate interpretation of screening mammograms resulted in a statistically significant increase in recalls and additional clinical work-ups of perceived abnormalities; however, no significant difference in cancer detection rate was detected between groups.
© RSNA, 2005
| INTRODUCTION |
|---|
|
|
|---|
Immediate interpretations offer several advantages for the patients and referring clinicians, as demonstrated by previous surveys (14). These include the satisfaction of obtaining immediate results, reduced stress associated with waiting for further evaluation of positive findings, and convenience of completing the work-up in a single visit. For radiologists, however, the unpredictable nature of immediate interpretation makes scheduling difficult, and it can be disruptive to the flow of a busy diagnostic practice. This results in longer waiting times for patients, overall inefficiency, and decreased cost-effectiveness (4). Paradoxically, the convenience of having the patient present when images are interpreted may contribute to increased use of additional imaging. Such an increase in additional imaging may result in detection of more subtle malignancies, a higher false-positive rate, or both. To our knowledge, comparison of recall rates and cancer detection rates for immediate versus subsequent batch interpretation of screening mammography has not been assessed. Thus, the purpose of our study was to retrospectively compare recall rates and cancer detection rates between immediate and subsequent batch interpretation of screening mammography.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Patients
From January 1, 2001, to October 31, 2001, 15 562 mammograms were obtained at one of our institutions four breast imaging facilities and interpreted by any one of five dedicated breast imaging radiologists, whose experience ranged from 5 to 11 years (J.A.B., E.L.R., E.I.G., R.W., M.S.S.). Mammography was performed in 8698 (56%) asymptomatic patients undergoing yearly radiographic imaging for occult breast cancer. Patients with a history of previous lumpectomy or breast prostheses, patients undergoing mammographic follow-up, and patients presenting with a palpable breast lump, breast thickening, localized pain, nipple discharge, or other symptoms relating to the breast and requiring diagnostic work-up were excluded from this group. The 8698 screening mammographic studies evaluated in our study population were divided into two groups: those that were interpreted immediately on the day of the examination while the patient waited for the results (n = 4113) and those that were interpreted subsequently in a batch reading session, with results mailed to the patient (n = 4585). For mammograms that were interpreted immediately, final interpretationincluding dictation of the reportwas performed while the patient waited. Results were discussed with patients with abnormal or benign findings, and letters were given to patients with normal reports. The subsequent batch-interpreted mammograms were interpreted at one of three outlying mammography facilities that did not have a breast imaging radiologist on site. The five dedicated breast imaging radiologists rotated evenly between assignments for immediate and batch interpretation of mammograms.
High-risk patients were subsequently excluded from each group to match the two groups with respect to risk factors: 672 (16%) of women in the immediate group and 653 (14%) women in the batch group had a personal history of breast cancer, a history of atypical ductal or lobular hyperplasia at previous biopsy, or a first-degree relative with breast cancer. Thus, there were 3441 patients in the immediate group (age range, 2196 years; mean age ± standard deviation, 56.8 years ± 11.1) and 3932 patients in the batch group (age range, 2498 years; mean age ± standard deviation, 56.2 years ± 11.3) (P = .02, Wilcoxon rank sum test). Breast density for both groups was compared and graded by the interpreting radiologist (M.S.S., J.A.B., E.L.R., E.I.G., R.W.) in one of four Breast Imaging Reporting and Data System (BI-RADS) density categories: category 1, breast composed almost entirely of fat; category 2, scattered fibroglandular densities that could obscure a lesion at mammography; category 3, heterogeneously dense, which may lower the sensitivity of mammography; or category 4, extremely dense, which lowers the sensitivity of mammography (5). This method of categorizing densities has been described previously (6). The two groups were also compared with respect to availability of comparison images (S.V.G.).
Recall and Cancer Detection Rates
The two groups of patients who underwent screening mammography were identified retrospectively and segregated by using our mammography database (Mammography Information System, version 7.0; PenRad Technologies, Plymouth, Minn). From each study group, the recall rate was determined by selecting from our mammography database those patients with a BI-RADS final assessment category of 0, which signifies the need for additional imaging (S.V.G.). For screening mammograms obtained at our institution, a BI-RADS final assessment category of 3, 4, or 5 is never assigned. These categories are reserved for use with subsequent diagnostic mammography performed to enable evaluation of an abnormality detected at screening. Of the patients whose mammograms were interpreted immediately, those who required additional imaging were also cross-referenced with our billing records to ensure that these records corresponded to Current Procedural Terminology code charges of "screening converted to diagnostic" used during that time (S.V.G.). The dictated reports of all of these patients were re-reviewed manually to confirm accuracy of both the database and the billing Current Procedural Terminology codes (S.V.G.). Recall rates were then compared among the groups.
The cancer detection rate was also assessed to determine if there was any difference in the ability of radiologists to detect malignancy on the basis of two methods of interpretation. To determine cancer detection rates, all patients from each group who fell into the BI-RADS final assessment category of 4 (suspicious for malignancy) or 5 (highly suggestive of malignancy), which signifies the need for biopsy, were selected from our database (S.V.G.). Results of pathologic analysis were reviewed for all of these patients; in each group, patients with malignant findings were recorded (S.V.G.).
False-Negative Analysis
The false-negative rate for each group was assessed during retrospective review to determine if there was a difference in the number of missed cancers between the two groups. The database system was used for review of all results of pathologic analysis; 307 malignancies were identified during the 12 months (January 1December 31, 2002) after our study period. The 307 patients with malignant results that were detected during the subsequent year were cross-referenced with patients that underwent screening mammography at one of our facilities during the study period. Initially, of these 307 patients, 209 were referred from outside institutions and had not undergone prior mammography at our facilities; therefore, they were not part of our study population. An additional 22 patients had undergone diagnostic imaging but had not undergone screening mammography at our facilities; therefore, they were not part of our study population. A total of 76 patients had initially undergone screening mammography at our institution during the study period. Of these patients, 22 had images that were either missing from the film jacket or signed out permanently to the patient or an outside physician; therefore, these images were unavailable for review.
Mammograms obtained in these patients during the study period and their prior studies (for comparison purposes) were mixed with similar sets of 45 normal mammograms. A panel of the same five breast imaging radiologists who were blinded to the outcome (M.S.S., J.A.B., E.L.R., E.I.G., R.W.) then reviewed these mammograms to determine if malignancy had been missed during initial evaluation of the images. Cancer was considered identified if three of the five radiologists detected cancer on mammograms obtained during this study that had not been detected prospectively. Breast density was not taken into consideration at the time of the review.
Statistical Analysis
Statistical analysis was performed by using SAS software (version 8.2; SAS Institute, Cary, NC), which was released in 2001. The Pearson
2 test was used to compare the recall and cancer detection rates and assess the availability of comparison images for the two groups. Comparison of breast density and age distribution between the two groups was performed by using the Wilcoxon rank sum test. Comparison of the false-negative rates was performed by using the Fisher exact test. A P value of less than .05 was used to indicate statistically significant differences between the groups.
| RESULTS |
|---|
|
|
|---|
|
False-Negative Analysis
At blinded review for false-negative analysis, three of five radiologists on our panel detected "actionable" lesions that represented three cancers from each of the study groups. These were designated as "missed," resulting in a false-negative rate of 0.01% for immediately interpreted mammograms and 0.01% for subsequent batch-interpreted mammograms (Table 2). There was no significant difference between the two groups (P > .99).
|
| DISCUSSION |
|---|
|
|
|---|
Although these studies may have caused many breast imaging centers to modify their practices to allow some form of immediate interpretation to satisfy these demands, there are also potential disadvantages. For the radiologist, immediate interpretation adds additional time demands and scheduling disruptions. The extra time and pressure required to interpret unpredicted additional mammograms and sonograms or results of interventional procedures can distract from already scheduled diagnostic patients. The additional imaging examinations (particularly sonography), which in our practice are performed by radiologists, can increase wait times for both screening and diagnostic patients and lead to further anxiety and frustration.
Our results demonstrate a significant difference in recall rates for screening mammography when comparing immediate- and subsequent batch-interpretation methods. Although our recall rates of 14% for batch interpretation and 18% for immediate interpretation are within the range of reported screening recall rates (5%20%) (79), our recall rate of 18% for the immediate group represented a 20% increase over the recall rate of the batch group, with no clear benefit to our patients. We found no significant difference in the cancer detection rates or false-negative rates between the two groups. We hypothesize that factors such as pressure and distraction of the diagnostic studies, in combination with the convenient presence of the screening patient, lead to use of additional imaging rather than focused decision making to analyze questionable abnormalities and resulted in the difference in additional imaging requests. For the patients in our study, additional imaging did not lead to increased cancer detection; rather, it increased radiation exposure from the additional mammographic views, increased cost, and possibly increased anxiety and inconvenience of waiting when scheduling backups occurred.
In addition, immediate interpretation is not cost-effective. In a cost analysis study performed by Raza et al (4), immediate interpretation of screening mammograms added as much as $28 to the cost of the study when factors such as additional radiologist and technologist time and possible additional space and equipment issues were taken into consideration. Although a large majority of patients favored immediate interpretation, only 11% were willing to pay this additional cost. The additional recall rate of 20% for patients with immediately interpreted mammograms, which is demonstrated by our study, would further increase the cost per patient.
Another possible disadvantage of immediate interpretation is the inability of radiologists to obtain double interpretations efficiently, whether with computer-aided detection or with consensus double reading by a second radiologist. We recognize that the additional time required for digitizing screen-film images for interpretation of computer-aided detection findings at our facilities may not be a barrier to immediate reporting at practices where digital mammography is available. Although this issue was not addressed in our study, prior studies have shown that double interpretation decreases recall rates (10) and increases sensitivity (11), while computer-aided detection increases sensitivity by as much as 20% (7,12,13). When given the choice, both patients (14) and referring clinicians (15) favored delayed batch interpretation if double interpretation was offered.
Since the time of this study, we have changed our practice pattern; we now perform immediate interpretation for only those patients who live more than an hour from our facility. As a result, we have subjectively noticed a reduction in scheduling backups and subsequent patient frustration previously seen in the immediate-interpretation setting. Batch interpretation also allows more effective use of computer-aided detection interpretation at our institution.
An important limitation of our study is the possibility of selection bias. Because this is a retrospective database review, patients were not randomized between the two groups, and demographic characteristics between the two groups could not be directly controlled. Both age and breast density have been reported as the most independent predictors of mammographic sensitivity (6). Our analysis determined that the two groups were closely matched with respect to breast density. The mean ages for the immediately interpreted group and the subsequent batch-interpreted group were 56.8 years and 56.2 years, respectively. Although the mean ages were significantly different, the absolute difference of 6 months is very small, and it is likely of little clinical consequence. Another limitation is the relatively small population (17 cancers were detected in each group), which reduces statistical power.
Several prior studies have been performed to evaluate the importance of comparison with previous mammograms for quality mammographic interpretation (1618). Sickles et al (18) reported that availability of prior mammograms may decrease the number of false-positive examinations. Bassett et al (19) disagreed and found that comparison with prior mammograms affected care in only 3% of cases; therefore, the cost and time involved in obtaining prior examinations was not always justified. We found fewer available comparison images and a lower recall rate for the batch-interpreted group of patients. If more comparison images were available, the number of recalls would likely be even lower. In our experience, there is a large number of baseline first-time mammograms in our batch-interpreted group, which is a possible explanation of the unavailable comparison images. This subset of patients was not analyzed in our study.
Finally, we determined the false-negative rate by determining the number of interval cancers detected 12 months after the initial screening mammogram. An interval of 12 months from index mammography has been defined by the American College of Radiology (5). It is possible that interval cancers may have been missed in the 22 patients that were lost to follow-up; thus, we may have underestimated the number of false-negative findings.
In conclusion, immediate interpretation of screening mammograms results in higher recall rates, with no significant difference in cancer detection rates when compared with delayed subsequent batch-interpreted mammograms. Efficiency, patient preferences, and cost-effectiveness are all important issues to consider when running a successful mammography practice. The correct balance will depend on the needs and abilities of each mammography practice.
| FOOTNOTES |
|---|
Authors stated no financial relationship to disclose.
Author contributions: Guarantor of integrity of entire study, S.V.G.; study concepts, S.V.G., M.S.S., E.L.R., J.A.B.; study design, S.V.G., M.S.S.; literature research, S.V.G.; clinical studies, M.S.S., J.A.B., E.L.R., E.I.G., R.W.; data acquisition, all authors; data analysis/interpretation, S.V.G., M.S.S.; statistical analysis, S.V.G.; manuscript preparation, S.V.G., M.S.S., E.L.R.; manuscript definition of intellectual content, editing, and revision/review, S.V.G., M.S.S., E.L.R., J.A.B.; manuscript final version approval, all authors
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. A. Harvey, B. T. Nicholson, and M. A. Cohen Finding Early Invasive Breast Cancers: A Practical Approach Radiology, July 1, 2008; 248(1): 61 - 76. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Carney, L. A. Abraham, D. L. Miglioretti, K. R. Yabroff, E. A. Sickles, D. S. M. Buist, C. J. Kasales, B. M. Geller, R. D. Rosenberg, M. B. Dignan, et al. Factors Associated with Imaging and Procedural Events Used to Detect Breast Cancer After Screening Mammography Am. J. Roentgenol., February 1, 2007; 188(2): 385 - 392. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. D. Pisano, C. Gatsonis, E. Hendrick, M. Yaffe, J. K. Baum, S. Acharyya, E. F. Conant, L. L. Fajardo, L. Bassett, C. D'Orsi, et al. Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening N. Engl. J. Med., October 27, 2005; 353(17): 1773 - 1783. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Burnside, J. M. Park, J. P. Fine, and G. A. Sisney The Use of Batch Reading to Improve the Performance of Screening Mammography Am. J. Roentgenol., September 1, 2005; 185(3): 790 - 796. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |