|
|
||||||||
1 Department of Radiology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215. e-mail: fhall@bidmc.harvard.edu
Editor:
I commend Dr Kerlikowske and colleagues for their article in the March 2005 issue of Radiology (1), in which they provide evidence-based data on the controversial subject of when and whether to recommend 6-month follow-up mammography for probably benign findings (PBFs) without first obtaining a diagnostic work-up.
The authors conclude that the "absence of diagnostic work-up prior to short-interval follow-up recommendation may result in periodic surveillance of a high proportion of benign lesions" (1). This should come as no surprise because the entire concept of the primary 6-month follow-up is based on these lesions being less suspicious. When I estimate that a mammographic abnormality has a 2% (one in 50) chance of being cancer, with a range of perhaps 1:25 to 1:100, I would almost always call back the patient for immediate diagnostic imaging, perhaps followed by a 6-month follow-up examination. However, if I estimate the risk at 0.5% (one in 200), I might occasionally prefer to have an intermediate option between immediate callback diagnostic examination and annual screening. This would be particularly true if I was reasonably certain that additional imaging would not change the assessment.
I fully agree with Dr Kerlikowske and colleagues (1) that "use of short-term follow-up to monitor PBFs should be limited to as small a number of women as possible" (1). In experienced hands this number should be small because, as these authors point out, the probably benign category is generally applicable to baseline screening or to situations in which comparison images cannot be obtained. Hence, the authors' findings that approximately one-half of 6-month follow-up mammography examinations are recommended without first obtaining a callback diagnostic work-up is both surprising and disappointing. Dr Kerlikowske and colleagues (1) comment that the situation may improve with the recent publication of the fourth edition of the Breast Imaging Reporting and Data System (BI-RADS) atlas, which "provides extensive guidance on the use of PBF assessments and management recommendations for short-interval follow-up."
I suggest another, probably more forceful, remedy: Reclassify BI-RADS category 3 PBF so that a primary recommendation for 6-month follow-up is included in the calculation of recall rates. As pointed out by Dr Kerlikowske and colleagues (1), this is not currently done despite the fact that the patient is recalled in both instanceseither immediately or in 6 months. The standard ubiquitous benchmark of a 5%10% callback rate would remain unchanged. I suspect that this change would push only a small minority of mammographers over the 10% mark, and perhaps these are the individuals who should be reassessing their use of the short-interval follow-up recommendation.
Reference
,
,
Rebecca Smith-Bindman*,
,
Bonnie C. Yankaskas, PhD||,
Berta M. Geller, EdD#,
Patricia A. Carney, PhD** and
Karla Kerlikowske, MD
,
* Department of Radiology, University of California, San Francisco, Calif
Center for Health Studies, Group Health Cooperative, Seattle, Wash
Department of Radiology, University of Washington Medical Center, Seattle Cancer Care Alliance, Seattle, Wash
Department of Epidemiology and Biostatistics, University of California, San Francisco, Calif
|| Department of Radiology, University of North Carolina, Chapel Hill, NC
# Health Promotion Research, University of Vermont, College of Medicine, Burlington, Vt
** Norris Cotton Cancer Center, Dartmouth-Hitchcock Medical Center, Department of Community and Family Medicine, Dartmouth Medical School, Lebanon, NH

General Internal Medicine Section, San Francisco Veterans Affairs Medical Center, 111A1, 4150 Clement Street, San Francisco, CA 94121. e-mail: kerliko@itsa.ucsf.edu
We appreciate Dr Hall's comments and support for using evidence-based data to inform radiologists on the issue of whether to recommend 6-month follow-up mammography for PBFs (BI-RADS category 3 assessments) without first obtaining a diagnostic work-up. Dr Hall and one of us (E.A.S.) disagreed on this same issue 10 years ago, at a time when we both based our arguments on inferential assumptions and anecdote (1,2). Now, considering the substantial scientific evidence presented in our current study (3), recommendations for use of short-interval follow-up can be based on evidence rather than supposition.
Our data suggest that three groups of findings contribute to the overall low likelihood of malignancy for category 3 assessments made directly from screening images. The first is a group of findings that would have been interpreted as category 3 even if full diagnostic imaging work-up had been performed. This group of findings has been studied thoroughly and carries a likelihood of malignancy of less than 2% (49). The second is a relatively large group of findings that would have been assessed as category 1 (negative) or category 2 (benign) had an initial diagnostic work-up been performed, thereby providing the patient with immediate closure for what otherwise would have been a 6-month (possibly anxiety-provoking) wait for closure. This group of findings has a likelihood of malignancy that is much lower than that of traditional category 3 lesions (3). The third is a relatively small group of findings that would have been assessed as category 4 (suspicious) if an initial diagnostic work-up had been performed; some of these lesions will be malignant, and the prognosis is perhaps not as favorable as with the traditionally worked-up category 3 lesions that later prove to be malignant. This third group of findings carries a likelihood of malignancy higher than 2%, although the numbers of examinations with these findings in our study are too small to reliably indicate how high this percentage actually is (3).
Furthermore, our data show a trend toward the diagnosis of more advanced cancers (larger size, higher stage) in the category 3 screening-only group versus in the category 3 full diagnostic imaging work-up group (3). Thus, the net effect of rendering category 3 assessments directly at screening, versus the traditional approach of completing the full diagnostic imaging work-up, is to delay definitive negative or benign mammographic assessments for 6 months in a relatively large number of patients (causing delay of closure, an unfavorable outcome), and also to delay for 6 months the diagnosis of some not necessarily indolent cancers (an even less favorable outcome). There appears to be no benefit to the patient for making category 3 assessments at screening; all this assessment provides is an intermediate option for the radiologist between immediate recall and repeat mammography in 1 year. Given that scientific evidence is available, we should base our management recommendations on the evidence rather than on preferences of individual radiologists that have no benefit for the patient and may be associated with potential harms.
Our evidence-based data also negate Dr Hall's contention that including category 3 assessments in the calculation of recall rate would "push only a small minority of mammographers over the 10%" guideline ceiling cited in the BI-RADS atlas (10). As indicated in our article, the recall rate was 8.6% for first screening examinations that were followed with diagnostic imaging within the next 90 days, which is well under the guideline ceiling; however, if all patients with category 3 assessments, including those assessments made directly at screening, were considered to need recall, the recall rate for first screening examinations would have been 14.0%, which is well above the ceiling (3). Thus, changing the auditing definition of a recall at screening to include category 3 assessments would actually substantially increase recall rates.
Like Dr Hall, we believe that category 3 assessments are overused by many practicing radiologists. However, we do not share his opinion that this should be remedied by counting category 3 screening assessments as recall cases when performing a mammography audit.
First, such an approach would create a confusing and internally inconsistent situation when calculating performance measures. If the category 3 assessment made at screening is considered equivalent to prompt recall for additional imaging, then such assessments would be "positive" not only for the calculation of recall rate but also for the calculation of positive predictive values. This would create an inconsistency when comparing performance measures for screening and diagnostic mammography because screening category 3 assessments would be considered "positive" whereas diagnostic category 3 assessments would be considered "negative." (At diagnostic mammography, only assessments leading to the recommendation for biopsy are considered positive.)
Second, and more important, there likely is a better way to reduce the inappropriate use of category 3 assessments: education. This approach already has been shown to be at least partially effective in reducing the frequency of internally discordant mammography reports, that is, those in which management recommendations do not match those of the BI-RADS assessment category. By using large-scale data from the Breast Cancer Surveillance Consortium (the same data source used in our own recent study), Taplin et al (11) and Geller et al (12) have already reported on the frequency of discordant screening and diagnostic mammography reports for 19961997. One of us (B.M.G.) has just completed a year-by-year review of these data, also including the years 19982001, that shows progressive reduction in the frequency of discordant reports over the full 6-year period. This improvement in performance is likely attributable to education by the many radiologists who provide continuing medical education in mammography throughout the United States and also to improved clarity in discouraging such discordances in the third edition of the BI-RADS atlas, published in 1998 (13). A parallel situation exists for relying on education to reduce the frequency of inappropriate category 3 assessments. The fourth edition of the BI-RADS atlas, published in 2003, explicitly discourages the use of category 3 assessments for screening mammograms (10). Many of the radiologists who provide continuing medical education also have been emphasizing this change in their teaching. Finally, the publication of our evidence-based data provides the scientific support that should help to educate even more radiologists to avoid making category 3 assessments when interpreting screening mammograms.
A major goal of the BI-RADS approach is to decrease the variability in radiologists' interpretation of breast images. Educating radiologists regarding the appropriate use of category 3 assessments is an important component in achieving that goal. Our data provide scientific support for the BI-RADS recommendation that category 3 assessments should be made only after full diagnostic imaging work-up, not directly at screening mammography. We hope that the results of our study will encourage more radiologists to adjust their practice to conform to this recommendation.
References
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |