|
|
||||||||
Breast Imaging |
1 From the Screening Mammography Program of British Columbia, 8th Fl, 686 W Broadway, Vancouver, British Columbia, Canada V5Z 1G1 (L.K., I.A.O., L.J.W.B.); British Columbia Cancer Agency, Vancouver, British Columbia, Canada (L.K., I.A.O., A.J.C.); the Department of Radiology, University of California Medical Center, San Francisco (E.A.S.); and the Departments of Radiology (L.J.W.B.) and Surgery (I.A.O.), Faculty of Medicine, and the Department of Statistics, Faculty of Science (A.J.C.), University of British Columbia, Vancouver, British Columbia, Canada. From the 1998 RSNA scientific assembly. Received November 29, 1998; revision requested January 21, 1999; final revision received August 16; accepted September 3. Address correspondence to I.A.O. (e-mail: iolivott@bccancer.bc.ca).
| Abstract |
|---|
|
|
|---|
MATERIALS AND METHODS: Standardized abnormal interpretation ratios and standardized cancer detection ratios were constructed for 35 readers with at least 3 years of experience with the Screening Mammography Program of British Columbia. The ratios were used to compare individual reader performance with the mean program performance after adjustment for the age and screening history (first versus subsequent screening examinations) of the women who underwent screening.
RESULTS: The mean standardized abnormal interpretation ratio was better for readers of 2,0002,999 (n = 8) and 3,0003,999 (n = 9) screening mammograms per year than for those of less than 2,000 (n = 9) and 4,0005,199 (n = 9) screening mammograms per year. Differences in the mean standardized abnormal interpretation ratios were significant (P < .05) between the readers of less than 2,000 and of 2,0002,999 screening mammograms per year, between readers of less than 2,000 and of 3,0003,999 screening mammograms per year and between readers of 3,0003,999 and of 4,0005,199 screening mammograms per year. The mean standardized cancer detection ratio improved gradually with increasing annual volume, but the differences between groups were not statistically significant. Five of the eight readers of 2,0002,999 mammograms were reading 2,475 or more screening mammograms per year.
CONCLUSION: Standardized abnormal interpretation ratios and standardized cancer detection ratios provide a method of comparing two important performance measures in a screening program. A minimum of 2,500 interpretations per year is associated with lower abnormal interpretation rates and average or better cancer detection rates.
Index terms: Breast neoplasms, diagnosis, 00.11, 00.30 Breast neoplasms, radiography, 00.11, 00.30 Breast radiography, quality assurance, 00.11 Cancer screening, 00.11, 00.30
| Introduction |
|---|
|
|
|---|
Abnormal interpretation rates and cancer detection rates are performance indicators used by many screening programs (311). The abnormal interpretation rate is the number of screening mammograms considered abnormal (ie, referred for assessment) divided by the total number of screening mammograms interpreted. The cancer detection rate is the number of cases of invasive breast cancer or ductal carcinoma in situ diagnosed subsequent to investigation of abnormal screening mammograms and is usually expressed as the number of cases per 1,000 women who underwent screening mammography. It is recognized that both measures are higher for first screening examinations than for subsequent screening examinations. It is also well recognized that the cancer detection rate increases with advancing age (913) because the risk of a woman developing breast cancer increases with age. Thus, the age distribution and screening history of the women undergoing screening need to be incorporated into any assessment of reader performance.
This article presents a standardization approach to adjust for the different age distributions and screening histories of women whose screening mammograms were read by individual radiologists and thus to allow comparison of performances among radiologists. We performed this study to examine the relationship between reading volume and reader performance by using standardized measures of the abnormal interpretation rate and the cancer detection rate.
| MATERIALS AND METHODS |
|---|
|
|
|---|
For each radiologist, the standardized abnormal interpretation ratio was defined as the number of screening mammograms with abnormal interpretations divided by the number of abnormal results expected. In this study, the population that underwent screening was divided into 12 subgroups: six age groups, each further divided into two subgroups on the basis of whether the screening examination was the first in the program. The age groups were women younger than 40 years, 4049 years, 5059 years, 6069 years, 7079 years, or 80 years or older.
For each subgroup, the expected number of abnormal results for the radiologist was obtained by multiplying the expected rate for that subgroup and the number of screening mammograms read by that radiologist. In this study, the expected rates used to construct standardized abnormal interpretation ratios were the subgroup-specific program abnormal interpretation rates. To summarize, for an individual radiologist,
The standardized cancer detection ratio was similarly defined, where Oj was the number of cancers detected by the radiologist in the jth subgroup and Rj was the program cancer detection rate in the jth subgroup. Nj was the same as defined in the previous paragraph.
Table 1 illustrates how the standardized abnormal interpretation ratios would be calculated for a hypothetical reader. In this example, the total observed number of abnormal mammographic interpretations was 100, whereas the expected number of abnormal interpretations was calculated to be 75.0. Thus, the standardized abnormal interpretation ratio would be 100 divided by 75.0, or 1.33.
|
Women are eligible to attend the Screening Mammography Program of British Columbia if they are aged 40 years or older, are asymptomatic with respect to breast complaints, are not pregnant or lactating, do not have breast implants, and have no prior personal history of invasive or in situ breast cancer. Information on family history, hormone use, and previous breast biopsies are collected from each woman at the time of screening. These latter three variables are distributed similarly across readers and screening centers in the Screening Mammography Program of British Columbia (15).
As new radiologist readers of screening mammograms usually start with higher abnormal interpretation rates and go through a period of adjustment, six radiologists with less than 3 years of experience in the Screening Mammography Program of British Columbia as of March 31, 1997, were excluded. Another three radiologists reading more than 9,000 screening mammograms per year on average (ie, three times the program target reading volume) were also excluded. There was a difference of more than 4,000 screening mammograms between this group of three radiologists and the radiologist reading the next highest volume.
A fiscal year starts on April 1 and ends on March 31 of the following year. The mean annual reading volume over 3 fiscal years from 19941995 to 19961997 was calculated for each of the remaining 35 readers. The readers were assigned to one of four groups on the basis of their mean annual volume: 1,0001,999, 2,0002,999, 3,0003,999, and 4,0005,199 screening mammograms. Standardized abnormal interpretation ratios and standardized cancer detection ratios were calculated on the basis of the 19961997 screening performances.
Standardized abnormal interpretation ratios and standardized cancer detection ratios were examined according to reading volume group. Analysis of variance (ANOVA) was used to evaluate the effect of reading volume on the standardized abnormal interpretation ratios and the standardized cancer detection ratios. The Bartlett test was applied to confirm equality of variances required for the use of the ANOVA model (16). Differences in standardized abnormal interpretation ratios and standardized cancer detection ratios between reading volume groups were assessed by using t tests for multiple comparisons.
| RESULTS |
|---|
|
|
|---|
|
|
ANOVA results showed that the effect of reading volume was significant for standardized abnormal interpretation ratios (P = .03) but was not significant for standardized cancer detection ratios. Multiple t tests showed that differences in the mean standardized abnormal interpretation ratios were significant at the 5% level (P < .05) between the readers of 1,0001,999 and 2,0002,999 screening mammograms per year, between readers of 1,0001,999 and 3,0003,999 screening mammograms per year, and between readers of 3,0003,999 and 4,0005,199 screening mammograms per year. Five of the eight readers of 2,0002,999 mammograms per year read 2,475 or more screening mammograms per year.
| DISCUSSION |
|---|
|
|
|---|
There were no significant differences in the mean standardized abnormal interpretation ratios and standardized cancer detection ratios between the reading volume groups of 2,0002,999 and 3,0003,999 screening mammograms per year. Radiologists reading on average 2,0003,999 screening mammograms per year had fewer abnormal interpretations but detected the normative number of cancers. However, five of the eight readers in the reading volume group of 2,0002,999 mammograms per year read 2,475 or more screening mammograms per year. As a result, the Screening Mammography Program of British Columbia reset the minimum annual reading requirement at 2,500 screening mammograms per radiologist.
The standardized abnormal interpretation ratio and standardized cancer detection ratio have been developed to help examine the relationship between reading volume and reader performance. The main advantage of this standardization is that it incorporates the age distribution and the screening history of the women undergoing screening. Both age and screening history correlate with cancer detection rates (913), and screening history correlates with abnormal interpretation rates (10,11,17) in breast cancer screening programs.
In addition, the standardized abnormal interpretation ratio and standardized cancer detection ratio incorporate comparison target values against which performance can be measured. The overall program performance measures used in the current standardization could be replaced with some other target measures. For example, if a screening program has set standards for subgroup-specific abnormal interpretation rates, then the standards could replace the observed program abnormal interpretation rates as the reference expected values to be used in the calculation. The standardized abnormal interpretation ratios and standardized cancer detection ratios thus constructed would provide a measure for comparing the individual performance against the program standards rather than against normative values for his or her reader colleagues.
It is clear that the standardized abnormal interpretation ratio and standardized cancer detection ratio depend on the target values used in the denominator. Standardized abnormal interpretation ratios or standardized cancer detection ratios should be compared across time or between programs or individuals only when the standardized ratios are constructed by using the same target values.
Similar standardization of a rate measure has previously been applied to the invasive cancer detection rate for a comparison of individual screening programs in the National Health Services Breast Screening Programme (18). That standardized measure was called the standardized detection ratio. The statistical properties of the standardized detection ratio, standardized abnormal interpretation ratio, and standardized cancer detection ratio allow for a simple construction of CIs.
The minimum annual reading volume of 2,500 mammograms currently adopted by the Screening Mammography Program of British Columbia is still considerably higher than the 480 mammograms per year required by the U.S. Department of Health and Human Services and is much lower than the 5,000 mammograms per year required by the National Health Services Breast Screening Programme. None of the radiologists in the Screening Mammography Program of British Columbia who participated in this study had annual volumes of less than 1,000 mammograms per year. Although organizations such as the U.S. Department of Health and Human Services and the National Health Services Breast Screening Programme have addressed the reading volume question with requirements at the extremes, to our knowledge there are no published data that indicate a scientific basis for these decisions.
We acknowledge that the standardized abnormal interpretation ratio and standardized cancer detection ratio only partially reflect the performance level of the interpreters. The rates of detecting favorable early-stage cancers and the interval cancer rates also need to be considered. Although radiologists with higher abnormal interpretation rates sometimes detect more early-stage cancers, the opposite has also been observed among radiologists with low reading volumes (4). The Screening Mammography Program of British Columbia reported an interval cancer rate of 0.6 cancers per 1,000 examinations, which is based on 4 years of data collected from April 1, 1992, to March 31, 1997 (15). Thus, many years of data are required to incorporate this criterion into the assessment of individual reader performance and the setting of minimum annual reading volumes. Using data accumulated over a long time has the potential drawback that it may not be appropriate to assume that reader performance will remain constant over the long term.
During the next few years, the Screening Mammography Program of British Columbia will assist the radiologist readers to achieve the minimum reading volume through program promotion, expansion of screening services, and realignment of the existing services. The Screening Mammography Program of British Columbia radiologist readers have supported this decision and will monitor the reading allocation within their screening centers to ensure that every reader meets the minimum reading requirement. The Screening Mammography Program of British Columbia will continue to monitor abnormal interpretation ratios and cancer detection ratios that will be standardized with respect to the 19961997 program results. This will allow the program to track and refine targets as more readers achieve the minimum reading requirement and greater numbers of radiologists become readers within the program.
| Footnotes |
|---|
Author contributions: Guarantors of integrity of entire study, L.K., I.A.O.; study concepts, L.K., I.A.O., A.J.C.; study design, L.K.; literature research, L.K., I.A.O.; data acquisition and analysis, L.K.; statistical analysis, L.K.; manuscript preparation, L.K.; manuscript editing, L.K., I.A.O.; manuscript review, I.A.O., L.J.W.B., A.J.C., E.A.S.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. S. Baxi, L. Liberman, C. Lee, and E. B. Elkin Breast Imaging Fellowships in the United States: Who, What, and Where? Am. J. Roentgenol., February 1, 2009; 192(2): 403 - 407. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Smith-Bindman, D. L. Miglioretti, R. Rosenberg, R. J. Reid, S. H. Taplin, B. M. Geller, K. Kerlikowske, and the National Institutes of Health Breast Cancer Su Physician Workload in Mammography Am. J. Roentgenol., February 1, 2008; 190(2): 526 - 532. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Miglioretti, R. Smith-Bindman, L. Abraham, R. J. Brenner, P. A. Carney, E. J. A. Bowles, D. S. M. Buist, and J. G. Elmore Radiologist Characteristics Associated With Interpretive Performance of Diagnostic Mammography J Natl Cancer Inst, December 19, 2007; 99(24): 1854 - 1863. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. G. Elmore and R. J. Brenner The More Eyes, the Better to See? From Double to Quadruple Reading of Screening Mammograms J Natl Cancer Inst, August 1, 2007; 99(15): 1141 - 1143. [Full Text] [PDF] |
||||
![]() |
L. Berlin Accuracy of Diagnostic Procedures: Has It Improved Over the Past Five Decades? Am. J. Roentgenol., May 1, 2007; 188(5): 1173 - 1178. [Full Text] [PDF] |
||||
![]() |
J. W. T. Leung, F. R. Margolin, K. E. Dee, R. P. Jacobs, S. R. Denny, and John. D. Schrumpf Performance Parameters for Screening and Diagnostic Mammography in a Community Practice: Are There Differences Between Specialists and General Radiologists? Am. J. Roentgenol., January 1, 2007; 188(1): 236 - 241. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Castells, E. Molins, and F. Macia Cumulative false positive recall rate and association with participant related factors in a population based breast cancer screening programme. J Epidemiol Community Health, April 1, 2006; 60(4): 316 - 321. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Coldman, D. Major, G. P. Doyle, Y. D'yachkova, N. Phillips, J. Onysko, R. Shumak, N. E. Smith, and N. Wadden Organized Breast Screening Programs in Canada: Effect of Radiologist Reading Volumes on Outcomes Radiology, March 1, 2006; 238(3): 809 - 815. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Burnside, J. M. Park, J. P. Fine, and G. A. Sisney The Use of Batch Reading to Improve the Performance of Screening Mammography Am. J. Roentgenol., September 1, 2005; 185(3): 790 - 796. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Sickles, D. L. Miglioretti, R. Ballard-Barbash, B. M. Geller, J. W. T. Leung, R. D. Rosenberg, R. Smith-Bindman, and B. C. Yankaskas Performance Benchmarks for Diagnostic Mammography Radiology, June 1, 2005; 235(3): 775 - 790. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Shin, B.-K. Han, Y. H. Choe, S.-J. Nam, W. Park, and Y.-H. Im Ultrasonographic Detection of Occult Cancer in Patients After Surgical Therapy for Breast Cancer J. Ultrasound Med., May 1, 2005; 24(5): 643 - 649. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Gur, L. P. Wallace, A. H. Klym, L. A. Hardesty, G. S. Abrams, R. Shah, and J. H. Sumkin Trends in Recall, Biopsy, and Positive Biopsy Rates for Screening Mammography in an Academic Practice Radiology, May 1, 2005; 235(2): 396 - 401. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Smith-Bindman, P. Chu, D. L. Miglioretti, C. Quale, R. D. Rosenberg, G. Cutter, B. Geller, P. Bacchetti, E. A. Sickles, and K. Kerlikowske Physician Predictors of Mammographic Accuracy J Natl Cancer Inst, March 2, 2005; 97(5): 358 - 367. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Theberge, N. Hebert-Croteau, A. Langlois, D. Major, and J. Brisson Volume of screening mammography and performance in the Quebec population-based Breast Cancer Screening Program Can. Med. Assoc. J., January 18, 2005; 172(2): 195 - 199. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-L. Urbain Breast cancer screening, diagnostic accuracy and health care policies Can. Med. Assoc. J., January 18, 2005; 172(2): 210 - 211. [Full Text] [PDF] |
||||
![]() |
S. V. Destounis, P. DiNitto, W. Logan-Young, E. Bonaccio, M. L. Zuley, and K. M. Willison Can Computer-aided Detection with Double Reading of Screening Mammograms Help Decrease the False-Negative Rate? Initial Experience Radiology, August 1, 2004; 232(2): 578 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Beam, E. F. Conant, E. A. Sickles, and S. P. Weinstein Evaluation of Proscriptive Health Care Policy Implementation in Screening Mammography Radiology, November 1, 2003; 229(2): 534 - 540. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Smith-Bindman, P. W. Chu, D. L. Miglioretti, E. A. Sickles, R. Blanks, R. Ballard-Barbash, J. K. Bobo, N. C. Lee, M. G. Wallis, J. Patnick, et al. Comparison of Screening Mammography in the United States and the United Kingdom JAMA, October 22, 2003; 290(16): 2129 - 2137. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Smith, D. Saslow, K. Andrews Sawyer, W. Burke, M. E. Costanza, W. P. Evans III, R. S. Foster Jr., E. Hendrick, H. J. Eyre, and S. Sener American Cancer Society Guidelines for Breast Cancer Screening: Update 2003 CA Cancer J Clin, May 1, 2003; 53(3): 141 - 169. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Guenin, E. A. Sickles, D. E. Wolverton, and K. E. Dee Generalists versus Specialists in Mammography [letter] * Dr Sickles and colleagues respond: Radiology, May 1, 2003; 227(2): 609 - 611. [Full Text] [PDF] |
||||
![]() |
W. A. Berg Rationale for a Trial of Screening Breast Ultrasound: American College of Radiology Imaging Network (ACRIN) 6666 Am. J. Roentgenol., May 1, 2003; 180(5): 1225 - 1228. [Full Text] [PDF] |
||||
![]() |
C. A. Beam, E. F. Conant, and E. A. Sickles Association of Volume and Volume-Independent Factors With Accuracy in Screening Mammogram Interpretation J Natl Cancer Inst, February 19, 2003; 95(4): 282 - 290. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Sickles, D. E. Wolverton, and K. E. Dee Performance Parameters for Screening and Diagnostic Mammography: Specialist and General Radiologists Radiology, September 1, 2002; 224(3): 861 - 869. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Berg, C. J. D'Orsi, V. P. Jackson, L. W. Bassett, C. A. Beam, R. S. Lewis, and P. E. Crewson Does Training in the Breast Imaging Reporting and Data System (BI-RADS) Improve Biopsy Recommendations or Feature Analysis Agreement with Experienced Breast Imagers at Mammography? Radiology, September 1, 2002; 224(3): 871 - 880. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Esserman, H. Cowley, C. Eberle, A. Kirkpatrick, S. Chang, K. Berbaum, and A. Gale Improving the Accuracy of Mammography: Volume and Outcome Relationships J Natl Cancer Inst, March 6, 2002; 94(5): 369 - 375. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Berlin Dot Size, Lead Time, Fallibility, and Impact on Survival: Continuing Controversies in Mammography Am. J. Roentgenol., May 1, 2001; 176(5): 1123 - 1130. [Full Text] [PDF] |
||||
![]() |
S. S. Kaplan Clinical Utility of Bilateral Whole-Breast US in the Evaluation of Women with Dense Breast Tissue Radiology, December 1, 2001; 221(3): 641 - 649. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |