Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Poplack, S. P.
Right arrow Articles by Carney, P. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Poplack, S. P.
Right arrow Articles by Carney, P. A.
(Radiology. 2000;217:832-840.)
© RSNA, 2000


Breast Imaging

Mammography in 53,803 Women from the New Hampshire Mammography Network1

Steven P. Poplack, MD, Anna N. Tosteson, ScD, Margaret R. Grove, MSc, Wendy A. Wells, MD and Patricia A. Carney, PhD

1 From the Departments of Radiology (S.P.P.), Community and Family Medicine (A.N.T., M.R.G., P.A.C.), Medicine (A.N.T.), and Pathology (W.A.W.), Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, HB 7999, Lebanon, NH 03756. From the 1999 RSNA scientific assembly. Received August 13, 1999; revision requested September 29; final revision received May 1, 2000; accepted May 22. Supported by grant DASD17-94-J-4109 from the United States Department of Defense and Cancer Center Support Grant 2P30 CA23108-22 from the National Cancer Institute. Address correspondence to S.P.P. (e-mail: steven.p.poplack@hitchcock.org).


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
PURPOSE: To describe measures of mammography performance in a geographically defined population and evaluate the interpreter’s use of the Breast Imaging Reporting and Data System (BI-RADS).

MATERIALS AND METHODS: Mammographic data from 47,651 screening and 6,152 diagnostic examinations from November 1, 1996, to October 31, 1997, were linked to 1,572 pathologic results. Mammographic outcomes were based on BI-RADS assessments and recommendations reported by the interpreting radiologist. The consistency of BI-RADS recommendations was evaluated.

RESULTS: Screening mammography had a sensitivity of 72.4% (95% CI: 66.4%, 78.4%), specificity of 97.3% (95% CI: 97.25%, 97.4%), and positive predictive value of 10.6% (95% CI: 9.1%, 12.2%). Diagnostic mammography had higher sensitivity, 78.1% (95% CI: 71.9%, 84.3%); lower specificity, 89.3% (95% CI: 88.5%, 90.1%); and better positive predictive value, 17.1% (95% CI: 14.5%, 19.8%). The cancer detection rate with screening mammography was 3.3 per 1,000 women, with a biopsy yield of 22.4%, whereas the interval cancer rate was 1.2 per 1,000. Nearly 80% of screening-detected invasive malignancies were node negative. The recall rate for screening mammography was 8.3%. Ultrasonography was used in 3.5% of screening and 17.5% of diagnostic examinations. BI-RADS recommendations were generally consistent, except for probably benign assessments.

CONCLUSION: The sensitivity of screening mammography in this population-based sample is lower than expected, although other performance indicators are commendable. BI-RADS "probably benign" assessments are commonly misused.

Index terms: Breast neoplasms, radiography, 00.11, 00.30 • Cancer screening, 00.11, 00.30 • Diagnostic radiology, observer performance, 00.11, 00.30


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The effectiveness of mammographic screening in reducing breast cancer mortality, especially in women aged 50–69 years, is well established (19). Authors of numerous studies have evaluated the effectiveness of screening mammography by using a variety of outcome measures (1013), leading to Agency of Health Care Policy and Research guidelines on the interpretive performance of mammography (14). To our knowledge, most previous studies (1013,1518) on interpretive performance involved a limited number of mammography centers with similar characteristics. Few studies (1922) have been published on mammographic interpretation in diverse community settings, and to our knowledge none has described the operating characteristics of both screening and diagnostic mammography or the use of breast ultrasonography (US) in a geographically defined largely rural population.

The purpose of this study was to describe key performance measures of screening and diagnostic breast radiography in a geographically defined subject population and to evaluate the use of the American College of Radiology’s Breast Imaging Reporting and Data System (BI-RADS) (23) by interpreting radiologists. Our data are derived from a diverse group of mammography facilities, the majority of which are community based.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Background of the New Hampshire Mammography Network
The design and development of the New Hampshire Mammography Network (NHMN) is described in detail elsewhere (24,25). Briefly, the NHMN was founded in October 1994 and began collecting data May 1, 1996. The NHMN, and all study-related procedures, were approved by our committee for the protection of human subjects. All women undergoing mammography in a participating New Hampshire facility are eligible to enroll in the NHMN. Women participants, radiologists, and pathologists sign written consent forms to allow for data accrual and analysis by the NHMN. Currently, 37 (90%) of the 41 mammography facilities in New Hampshire contribute data to the NHMN. The composition of mammography facilities is diverse and includes hospital (54%) and clinic-based facilities (22%), physicians’ private offices (20%), freestanding imaging centers (2%), and an academic medical center (2%) (26).

All data contained in the NHMN database are scanned from standardized forms completed by women undergoing mammography, mammography technologists, and interpreting radiologists. The NHMN does not capture examinations limited to US. Women undergoing mammography provide demographic and some breast cancer risk factor information. Mammography technologists obtain additional risk and clinical information in a face-to-face interview with women undergoing mammography. During the pilot phase of NHMN development, test-retest reliability studies were conducted on all questions used in data collection for women, including information they provide to technologists during direct interviews. The test-retest results on final data collection forms were more than 90% reliable.

Radiologists record interpretive data by using BI-RADS terminology (23), including use of breast US, breast composition, assessment status, and recommendation for each breast. We created and distributed to participating radiologists a breast density atlas to assist and standardize coding of radiographic breast density. The atlas displays examples of borderline composition categories (fat vs scattered density vs heterogeneously dense vs extremely dense) and identifies correct density coding for each example. We also conducted quality assurance on interpretive data on 20 randomly selected cases from each facility by comparing data submitted to the NHMN by radiologists with the corresponding clinical text reports. Agreement between the NHMN project forms and the radiologists’ text reports was consistently greater than 96%.

Participating New Hampshire pathology laboratories send clinical pathology reports on all breast specimens, including fine-needle aspiration biopsy, core needle biopsy including the advance breast biopsy instrumentation, excisional biopsy, lumpectomy, and mastectomy, to the NHMN project office. These are abstracted and entered into a separate pathology database. The most serious pathologic outcome is applied when there are multiple pathologic results for the same breast, except when a suspicious-looking cytologic specimen precedes a benign histologic specimen. Linkages between the mammography and pathology databases are performed approximately every 6 months by using a probability-based matching program (INTEGRITY; Vality Technology, Boston, Mass) with demonstrated effectiveness.

Study Population
Mammographic examinations performed between November 1, 1996, and October 31, 1997, were eligible for inclusion in these analyses. During this period, 95 radiologists representing 20 radiology groups interpreted mammograms in 36 facilities (87.8%) in New Hampshire and contributed data. We excluded 5,482 women in whom mammography was performed in six of these mammography facilities because corresponding pathologic data were not available for these facilities. We also excluded 805 women who were missing interpretive assessments for both breasts. Mammographic data in 53,803 women were complete and met our inclusion criteria. These were linked with 1,572 benign and malignant pathologic results submitted by 82% (14 of 17) of the pathology laboratories in the state of New Hampshire. For 47,651 of these women, the initial indication for their examination was screening and for 6,152 women the initial indication was diagnostic.

We defined the nature of a mammographic examination on the basis of the presenting indication. We used a hierarchy of the following three independent data sources to identify screening indications: (a) technologist form—the woman undergoing mammography reported no current breast concerns (valid breast concerns were limited to lump, nipple discharge, and skin changes) and had no record in the NHMN database of a prior mammogram of any type within 270 days; (b) radiologist form—the type of examination was recorded as screening (asymptomatic) mammography by the interpreting radiologist; and (c) patient form—routine screening examination was selected as the indication for mammography. All other examinations not meeting these criteria were considered diagnostic. The evaluation of a valid clinical breast concern, as defined at the beginning of the paragraph, and short-term (<270 days) follow-up imaging were the primary diagnostic indications. Immediate supplementary imaging, defined as occurring within 45 days after the index screening examination, was not considered a diagnostic indication but was linked to the initial screening examination (Fig 1).



View larger version (16K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. Flow diagram shows examinations corresponding to the screening and diagnostic mammography designations used throughout the article. Screening Indication is a request for mammography to detect clinically unsuspected breast cancer in a woman without symptoms. Supplementary Imaging is additional mammography or US performed within 45 days after the current mammographic examination to further evaluate an abnormality and arrive at a final assessment or recommendation. Diagnostic Indication is mammography requested to address a clinical sign or symptom, pain excluded, or to evaluate a preexisting mammographic abnormality (eg, 6-month follow-up of a probably benign finding).

 
Mammographic outcome was based on the BI-RADS assessment and recommendation reported by the interpreting radiologist. Radiologists recorded assessments and recommendations for each breast, although data were analyzed for each woman by using the highest assessment category. The BI-RADS assessment category hierarchy was as follows: highly suggestive of malignancy, category 5; suspicious-looking abnormality, category 4; incomplete assessment, category 0; probably benign finding, category 3; benign finding, category 2; negative, category 1. Mammograms assessed as negative, benign, or probably benign with no recommendation for biopsy or surgical consultation were considered negative. Mammograms that were assessed as highly suggestive of malignancy, suspicious, or incomplete or a recommendation for biopsy or surgical consultation irrespective of assessment were considered positive.

We analyzed the association of specific recommendations with final assessment categories for each woman. Multiple nonroutine recommendations were reported and may have included a less serious recommendation for the contralateral breast, since recommendations were not analyzed according to laterality. However, recommendations for routine follow-up, nonroutine follow-up, and the absence of a recommendation were considered mutually exclusive.

We linked indeterminate screening mammograms, defined as incomplete assessment and/or recommendation for or inclusion of immediate additional evaluation by means of mammography and/or US, with subsequent imaging that occurred within 45 days. All linked examinations were considered screening examinations because the initial indication was screening. The outcome of screening mammography reported here reflects the final assessment and recommendation status of associated examinations (Fig 1). An incomplete assessment status implies lack of resolution within 45 days after an indeterminate screening mammogram. We limited the time between imaging examinations to 45 days after an analysis of 338 examinations with initially incomplete results (category 0) revealed that more than 98% (172 of 175) of women who underwent supplementary imaging within 120 days underwent their examination within 45 days.

We defined the recall rate as the proportion of initial screening examinations assessed as incomplete and/or recommending or using additional imaging to arrive at a final assessment. This did not include definitively abnormal assessments (categories 4 and 5) or probably benign assessments (category 3) rendered solely on the basis of the initial screening examination.

We defined a positive cancer status as any tissue specimen, including malignant cytologic findings, revealing invasive carcinoma or ductal carcinoma in situ. We considered a malignant fine-needle aspiration biopsy outcome to reflect invasive carcinoma. We defined a negative cancer status as a benign result from tissue sampling and/or the absence of malignancy reported within the follow-up interval. Lobular carcinoma in situ, atypical epithelial proliferative disorders, and suspicious-looking cytologic findings without correlative histologic findings were considered benign in these analyses.

Statistical Methods
Summary statistics were used to describe patient and examination characteristics for screening and diagnostic mammograms separately. By using the mammographic outcome criteria and cancer status definitions described earlier, mammograms were linked with cancer outcomes to identify true-positive, true-negative, false-positive, and false-negative examinations. True-positive and false-positive status was defined as a positive mammographic interpretation with (true-positive result) or without (false-positive result) a cancer diagnosis reported within 365 days. A false-positive status was designated irrespective of whether a biopsy was performed. A true-negative result was a negative mammographic interpretation, including a probably benign assessment, with no report of cancer within the 365-day follow-up interval. Similarly, a false-negative result was defined as a negative mammographic interpretation with cancer diagnosed within the subsequent 365 days. On the basis of these classifications, sensitivity (true-positive/ [true-positive + false-negative]), specificity (true-negative/[false-positive + true-negative]), positive predictive value (true-positive/[true-positive + false-positive]), and negative predictive value (true-negative/[false-negative + true-negative]) were estimated.

Logistic regression, which was used to model the odds of a positive mammogram after controlling for cancer status, age less than 50 versus 50 or more years old, breast density (dense vs not dense), and history of a prior mammogram (yes vs no or unknown), was used to account for the influence of varying case mix on the operating characteristics of screening mammography. For purposes of analysis, "dense" breasts were categorized as those rated as heterogeneously dense or extremely dense, whereas breasts with fatty and scattered density were considered "not dense." To facilitate comparisons between sensitivity and specificity in the population in our study with those in other studies, sensitivity and specificity for women with particular characteristics were estimated by using the logistic regression model.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Table 1 shows the characteristics of women who underwent screening and diagnostic mammography and were included in these analyses. The mean age of the population undergoing screening mammography was 54.5 years ± 11.8 (SD), and the median age was 53 years. Nearly 40%, 18,043 of 47,651 women undergoing screening mammography were either unsure (n = 1,191) of their menopausal status or were premenopausal (n = 16,852). The majority of women, 43,738 (92%) of 47,651, who presented for screening mammography reported a history of prior mammography.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Characteristics of Women Undergoing Mammography
 
The recall rate of screening mammography was 8.3% (3,947 of 47,651). Twenty-three women who underwent screening mammography underwent supplementary imaging but retained an incomplete assessment status, and 516 women who underwent screening mammography had no record of additional imaging. The final BI-RADS assessments for the remaining 3,408 women were negative, 2,211 (64.9%); benign, 864 (25.3%); probably benign, 268 (7.9%); suspicious, 62 (1.8%); or highly suggestive of malignancy, three (0.1%). US was used or recommended in 3.5% (1,681 of 47,651) of women with a screening indication and 17.5% (1,074 of 6,152) of women with a diagnostic indication. Pathologic findings were available in 130 women with indeterminate results from index screening examinations, including 42 of the 516 women who had no record of supplementary imaging. Twenty-eight malignancies were reported in the group with additional imaging and three in the women with no record of supplementary imaging.

Tables 2 and 3 list the frequency of final assessments with corresponding recommendations and cancer outcomes for both screening and diagnostic mammography. No recommendation accompanied the assessment in 0.5% (224 of 47,651) of women presenting for screening and in 0.8% (46 of 6,152) of women presenting for diagnostic mammography. The majority, 90.1% (n = 42,925), of screening mammograms were negative (categories 1 or 2), and 98.9% (n = 42,440) of negative screening mammograms had recommendations for routine follow-up. A smaller proportion, 68.7% (n = 4,227), of diagnostic mammograms were considered negative. Approximately 11.1% (n = 472) of negative diagnostic examinations had nonroutine recommendations. Assessments with results that were suspicious or highly suggestive of malignancy composed 1.8% (n = 842) of screening and 6.5% (n = 402) of diagnostic examinations. A recommendation for either biopsy or surgical consultation accompanied 78.6% (n = 602) of screening examinations with suspicious-looking results and 92.1% (n = 70) of those with results highly suggestive of malignancy. This pattern was also seen with diagnostic mammography.


View this table:
[in this window]
[in a new window]

 
TABLE 2. Recommendations and Cancer Outcomes by Assessment Status for Women with Screening Indications
 

View this table:
[in this window]
[in a new window]

 
TABLE 3. Recommendations and Cancer Outcomes by Assessment Status for Women with Diagnostic Indications
 
Seven percent (n = 3,345) of screening mammograms and 21.8% (n = 1,341) of diagnostic mammograms were considered probably benign. For fewer than two-thirds of the probably benign assessments (63.1%, screening; 64.1%, diagnostic), a short-interval follow-up of less than 270 days was recommended. A small minority (1.1% [539 of 47,651]) of women who underwent screening examinations had incomplete assessments despite the opportunity to resolve this status with supplementary imaging. Therefore, BI-RADS recommendations were generally consistent, except for probably benign assessments.

Tables 2 and 3 also show the frequency of malignancy associated with specific assessment categories. As expected, the frequency of malignancy increases with the severity of the assessment code. Unresolved incomplete screening assessments had a malignancy rate similar to that of the probably benign category but were more highly associated with malignancy in women with diagnostic indications. Malignancy was present in less than 2% of the probably benign assessments, which is commensurate with published results (27,28).

Screening mammography helped detect malignancy in 3.3 per 1,000 women. Diagnostic mammography was used to identify cancer in 21.5 per 1,000 patients. Malignancy was diagnosed in 59 of 46,194 women after a negative screening examination and in 37 of 5,381 women with negative results at diagnostic mammography. The interval cancer rate was 1.2 per 1,000 women for screening and 6.0 per 1,000 for diagnostic mammography.

Table 4 outlines sensitivity, specificity, positive predictive value, and negative predictive value of screening and diagnostic mammography. Table 5 shows estimated sensitivity and specificity of screening mammography according to mammographic history, breast density, and age (less than 50 years vs 50 years or older). Similar results (not shown) were seen in the diagnostic mammography population.


View this table:
[in this window]
[in a new window]

 
TABLE 4. Unadjusted Performance Indicators
 

View this table:
[in this window]
[in a new window]

 
TABLE 5. Adjusted Screening Sensitivity and Specificity by Breast Density, Age, and Prior Mammogram Status
 
Figure 2 illustrates the characteristics of 383 malignancies and details additional staging information in 234 of 319 invasive cancers. The biopsy yield was 22.4% (214 of 957) for screening mammography and 27.5% (169 of 615) for diagnostic mammography. Ductal carcinoma in situ accounted for 20.7% (32 of 155) of screening-detected malignancy versus 12.1% (16 of 132) of cancers identified by means of diagnostic mammography. Nearly 14% (13.6% [eight of 59]) of interval cancers after screening and 21.6% (eight of 37) of interval cancers after diagnostic mammography were ductal carcinoma in situ. The mean and median tumor sizes of 88 screening-detected invasive cancers were 16.4 mm ± 12.1 and 13 mm, respectively. Almost 80% (70 of 88) of malignancies detected at screening mammography did not have axillary lymph node metastases. In contrast, the mean and median tumor sizes and axillary node negativity rate of 90 invasive cancers recognized with diagnostic mammography were 22.9 mm ± 16.1, 20 mm, and 64.4%, respectively.



View larger version (40K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Flow diagram shows characteristics of 383 cancers in 53,803 women who underwent mammography from November 1, 1996, through October 31, 1997.

 
The mean and median tumor sizes and node negativity rate of 36 interval cancers after screening mammography were 17.5 mm ± 14.3, 12.5 mm, and 72.2%, respectively. For 20 interval cancers after diagnostic mammography, the mean and median tumor sizes were 19.6 mm ± 15.7 and 16.5 mm, respectively, with a node negativity rate of 80.0%. For interval cancers, the mean time of diagnostic delay (ie, time from original examination date to pathologic examination date) was 176 days (95% CI: 147, 195 days; median, 180 days; range, 5–365 days).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Our data suggest that screening mammography as practiced in a diverse community setting in New Hampshire is considerably less sensitive than the 85% guideline published by the Agency of Health Care Policy and Research. We report mammographic sensitivities of 72.4%–78.1% and specificities of 89.3%–97.3%. These sensitivity estimates are lower than most previously reported. However, there are some important methodologic differences between our study and those used in other studies (113,1519) that tend to lower sensitivity and raise specificity.

We based mammographic outcome on the prospective report of the BI-RADS assessment and recommendation encoded by the interpreting radiologist. Although BI-RADS is useful for standardization, it is not always used correctly and does not always address complex imaging and clinical circumstances. Unlike the results of most other studies, the mammographic results in our study reflect the status of completed imaging evaluations. We classified a probably benign (category 3) assessment as a negative mammographic outcome. Almost half, 46% (27 of 59), of the interval cancers after screening mammography were assessed as probably benign. Had we classified probably benign mammograms as positive, our sensitivity and specificity for screening mammography would have been 84.6% and 90.4%, respectively.

We defined false-negative results (interval cancers) on the basis of the report of cancer outcomes in the 365 days after negative mammography. Thus, negative mammograms with no pathology report were considered as true-negative results, whereas positive mammograms with no pathology report were considered false-positive results. Because women with positive examinations are more likely to have aggressive follow-up, there is a potential for underestimating the false-negative rate and overestimating the true-negative rate. This phenomenon, known as verification bias (29), may lead to overestimation of sensitivity and underestimation of specificity. We have minimized this potential bias by obtaining cancer outcomes by using breast pathology reports from participating laboratories, as well as New Hampshire tumor registry data, and by excluding data from mammography centers associated with nonparticipating laboratories.

This study may provide a more comprehensive account of positive disease outcomes than have other studies (10, 13,15,16,21,30,31) that have estimated false-negative mammographic cancer outcomes or relied exclusively on tumor registry data. Our lower estimate of sensitivity compared with those of these other studies may have been the result of our more complete capture of cancer outcomes.

The timing of mammography with respect to clinical breast examination may also alter operating characteristics, especially sensitivity. Some studies (3234) that reported lower interval cancer rates offered clinical breast examination at the time of screening mammography, which will decrease the interval cancer rate owing to coincidental detection by means of clinical breast examination of mammographically occult cancers.

These methodologic differences may help explain our sensitivity of 72.4% for screening mammography and corresponding interval cancer rate of 1.2 per 1,000 women. Although the sensitivity we report is within the range of sensitivities of 68%–88% (detection method) noted by Fletcher and colleagues (1) for seven randomized control trials, it is lower than those of 91%–93% in studies (10,11) from single expert centers. Our sensitivity estimate more closely approximates the rate of 79.9% for linked screening examinations noted with the New Mexico Mammography Project (21).

The interval cancer rate of 1.2 per 1,000 women in our study is also higher than that in other published studies (31,3335). Interestingly, the tumor sizes and nodal status of the interval cancers in our study were relatively favorable, especially when compared with the staging characteristics of the malignancies identified at diagnostic mammography. We believe this may relate to the preponderance of prior mammography (92%) in the population undergoing screening mammography. We hypothesize that prior screening mammography may have been effective in leading to the extraction of larger tumors from that population, leaving smaller less detectable cancers available for discovery at the subsequent screening examination. This may also reflect a clinical decision to perform a biopsy in the setting of a probably benign mammogram or an initially indeterminate examination that resolved to a negative status (categories 1, 2, or 3) after supplementary imaging was completed.

The cancer detection rate of 3.3 cancers per 1,000 women who underwent screening mammography in our study is comparable with that in other studies (17,3537), given the age distribution and history of prior mammography in the population in our study. One would expect to detect two to four cancers per 1,000 women at annual follow-up screening mammography and six to 10 cancers per 1,000 women at baseline screening mammography (38).

The characteristics of the screening-detected cancers in our study compare favorably with those of other studies (3,10,11,21,3739). Roughly 21% of our screening-detected cancers were ductal carcinoma in situ, which is within the range of 19%–27% from prior North American studies (10,11,21,3739). Mean and median tumor sizes of the invasive cancers in our study were equal or smaller (10,11,21,3739). The rate of axillary nodal metastases of 20% for invasive malignancy is also comparable (11, 21,39), given that studies (10,38) reporting lower axillary node positivity rates have included ductal carcinoma in situ.

Other measures of screening mammographic performance, including specificity (97.3%), positive predictive value (10.6%), and recall rate (8.3%), meet the standards of the Agency of Health Care Policy and Research (14). We recognize that these estimates are somewhat inflated by our decision to base mammographic outcome on a completed imaging work-up, to consider probably benign assessment and short-term follow-up to reflect a negative interpretive outcome, and to define recall rate as we did. We defined recall rate according to the guidelines described by Linver et al (40), which differ from more inclusive abnormality rates reported by other investigators (10,19,21).

In addition to the traditional performance indicators described earlier, we also evaluated the use of BI-RADS by our interpreting physicians. It was reassuring to note that an appropriate recommendation followed the BI-RADS assessment most frequently, but there were a small number of inappropriate recommendations for all assessments for both screening and diagnostic mammography. Gross misapplications of BI-RADS, such as a recommendation for biopsy or surgical consultation in the setting of a negative or benign assessment or a recommendation for routine follow-up in the setting of an assessment that was suspicious or highly suggestive of malignancy, occurred rarely.

Although we suspect that some inappropriate recommendations represent coding errors, misclassification of screening and diagnostic indications, indecisiveness resulting in multiple recommendations, or additional nonroutine recommendations for the contralateral breast, a detailed analysis of these cases is beyond the scope of this article. Some discordance may reflect the difficulty of applying a rigid coding system to a complex and sometimes ambiguous set of clinical management alternatives. However, to some extent, the presence of discordant recommendations may indicate a lack of understanding of BI-RADS by some interpreting radiologists.

Discordant recommendations were especially evident for mammograms interpreted as probably benign, which were associated with a considerable number of routine follow-up recommendations (22%, screening; 21%, diagnostic) and a higher-than-expected rate of immediate additional imaging (14%, screening; 11%, diagnostic), predominantly US (11%, screening; 10%, diagnostic). In these instances, the interpreters appear to have misclassified benign and incomplete assessments as probably benign; this underscores the need for training mammographers in the use of BI-RADS, especially as it relates to the appropriate classification and corresponding recommendations of benign, probably benign, and incomplete assessments.

Although one of the strengths of this study includes the collection of standardized data from a diverse group of mammography facilities, we rely on many individuals to provide correctly coded data instruments, to oversee data entry, and to meticulously manage a large database. We acknowledge that despite extensive quality assurance measures, human error, including misclassification and incorrect or incomplete coding is a concern.

Another concern is the composition of our subject population. Approximately 98% of our study population is white, which is similar in ethnicity to other population-based mammography databases in the Northeast and the Northwest (41,42), but differs in ethnic distribution compared with study populations in mammography databases in other regions of the country (21,36).

In conclusion, our data suggest that the sensitivity of screening mammography (72.4%) is lower than generally believed, although other indicators of interpretive performance, including cancer detection rate, specificity, positive predictive value for a completed imaging work-up, recall rate, and the characteristics of screening-detected cancers, satisfy or exceed standards. Part, but not all, of the reduction in sensitivity can be explained by the preponderance of prior mammographic screening in the population in our study. We also learned that roughly 8% of women presenting for screening mammography had indeterminate examination results necessitating supplementary imaging, which included US 23% of the time. Approximately 90% of screening mammograms were considered negative or definitively benign; 7%, probably benign; 2%, suspicious or highly suggestive of malignancy; and 1%, indeterminate. Appropriate recommendations followed these assessment categories most of the time, although in the setting of a probably benign finding (category 3), recommendations frequently were misapplied. Further education of radiologists in the intended use of the BI-RADS lexicon may help address this problem.


    FOOTNOTES
 
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System, NHMN = New Hampshire Mammography Network

Author contributions: Guarantor of integrity of entire study, S.P.P.; study concepts, S.P.P., A.N.T., P.A.C., W.A.W.; study design, S.P.P., A.N.T., P.A.C.; definition of intellectual content, S.P.P., A.N.T., P.A.C.; literature research, S.P.P., A.N.T., P.A.C.; clinical studies, S.P.P., P.A.C., W.A.W.; experimental studies, S.P.P., P.A.C., A.N.T., M.R.G.; data acquisition, P.A.C., M.R.G.; data analysis, M.R.G., A.N.T.; statistical analysis, M.R.G., A.N.T.; manuscript preparation and editing, S.P.P., A.N.T., M.R.G., P.A.C.; manuscript review, all authors.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Fletcher SW, Black W, Harris R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancer Inst 1993; 85:1644-1656.[Abstract/Free Full Text]
  2. Nystrom L, Rutqvist LE, Wall S, et al. Breast cancer screening with mammography: overview of Swedish randomised trials. Lancet 1993; 341:973-978.[Medline]
  3. Tabar L, Fagerberg G, Duffy SW. Update of the Swedish two-county program of mammographic screening for breast cancer. Radiol Clin North Am 1992; 30:187-210.[Medline]
  4. Roberts MM, Alexander F, Anderson TJ. Edinburgh trail of screening for breast cancer: mortality at seven years. Lancet 1990; 335:241-246.[Medline]
  5. Frisell J, Klund G, Hellstrom L. Randomized study of mammography screening: preliminary report on mortality in the Stockholm trial. Breast Cancer Res Treat 1991; 18:49-56.[Medline]
  6. . In: Shapiro S, Venet W, Strax P., eds. Periodic screening for breast cancer. Baltimore, Md: Johns Hopkins University Press, 1988.
  7. Andersson I, Aspegren K, Janzon L, et al. Mammographic screening and mortality from breast cancer: the Malmo mammographic screening trial. BMJ 1998; 297:943-948.
  8. Elwood JM, Cox B, Richardson AK. The effectiveness of breast cancer screening by mammography in younger women/RTITLE>. Online J Curr Clin Trials February 25, 1993; doc 32:.
  9. Kerlikowske K, Grady D, Rubin SM. Efficacy of screening mammography: a meta-analysis. JAMA 1995; 273:149-154.[Abstract]
  10. Sickles EA, Ominsky SH, Sollitto RA, Galvin HB, Monticciolo DL. Medical audit of a rapid-throughput mammography screening practice: methodology and results of 27,114 examinations. Radiology 1990; 175:323-327.[Abstract/Free Full Text]
  11. Bird RE. Low-cost screening mammography: report on finances and review of 21,716 consecutive cases. Radiology 1989; 171:87-90.[Abstract/Free Full Text]
  12. Spring DB, Kimbrell-Wilmot K. Evaluating the success of mammography at the local level: how to conduct an audit of your practice. Radiol Clin North Am 1987; 25:983-992.[Medline]
  13. Margolin FR, Lagios MD. Development of mammography and breast services in a community hospital. Radiol Clin North Am 1987; 25:973-982.[Medline]
  14. . Clinical practice guideline number 13: quality determinants of mammography Rockville, Md: US Dept of Health and Human Services, Agency for Health Care Policy and Research, 1994; AHCPR publication 95-0632.
  15. Robertson CL. A private breast imaging practice: medical audit of 25,788 screening and 1,077 diagnostic examinations. Radiology 1993; 187:75-79.[Abstract/Free Full Text]
  16. Wolfe JN, Buck KA, Salane M, Parekh NJ. Xeroradiography of the breast: overview of 21,057 consecutive cases. Radiology 1987; 165:305-311.[Abstract/Free Full Text]
  17. Moseson D. Audit of mammography in a community setting. Am J Surg 1992; 163:544-546.[Medline]
  18. Braman DM, Williams HD. ACR accredited suburban mammography center: three year results. J Fla Med Assoc 1989; 76:1031-1034.
  19. Brown ML, Houn F, Sickles EA, Kessler LG. Screening mammography in community practice: positive predictive value of abnormal findings and yield of follow-up diagnostic procedures. AJR Am J Roentgenol 1995; 165:1373-1377.[Abstract/Free Full Text]
  20. Beam CA, Layde PM, Sullivan DC. Variability in the interpretation of screening mammograms by radiologists. Arch Intern Med 1996; 156:209-213.[Abstract]
  21. Rosenberg RD, Lando JF, Hunt WC, et al. The New Mexico Mammography Project: screening mammography performance in Albuquerque, New Mexico, 1991 to 1993. Cancer 1996; 78:1731-1739.[Medline]
  22. Rosenberg RD, Hunt WC, Williamson MR, et al. Effects of age, breast density, ethnicity, and estrogen replacement therapy on screening mammographic sensitivity and cancer stage at diagnosis: review of 183,134 screening mammograms in Albuquerque, New Mexico. Radiology 1998; 209:511-518.[Abstract/Free Full Text]
  23. Kopans DB, D’Orsi CJ, Adler DED, et al. Breast imaging reporting and data system 3rd ed. Reston, Va: American College of Radiology, 1998; 93-95.
  24. Carney PA, Poplack SP, Wells WA, et al. The New Hampshire Mammography Network: the development and design of a population-based registry. AJR Am J Roentgenol 1996; 167:367-372.[Abstract/Free Full Text]
  25. Ballard-Barbash R, Taplin SH, Yankaskas BC, et al. Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. AJR Am J Roentgenol 1997; 169:1001-1008.[Free Full Text]
  26. Carney PA, Goodrich ME, O’Mahony D, et al. Mammography in New Hampshire: characteristics of the women and the exams they receive. J Community Health 2000; 25:183-198.[Medline]
  27. Sickles EA. Periodic mammographic follow-up of probably benign lesions: results in 3,184 consecutive cases. Radiology 1991; 179:463-468.[Abstract/Free Full Text]
  28. Varas X, Leborgne F, Leborgne JH. Nonpalpable, probably benign lesions: role of follow-up mammography. Radiology 1992; 184:409-414.[Abstract/Free Full Text]
  29. Begg CB, McNeil BJ. Assessment of radiologic tests: control of bias and other design considerations. Radiology 1988; 167:565-569.[Abstract/Free Full Text]
  30. Sickles EA. Quality assurance: how to audit your own mammography practice. Radiol Clin North Am 1992; 30:265-275.[Medline]
  31. Burhenne HJ, Burhenne LW, Goldberg F, et al. Interval breast cancers in the screening mammography program of British Columbia: analysis and classification. AJR Am J Roentgenol 1994; 162:1067-1071.[Abstract/Free Full Text]
  32. Seidman H, Gelb SK, Silverberg E, LaVerda N, Lubera JA. Survival experience in the Breast Cancer Detection Demonstration Project. CA Cancer J Clin 1987; 37:258-290.[Abstract/Free Full Text]
  33. Miller AB, Baines CJ, To T, Wall C. Canadian National Breast Screening Study. I. Breast cancer detection and death rates among women aged 40 to 49 years. Can Med Assoc J 1992; 147:1459-1476.[Abstract]
  34. Miller AB, Baines CJ, To T, Wall C. Canadian National Breast Screening Study. II. Breast cancer detection and death rates among women aged 50 to 59 years. Can Med Assoc J 1992; 147:1477-1488.[Abstract]
  35. Moskowitz M. Interval breast cancers in the screening mammography program of British Columbia: commentary (editorial). AJR Am J Roentgenol 1994; 162:1072-1075.
  36. Kerlikowske K, Grady D, Barclay J, Sickles EA, Eaton A, Ernster V. Positive predictive value of screening mammography by age and family history of breast cancer. JAMA 1993; 270:2444-2450.[Abstract]
  37. Burhenne LW, Hislop TG, Burhenne HJ. The British Columbia Mammography Screening Program: evaluation of the first 15 months. AJR Am J Roentgenol 1992; 158:45-49.[Abstract/Free Full Text]
  38. Linver MN, Paster SB, Rosenberg RD, Key CR, Stidley CA, King WV. Improvement in mammography interpretation skills in a community radiology practice after dedicated teaching courses: 2-year medical audit of 38,633 cases. Radiology 1992; 184:39-43.[Abstract/Free Full Text]
  39. Morrison AS, Brisson J, Khalid N. Breast cancer incidence and mortality in the Breast Cancer Detection Demonstration Project. J Natl Cancer Inst 1988; 80:1540-1546.[Abstract/Free Full Text]
  40. Linver MN, Osuch JR, Brenner JR, Smith RA. The mammography audit: a primer for the mammography quality standards act (MQSA). AJR Am J Roentgenol 1995; 165:19-25.[Abstract/Free Full Text]
  41. Geller BM, Worden JK, Ashley JA, Oppenheimer RG, Weaver DL. Multipurpose statewide breast cancer surveillance system: the Vermont experience. J Registry Manage 1996; 23:168-174.
  42. Thompson RS, Barlow WE, Taplin SH, et al. A population-based case-cohort evaluation of the efficacy of mammographic screening for breast cancer. Am J Epidemiol 1994; 140:889-901.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
JNCI J Natl Cancer InstHome page
S. Taplin, L. Abraham, W. E. Barlow, J. J. Fenton, E. A. Berns, P. A. Carney, G. R. Cutter, E. A. Sickles, D. Carl, and J. G. Elmore
Mammography Facility Characteristics Associated With Interpretive Accuracy of Screening Mammography
J Natl Cancer Inst, June 18, 2008; 100(12): 876 - 887.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
W. F. Good, G. S. Abrams, V. J. Catullo, D. M. Chough, M. A. Ganott, C. M. Hakim, and D. Gur
Digital Breast Tomosynthesis: A Pilot Observer Study
Am. J. Roentgenol., April 1, 2008; 190(4): 865 - 869.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
D. Gur
Tomosynthesis: Potential Clinical Role in Breast Imaging
Am. J. Roentgenol., September 1, 2007; 189(3): 614 - 615.
[Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
S. P. Poplack, T. D. Tosteson, C. A. Kogel, and H. M. Nagy
Digital Breast Tomosynthesis: Initial Experience in 98 Women with Abnormal Digital Screening Mammography
Am. J. Roentgenol., September 1, 2007; 189(3): 616 - 623.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
S. P. Poplack, T. D. Tosteson, W. A. Wells, B. W. Pogue, P. M. Meaney, A. Hartov, C. A. Kogel, S. K. Soho, J. J. Gibson, and K. D. Paulsen
Electromagnetic Breast Imaging: Results of a Pilot Study in Women with Abnormal Mammograms
Radiology, May 1, 2007; 243(2): 350 - 359.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
P. A. Carney, L. A. Abraham, D. L. Miglioretti, K. R. Yabroff, E. A. Sickles, D. S. M. Buist, C. J. Kasales, B. M. Geller, R. D. Rosenberg, M. B. Dignan, et al.
Factors Associated with Imaging and Procedural Events Used to Detect Breast Cancer After Screening Mammography
Am. J. Roentgenol., February 1, 2007; 188(2): 385 - 392.
[Abstract] [Full Text] [PDF]


Home page
Ann Fam MedHome page
P. A. Carney, E. Steiner, M. E. Goodrich, A. J. Dietrich, C. J. Kasales, J. E. Weiss, and T. MacKenzie
Discovery of Breast Cancers Within 1 Year of a Normal Screening Mammogram: How Are They Found?
Ann. Fam. Med, November 1, 2006; 4(6): 512 - 518.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
R. D. Rosenberg, B. C. Yankaskas, L. A. Abraham, E. A. Sickles, C. D. Lehman, B. M. Geller, P. A. Carney, K. Kerlikowske, D. S. M. Buist, D. L. Weaver, et al.
Performance Benchmarks for Screening Mammography
Radiology, October 1, 2006; 241(1): 55 - 66.
[Abstract] [Full Text] [PDF]


Home page
J Am Board Fam MedHome page
M. M. Eberl, C. H. Fox, S. B. Edge, C. A. Carter, and M. C. Mahoney
BI-RADS Classification for Management of Abnormal Mammograms
J Am Board Fam Med, March 1, 2006; 19(2): 161 - 164.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
E. D. Pisano, C. Gatsonis, E. Hendrick, M. Yaffe, J. K. Baum, S. Acharyya, E. F. Conant, L. L. Fajardo, L. Bassett, C. D'Orsi, et al.
Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening
N. Engl. J. Med., October 27, 2005; 353(17): 1773 - 1783.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
Q. Zhu, E. B. Cronin, A. A. Currier, H. S. Vine, M. Huang, N. Chen, and C. Xu
Benign versus Malignant Breast Masses: Optical Differentiation with US-guided Optical Imaging Reconstruction
Radiology, October 1, 2005; 237(1): 57 - 66.
[Abstract] [Full Text] [PDF]


Home page
Ann. Surg. Oncol.Home page
D. L. Weaver, P. M. Vacek, J. M. Skelly, and B. M. Geller
Predicting Biopsy Outcome After Mammography: What Is the Likelihood the Patient Has Invasive or In Situ Breast Cancer?
Ann. Surg. Oncol., August 1, 2005; 12(8): 660 - 673.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
R. E. Hendrick, G. R. Cutter, E. A. Berns, C. Nakano, J. Egger, P. A. Carney, L. Abraham, S. H. Taplin, C. J. D'Orsi, W. Barlow, et al.
Community-Based Mammography Practice: Services, Charges, and Interpretation Methods
Am. J. Roentgenol., February 1, 2005; 184(2): 433 - 438.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
S. P. Poplack, P. A. Carney, J. E. Weiss, L. Titus-Ernstoff, M. E. Goodrich, and A. N. A. Tosteson
Screening Mammography: Costs and Use of Screening-related Services
Radiology, January 1, 2005; 234(1): 79 - 85.
[Abstract] [Full Text] [PDF]


Home page
Arch Intern MedHome page
J. G. Elmore, P. A. Carney, L. A. Abraham, W. E. Barlow, J. R. Egger, J. S. Fosse, G. R. Cutter, R. E. Hendrick, C. J. D'Orsi, P. Paliwal, et al.
The Association Between Obesity and Screening Mammography Accuracy
Arch Intern Med, May 24, 2004; 164(10): 1140 - 1147.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
S. P. Poplack, K. D. Paulsen, A. Hartov, P. M. Meaney, B. W. Pogue, T. D. Tosteson, M. R. Grove, S. K. Soho, and W. A. Wells
Electromagnetic Breast Imaging: Average Tissue Property Values in Women with Negative Clinical Findings
Radiology, May 1, 2004; 231(2): 571 - 580.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
M. A. Roubidoux, J. E. Bailey, L. A. Wray, and M. A. Helvie
Invasive Cancers Detected after Breast Cancer Screening Yielded a Negative Result: Relationship of Mammographic Density to Tumor Prognostic Factors
Radiology, January 1, 2004; 230(1): 42 - 48.
[Abstract] [Full Text] [PDF]


Home page
Cancer Epidemiol. Biomarkers Prev.Home page
M. A. Roubidoux, J. S. Kaur, K. A. Griffith, J. Sloan, C. Wilson, P. Novotny, and M. Lobell
Correlates of Mammogram Density in Southwestern Native-American Women
Cancer Epidemiol. Biomarkers Prev., June 1, 2003; 12(6): 552 - 558.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
J. M. Lewin, C. J. D'Orsi, R. E. Hendrick, L. J. Moss, P. K. Isaacs, A. Karellas, and G. R. Cutter
Clinical Comparison of Full-Field Digital Mammography and Screen-Film Mammography for Detection of Breast Cancer
Am. J. Roentgenol., September 1, 2002; 179(3): 671 - 677.
[Abstract] [Full Text] [PDF]


Home page
JNCI J Natl Cancer InstHome page
W. E. Barlow, C. D. Lehman, Y. Zheng, R. Ballard-Barbash, B. C. Yankaskas, G. R. Cutter, P. A. Carney, B. M. Geller, R. Rosenberg, K. Kerlikowske, et al.
Performance of Diagnostic Mammography for Women With Signs or Symptoms of Breast Cancer
J Natl Cancer Inst, August 7, 2002; 94(15): 1151 - 1159.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
E. L. Rosen, J. A. Baker, and M. S. Soo
Malignant Lesions Initially Subjected to Short-term Mammographic Follow-up
Radiology, April 1, 2002; 223(1): 221 - 228.
[Abstract] [Ful