|
|
||||||||
Breast Imaging |
1 From the Department of Radiology, Rhode Island Hospital, Brown Medical School, 593 Eddy St, Providence, RI 02903. From the 2004 RSNA Annual Meeting. Received December 15, 2004; revision requested February 15, 2005; revision received April 17; accepted May 31; final version accepted July 18. Address correspondence to E.L. (e-mail: elazarus{at}lifespan.org).
| ABSTRACT |
|---|
|
|
|---|
Materials and Methods: Institutional review board approval was obtained; informed consent was not required. This study was HIPAA compliant. Ninety-four consecutive lesions in 91 women who underwent image-guided biopsy comprised 59 masses, 32 calcifications, and three masses with calcification. Five radiologists retrospectively reviewed these lesions. Each observer described each lesion with BI-RADS terminology and assigned a final BI-RADS category. Interobserver variability was assessed with the Cohen
statistic. A pathologic diagnosis was available for all 94 lesions; 30 (32%) were malignant and 64 (68%) were benign. Pathologic analysis of benign lesions was performed on tissue obtained with image-guided core-needle biopsy. In cases referred for excisional biopsy after needle biopsy because of atypia or discordance, final surgical pathologic analysis was used for correlation with imaging findings. PPV for category 4 or 5 lesions was determined for all readers combined.
Results: For ultrasonographic (US) descriptors, substantial agreement was obtained for lesion orientation, shape, and boundary (
= 0.61, 0.66, and 0.69, respectively). Moderate agreement was obtained for lesion margin and posterior acoustic features (
= 0.40 for both). Fair agreement was obtained for lesion echo pattern (
= 0.29). For mammographic descriptors, moderate agreement was obtained for mass shape, mass margin, and calcification distribution (
= 0.48, 0.48, and 0.50, respectively). Fair agreement was obtained for calcification description (
= 0.32). Slight agreement was obtained for mass density (
= 0.18). Fair agreement was obtained for final assessment category (
= 0.28). PPVs of BI-RADS category 4 and 5 assignments were as follows: category 4a, six (6%) of 102; category 4b, 17 (15%) of 110; category 4c, 48 (53%) of 91; and category 5, 71 (91%) of 78.
Conclusion: Interobserver agreement with the new BI-RADS terminology is good and validates the US lexicon. Subcategories 4a, 4b, and 4c are useful in predicting the likelihood of malignancy.
© RSNA, 2006
| INTRODUCTION |
|---|
|
|
|---|
Breast ultrasonography (US) has proved to be useful in the evaluation of masses detected with mammography or clinical examination, as US is used to distinguish cystic lesions from solid lesions and to further differentiate benign solid masses from malignant solid masses (46). However, standard terminology for describing lesions on breast sonograms has been lacking. Use of BI-RADS for breast US should standardize the reporting and classification of lesions detected on sonograms, thereby improving the utility of US in the work-up of breast masses. By identifying lesion descriptors that emphasize the distinctions between benign and malignant lesions on breast sonograms, the BI-RADS lexicon for US clarifies the indication for biopsy of particular lesions.
One other major addition in the fourth edition of BI-RADS involves the subcategorization of category 4 lesions. While classification of a lesion as BI-RADS category 4 indicates that a lesion has been recommended for biopsy, it provides no frame of reference for either the referring physician or the patient as to the prebiopsy risk for malignancy. Dividing category 4 lesions into those with a small (category 4a), moderate (category 4b), or substantial (category 4c) (1) likelihood of malignancy better informs the physician and patient as to the level of concern regarding the lesion and prepares both the physician and the patient for the likely biopsy findings and the potential need for follow-up.
Our study had two purposes: to retrospectively evaluate interobserver variability between breast radiologists who used the new BI-RADS terminology to characterize lesions identified on both mammograms and sonograms and to retrospectively determine the positive predictive value of the new BI-RADS categories (4a, 4b, and 4c) as they are used by radiologists performing breast imaging.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Patient Lesions and Interpretation
The authors (all radiologists with subspecialty expertise in breast imaging) retrospectively evaluated 94 consecutive lesions in 91 women (mean age, 55 years; age range, 2885 years) who underwent image-guided biopsy between August 1, 2002 and October 4, 2002. Four of the radiologists (E.L., M.B.M., L.S.L., and S.L.K.) underwent fellowship training in breast imaging and have practiced as faculty in an academic breast imaging section for 210 years. The other radiologist (B.S.) has been interpreting breast images for more than 30 years in her position as chief of the same breast imaging section. All five radiologists practice within the same group, and all met the standards of the Mammography Quality Standards Act as qualified interpreting physicians. The 94 lesions comprised 59 masses, 32 calcifications, and three masses with calcification; 52 lesions were evaluated by reviewing both mammograms and sonograms, 32 were evaluated by reviewing mammograms alone, and 10 were evaluated by reviewing sonograms alone. The lesions were evaluated on original mammograms and sonograms marked to indicate lesions, without the benefit of prior mammograms and sonograms for comparison. The lesions were marked so that all observers would examine the same biopsy-proved lesion, as some images showed more than one finding. Mammographic magnification views were obtained for all calcifications analyzed. All 62 masses were evaluated with US; 52 (84%) masses were evaluated with both mammography and US. Of the 94 lesions, 30 (32%) were malignant and 64 (68%) were benign.
Pathologic diagnosis was available for all lesions. Pathologic findings in benign lesions were evaluated with results from image-guided core-needle biopsy; a 14-gauge needle was used for US-guided biopsy, and a 9-gauge needle was used for vacuum-assisted stereotactic biopsy. In cases referred for excisional biopsy after needle biopsy because of atypia, positivity for malignancy, or discordance, final surgical pathologic analysis was used for correlation with imaging findings.
Each observer described each lesion by using the terminology of the fourth edition of the BI-RADS lexicon and assigned a final BI-RADS category, including the new subcategories of BI-RADS category 4. Each observer was provided a sheet containing the BI-RADS categories and descriptors for lesions seen on both mammograms and sonograms and instructed to select the most appropriate descriptors for each lesion.
While the evaluating radiologists were familiar with the "Guidance Chapter" (1) criteria for BI-RADS subcategorization, no formal training in the newest lexicon was provided. Thus, the criteria used by each evaluating radiologist were subjective and based on prior knowledge and experience. Table 1 lists the BI-RADS terminology used in this study.
|
statistic was used to assess interreader agreement for all descriptor variables. The guidelines of Landis and Koch were followed in interpreting
values: 0.000.20, slight agreement; 0.210.40, fair agreement; 0.410.60, moderate agreement; 0.610.80, substantial agreement; and 0.801.00, almost perfect agreement (7). All
statistics were calculated with statistical software (Stata, version 8; Stata, College Station, Tex). The positive predictive value for category 4 and 5 lesions was determined by using data from the assessments of all readers combined. To determine if the results were skewed by any readers, we assessed the reliability between pairs of readers. Data from the final BI-RADS categorization of each lesion were used to determine the Pearson product-moment correlation coefficients between each pair of readers. These statistics were calculated with the SAS system (SAS Institute, Cary, NC). Intraclass correlation was also determined for final BI-RADS categorization by using (a) all results from each reader and (b) the mean of the results from each reader (Appendix).
| RESULTS |
|---|
|
|
|---|
= 0.48). Substantial agreement was seen when the mass shape was characterized as irregular (
= 0.68), moderate agreement was seen when the mass shape was characterized as oval (
= 0.46), and fair agreement was seen when the mass shape was characterized as lobular (
= 0.24) or round (
= 0.31) (Table 2).
|
= 0.48). The greatest agreement was seen when the margins were characterized as circumscribed (
= 0.60) or spiculated (
= 0.69). Poor agreement was seen when the margins were characterized as microlobulated (
= 0.08), while fair agreement was seen when margins were characterized as indistinct (
= 0.38) or obscured (
= 0.27).
Overall agreement for mass density was slight (
= 0.18). Agreement was slight when mass density was described as fat containing (
= 0.11), equal to breast tissue (
= 0.15), or high (
= 0.20).
Mammographic Assessment of Calcifications
Agreement was nearly perfect when assessing the presence of calcifications (
= 0.94) (Table 2). The five observers demonstrated overall fair agreement when they described the calcifications (
= 0.32). Use of the terms amorphous and fine branching resulted in moderate agreement (
= 0.45 and 0.49, respectively). Agreement was fair for use of the terms coarse heterogeneous (
= 0.27), pleomorphic (
= 0.21), and vascular (
= 0.24). Use of the terms coarse (n = 3), dystrophic (n = 1), milk of calcium (n = 2), and punctuate (n = 2) was uncommon in this series of lesions and led to low
values (
= 0.01 for these terms). Figure 1 shows an image for which reviewers agreed on the number and distribution of calcifications but disagreed on the description of calcifications.
|
= 0.50) and number (
= 0.48) of calcifications.
In evaluating the presence of architectural distortion, agreement was fair (
= 0.26). Agreement between readers for the presence of associated findings and special cases could not be assessed secondarily because the readers found few cases in which associated findings or special cases were present in the lesions.
Sonographic Assessment
For sonographic descriptors, substantial agreement was obtained for assessment of lesion orientation (
= 0.61) that was described as parallel or not parallel (Table 3).
|
= 0.66). The greatest agreement was achieved when lesion shape was described as irregular (
= 0.70) or oval (
= 0.71). The term round was used infrequently, and less agreement was found with its use (
= 0.29).
Evaluation of the lesion boundary, which was described as abrupt or having an echogenic halo, yielded similarly substantial agreement (
= 0.69).
Fair agreement was achieved for evaluation of the lesion margin (
= 0.40). Excellent agreement was seen with lesions that were considered circumscribed (
= 0.71). Fair agreement was seen with lesions that were considered angular (
= 0.22), indistinct (
= 0.22), microlobulated (
= 0.25), or spiculated (
= 0.26).
The many terms available to describe the US echo pattern yielded fair agreement between observers (
= 0.29). The terms complex (
= 0.40) and hypoechoic (
= 0.29) were used most commonly and demonstrated the most agreement in their use. The terms anechoic (
= 0.01) and isoechoic (
= 0.05) were used rarely, which likely accounted for poor agreement in their use. The term hyperechoic also was used rarely and demonstrated only slight agreement (
= 0.16).
Overall fair agreement was achieved in describing posterior acoustic features (
= 0.40). The greatest agreement (
= 0.66) was achieved when lesions were described as having posterior acoustic shadowing. When lesions were described as having either posterior enhancement or no change in the echo pattern, agreement was fair (
= 0.39 and
= 0.31, respectively). Lesions were rarely described as having combined posterior enhancement and shadowing, and agreement between observers who used this description was poor (
= 0.09). Figure 2 shows a lesion with good agreement for lesion orientation, shape, boundary, and echogenicity but disagreement for assessment of the lesion margin and posterior acoustic features.
|
statistics could be calculated for assessment of alterations in the surrounding tissue, presence of calcifications, lesions determined to be special cases, or lesion vascularity because the observers believed these findings were present in the lesions only on rare occasions.
Final Assessment Category
Fair agreement was achieved for the final assessment category (
= 0.28) with all final categories (2, 3, 4a, 4b, 4c, and 5) (Table 4). The greatest agreement was found with lesions categorized as highly suspicious for malignancy (category 5) (
= 0.56). Although few lesions were rated as benign (category 2), agreement with this category was fair (
= 0.27). Fair agreement was obtained for category 3 and 4c lesions (
= 0.32 and 0.26, respectively); however, there was poor agreement between observers for category 4a and 4b lesions (
= 0.14 and 0.16, respectively). When the categories were grouped in terms of whether biopsy was required (categories 2 and 3 and categories 4a, 4b, and 5), moderate agreement was obtained (
= 0.45). Figure 3 shows a classic malignancy on a sonogram, with all reviewers classifying the malignancy as BI-RADS category 5.
|
|
|
| DISCUSSION |
|---|
|
|
|---|
Our results show a high degree of agreement in describing lesions on sonograms, thus demonstrating the appropriateness of the terms chosen in the newest iteration of BI-RADS. The terminology was familiar to radiologists experienced in breast imaging, and their use was generally concordant.
Agreement for lesion orientation, shape, and boundary on sonograms was slightly better than agreement for lesion margin on sonograms and mammograms and lesion shape on mammograms, likely because of the greater number of choices for describing the second set of qualifiers. The lower rate of agreement for lesion echo pattern on sonograms and mass density on mammograms suggests that observers had difficulty in making these categorizations. However, results from Stavros et al (5) and Baker et al (10) indicate that these qualifiers are not very useful in the differentiation of benign and malignant masses.
Agreement for calcification description on mammograms was somewhat lower in our study than in prior studies (11,12). The agreement may be lower because of bias introduced by the method chosen to select the cases. To have a pathologic correlation available for assessment of positive predictive value, we chose to evaluate lesions referred for biopsy in this study. As a consequence, there was rare use of typical benign descriptors, which may have better observer agreement.
The fair agreement for overall BI-RADS category reported in our study (
= 0.28) was not much different from that reported by Berg et al (11) (
= 0.37) in a prior study in which only mammography was used. The lack of agreement between the
value obtained in our study and that obtained in the study of Berg et al (11) can be at least partially explained by the greater number of categories offered with inclusion of the 4a, 4b, and 4c subcategories.
Orel et al (13) showed that placement of mammographic lesions into BI-RADS categories is useful for predicting malignancy, with a positive predictive value of 30% for category 4 lesions and 97% for category 5 lesions. We obtained a similar positive predictive value of 91% for category 5 lesions.
While BI-RADS category 5 has always been used to identify lesions that are almost certainly malignant, BI-RADS category 4 historically has comprised a more heterogeneous population of lesions. Our results demonstrate that the optional subcategories of 4a, 4b, and 4c are useful in stratifying the likelihood of malignancy among the large heterogeneous group of category 4 lesions. This stratification is helpful in communicating the level of suspicion to referring physicians and patients, who may choose to use this information in management decisions (ie, which patients to refer to a breast specialist prior to biopsy). In our practice, some referring internists and gynecologists are willing to discuss the results of benign breast biopsies with their patients, but others prefer that patients receive the results of malignant breast biopsies from a breast surgeon who can counsel the patient on further intervention. The medical expenditure of an additional referral may not be necessary if the preprocedural risk assessment is low and can be clearly communicated.
For radiologists who use the subcategories, a medical audit of the positive predictive value of each category can provide additional feedback on interpretive performance.
There were several limitations to our study. Because we used only those lesions that were referred for biopsy, we had few descriptors typically associated with benign disease and few lesions that could be characterized as special cases. The small number of these cases may have decreased interobserver agreement in some areas, as Taplin et al (2) demonstrated that BI-RADS evaluation of negative findings and benign lesions is more consistent than BI-RADS evaluation of abnormalities.
Second, Berg et al (14) demonstrated that even for experienced breast imagers, BI-RADS training results in improved agreement for feature analysis and final assessment. While the guidance chapter in the fourth edition of the BI-RADS breast imaging atlas offers useful examples of lesions appropriate for each subcategory (1), the evaluating radiologists were not specifically asked to review these criteria. We did not provide specific training to the radiologists involved in this study to more accurately represent the usage of the majority of radiologists being introduced to this new version, who would likely have no formal training prior to its implementation. Additionally, this study was based on the performance of experienced breast imaging radiologists. Inconsistencies and errors in using the BI-RADS lexicon and categories among radiologists with different levels of experience may vary and should be studied.
Another limitation is that the cases had been evaluated by a radiologist prior to this study. We used cases that were assessed more than a year before the study was begun to minimize any recollection of a case that may have influenced categorization of the lesion.
In conclusion, the addition of the BI-RADS lexicon for US is helpful and can be used with good agreement among radiologists, even those without specific training in the new terminology. Additionally, use of the new optional subcategories 4a, 4b, and 4c is beneficial in stratifying the likelihood of malignancy in lesions recommended for biopsy.
| APPENDIX |
|---|
|
|
|---|
| FOOTNOTES |
|---|
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System
Author contributions: Guarantors of integrity of entire study, E.L., B.S., L.S.L.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; manuscript final version approval, all authors; literature research, E.L., B.S., L.S.L.; clinical studies, all authors; statistical analysis, E.L., B.S., L.S.L.; and manuscript editing, all authors
Authors stated no financial relationship to disclose.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
N. Abdullah, B. Mesurolle, M. El-Khoury, and E. Kao Breast Imaging Reporting and Data System Lexicon for US: Interobserver Agreement for Assessment of Breast Masses Radiology, June 30, 2009; (2009) 2523080670. [Abstract] [Full Text] |
||||
![]() |
J. H. Shin, B.-K. Han, E. Y. Ko, Y. H. Choe, and S.-J. Nam Probably Benign Breast Masses Diagnosed by Sonography: Is There a Difference in the Cancer Rate According to Palpability? Am. J. Roentgenol., April 1, 2009; 192(4): W187 - W191. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Weiss and C. P. Langlotz Structured Reporting: Patient Care Enhancement or Productivity Nightmare? Radiology, December 1, 2008; 249(3): 739 - 747. [Full Text] [PDF] |
||||
![]() |
S. J. Kim, E. Y. Ko, J. H. Shin, S. S. Kang, S. H. Mun, B.-K. Han, and E. Y. Cho Application of Sonographic BI-RADS to Synchronous Breast Nodules Detected in Patients with Breast Cancer Am. J. Roentgenol., September 1, 2008; 191(3): 653 - 658. [Abstract] [Full Text] [PDF] |
||||
![]() |
E.-K. Kim, K. H. Ko, K. K. Oh, J. Y. Kwak, J. K. You, M. J. Kim, and B.-W. Park Clinical Application of the BI-RADS Final Assessment to Breast Sonography in Conjunction with Mammography Am. J. Roentgenol., May 1, 2008; 190(5): 1209 - 1215. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Miglioretti, R. Smith-Bindman, L. Abraham, R. J. Brenner, P. A. Carney, E. J. A. Bowles, D. S. M. Buist, and J. G. Elmore Radiologist Characteristics Associated With Interpretive Performance of Diagnostic Mammography J Natl Cancer Inst, December 19, 2007; 99(24): 1854 - 1863. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Mesurolle, L. Kadoch, M. El-Khoury, A. Lisbona, N. Dendukuri, and W. D. Foulkes Sonographic Features of Breast Carcinoma Presenting as Masses in BRCA Gene Mutation Carriers J. Ultrasound Med., June 1, 2007; 26(6): 817 - 824. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. G. ODLE Breast Ultrasound Radiol. Technol., January 1, 2007; 78(3): 222M - 242M. [Abstract] [Full Text] [PDF] |
||||
Read all eLetters
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |