|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Breast Imaging |
1 From Associated Radiologists Limited, Mesa, Ariz (K.E.M.); Department of Radiology, University of Michigan Health Center, Ann Arbor, Mich (M.A.H., C.Z., M.A.R., J.E.B., C.P., C.E.B., K.A.K., H.P.C.); and Department of Surgery, University of Pennsylvania School of Medicine, Philadelphia, Pa (S.S.S.). Received November 16, 2004; revision requested January 18, 2005; revision received April 3; accepted May 2; final version accepted November 7. Supported in part by U.S. Army Medical Research and Material Command grant DAMD 17-01-1-0326. Address correspondence to K.E.M., East Valley Diagnostic Imaging, 1125 E Southern Ave, Suite 200, Mesa, AZ 85204 (e-mail: yango{at}cox.net).
| ABSTRACT |
|---|
|
|
|---|
Materials and Methods: Institutional Review Board approval was obtained for this HIPAA-compliant study; patient informed consent requirements were waived. A fully automated MDEST computer program was used to measure breast density on digitized mammograms in 65 women (mean age, 53 years; range, 2489 years). Pixel gray levels in detected breast borders were analyzed, and dense areas were segmented. Percentage density was calculated by dividing the number of dense pixels by the total number of pixels within the borders. Seven breast radiologists (five trained with MDEST, two not trained) prospectively assigned qualitative BI-RADS density categories and visually estimated percentage density on 260 mammograms. Qualitative BI-RADS assessments were compared with new quantitative BI-RADS standards. The reference standard density for this study was established by allowing the five trained radiologists to manipulate the MDEST gray-level thresholds, which segmented mammograms into dense and nondense areas. Statistical tests performed include Pearson correlation coefficients, Bland-Altman agreement method,
statistics, and unpaired t tests.
Results: There was a close correlation between the reference standard and radiologist-estimated density (R = 0.900.95) and MDEST density (R = 0.89). Untrained radiologists overestimated percentage density by an average of 37%, versus 6% for trained radiologists (P < .001). MDEST showed better agreement with the reference standard (average overestimate, 1%; range, 15% to +18%). MDEST correlated better with percentage density than with qualitative BI-RADS categories. There were large overlaps and ranges of percentage density in qualitative BI-RADS categories 24. Qualitative BI-RADS categories correlated poorly with new quantitative BI-RADS categories, and 16 (6%) of 260 views were erroneously classified by MDEST.
Conclusion: MDEST compared favorably with radiologist estimates of percentage density and is more reproducible than radiologist estimates when qualitative BI-RADS density categories are used. Qualitative and quantitative BI-RADS density assessments differed markedly.
© RSNA, 2006
| INTRODUCTION |
|---|
|
|
|---|
Mammographic density is important for two main reasons: First, the sensitivity of mammography in the detection of breast carcinoma is lower in dense breasts because dense fibroglandular tissue may obscure calcifications and masses (13). Second, there is a direct association between increased mammographic density and increased risk of developing breast cancer (410). In addition, investigators who use quantitative assessment of mammographic density report higher odds ratios for the development of breast carcinoma in women with dense breasts compared with the odds ratios reported by investigators who use subjective assessment of density (7,8,11). Boyd et al (6) confirmed the importance of using precise methods to determine mammographic density: They observed a 2% increase in the relative risk of breast cancer for every 1% increase in mammographic density percentage.
There is also evidence that hormonal therapies, including estrogen and tamoxifen treatments, can change mammographic density (9,1214) and alter the risk of breast carcinoma (1518). Whether this relationship is causal remains to be proved. A simple and accurate method of measuring breast density would be a useful tool for investigating breast cancer riskmammographic density relationships.
Several methods to objectively quantitate mammographic density exist. The original method, described by Wolfe et al (11) in 1987, involved the use of manual planimetry to compute the density percentage: The dense white areas on mammograms were manually traced. However, as the authors themselves noted, this method was "tedious and time-consuming." More recent techniques have been facilitated by the advent of digital methods of acquiring and viewing mammographic data. Although these methods involve the use of computers, some of them are only partially automated (19,20). One such method was based on an ordinal ranking system rather than on a density percentage system (20). More recent computerized programs have been fully automated (10,21,22).
We developed a method in which a fully automated mammographic density estimation (MDEST) program is used to rapidly determine the perimeter of the breast and quantitate the mammographic density percentage (23). Thus, the purpose of our study was to retrospectively compare mammographic densities determined by using this MDEST program with both radiologists' estimates of density percentage and BI-RADS breast density categories.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The mammograms had been acquired by using Mammography Quality Standards Act (MQSA)-approved GE DMR mammography units (GE Medical Systems, Milwaukee, Wis) with Kodak MR2000 (Kodak, Rochester, NY) screen and film systems. All images were digitized by using a LUMISYS 85 laser film scanner (Lumisys, Mountain View, Calif) with a pixel size of 0.05 x 0.05 mm and 4096 gray levels. The gray levels were linearly proportional to the optical densities, from 0.1 to approximately 4.0 optical density units. The nominal optical density range of the scanner is 04, with large pixel values corresponding to low optical density. Since the breast density pattern does not have to be analyzed in high spatial resolution (ie, pixel size of 0.05 mm or less), the full-spatial-resolution mammograms were first smoothed with a 16 x 16 box filter and subsampled by a factor of 16 to result in 0.8-mm pixel size images that were approximately 256 x 256 pixels in size for the analysis. This process reduced the processing time and image noise. The technical details are described elsewhere (23). However, a different software version of the density program was used for this study.
Mammogram Density Analysis with MDEST
The computer first tracked the breast boundary by using a gradient-based edge-tracking algorithm, which has been described previously (23). The tracking of the boundaries of a given breast started from approximately the middle of the breast image and continued both upward and downward along the boundary. The direction in which to search for a new edge point was guided by the previous edge points. The edge location was determined by using a gradient criterion along a band of pixels perpendicular to the tracking direction. The detected boundary separated the breast from other background features, including the directly exposed area, patient identification information, and lead markers, which were excluded in the subsequent analyses. Figure 1 shows examples of the breast boundaries determined on typical CC- and MLO-view mammograms. A separate edge-tracking algorithm was used to detect the edge of the pectoral muscle on the MLO-view mammograms (Fig 1). The detected edge usually is not very smooth owing to noise on the image. A second-order polynomial was fitted to the detected edge points to segment the pectoral region. The pectoral muscle on the MLO views was excluded from the subsequent gray-level histogram analyses and breast area calculations.
|
|
After histogram classification, a gray-level threshold was automatically calculated to separate the fat and dense glandular tissue regions. The gray-level threshold depends on the shape (or class) of the histogram. If the histogram has a single peak, the maximum entropy principlebased method (24) is used to calculate the threshold. If the histogram has more than one peak, the discriminant analysis method (25) is used. The threshold is used to separate the pixels in the breast region into two classes: The class of pixel values above the threshold corresponds to dense tissue, and the class of pixel values below the threshold corresponds to fat tissue. This classification is represented on a binary image (ie, segmented image), on which dense pixels are represented by white and fat pixels are represented by black (Fig 2). The percent breast density is then calculated as the number of pixels in the dense area divided by the total number of pixels in the entire breast region.
|
A graphical interface for displaying and recording the radiologists' evaluations was developed. For a given breast, the CC- and MLO-view mammograms were first displayed side-by-side on a high-spatial-resolution 22-inch Compaq AlphaStation monitor (Compaq, Palo Alto, Calif). This monitor has a display matrix size of 1280 x 1024 pixels. It is not Digital Imaging and Communications in Medicine calibrated, but it allows one to adjust contrast and brightness settings, and we adjusted these at the beginning of the study according to the subjective impressions of an experienced MQSA-certified radiologist (M.A.H.). For each mammogram, the radiologists were able to adjust the window and level settings on the display screen.
Qualitative BI-RADS density classifications.The radiologist first assigned each two-view mammogram to one of the four conventional BI-RADS qualitative density categories (eg, category 1, indicating fat tissue). This BI-RADS density assessment system does not include any quantitative classification used in the new (fourth edition) American College of Radiology BI-RADS (1), which was not published at the time of the study. Herein, the scores used in the new BI-RADS classification system are referred to as "qualitative BI-RADS categories."
Quantitative estimate of density percentage.Next, the radiologist visually estimated the density percentage on each mammogram by selecting one of the 10% density ranges displayed on the screen. Ten density percentage increments (eg, 1%10%, 10%20%) were used because we believed that it would be too difficult for the radiologists to visually estimate density to the nearest 1%.
Determination of reference-standard density.After the subjective radiologist evaluation, each view (CC or MLO) was displayed sequentially. The displayed material included the original mammogram, the enhanced mammogram, the histogram of the breast region in that view, and the corresponding binary image created by thresholding the histogram. The enhanced image was generated by the MDEST program during the density segmentation. This image was basically a version of the original mammogram with the contrast of structures enhanced. The radiologist was then able to manipulate the gray-level threshold by interactively moving a slider along the horizontal axis of the histogram. The binary image changed simultaneously with the chosen threshold so that the radiologist could determine whether the segmented white area corresponded to the dense white area on the mammogram. The radiologist was instructed to change the amount of segmented dense area to resemble the area that he or she would trace by hand if he or she were performing manual planimetry (11). When the radiologist determined that the segmented area was accurate, he or she clicked a button to record the gray-level threshold and density percentage for each image. Since no reference standard exists for breast density measurements, we used this valueaveraged for five radiologists previously trained with the training casesas the reference-standard density percentage for each view.
The radiologist was blinded to his or her own estimated density percentage value obtained and thus could not attempt to match his or her density percentage estimate for the different views or for different breasts of the same patient. The mammogram of the contralateral breast of the same patient was then displayed and evaluated in the same way. The entire process was repeated for each patient until the imaging data of all patients in the data set were evaluated. We also recorded how long it took the radiologists to complete their evaluations of the mammograms.
During the training session for the five radiologists, both the percentage of dense area derived by the MDEST program and that determined by using interactive thresholding were presented to the radiologists so that they could compare these two percentages with their visually estimated density percentage for each image. The percent dense areas derived by using MDEST and interactive thresholding were not displayed during the actual study. To assess the effect of training, two additional breast imaging radiologists, who had not undergone training to visually estimate density percentage with the 25 training cases, evaluated the same set of study images.
Statistical Analyses
Pearson correlation coefficients were calculated to examine the associations of the qualitative BI-RADS, MDEST-, and trained radiologistestimated mammographic densities with the true (ie, reference-standard) mammographic density. To assess the agreement between the reference-standard density and both the MDEST- and the trained radiologistestimated densities and to obtain 95% limits of agreement, the method of Bland and Altman was used (26). Interreader agreement among the radiologists was measured by using
statistics (27). The strengths of agreement were expressed in
values: A value of 0.20 or less indicated poor; 0.210.40, fair; 0.410.60, moderate; 0.610.80, good; and 0.811.00, very good agreement. The significance of differences in overestimations of density between the trained and untrained radiologists was estimated by using the unpaired t test. For the statistical calculations, the radiologists' density percentage estimates were expressed as the mean of the 10% range (eg, for 1%10%, 5% was used). Software, including SAS (SAS Institute, Cary, NC) and Microsoft Excel (Redmond, Wash), was used to perform all statistical analyses.
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
|
|
Interobserver Agreement
The interobserver agreement values observed for each density measurement method indicate that there was strong agreement among the trained radiologists. Intraclass correlation coefficients were 0.88 for the radiologists' estimates of density percentage and 0.94 for the radiologists' determinations of the reference-standard density percentage, indicating very good agreement among the radiologists' density measurements obtained with these two methods. Pairwise comparison of the radiologists' assignments of qualitative BI-RADS categories revealed good but lower agreement, with
values (27) ranging from 0.61 to 0.76.
Time
The mean time to complete the qualitative BI-RADS density category assignments and density percentage estimations with the MDEST program was 18 seconds per view (range, 1322 seconds), with a mean standard deviation of 8 seconds (range, 79 seconds).
| DISCUSSION |
|---|
|
|
|---|
Similar results were found in a comparison between the qualitative estimates based on the Wolfe parenchymal patterns and the quantitative determinations of density made by using manual planimetry (28). This is not surprising, given the subjective nature of both the qualitative BI-RADS density categories and the Wolfe parenchymal patterns. More recently, Wang et al (21) suggested that the visual density percentage estimates derived by three mammographers may have led to the same mammogram being assigned to different qualitative BI-RADS categories. With use of quantitative percent breast density determinations, one is more likely to detect subtle changes in breast density that may be masked when they are classified in the same BI-RADS category as the overall breast density.
Our results show that experienced radiologists' subjective density assessments based on qualitative BI-RADS categories may be quite different from density assessments based on quantitative BI-RADS categories. For example, none of the 650 mammographic cases judged to have qualitative category 4 density had greater than 75% breast density according to quantitative BI-RADS measures. In addition, the cases with 0%24% quantitative BI-RADS category 1 (fatty) breast density would have encompassed a majority (370 [57%] of 650) of the mammograms, many of which were conventionally assigned to qualitative BI-RADS category 2 or 3.
A goal of the BI-RADS system is to facilitate uniformity of physician reports, and our results suggest that additional training would be necessary to enable physicians to accurately translate a visual assessment of density percentage into a quantitative assessment, as recommended by the new BI-RADS standards (1). In our study, the two untrained radiologists overestimated density by 37%. However, radiologists could be rapidly trained to estimate breast density by using a computerized density measurement program so that they could follow the new quantitative BI-RADS density classifications. Furthermore, use of the new quantitative BI-RADS assessment may lead to the "down coding" of breast density and thus the creation of nonuniformity between the old and new standards and the consequent hindering of longitudinal research. A case previously assigned to BI-RADS category 3 may now be assigned to BI-RADS category 1. Since very few breasts have greater than 75% density, the new quantified BI-RADS may functionally approach a three-density-level system, with the majority of mammographic cases assigned to categories 1 and 2.
The MDEST breast density determinations were more accurate than the radiologists' visual estimates of breast density, as indicated by the radiologists' overestimating of breast density to a greater degree and their larger variation in density estimates (relative to the reference-standard density) compared with the MDEST measurements. We also observed good correlations between MDEST-derived density percentage and radiologist-determined reference-standard density percentage (R = .89). Although this correlation was slightly lower than that between the trained radiologistestimated and reference-standard densities (R = 0.90), the MDEST-derived densities had tighter agreement with the reference-standard measurements than did the radiologists' estimates. The MDEST program overestimated density by a mean of only 1% (range, 15% to +18%), as compared with a mean overestimation of density of 6% by the trained radiologists (with a wider range: 16% to +27%) relative to the reference-standard density percentage.
Correlation coefficients for agreement on density measurement between the CC and MLO views favored the trained radiologists, although both the MDEST program and the radiologists had good correlation. The MDEST program performed better than the untrained radiologists in the estimation of percent breast density. The two untrained breast imagers tended to overestimate density percentage by approximately 37%, which was greater than the percentage of overestimation by the trained breast imagers. These findings are in contrast to those of Lee-Han et al (28): The single radiologist in their study slightly underestimated the density percentage relative to the measured area of density. This result may have been secondary to some form of density percentage estimation training received by the radiologist, although this was not specified.
Many of the computer programs previously used to evaluate breast density have been only partially automated. The density measurement methods used by Byng et al (19) and Boone et al (20) involved manual cropping of the pectoral muscle to determine the breast area on mammograms. In addition, the Byng et al method involved manual determinations of both breast edge and breast density gray-level thresholds. With the Byng et al method, it took less than a minute to evaluate each image (19). Our fully automated program automatically detects the breast edge, crops the pectoral muscle, and estimates the gray-level threshold for density segmentation. In addition, if manual interactive thresholding (the reference-standard method used in the current study) is preferred, the MDEST user interface is fast and simple to use, requiring an average of 18 seconds per view to evaluate both the BI-RADS categorybased density and the density percentage. Investigators in two other studies (10,21) have described fully automated programs for determining breast density.
We found that percent breast density determinations were more accurate on CC views than on MLO views. This was true for both the MDEST densities and the radiologist visual density estimates. The MDEST program overestimated density percentage by a mean of 2.4% on the MLO views and by a mean of 0.4% on the CC views. The radiologists overestimated density by a mean of 6.3% on the MLO views and by a mean of 4.7% on the CC views. These data suggest that in the future, CC views alone may be adequate for assessing percent breast density in temporal measurements.
There were several limitations to our study. The MDEST program had technical errors, which led to a 6% case rejection rate. Technical errors included inaccurate breast border detection and gross misclassification of the gray-scale histograms. Errors in both the anterior breast border detection algorithm and the pectoral muscle detection algorithm occurred and resulted in inaccurate breast tissue area determinations. Misclassification of the gray-scale histograms resulted in improper gray-level threshold determination, inaccurate segmentation of the dense areas, and inaccurate density percentage calculations. These histogram misclassification errors occurred more often on the mammograms with extremely dense and fatty pixels. Thus, MDEST cannot yet be used as a stand-alone density measurement method. Although further development of computer visualization techniques and additional training with a large data set are needed to improve the accuracy and robustness of MDEST, the results of this study demonstrate the feasibility of our approach and the promise of using an automated or semiautomatic system like MDEST to aid future research efforts in the investigation of mammographic breast density.
Another limitation of our study was that the BI-RADS qualitative assessments were subjective and could be institutionally defined. Six (86%) of the seven radiologists received residency training at different institutions, so strong institutional bias was less likely in this study. Also, there is no reference standard for determining breast density, so there will always be some subjective difference in determining mammographic density, even when manual segmentation is used. The averaging of five radiologists' segmentations may have partially reduced this bias.
Our MDEST program calculates the area of mammographic breast density, which correlates with the area of fibroglandular tissue that is present. However, volume is a more accurate measure of the amount of breast tissue than is area. Accurate determination of the dense tissue volume requires sensitometry and scatter and beam-hardening corrections for each mammogram. A rough estimate of dense tissue volume could be determined by multiplying the breast thickness, which is recorded for each mammogram at our institution, by the area of dense tissue. Wang et al (21) described a computer-aided detection method that is more accurate for estimating dense mammographic tissue composition because it involves the use of a tissue-thickness-correction algorithm. This concept of breast tissue volume may be of importance in the study of breast cancer risk, because it is probably the volume of dense glandular breast tissuerather than the density of breast tissuethat determines risk (9). Wei et al (22) recently observed a high correlation between our automated mammographic MDEST assessment method and volumetric fibroglandular tissue estimation at breast magnetic resonance imaging, suggesting that estimates of change in mammographic density are close surrogates for change in volumetric density. Further investigation is needed to determine whether rough estimates of dense tissue volume can improve the correlation between breast density and breast cancer risk.
In conclusion, the MDEST-derived densities compared favorably to radiologist estimates of percent breast density and were more reproducible than radiologist estimates of the conventional qualitative BI-RADS density categories. Qualitative and quantitative BI-RADS density assessments differed markedly.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System CC = craniocaudal MDEST = mammographic density estimation MLO = mediolateral oblique MQSA = Mammography Quality Standards Act
Authors stated no financial relationship to disclose.
Author contributions: Guarantors of integrity of entire study, K.E.M., M.A.H.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; manuscript final version approval, all authors; literature research, K.E.M., M.A.H., M.A.R., C.P., C.E.B., H.P.C.; clinical studies, M.A.H., M.A.R., C.P., K.A.K.; statistical analysis, M.A.H., C.Z., C.P., S.S.S.; and manuscript editing, K.E.M., M.A.H., C.Z., M.A.R., C.P., C.E.B., K.A.K., S.S.S., H.P.C.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. A. Tice, S. R. Cummings, R. Smith-Bindman, L. Ichikawa, W. E. Barlow, and K. Kerlikowske Using Clinical Factors and Mammographic Breast Density to Estimate Breast Cancer Risk: Development and Validation of a New Predictive Model Ann Intern Med, March 4, 2008; 148(5): 337 - 347. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Z. Bigenwald, E. Warner, A. Gunasekara, K. A. Hill, P. A. Causer, S. J. Messner, A. Eisen, D. B. Plewes, S. A. Narod, L. Zhang, et al. Is Mammography Adequate for Screening Women with Inherited BRCA Mutations and Low Breast Density? Cancer Epidemiol. Biomarkers Prev., March 1, 2008; 17(3): 706 - 711. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. M. Hall and K. E. Martin, MD Mammographic Density Categories Radiology, October 1, 2007; 245(1): 300 - 302. [Full Text] [PDF] |
||||
![]() |
D. S.M. Buist, E. J. Aiello, D. L. Miglioretti, and E. White Mammographic Breast Density, Dense Area, and Breast Area Differences by Phase in the Menstrual Cycle. Cancer Epidemiol. Biomarkers Prev., November 1, 2006; 15(11): 2303 - 2306. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |