DOI: 10.1148/radiol.2461062072
(Radiology 2008;246:71-80.)
© RSNA, 2007
Computer-aided Detection in Full-Field Digital Mammography: Sensitivity and Reproducibility in Serial Examinations1
Seung Ja Kim, MD,
Woo Kyung Moon, MD,
Nariya Cho, MD,
Joo Hee Cha, MD,
Sun Mi Kim, MD, and
Jung-Gi Im, MD
1 From the Department of Radiology, Konkuk University Hospital, Seoul, Korea (S.J.K.); Department of Radiology and Clinical Research Institute, Seoul National University Hospital and the Institute of Radiation Medicine, Seoul National University Medical Research Center, 28 Yongon-dong, Chongno-gu, Seoul 100-744, Korea (W.K.M., N.C., S.M.K., J.G.I.); and Department of Radiology, Boramae Municipal Hospital, Seoul, Korea (J.H.C.). Received December 6, 2006; revision requested February 14, 2007; revision received April 24; final version accepted June 1. Supported by KISTEP, Ministry of Science and Technology, Korea.
Address correspondence to W.K.M. (e-mail: moonwk{at}radcom.snu.ac.kr).
 |
ABSTRACT
|
|---|
Purpose: To retrospectively evaluate the sensitivity and reproducibility of a computer-aided detection (CAD) system applied to serial digital mammograms obtained in women with breast cancer, with histologic analysis as the reference standard.
Materials and Methods: This study was institutional review board approved, and patient informed consent was waived. A commercially available CAD system was applied to initial and follow-up digital mammograms obtained in 93 women with breast cancer (mean age, 52 years; age range, 32–81 years). The mean interval between mammographic examinations was 23 days (range, 7–58 days). There were 119 visible lesion components (70 masses, 49 microcalcifications). Sensitivity, false-positive mark rate, and reproducibility of the CAD system were evaluated for both sets of mammograms with the t test.
Results: Sensitivities of the CAD system at initial and follow-up digital mammography were 91% and 89%, respectively, for detection of masses. Sensitivity of the CAD system for detection of microcalcifications was 100% at both initial and follow-up digital mammography. Overall false-positive mark rates were 0.29 per image and 0.27 per image at initial and follow-up digital mammography, respectively. When craniocaudal and mediolateral oblique views were considered separately, sensitivities were 76% and 75%, respectively, for masses and 96% and 92%, respectively, for microcalcifications. The reproducibility of CAD marks was 80% for true-positive masses, 92% for true-positive microcalcifications, 9% for false-positive masses, and 8% for false-positive microcalcifications (P < .001).
Conclusion: The sensitivity of the CAD system was consistently high for detection of breast cancer on initial and short-term follow-up digital mammograms. Reproducibility was significantly higher for true-positive CAD marks than for false-positive CAD marks.
Supplemental material: http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1
© RSNA, 2007
 |
INTRODUCTION
|
|---|
Computer-aided detection (CAD) systems have been shown to be capable of reducing false-negative rates in the detection of breast cancer by highlighting suspicious masses and microcalcifications on mammograms. The results of a retrospective study conducted by Warren Burhenne et al (1) showed that CAD correctly highlighted 89 (77%) of 115 missed cancers on screening mammograms, with a rate of 1.0 false-positive finding per image. The results of a prospective study by Freer and Ulissey (2) showed that use of CAD with 12 860 screening mammograms resulted in a 20% (eight of 41 cancers) increase in the number of cancers detected, with an increased recall rate of 7.5% (previously 6.5%). The CAD algorithm is more sensitive in the detection of microcalcifications than in the detection of masses, with sensitivities of 86%–99% for microcalcifications and 75%–86% for masses (1–4).
The reproducibility and performance levels of CAD systems can affect radiologist confidence and willingness to rely on CAD results. The results of CAD studies that involved repeated scanning of the film mammograms obtained in patients with breast cancer have shown 39%–53% reproducibility, which is markedly lower than the 95%–99% reproducibility claimed by the manufacturers (4–7). It is likely that shifts in the position of film mammograms between sequential digitizations were the most important cause of this variability (7). Digitization can also cause false-positive marks with film-based CAD systems.
With digital mammography, CAD systems do not require a digitizer, and they display CAD marks rapidly after image acquisition. Because CAD systems use a computer algorithm, we believe that reproducibility of CAD marks with the same digital images should be 100%. CAD systems appropriate for use with full-field digital mammography are now commercially available; however, the literature contains little information on these systems (8,9), and to our knowledge, no report on the reproducibility of CAD systems with digital mammography has been published. Moreover, we are unaware of any investigation into the possible effects of repeat image acquisition in the same breast on CAD results for film or digital mammography. Thus, the purpose of our study was to retrospectively evaluate the sensitivity and reproducibility of a CAD system applied to serial digital mammograms obtained in women with breast cancer, with histologic analysis as the reference standard.
 |
MATERIALS AND METHODS
|
|---|
Patient Selection
This retrospective study was conducted with an institutional review board–approved protocol. Informed consent was waived. Between July 2004 and November 2005, 946 women with known breast cancer underwent full-field digital mammography (Senographe 2000D FFDM; GE Medical Systems, Buc, France), breast ultrasonography (US), and magnetic resonance (MR) imaging 1–7 days before breast surgery at Seoul National University Hospital. One radiologist (S.J.K., 3 years of experience with breast imaging) used a picture archiving and communication system to retrospectively review all patient data. He selected the 93 consecutive women (mean age, 52 years; age range, 32–81 years) who had (a) undergone digital mammography twice within an interval of less than 2 months, (b) a visible malignant lesion in both craniocaudal and mediolateral oblique views, and (c) no changes in findings during follow-up (Fig 1). These selection criteria were chosen to investigate the possible effects of repeat image acquisition on CAD results and to minimize the problem of performing a second mammographic examination to test the reliability of a CAD system. The interval between consecutive mammographic examinations ranged from 7 to 58 days (mean interval, 23 days). Percutaneous image-guided 14-gauge core-needle biopsy was performed in 81 of the 93 women between initial and follow-up digital mammography. In these 81 patients, a second mammographic examination was performed at the clinician's request to evaluate changes after biopsy and preoperative tumor extent. In the remaining 12 patients, no intervention was performed between initial and follow-up digital mammography. In these 12 patients, fine-needle aspiration biopsy was performed at an outside hospital before initial digital mammography and follow-up mammography were performed to reevaluate tumor extent for surgical planning because more than 1 month had passed since the initial examinations (n = 5) or because patients reported new breast symptoms (n = 7).
In all women, initial and follow-up digital mammograms, including craniocaudal and mediolateral oblique views, were obtained by using an automatic exposure control method. The unit automatically selected technical factors (peak kilovoltage, radiation dose, target, and filter). Of the 93 women included in this study, two had bilateral breast cancer and four had undergone previous mastectomy; therefore, there were 95 breasts with cancer and 87 healthy breasts. In total, 190 mammograms were obtained in breasts with cancer, and 174 mammograms were obtained in healthy breasts. Findings at presentation in the 95 breasts with cancer were a palpable mass (n = 51) and bloody nipple discharge (n = 1). Forty-three asymptomatic lesions (45%) were detected at screening. Preoperative mammography, US, and MR imaging revealed no malignant findings in healthy breasts.
Mammographic and Histologic Findings
There were 119 visible lesion components in the 95 cancerous breasts. Seventy lesions were described as masses, and 49 were described as microcalcifications. Fifty-five lesions were described only as masses, 34 were described only as microcalcifications, and 15 were described as having both signs of malignancy. Because the CAD system identifies mass and microcalcification components separately, we counted mass and microcalcification components separately for malignancies that manifested as both a mass and a microcalcification cluster. The sizes of the 70 masses were 10 mm or smaller (n = 3), 11–20 mm (n = 41), 21–30 mm (n = 17), and 31–40 mm (n = 9). The sizes of the 49 microcalcification clusters were 10 mm or smaller (n = 20), 11–20 mm (n = 15), 21–30 mm (n = 10), 31–40 mm (n = 2), and larger than 40 mm (n = 2). No difference was found between initial and follow-up digital mammograms in terms of lesion size distribution.
We used the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) (10) to classify breasts. Of the 95 breasts with cancer, 14 (15%) were homogeneously fatty (BI-RADS category 1 density), 22 (23%) fatty breasts had scattered fibroglandular tissue (BI-RADS category 2 density), 32 (34%) were heterogeneously dense (BI-RADS category 3 density), and 27 (28%) were extremely dense (BI-RADS category 4 density).
Final surgical histologic diagnosis of malignant lesions included ductal carcinoma in situ (n = 16), invasive ductal carcinoma with or without ductal carcinoma in situ (n = 76), and lobular carcinoma (n = 3). For treatment, mastectomy was performed in 43 breasts, and lumpectomy or breast conservation therapy was performed in 52. Follow-up information was available for 92 of the 93 patients, with a mean follow-up time of 18 months (range, 7–34 months). No cancer was found in any of the remaining breasts after treatment during the follow-up period.
CAD Mark Evaluation
One of the two attending radiologists (S.J.K.) who interpreted the initial studies applied a commercially available CAD system (ImageChecker M1000-DM, version 3.1; R2 Technology, Sunnyvale, Calif) developed for use with full-field digital mammography to the two sets of digital mammograms. The two radiologists had 2–3 years of experience with breast imaging. It took 1 second to activate and display the CAD marks at a workstation (Review Workstation; GE Medical Systems, Buc, France). Digital images may or may not have CAD marks. These marks indicate areas where the detection algorithm has identified a pattern that warrants evaluation by the radiologist. An asterisk is used to indicate a pattern indicative of a mass or an area of architectural distortion. A solid triangle is used to indicate an area of clustered bright spots indicative of microcalcifications. Images with CAD marks were saved at the review workstation and forwarded to a picture archiving and communication system. We used these images for data analysis, which was performed with a 5-megapixel (2560 x 2048 pixels) liquid crystal display system (ME511L; Totoku Electric, Tokyo, Japan). The monitor was calibrated according to the manufacturer's recommendations for mammograms.
The locations of CAD marks corresponding to mammographically detected and histopathologically confirmed cancerous lesions were analyzed by two radiologists (S.J.K., N.C.) with 3–6 years of experience in breast imaging. They used all available mammographic views—including magnification views, US and MR images, and pathologic reports—to determine in consensus whether CAD marks indicated the location of a malignancy. If an asterisk was located anywhere inside a true-positive mass, this mass was considered to have been identified correctly by the CAD system. Similarly, as long as a triangle was overlapping any part of the microcalcification area, the CAD mark was considered to represent true-positive detection. All CAD marks that did not indicate the known malignancy were considered false-positive marks. When the CAD system highlighted typical benign microcalcifications or crossing lines, we considered these false-positive microcalcification marks.
Statistical Analysis
In the 93 women with breast cancer, the total numbers of mass marks, microcalcification marks, and true- and false-positive marks were calculated for initial and follow-up digital mammograms. Sensitivity of the CAD system was assessed on the basis of the correct marking of at least one true-positive lesion in either view (craniocaudal or mediolateral oblique). These views included 119 positive findings, namely 70 masses and 49 microcalcifications in the 95 breasts with cancer. In addition, CAD sensitivity—both for masses and for microcalcifications—was analyzed for each view. In this approach, the lesion depicted in each view (either craniocaudal or mediolateral oblique) was considered an independent finding. There were 238 positive findings—namely, 70 masses and 49 microcalcifications—that were visible on both views. Sensitivity of the CAD system, defined as the number of lesions correctly marked divided by the total number of lesions, is presented with the 95% confidence interval. For initial and follow-up digital mammograms, the mean numbers of false-positive marks per image were calculated and presented with the range and median. False-positive marks were assessed for masses and microcalcifications in both breasts. We provide the false-positive mark rates per image rather than per patient because two patients had bilateral cancers and four patients had previously undergone mastectomy.
To determine reproducibility, we analyzed data in two ways: image- and mark-based reproducibility (4). Image-based reproducibility was defined as identical images in the two sets of digital mammograms, irrespective of the existence of CAD marks (ie, images with and images without CAD marks). The number of identical images divided by the total number of images is presented with the 95% confidence interval for breasts with cancer and healthy breasts. Mark-based reproducibility was defined as any mark (ie, true-positive or false-positive) within the same mass or microcalcification in the two sets of digital mammograms. Analysis of true-positive marks was performed in craniocaudal and mediolateral oblique views of breasts with cancer, whereas analysis of false-positive marks was performed in craniocaudal and mediolateral oblique views of breasts with cancer and healthy breasts. Numbers of identical CAD marks divided by the total number of CAD marks within lesions are presented with 95% confidence intervals. Methods to handle multiple lesions in the same patient (ie, clustered data) were used to construct confidence intervals for sensitivities, false-positive rates, and reproducibilities (11).
We classified each breast into one of two groups according to its composition (fatty [BI-RADS density categories 1 and 2] and dense [BI-RADS density categories 3 and 4]) and determined whether CAD sensitivities, false-positive rates, and CAD mark reproducibilities were related to breast composition. Sensitivities, false-positive mark rates, and reproducibilities of CAD marks in fatty and dense breasts were compared by using the unpaired t test. Sensitivities and false-positive mark rates at initial and follow-up examinations were compared by using the paired t test. The reproducibilities of true-positive and false-positive CAD marks were compared by using the unpaired t test. A P value of less than .05 was considered to indicate a significant difference. Statistical analyses were performed with software (SPSS, version 10 for Windows; SPSS, Chicago, Ill).
The problem with using a second mammographic examination to test the reliability of a CAD system is that the two images produced are likely to be slightly different because of alterations in breast position and compression. To evaluate this effect, two radiologists (W.K.M., J.H.C.) who were unaware of the CAD results retrospectively analyzed in consensus the similarity of the initial and follow-up mammograms of breasts with cancer and healthy breasts by using two 5-megapixel liquid crystal display monitors and classified them as being similar or dissimilar. The two radiologists had 6 and 13 years of experience in breast imaging. For mammograms to be classified as similar, they needed to be identical in overall breast position and compression and they needed to have identical findings for lesions (if any) in terms of size, shape, and conspicuity. Reproducibility of the CAD system was compared for these similar and dissimilar images by using the unpaired t test.
 |
RESULTS
|
|---|
CAD Marks
The CAD system placed 307 marks (183 mass marks and 124 microcalcification marks) on initial craniocaudal and mediolateral oblique mammograms. Of the 183 mass marks, 107 were true-positive and 76 were false-positive. Of the 124 microcalcification marks, 94 were true-positive and 30 were false-positive. The CAD system placed 292 marks (178 mass marks and 114 microcalcification marks) on follow-up craniocaudal and mediolateral oblique mammograms. Of the 178 mass marks, 105 were true-positive and 73 were false-positive. Of the 114 microcalcification marks, 90 were true-positive and 24 were false-positive.
Sensitivity
Sensitivities of the CAD system for initial and follow-up digital mammograpy were 95% (113 of 119 lesions) and 93% (111 of 119 lesions), respectively. Sensitivities of the CAD system for detection of masses at initial and follow-up digital mammography were 91% (64 of 70 lesions) and 89% (62 of 70 lesions), respectively. Sensitivity of the CAD system for detection of microcalcifications was 100% (49 of 49 lesions) for both initial and follow-up digital mammography (Table E1, http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1). The differences between initial and follow-up digital mammography were not significant (P = .292 for all lesions, P = .145 for masses, and P > .99 for microcalcifications). Of the six and eight masses missed at initial and follow-up mammography, respectively, five lesions were the same and four were different. When craniocaudal and mediolateral oblique views were considered separately, sensitivities for masses at initial and follow-up digital mammography were 76% (107 of 140 lesions) and 75% (105 of 140 lesions), respectively, whereas sensitivities for microcalcifications at initial and follow-up digital mammography were 96% (94 of 98 lesions) and 92% (90 of 98 lesions), respectively (Table E2, http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1). These differences between the two sets of digital mammograms were not significant (P = .685 for masses and P = .103 for microcalcifications).
Sensitivities for masses in the fatty breast group were 100% (32 of 32 lesions) and 97% (31 of 32 lesions) for initial and follow-up digital mammography, respectively. Sensitivities for masses in the dense breast group were 84% (32 of 38 lesions) and 82% (31 of 38 lesions) for initial and follow-up digital mammography, respectively. These differences were significant (P = .019 for initial mammography and P = .046 for follow-up mammography). Sensitivity for microcalcifications was 100% in both groups. When craniocaudal and mediolateral oblique views were considered separately, sensitivities for masses in the fatty breast group at initial and follow-up digital mammography were 89% (57 of 64 lesions) and 91% (58 of 64 lesions), respectively, whereas sensitivities for masses in the dense breast group at initial and follow-up digital mammography were 66% (50 of 76 lesions) and 62% (47 of 76 lesions), respectively. These differences were significant (P = .001 for initial mammography and P < .001 for follow-up mammography). Sensitivities for microcalcifications in craniocaudal and mediolateral oblique views were 100% (18 of 18 lesions) and 83% (15 of 18 lesions), respectively, in the fatty breast group and 95% (76 of 80 lesions) and 94% (75 of 80 lesions), respectively, in the dense breast group. These differences were not significant (P = .338 for initial mammography and P = .148 for follow-up mammography).
False-Positive Results
For initial mammography, the rate of false-positive marks was 0.29 mark per image (range, 0–3 marks; median, 0 marks), with rates of 0.21 mark per image (range, 0–2 marks; median, 0 marks) for masses and 0.08 mark per image (range, 0–3 marks; median, 0 marks) for microcalcifications (Table E1, http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1). For follow-up mammography, the rate of false-positive marks was 0.27 mark per image (range, 0–3 marks; median, 0 marks), with rates of 0.20 mark per image (range, 0–2 marks; median, 0 marks) for masses and 0.07 mark per image (range, 0–3 marks; median, 0 marks) for microcalcifications. Differences in the rates of false-positive mass and microcalcification marks were not significant for the two sets of digital mammograms (P = .802 for false-positive mass marks and P = .461 for false-positive microcalcification marks).
For initial and follow-up digital mammography, false-positive mass mark rates were 0.18 and 0.19 mark per image, respectively, in the fatty breast group and 0.23 and 0.21 mark per image, respectively, in the dense breast group. These differences were not significant (P = .882 in the fatty breast group and P = .686 in the dense breast group). For initial and follow-up digital mammography, false-positive microcalcification mark rates were 0.10 and 0.08 mark per image, respectively, in the fatty breast group and 0.08 and 0.06 mark per image, respectively, in the dense breast group. These differences were not significant (P = .603 in the fatty breast group and P = .602 in the fatty breast group).
Reproducibility
In 95 breasts with cancer, image-based reproducibility was 53% (100 of 190 mammograms) (Fig 2; Table E3, http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1). When we excluded 10 images with no CAD marks at either session, the reproducibility of the remaining 180 images was reduced to 50% (90 of 180 images). In 87 healthy breasts, image-based reproducibility was 60% (105 of 174 mammograms); when we excluded 98 mammograms with no CAD marks at either session, this reproducibility was reduced to 9% (seven of 76 mammograms) (Table E3, http://radiology.rsnajnls.org/cgi/content/full/246/1/71/DC1). Image-based reproducibility with and without CAD marks was similar in the fatty (58% [76 of 132 mammograms]) and dense (56% [129 of 232 mammograms]) breast groups (P = .776).

View larger version (108K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 2a: (a) Initial craniocaudal (left) and mediolateral oblique (right) digital mammograms of a 42-year-old woman with invasive ductal carcinoma seen as a 1.0-cm ill-defined mass (arrows) at screening mammography. (b) Screen-capture images of computer monitor display of CAD system output of initial mammograms. The CAD system marked the mass (*) correctly in both craniocaudal (left) and mediolateral oblique (right) views. There is a false-positive mass mark (arrow) in the central portion of the breast. (c) Screen-capture images of computer monitor display of CAD system output of follow-up mammograms obtained 31 days later. The CAD system marked the mass (*) correctly in both craniocaudal and mediolateral oblique views. There is a false-positive mass mark (arrow) in the central portion of the breast. The CAD marks in b and c are identical. CC = craniocaudal, MLO = mediolateral oblique.
|
|

View larger version (90K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 2b: (a) Initial craniocaudal (left) and mediolateral oblique (right) digital mammograms of a 42-year-old woman with invasive ductal carcinoma seen as a 1.0-cm ill-defined mass (arrows) at screening mammography. (b) Screen-capture images of computer monitor display of CAD system output of initial mammograms. The CAD system marked the mass (*) correctly in both craniocaudal (left) and mediolateral oblique (right) views. There is a false-positive mass mark (arrow) in the central portion of the breast. (c) Screen-capture images of computer monitor display of CAD system output of follow-up mammograms obtained 31 days later. The CAD system marked the mass (*) correctly in both craniocaudal and mediolateral oblique views. There is a false-positive mass mark (arrow) in the central portion of the breast. The CAD marks in b and c are identical. CC = craniocaudal, MLO = mediolateral oblique.
|
|

View larger version (88K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 2c: (a) Initial craniocaudal (left) and mediolateral oblique (right) digital mammograms of a 42-year-old woman with invasive ductal carcinoma seen as a 1.0-cm ill-defined mass (arrows) at screening mammography. (b) Screen-capture images of computer monitor display of CAD system output of initial mammograms. The CAD system marked the mass (*) correctly in both craniocaudal (left) and mediolateral oblique (right) views. There is a false-positive mass mark (arrow) in the central portion of the breast. (c) Screen-capture images of computer monitor display of CAD system output of follow-up mammograms obtained 31 days later. The CAD system marked the mass (*) correctly in both craniocaudal and mediolateral oblique views. There is a false-positive mass mark (arrow) in the central portion of the breast. The CAD marks in b and c are identical. CC = craniocaudal, MLO = mediolateral oblique.
|
|
Overall mark-based reproducibility for true-positive marks was 85% (182 of 214 CAD marks), with 80% reproducibility (94 of 118 mammograms) for true-positive mass marks and 92% reproducibility (88 of 96 mammograms) for true-positive microcalcification marks (Table 1, Fig 3). Reproducibility of a true-positive mass mark was 89% (54 of 61 mammograms) in the fatty breast group and 70% (40 of 57 mammograms) in the dense breast group (P = .013). Reproducibility of a true-positive microcalcification mark was 83% (15 of 18 mammograms) in the fatty breast group and 94% (73 of 78 mammograms) in the dense breast group (P = .159). Overall mark-based reproducibility for false-positive marks was 9% (17 of 189 marks), with 9% (13 of 137 marks) reproducibility for false-positive masses and 8% (four of 52 marks) reproducibility for false-positive microcalcifications (Table 2). The reproducibilities of true-positive CAD marks for masses and microcalcifications were significantly (P < .001) higher than the reproducibilities of false-positive marks for masses and microcalcifications (85% [182 of 214 mammograms] vs 9% [17 of 189 mammograms]).
Similarity of Images and CAD Reproducibility
Of 190 pairs of initial and follow-up craniocaudal and mediolateral oblique images of 95 breasts with cancer, 140 (74%) pairs of images were deemed similar and 50 (26%) pairs of images were deemed dissimilar. In 87 healthy breasts, 141 (81%) of 174 image pairs were deemed similar, whereas 33 (19%) pairs were deemed dissimilar. Similar images were found in 75% (124 of 166 breasts) of breasts in which biopsy had been performed previously, whereas similar images were found in 67% (16 of 24 breasts) of breasts in which biopsy had not been performed previously (P = .315).
In 95 breasts with cancer, image-based reproducibility was 55% (77 of 140 mammograms) in the similar image group and 46% (23 of 50 mammograms) in the dissimilar image group (P = .276). In 87 healthy breasts, image-based reproducibility was 63% (89 of 141 mammograms) in the similar image group and 48% (16 of 33 mammograms) in the dissimilar image group (P = .123).
 |
DISCUSSION
|
|---|
In our study, a CAD system was applied to initial and short-term follow-up digital mammograms of 93 women with breast cancer. Sensitivities of the CAD system at initial and follow-up digital mammography were 91% (64 of 70 lesions) and 89% (62 of 70 lesions), respectively, for masses. Sensitivity of the CAD system at both initial and follow-up mammography was 100% (49 of 49 mammograms) for microcalcifications. Overall false-positive mark rates were 0.29 mark per image and 0.27 mark per image at initial and follow-up digital mammography, respectively. Identical images with and without CAD marks were obtained for 53% (100 of 190 breasts) of breasts with cancer and 60% (105 of 174 breasts) of healthy breasts. However, reproducibility was significantly higher for true-positive CAD marks than for false-positive marks (85% [182 of 214 CAD marks] vs 9% [17 of 189 mammograms], P < .001). We investigated the effect of repeat image acquisition on CAD results by using initial and short-term follow-up digital mammograms of breasts with cancer and healthy breasts that were obtained in women with breast cancer. Our study differs from other studies in which CAD reproducibility was evaluated (4–7) in that the source of variability in CAD marks in the present study was mainly caused by differences in the data acquisition process at two time points and was not caused by the results of the repeated digitization of film mammography.
Zheng et al (4) examined the reproducibility of CAD by scanning 100 positive mammograms (four views each) three times. In their study, sensitivities for mass detection ranged from 67% (64 of 96 masses) to 71% (68 of 96 masses) and sensitivities for microcalcification detection ranged from 96% (48 of 50 microcalcifications) to 100% (50 of 50 microcalcifications), with a false-positive mark rate of 0.50–0.52 mark per image. Identical images with and without CAD marks were found for 213 (53%) of the 400 mammograms. In currently available CAD systems used to analyze film or digital mammograms, a binary threshold is typically used to generate detection marks. Each marked region has a computed threshold; hence, lesions with computed values that are near the thresholds are vulnerable to small changes and may be detected on one image but missed on another (7). Our findings and the findings of other researchers in previous studies (4,5) show that the reproducibility of false-positive CAD marks is lower than that of true-positive CAD marks. We believe that positional changes between initial and short-term follow-up examinations are primarily reasonable for the low reproducibility of false-positive CAD.
CAD performance may depend on background breast density. In our study, mass sensitivity was significantly higher for fatty breasts than for dense breasts, whereas microcalcification sensitivities were unaffected by breast density. This result is consistent with the findings of a previous study (12), in which researchers found that the sensitivity of a CAD system for lesions manifesting as masses was influenced by breast parenchymal density. In our study, all false-negative lesions detected with CAD manifested as masses. The early detection of breast cancer in the absence of microcalcifications, particularly in dense breasts, appears to be a difficult task, both for radiologists and for CAD systems (13–15). Several CAD algorithms used for detection of masses on digital mammograms have been reported to offer improved detection rates (16,17).
Our study had some limitations. To test the reliability of a CAD system, the object being tested should remain unchanged. Our use of images obtained during a second mammographic examination was a problem because alteration of the relative position and compression of the breast likely caused the image obtained in the second examination to be slightly different from that obtained in the first examination. This difference likely affected the CAD algorithm; however, image-based reproducibility was not significantly different between the similar-image and dissimilar-image groups when we compared the similarity of the initial and follow-up mammographic images of each breast and the CAD results. In our study, image-guided core-needle biopsy was performed in 81 (87%) of the 93 women between initial and follow-up mammography. This could have affected CAD reproducibility in breasts with cancer; however, retrospective analysis revealed similar images were obtained more often in patients who underwent biopsy than in patients who did not undergo biopsy (75% [124 of 166 patients] vs 67% [16 of 24 patients]). Since we selected breasts with cancer, lesions were visible on both mammograms, and we included some palpable cancers. Thus, the sensitivity and reproducibility of the CAD system are likely to have been overestimated. Therefore, our results are not comparable with CAD sensitivity at screening. In addition, the sample size used to assess false-positive marks was small. This may explain why no differences were found between dense and fatty breasts. We did not confirm that all false-positive CAD marks were actually negative regions with long-term (2–3-year) follow-up; however, no cancer was found in any remaining breasts after treatment for a mean follow-up period of 18 months.
In conclusion, when a CAD system was applied to initial and short-term follow-up digital mammograms, sensitivities were, respectively, 91% and 89% for masses and 100% and 100% for microcalcifications; overall false-positive mark rates were 0.29 mark per image and 0.27 mark per image at initial and follow-up digital mammography, respectively.
 |
ADVANCES IN KNOWLEDGE
|
|---|
- Computer-aided detection (CAD) marked 91% (64 of 70 lesions) and 89% (62 of 70 lesions) of malignant masses on initial and follow-up mammograms, respectively, and 100% (49 of 49 lesions) of malignant microcal-cifications on both initial and follow-up mammograms, with false-positive mark rates of 0.29 mark per image and 0.27 mark per image on initial and short-term follow-up digital mammograms, respectively.
- Identical images with and without CAD marks were found in only 53% (100 of 190 mammograms) of breasts with cancer and in 60% (105 of 174 mammograms) of healthy breasts. The reproducibility of true-positive CAD marks, however, was significantly higher than that of false-positive CAD marks (85% [182 of 214 mammograms] vs 9% [17 of 189 mammograms], P < .001).
 |
IMPLICATION FOR PATIENT CARE
|
|---|
- When CAD marks are seen in the same area of the breast on initial and follow-up mammograms, the possibility of the presence of breast cancer should be carefully evaluated because true-positive CAD marks are significantly more reproducible than false-positive CAD marks.
 |
FOOTNOTES
|
|---|
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System CAD = computer-aided detection
Guarantor of integrity of entire study, W.K.M.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; manuscript final version approval, all authors; literature research, S.J.K., W.K.M.; clinical studies, all authors; statistical analysis, S.J.K., W.K.M.; and manuscript editing, all authors
Authors stated no financial relationship to disclose.
 |
References
|
|---|
- Warren Burhenne LJ, Wood SA, D'Orsi CJ, et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000;215:554–562. [Abstract/Free Full Text]
- Freer TW, Ulissey MJ. Screening mammography with computer-aided detection: prospective study of 12,860 patients in a community breast center. Radiology 2001;220:781–786. [Abstract/Free Full Text]
- Helvie MA, Hadjiiski L, Makariou E, et al. Sensitivity of noncommercial computer-aided detection system for mammographic breast cancer detection: pilot clinical trial. Radiology 2004;231:208–214. [Abstract/Free Full Text]
- Zheng B, Hardesty LA, Poller WR, Sumkin JH, Golla S. Mammography with computer-aided detection: reproducibility assessment—initial experience. Radiology 2003;228:58–62. [Abstract/Free Full Text]
- Baker JA, Lo JY, Delong DM, Floyd CE. Computer-aided detection in screening mammography: variability in cues. Radiology 2004;233:411–417. [Abstract/Free Full Text]
- Malich A, Azhari T, Bohm T, Fleck M, Kaiser WA. Reproducibility: an important factor determining the quality of computer-aided detection (CAD) systems. Eur J Radiol 2000;36:170–174. [CrossRef][Medline]
- Taylor CG, Champness J, Reddy M, Taylor P, Potts HW, Given-Wilson R. Reproducibility of prompts in computer-aided detection (CAD) of breast cancer. Clin Radiol 2003;58:733–738. [CrossRef][Medline]
- Baum F, Fischer U, Obenauer S, Grabbe E. Computer-aided detection in direct digital full-field mammography: initial results. Eur Radiol 2002;12:3015–3017. [Medline]
- Kim SJ, Moon WK, Cho N, et al. Computer-aided detection in digital mammography: comparison of craniocaudal, mediolateral oblique, and mediolateral views. Radiology 2006;241:695–701. [Abstract/Free Full Text]
- American College of Radiology. Breast Imaging Reporting and Data System: BI-RADS atlas. 4th ed. Reston, Va: American College of Radiology, 2003.
- Obuchowski NA. On the comparison of correlated proportions for clustered data. Stat Med 1998;17:1495–1507. [CrossRef][Medline]
- Brem RF, Hoffmeister JW, Rapelyea JA, et al. Impact of breast density on computer-aided detection for breast cancer. AJR Am J Roentgenol 2005;184:439–444. [Abstract/Free Full Text]
- Ikeda DM, Birdwell RL, O'Shaughnessy KF, Sickles EA, Brenner RJ. Computer-aided detection output on 172 subtle findings on normal mammograms previously obtained in women with breast cancer detected at follow-up screening mammography. Radiology 2004;230:811–819. [Abstract/Free Full Text]
- Brem RF, Rapelyea JA, Zisman G, Hoffmeister JW, Desimio MP. Evaluation of breast cancer with a computer-aided detection system by mammographic appearance and histopathology. Cancer 2005;104:931–935. [CrossRef][Medline]
- Malich A, Sauner D, Marx C, et al. Influence of breast lesion size and histologic findings on tumor detection rate of a computer-aided detection system. Radiology 2003;228:851–856. [Abstract/Free Full Text]
- Li L, Clark RA, Thomas JA. Computer-aided diagnosis of masses with full-field digital mammography. Acad Radiol 2002;9:4–12. [CrossRef][Medline]
- Wei J, Sahiner B, Hadjiiski LM, et al. Computer-aided detection of breast masses on full field digital mammograms. Med Phys 2005;32:2827–2838. [CrossRef][Medline]