Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


DOI: 10.1148/radiol.2373041418
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Erratum (v242,p320)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Drukker, K.
Right arrow Articles by Metz, C. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Drukker, K.
Right arrow Articles by Metz, C. E.
(Radiology 2005;237:834-840.)
© RSNA, 2005


Breast Imaging

Robustness of Computerized Lesion Detection and Classification Scheme across Different Breast US Platforms1

Karen Drukker, PhD, Maryellen L. Giger, PhD and Charles E. Metz, PhD

1 From the Department of Radiology MC2026, University of Chicago, 5841 S Maryland Ave, Chicago, IL 60637. Received August 13, 2004; revision requested October 21; revision received December 17; accepted January 20, 2005. Supported in part by United States Public Health Service grant CA89452 and U.S. Army Medical Research and Materiel Command grant 97-2445. Address correspondence to K.D. (e-mail: kdrukker{at}uchicago.edu).


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
PURPOSE: To evaluate the performance of a computerized detection and diagnosis method with breast ultrasonographic (US) images obtained with US equipment from two different manufacturers.

MATERIALS AND METHODS: Two independent clinical breast US databases were used in this performance study. Data collection and database use were HIPAA-compliant and followed institutional review board–approved protocols, with waiver of informed consent. One database consisted of 1740 images obtained in 458 women with Philips US equipment. The other database consisted of 151 images obtained in 151 women with Siemens US equipment. The testing protocols included independent testing and round-robin analysis. The computerized scheme detects potential lesions, calculates imaging features for all candidate lesions, and subsequently classifies candidate lesions into different categories. Two separate classification tasks were evaluated: distinction between all actual lesions and false-positive detections and distinction between actual cancers and all other detected lesion candidates. Statistical analysis was performed by using both receiver operating characteristic (ROC) and free-response ROC methods.

RESULTS: For the distinction between all actual lesions and false-positive detections, area under the ROC curve (Az) values ranged between 0.87 and 0.95 for different testing protocols. In two instances, the difference in performance between databases was significant (P < .01), but it was shown that this was due to the difference in size of the databases. In the distinction of cancer from all other detections, the Az values ranged between 0.80 and 0.86. No statistically significant difference was found among the different testing protocols in this instance.

CONCLUSION: These results indicate that the performance of this fully automated computerized lesion detection and classification method, which demonstrated robustness over the different US equipment used, is promising.

© RSNA, 2005


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
It is acknowledged by many that screening for breast cancer with mammography has reduced breast cancer mortality (15). The limitations of mammography in the detection of cancer in dense breasts, however, have stimulated investigation of the potential screening effectiveness of other imaging modalities. The most important alternatives to mammography are ultrasonography (US) and magnetic resonance imaging (68). Multi-institutional American College of Radiology Imaging Network projects are currently being conducted to assess each of these imaging modalities for screening.

US is second only to mammography as a breast imaging technique and has been used in breast imaging for several decades. In the 1970s and 1980s, fear of radiation-induced breast cancer stimulated a search for methods for breast cancer screening other than mammography. At that time, automated prone and supine US scanners were developed that generated numerous compound images of breast tissue. In effect, use of these automated scanners yielded studies that were repeatable, comprehensive, of low risk, widely available (9), and suitable for use in screening (10,11). Since then, however, mammography has become the screening method of choice because the quality of mammography improved while the radiation doses attendant to its use decreased. The expensive automated US systems became less desirable and finally were removed from clinical use. At present, US functions largely as a diagnostic rather than screening examination. In addition to its use in guiding interventional procedures, handheld US has been used predominantly to characterize masses identified mammographically or with palpation, thereby providing specificity in the evaluation of lesions detected with such methods (1214).

However, with recent technical advances in US system performance, interest in the use of US as a screening method—particularly for women with dense breasts, in whom mammography is inherently limited (6)—has grown. In particular, reports of individual studies (1518) indicate that three to six additional cancers per thousand that were not seen on mammograms in patients with dense breasts have been found with survey US.

It has been reported that for interpretation of mammograms, double reading—either by radiologists or with application of a computer-aided detection algorithm—can increase the sensitivity of breast cancer detection by 7%–10% (19). Analogously, for breast US, application of computer-aided detection algorithms (2028) might also result in improved sensitivity and/or specificity. We have previously used a computerized scheme (developed in house) that detects suspicious areas on sonograms on the basis of expected lesion shape and margin characteristics and then analyzes the suspicious findings (2224). In those studies (2224), this computerized scheme was trained and tested in extensive independent databases collected at different institutions, and the results were promising. However, the equipment used to collect the images for each database was from the same manufacturer. Thus, the purpose of our current study was to evaluate the performance of the computerized scheme with images obtained by using US scanners from two different manufacturers.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
Databases
Two retrospective databases of US images were used in this study. Data collection and database use were compliant with the terms of the Health Insurance Portability and Accountability Act and followed institutional review board–approved protocols, with waiver of informed consent. Patients had been referred for diagnostic US for various reasons, including young age, dense breast parenchyma with microcalcifications, the need for US-guided core biopsy, and the need to distinguish between cystic and solid lesions among abnormalities detected with mammography. The present study dealt only with mass and cystic lesions; microcalcifications were not analyzed.

Database A was collected in the Department of Radiology at the University of Chicago, Chicago, Ill. For this database, images were obtained with an HDI 3000 US unit (Philips, Best, the Netherlands) and a 12–5-MHz linear array probe and were captured from the video signal. The number of images available per patient varied from one to 20; three to five images were available for most patients. Most images had a vertical dimension of 3–4 cm and a similar horizontal dimension. The pixel size varied; average pixel size was 96 µm. All lesions were outlined by a radiologist with more than 25 years of experience in breast imaging.

The images in database B had been obtained with a Sonoline Elegra US unit (Siemens Medical Solutions, Malvern, Pa); this database was provided by Siemens. For this database only, a single image per patient—reconstructed from radiofrequency data used in another study—was available, and the pixel size was fixed at 85.6 µm. All lesions were outlined by several experienced radiologists. We verified that the overlap between lesion outlines drawn by different radiologists was similar to the overlap between manually drawn and computer-determined lesion outlines. It is important to note that interradiologist variability with respect to lesion outlining was not expected to play a major role in the assessment of the performance of our scheme because computer detection points were expected to be near lesion centers rather than the more variable lesion borders.

The initial database (database A) consisted of 1740 US images obtained in 458 women. There were 64 patients with one or more complex cysts (258 images), 110 with simple cysts (520 images), 53 with benign solid lesions (210 images), 23 with malignant solid lesions (87 images), 20 with other benign breast disease such as an abscess (87 images), and 231 with 578 "normal" images—that is, images that showed normal breast parenchyma and were devoid of abnormalities (some women had more than one lesion or more than one kind of image in the database). There were a total of 1281 lesion outlines in this database, and some images showed up to four lesions (usually small cysts). The 231 patients in whom the normal images were obtained (average age, 51 years; age range, 17–76 years) included both healthy female patients and female patients with an abnormality in a different location not shown by the image. The only case selection criterion for this database was that biopsy proof was available for all lesions (except obvious simple cysts). Images deemed normal at the time of examination were verified as such a posteriori by an independent radiologist.

Database B consisted of data for 151 patients and included a single image per patient and a single lesion on each image that depicted an abnormality. There were 14 patients with a cyst, 66 with benign solid lesions, 45 with malignant solid lesions, 20 with other benign breast disease, and six with normal images.

An overview of both databases is given in Table 1. A "case" is defined here as a patient. Some patients in database A had multiple lesions, most of which were cysts occurring in the same region of the breast. Difficulties in the identification of multiple lesions in different views of the same breast region prohibited a "by actual lesion" analysis. In database B, a single lesion was present per patient, except in patients in whom the US image did not depict an abnormality.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Overview of Database Composition

 
Lesion Detection, Segmentation, and Classification
Our computerized image analysis method involves an automated three-stage process. The first stage is the detection of lesion candidates, the second stage is the segmentation of those candidates from the parenchymal background, and the third stage is the classification of lesion candidates into two categories. In the detection stage, the computer automatically searches images for lesionlike shapes and identifies them as lesion candidates (22,26). Subsequently, all candidate lesions are segmented and image features are calculated (20,21,25). In the classification stage, a Bayesian neural network evaluates the candidate lesions on the basis of the extracted image features (27). The computer scheme requires no human input and has been described extensively elsewhere (2224).

Performance Evaluation
So that we could evaluate the performance of the computerized method, we investigated two modes of image analysis, each consisting of detection, segmentation, and classification stages. The detection and segmentation stages for both analysis modes were identical, resulting in the same lesion candidates. The classification stages differed, however, in that the first classifier was trained to distinguish actual lesions from false-positive detections (referred to as the true-positive vs false-positive classification task) and the second classifier was trained to distinguish malignancies from all other candidate lesions (the cancer vs all other lesion candidates [including false-positive detections, which are not lesions] task). The latter task involves simultaneously distinguishing cancerous lesions from both false-positive detections and all benign lesions. The distinction between findings of cancer and all other findings is of interest because, ultimately, the goal in clinical practice is to detect all cancers while keeping the number of unnecessary biopsies to a minimum. These classification tasks were considered independently, and both classifiers were trained and tested separately.

For classification, four image features were used at any given time because the use of too many features is known to result in overtraining and hence limits performance in the testing of an independent database (28). For each of the two classification tasks, the computer selected a subset of four image features from a set of 46 features that characterized each lesion candidate and the local environment (surrounding tissue). The four-feature subsets were computer-selected for the training database for each classification task in a stepwise fashion by using performance optimization based on Wilks {lambda} (29). The subset of features for the first classification task mathematically described the shape, margin, texture, and posterior acoustic characteristics of the candidate lesions (four features) (21). The subset of features for the second classification task mathematically described the margin sharpness and spiculation (two features) and texture (two features) of the candidate lesion (24).

Statistical Analysis
The ability of each classifier to classify the detected lesion candidates was assessed with receiver operating characteristic (ROC) analysis (3032). Values for the area under the ROC curve (Az) were used as figures of merit. The overall performance—that is, how well actual lesions were detected and classified—was assessed with free-response ROC analysis (33). Both ROC and free-response ROC curves were obtained by varying the threshold value for the output unit of the classifiers. All statistical work and evaluation were performed by one author (K.D.) with verification by the other two authors (C.E.M. and M.L.G.).

For the first classifier, which was designed to distinguish between actual lesions and false-positive findings, all actual lesions that were correctly identified (for a given threshold value for the classifier output) were counted as true-positive results, and all detection points that did not represent actual lesions were counted as false-positive results. For the second classifier, designed to identify cancer, true-positive results were defined as correctly identified cancerous lesions, and all other detection points (including those that represented actual benign lesions) were counted as "false-positive cancers". The decision as to whether a lesion was considered to be correctly identified was based on whether the computer detection point lay within the radiologist's lesion outline. If a given lesion was detected multiple times in a single image (ie, the computer identified multiple possible lesion centers), only one true-positive result was counted for that lesion; the other detection points within that lesion outline were not counted. The numbers of false-positive detections were summed.

ROC analysis was performed on a "by region of interest" basis. Each image was considered as a separate entity, and, for images that depicted multiple lesions, each lesion outline was considered to be a separate region of interest. For each region of interest, only a single true-positive detection was possible, while false-positive detections were additive. Free-response ROC analysis was also performed "by region of interest."

So that we could evaluate robustness across image acquisition platforms (ie, manufacturer systems), we first trained our classifier on database B and then tested it on database A, without adjusting any of the scheme's parameter values. We subsequently reversed the roles of the databases and applied the scheme to database A as the training set and to database B as the testing set, again without adjusting any of the parameter values. We compared this training and testing protocol with a protocol in which a single database was used for both training and testing by employing a round-robin (leave-one-out) approach. This was performed for both classification tasks. It should be noted that the feature subsets used for classification differed only for each classification task and did not depend on the testing protocol.

The classifier performance for both databases was compared by calculating P values and 95% confidence intervals for differences in Az values (32). Statistical significance was determined by using the Holm test (34). The statistical analysis of classification performance consisted of three parts: First, the performance on each of the full databases was compared. It is known, however, that for data sets with multiple images per patient, the P values tend to be underestimated because ROC analysis assumes independence of all observations. This makes it more difficult to show that there are no statistically significant differences. Hence, the second part of the analysis involved obtaining an estimate for the upper bound of the P values by randomly selecting a single image per patient for database A in the evaluation. The third part of the analysis assessed the influence of data set size and consisted of comparing performance by using 10 randomly selected subsets of database A that had the same size as database B (and included a single image per patient).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
Example images from both databases, along with the center points of lesion candidates identified by the computer scheme, are shown in Figure 1.



View larger version (180K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1a. Example images for both databases. (a, b) Database A. (a) Transverse US image of malignant lesion (pixel size, 108 µm). (b) Same image as a with computer detection points (+). (c, d) Database B. (c) Transverse US image of malignant lesion (pixel size, 85.6 µm). (d) Same image as c with computer detection point (+). Note that one of the lesion candidates identified by the computer in b is a false-positive detection.

 


View larger version (184K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1b. Example images for both databases. (a, b) Database A. (a) Transverse US image of malignant lesion (pixel size, 108 µm). (b) Same image as a with computer detection points (+). (c, d) Database B. (c) Transverse US image of malignant lesion (pixel size, 85.6 µm). (d) Same image as c with computer detection point (+). Note that one of the lesion candidates identified by the computer in b is a false-positive detection.

 


View larger version (186K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1c. Example images for both databases. (a, b) Database A. (a) Transverse US image of malignant lesion (pixel size, 108 µm). (b) Same image as a with computer detection points (+). (c, d) Database B. (c) Transverse US image of malignant lesion (pixel size, 85.6 µm). (d) Same image as c with computer detection point (+). Note that one of the lesion candidates identified by the computer in b is a false-positive detection.

 


View larger version (188K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1d. Example images for both databases. (a, b) Database A. (a) Transverse US image of malignant lesion (pixel size, 108 µm). (b) Same image as a with computer detection points (+). (c, d) Database B. (c) Transverse US image of malignant lesion (pixel size, 85.6 µm). (d) Same image as c with computer detection point (+). Note that one of the lesion candidates identified by the computer in b is a false-positive detection.

 
The classification performance varied; Az values ranged from 0.80 to 0.95 (Fig 2, Tables 2 and 3). As compared with database A, database B was more difficult with regard to the true-positive versus false-positive classification task but easier with respect to the cancer versus all other lesions classification task (Table 2). In addition, database B represented the population from which database A was drawn well for the cancer versus all other lesions classification task but not as well for the true-positive versus false-positive classification task. Database A, on the other hand, apparently represented the population from which database B was drawn well for both classification tasks.



View larger version (27K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2a. ROC curves for the two classification tasks in the independent testing (IT) and round-robin (RR) evaluation protocols when the indicated database (A or B) was used for testing. (a) ROC curves for classification of true-positive (TP) findings—that is, actual lesions—versus false-positive (FP) detections. (b) ROC curves for classification of cancer (CA) versus all other lesion candidates (OT). Curves pertaining to testing of database A are represented by solid lines, whereas curves for testing of database B are represented by dashed lines. Note that in a, the curves for database B are very similar, while the discrepancy between the curves for database A is slightly larger. In b, the curves for both databases are very similar.

 


View larger version (29K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2b. ROC curves for the two classification tasks in the independent testing (IT) and round-robin (RR) evaluation protocols when the indicated database (A or B) was used for testing. (a) ROC curves for classification of true-positive (TP) findings—that is, actual lesions—versus false-positive (FP) detections. (b) ROC curves for classification of cancer (CA) versus all other lesion candidates (OT). Curves pertaining to testing of database A are represented by solid lines, whereas curves for testing of database B are represented by dashed lines. Note that in a, the curves for database B are very similar, while the discrepancy between the curves for database A is slightly larger. In b, the curves for both databases are very similar.

 

View this table:
[in this window]
[in a new window]

 
TABLE 2. Comparison of Performance of the Two Classifiers

 

View this table:
[in this window]
[in a new window]

 
TABLE 3. Results of Statistical Assessment of Differences in the Performance of the Two Classifiers

 
In most instances, no statistically significant difference in performance was found among the different testing protocols for the two databases (Table 3). This indicates that in these instances, we were not able to distinguish between the performance of a classifier trained on a database and tested on another independent database obtained with a different US system on one hand and its performance when trained and tested by using a single database on the other hand. However, significant differences in performance for the true-positive versus false-positive classification task were found between round-robin analysis of database A and independent testing of both database A and database B. However, one should note that independent testing of database B yielded a higher Az value than round-robin analysis of database B. Apparently, database A better represents the population from which database B was drawn, probably owing to its much larger size.

So that we could obtain an upper bound for the P values, the above analysis was repeated for a randomly selected subset of database A, limiting the number of images per patient to one. Only the difference in performance between round-robin analysis of database A and independent testing of the same database remained statistically significant. This indicates that the other initially observed significant difference may have been caused by the underestimation of P values resulting from application of ROC methods to a data set containing multiple images per patient.

So that we could evaluate to what extent the difference in size between databases A and B was likely to be the cause of differences in performance, we performed analysis with randomly selected subsets of database A that had the same size as that of database B. We used 10 randomly selected subsets from database A that contained a single image per patient in round-robin analysis and in the independent testing protocol. No attempts were made to mimic the prevalence of different lesion types in database B. The resulting Az values were all very similar, with no statistically significant differences (Table 4).


View this table:
[in this window]
[in a new window]

 
TABLE 4. Performance of the Two Classifiers in the TP/FP Task when Subsets of Database A Were Used

 
Whereas ROC analysis deals with the classification step only—that is, how well the classifier is able to distinguish between two classes within the detections—free-response ROC analysis assesses the overall process, including the detection of lesion candidates (Fig 3 , Table 5). In our analysis, the false-positive rate was given at an operating point of 80% per-region-of-interest sensitivity for the free-response ROC curves for the two classification tasks (Table 5). The maximum false-positive detection rate for database B was approximately half of the rate for database A, while database B also attained a slightly higher maximum sensitivity for the detection of actual lesions (Fig 3).



View larger version (20K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3a. Free-response ROC curves for the two combined detection, segmentation, and classification schemes (in which the detection and segmentation stages were identical). (a) Curves for scheme performance in the true-positive (TP) versus false-positive (FP) classification task. (b) Curves for scheme performance in the cancer (CA) versus all other lesions (OT) classification task. Note that in b, the maximum false-positive rate per image for database B is 1.04, whereas the maximum rate for database A is 1.43.

 


View larger version (24K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3b. Free-response ROC curves for the two combined detection, segmentation, and classification schemes (in which the detection and segmentation stages were identical). (a) Curves for scheme performance in the true-positive (TP) versus false-positive (FP) classification task. (b) Curves for scheme performance in the cancer (CA) versus all other lesions (OT) classification task. Note that in b, the maximum false-positive rate per image for database B is 1.04, whereas the maximum rate for database A is 1.43.

 

View this table:
[in this window]
[in a new window]

 
TABLE 5. Results of Free-Response ROC Analysis: Overall Performance of Computer Scheme

 
The higher rate of false-positive detections per image for database A was probably caused by differences in image characteristics between the two databases. Database A contained many images with posterior acoustic shadowing caused by a variety of factors, including normal patient anatomy (ribs, lungs) and scanning technique. Many of these dark areas were identified in the detection stage of the computerized scheme as lesion candidates. On the other hand, the images in database B had little shadowing and a more diffuse appearance and thereby limited the identification of artifacts as lesionlike shapes. Interestingly, the free-response ROC curves for the detection and classification of cancers appear to be more similar for both databases than the free-response ROC curves for the detection and identification of all actual lesions.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
We have presented a computerized detection and classification scheme and tested its performance across different US systems. The detection and segmentation stages of the computerized scheme were identical in the tests reported herein and had been developed and calibrated in earlier investigations (22). Computerized methods for analyzing breast US images are being explored by several other research groups (3538). To date, however, work has focused on lesion classification (given a known lesion location), while lesion detection has received much less attention. The Az values for classification reported in the present article are comparable to those observed in other recent investigations (3538).

We previously demonstrated the robustness of our scheme across institutions by using similar US equipment from the same manufacturer (24). In the present study, classifiers trained on one database were tested on the other database, without adjusting any of the values for parameters of the computer algorithm.

Significant differences in classifier performances were observed only in the task of distinguishing true-positive detections (actual lesions) from false-positive detections. No significant differences were observed, however, when randomly selected smaller subsets of database A were used in the same classifier training and testing protocols. Hence, the differences in performance that were initially found were probably due to the fact that the much smaller database B did not adequately represent the population in database A. In other words, database B did not have as wide a range of lesion candidate characteristics as its much larger counterpart database A, and the classifier trained on database B was not optimally able to characterize the candidates in database A. In the task of distinguishing cancer from all other candidates, however, the limited size of database B was most likely counterbalanced by the higher prevalence of cancer in it.

Performance of the computerized detection scheme in terms of false-positive rates was probably affected by differences in image characteristics between the two data sets. These image characteristics depend not only on the scanning equipment used but also on the operators performing the examinations. It was not the aim of this study to optimize performance for each data set individually. In practice, performance for data sets obtained with different equipment, or even by different users, can—and probably should—be further optimized.

There were several study limitations. The first limitation—pertaining to lesion candidate classification—was the difference in size between the two databases. The second limitation—pertaining to lesion candidate detection—was the use of still images rather than of the streaming data acquired in clinical breast examinations by using handheld US scanners. The third limitation—pertaining to the statistical analysis—was the way in which we included multiple images per patient in the performance analysis.

Future work will include, first, evaluation of performance with larger data sets by using a statistical analysis method that incorporates image multiplicity. Second, we are currently working on evaluating performance in a prospective clinical study rather than on the retrospectively collected images used in the present investigation. Ultimately, our goal is to evaluate our system in real time in a clinical setting.

In conclusion, we have shown that our scheme performed consistently and well in the detection and classification of breast lesions in the US data sets used in this study—data sets that were acquired not only at different institutions but also by several different operators, with equipment from two vendors, and in different patient populations.


    ACKNOWLEDGMENTS
 
Special thanks to Michael S. Stern, BA, and the Ultrasound Division of Siemens Medical Solutions for their help in collecting the US databases and to Lorenzo L. Pesce, PhD, for helpful discussions on statistical analysis.


    FOOTNOTES
 

Abbreviations: Az = area under ROC curve • ROC = receiver operating characteristic

M.L.G. and C.E.M. are shareholders in R2 Technology (Los Altos, Calif).

Author contributions: Guarantors of integrity of entire study, K.D., M.L.G.; study concepts/study design or data acquisition or data analysis/interpretation, K.D., M.L.G., C.E.M.; manuscript drafting or manuscript revision for important intellectual content, K.D., M.L.G., C.E.M.; approval of final version of submitted manuscript, K.D., M.L.G., C.E.M.; literature research, K.D.; statistical analysis, K.D., C.E.M.; and manuscript editing, K.D., M.L.G.


    References
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 

  1. Smart CR, Byrne C, Smith RA, et al. Twenty-year follow-up of the breast cancers diagnosed during the breast cancer detection demonstration project. CA Cancer J Clin 1997;47:134–149.[Abstract]
  2. Tabar L, Vitak B, Tony HH, Yen MF, Duffy SW, Smith RA. Beyond randomized controlled trials: organized mammographic screening substantially reduces breast carcinoma mortality. Cancer 2001;91:1724–1731.[CrossRef][Medline]
  3. Tabar L, Vitak B, Chen HH, et al. The Swedish two-county trial twenty years later: updated mortality results and new insights from long-term follow-up. Radiol Clin North Am 2000;38:625–651.[CrossRef][Medline]
  4. Feig SA. Effect of service screening mammography on population mortality from breast carcinoma. Cancer 2002;95:451–457.[CrossRef][Medline]
  5. Humphrey LL, Helfand M, Chan BK, Woolf SH. Breast cancer screening: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 2002;137:347–360.
  6. Kolb TM, Lichy J, Newhouse JH. Occult cancer in women with dense breasts: detection with screening US—diagnostic yield and tumor characteristics. Radiology 1998;207:191–199.[Abstract/Free Full Text]
  7. Kerlikowske K, Grady D, Barclay J, Sickles EA, Ernster V. Effect of age, breast density, and family history on the sensitivity of first screening mammography. JAMA 1996;276:33–38.[Abstract/Free Full Text]
  8. Kolb TM, Lichy J, Newhouse JH. Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations. Radiology 2002;225:165–175.[Abstract/Free Full Text]
  9. Dempsey PJ. Breast sonography: historical perspective, clinical applications and image interpretation. Ultrasound Q 1998;6:69–90.
  10. Mendelson EB. The breast. In: Rumack CM, Wilson SR, Charboneau JW, eds. Diagnostic ultrasound. 2nd ed. St Louis, Mo: Mosby, 1998; 751–789.
  11. Jackson VP, Kelly-Fry E, Rothschild PA, Holden RW, Clark SA. Automated breast sonography using a 7.5 MHz PVDF transducer: preliminary clinical evaluation. Radiology 1986;159:679–684.[Abstract/Free Full Text]
  12. Rahbar G, Sie AC, Hansen GC, et al. Benign versus malignant solid breast masses: US differentiation. Radiology 1999;213:889–894.[Abstract/Free Full Text]
  13. American College of Radiology. Standard for the performance of the breast ultrasound examination. Reston, Va: American College of Radiology, 2002; 593–595.
  14. Stavros AT, Thickman D, Rapp CL, Dennis MA, Parker SH, Sisney GA. Solid breast nodules: use of sonography to distinguish benign and malignant lesions. Radiology 1995;196:123–134.[Abstract/Free Full Text]
  15. Gordon PB, Goldenberg SL. Malignant breast masses detected only by ultrasound: a retrospective review. Cancer 1995;76:626–630.[CrossRef][Medline]
  16. Berg WA, Gilbreath PL. Multicentric and multifocal cancer: whole-breast US in preoperative evaluation. Radiology 2000;214:59–66.[Abstract/Free Full Text]
  17. Kaplan SS. Clinical utility of bilateral whole-breast US in the evaluation of women with dense breast tissue. Radiology 2001;221:641–649.[Abstract/Free Full Text]
  18. Buchberger W, Niehoff A, Obrist P, DeKoekkoek-Doll P, Dunser M. Clinically and mammographically occult breast lesions: detection and classification with high-resolution sonography. Semin Ultrasound CT MR 2000;21:325–336.[CrossRef][Medline]
  19. Karssemeijer N, Otten JD, Verbeek AL, et al. Computer-aided detection versus independent double reading of masses on mammograms. Radiology 2003;227:192–200.[Abstract/Free Full Text]
  20. Horsch K, Giger ML, Venta LA, Vyborny CJ. Automatic segmentation of breast lesions on ultrasound. Med Phys 2001;28:1652–1659.[CrossRef][Medline]
  21. Horsch K, Giger ML, Venta LA, Vyborny CJ. Computerized diagnosis of breast lesions on ultrasound. Med Phys 2002;29:157–164.[CrossRef][Medline]
  22. Drukker K, Giger ML, Horsch K, Kupinski MA, Vyborny CJ, Mendelson EB. Computerized lesion detection on breast ultrasound. Med Phys 2002;29:1438–1446.[CrossRef][Medline]
  23. Drukker K, Giger ML, Vyborny CJ, Schmidt RA, Mendelson EB, Stern M. Computerized detection and classification of lesions on breast ultrasound. In: Sonka M, Fitzpatrick JM, eds. Proceedings of SPIE: medical imaging 2003—image processing. Vol 5032. Bellingham, Wash: International Society for Optical Engineering, 2003; 106–110.
  24. Drukker K, Giger ML, Vyborny CJ, Mendelson EB. Computerized detection and classification of cancer on breast ultrasound. Acad Radiol 2004;11:526–535.[CrossRef][Medline]
  25. Kupinski MA, Giger ML. Automated seeded lesion segmentation on digital mammograms. IEEE Trans Med Imaging 1998;17:510–517.[CrossRef][Medline]
  26. Kupinski MA, Giger ML, Bahr AE. Computerized detection of mass lesions in digital mammography using radial gradient index filtering (abstr). Radiology 1999; 213(P):566.
  27. Kupinski MA, Edwards DC, Giger ML, Metz CE. Ideal observer approximation using Bayesian classification neural networks. IEEE Trans Med Imaging 2001;20:886–899.[CrossRef][Medline]
  28. Kupinski MA, Giger ML. Feature selection with limited datasets. Med Phys 1999;26:2176–2182.[CrossRef][Medline]
  29. Lachenbruch PL. Discriminant analysis. London, England: Hafner, 1975.
  30. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978;8:283–298.[Medline]
  31. Metz CE, Herman BA, Shen JH. Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. Stat Med 1998;17:1033–1053.[CrossRef][Medline]
  32. Metz CE, Herman BA, Roe CA. Statistical comparison of two ROC curve estimates obtained from partially-paired datasets. Med Decis Making 1998;18:110–121.[Abstract/Free Full Text]
  33. Bunch PC, Hamilton JF, Sanderson GK, Simmons AH. A free-response approach to the measurement and characterization of radiographic-observer performance. J Appl Photographic Eng 1978;4:166–171.
  34. Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat 1979;6:65–70.
  35. Joo S, Yang YS, Moon WK, Kim HC. Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features. IEEE Trans Med Imaging 2004;23:1292–1300.[CrossRef][Medline]
  36. Sehgal CM, Cary TW, Kangas SA, et al. Computer-based margin analysis of breast sonography for differentiating malignant and benign masses. J Ultrasound Med 2004;23:1201–1209.[Abstract/Free Full Text]
  37. Piliouras N, Kalatzis I, Dimitropoulos N, Cavouras D. Development of the cubic least squares mapping linear-kernel support vector machine classifier for improving the characterization of breast lesions on ultrasound. Comput Med Imaging Graph 2004;28:247–255.[CrossRef][Medline]
  38. Sahiner B, Chan HP, Roubidoux MA, et al. Computerized characterization of breast masses on three-dimensional ultrasound volumes. Med Phys 2004;31:744–754.[CrossRef][Medline]



This article has been cited by other articles:


Home page
RadiologyHome page
K. Drukker, N. P. Gruszauskas, C. A. Sennett, and M. L. Giger
Breast US Computer-aided Diagnosis Workstation: Performance with a Large Clinical Diagnostic Population
Radiology, August 1, 2008; 248(2): 392 - 397.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
J. L. Jesneck, J. Y. Lo, and J. A. Baker
Breast Mass Lesions: Computer-aided Diagnosis Models with Mammographic and Sonographic Descriptors
Radiology, August 1, 2007; 244(2): 390 - 398.
[Abstract] [Full Text] [PDF]


Home page
radtechHome page
T. G. ODLE
Breast Ultrasound
Radiol. Technol., January 1, 2007; 78(3): 222M - 242M.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Erratum (v242,p320)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Drukker, K.
Right arrow Articles by Metz, C. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Drukker, K.
Right arrow Articles by Metz, C. E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
RADIOLOGY RADIOGRAPHICS RSNA JOURNALS ONLINE