|
|
||||||||
Thoracic Imaging |
1 From the Department of Radiology, Kurt Rossmann Laboratories for Radiologic Image Research, University of Chicago, 5841 S Maryland Ave, MC2026, Chicago, IL 60637 (J.S., H.A., R.E., H.M., K.D.); and Department of Intelligent Systems, Faculty of Information Sciences, Hiroshima City University, Japan (M.A.). Received April 30, 2002; revision requested June 21; revision received July 26; accepted September 23. Supported by Public Health Service grant CA62625. Address correspondence to J.S. (e-mail: junji@thymus.bsd.uchicago.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Fifty-three chest radiographs that depicted 31 primary lung cancers and 22 benign nodules were used. The likelihood measure of malignancy for each nodule was determined by using an automated computerized scheme. Sixteen radiologists (nine attending radiologists and seven radiology residents) participated in an observer study in which cases were interpreted first without and then with use of the scheme. The radiologists performance was evaluated with receiver operating characteristic analysis.
RESULTS: The mean area under the best-fit binormal receiver operating characteristic curve plotted in the unit square (Az) values of radiologists who interpreted images without and with the scheme were 0.743 and 0.817, respectively. The performance of radiologists was improved significantly when the scheme was used (P = .002). However, the performance (Az = 0.889) of the computer alone exceeded these results by a substantial margin. The average change in radiologists confidence level for interpretation without and with the scheme was highly correlated (r = 0.845) with the likelihood measure of malignancy, which was presented as computer output.
CONCLUSION: This scheme for computer-aided diagnosis has the potential to improve the accuracy of radiologists performance in the classification of benign and malignant solitary pulmonary nodules.
© RSNA, 2003
Index terms: Computers, diagnostic aid Diagnostic radiology, observer performance Lung, nodule, 60.281 Receiver operating characteristic (ROC) curve
| INTRODUCTION |
|---|
|
|
|---|
We investigated the potential usefulness of a scheme for computer-aided diagnosis (CAD) for classification of benign and malignant nodules (12,13), and our results indicated that the performance of our computerized scheme was superior to that of radiologists (12) in this task. Recently, we developed an automated computerized scheme, with which the outline of a nodule was segmented automatically once the location of the nodule was indicated by a radiologist and/or computer on a chest radiograph. Then, a likelihood measure for malignancy was determined by using linear discriminant analysis applied to a number of features that included two clinical parameters (age and sex) and 75 image features (13).
The purpose of our current study was to evaluate the effect on radiologists performance of a CAD scheme for distinction between benign and malignant pulmonary nodules on chest radiographs.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Database
The cases used in this study were selected on the basis of confirmation with CT findings. The selection criteria for nodules included the following: They were solitary, no larger than 3 cm, and had no calcification and no scarlike linear opacities. In addition, only nodules of primary lung cancer were selected for malignant cases. As a result of these criteria, 31 cancers and 22 benign nodules were selected from our existing clinical image database (12,13). The final diagnoses of the 31 primary bronchogenic carcinomas (size range, 1125 mm; mean size, 16 mm), including 26 adenocarcinomas, three squamous cell carcinomas, one small cell carcinoma, and one carcinoid tumor, were determined with pathologic examination. Two carcinomas of unknown subtype, which were part of our previous studies that included 33 cases of malignancy (12,13), were eliminated. The diagnoses of benign nodules, including 12 granulomas, seven inflammatory lesions, two pulmonary hamartomas, and one pulmonary infarct, were determined by means of pathologic examination (n = 7) or observation of no change (n = 8) or a decrease in nodule size (size range, 820 mm; mean size, 14 mm) during an interval of 2 years (n = 7).
The 53 radiographs were obtained in 33 women and 20 men (age range, 2486 years; mean age, 58 years). The location of a nodule was determined with the consensus of two chest radiologists who did not participate in this observer performance study.
We used original radiographs, which had been obtained with screen-film systems (Lanex Medium/OC; Kodak, Rochester, NY) at the University of Chicago Hospitals. With a laser scanner (model 2905; Abe Sekkei, Tokyo, Japan), we digitized the original radiographs with 2,000 x 2,000 matrix size (pixel size, 0.175 mm) and a 10-bit gray scale.
We also used five chest radiographs, including those that depicted three primary lung cancers and two benign nodules, for training cases, which were used before but were not part of the observer test. These five chest radiographs were selected from the Japanese Standard Digital Image Database that was developed by the Japanese Society of Radiological Technology (14).
Computerized Scheme
Our automated computerized scheme (13) for distinction between benign and malignant solitary pulmonary nodules on digital chest radiographs included several steps (Fig 1). First, the location of a nodule was identified by a radiologist. The nodule was then segmented automatically by using analysis of contour lines of the gray-level distribution that was based on the polar-coordinate representation, which was produced by means of the difference image technique (15). Seventy-five image features were determined from the outline and the image analysis for inside and outside regions of the segmented nodules. Finally, linear discriminant analysis was applied to seven features for determination of the likelihood measure of malignancy for each nodule. The features were selected from two clinical parameters (age and sex) and 75 image features as an optimal combination of parameters for the linear discriminant analysis. The seven selected features included age, root-mean-square value of the power spectrum, overlap measure on histograms, full width half maximum for the outside region of the segmented nodule on the background corrected image, degree of irregularity, full width half maximum for the inside region of the segmented nodule on the original image, and contrast of the segmented nodule on the background-corrected image. The computer output indicated the likelihood measure of malignancy in terms of a percentage. The likelihood measure of malignancy for five training cases was determined with the same parameters as were used for test cases. The performance of the computerized scheme was evaluated by a round-robin (leave-one-out) test, where training was performed for all cases except one in the database, and the one not used for training was applied for testing with the trained computerized scheme. This procedure was repeated until every case in the database was used once.
|
Both the original radiograph and the digitized image on the monitor were presented to a radiologist first without the computer output. After the radiologist marked the initial level of confidence, the computer output for the likelihood measure of malignancy was shown on the monitor. The radiologist again was asked to mark his or her confidence level if he or she wished to change the initial result. Before the training and the test, radiologists were instructed with the following: They were told that the purpose of this experiment was to evaluate the potential benefit of using CAD schemes to classify nodules as benign or malignant on chest radiographs. They also were told that images displayed on the screen represented digitized images and that the original film images were displayed with the view box adjacent to the screen. In addition, they were told that 53 chest radiographs were to be shown randomly and that approximately 60% of the nodules were malignant. They also were told that the accuracy of the computer output was about 80% when a threshold of 50% was used for the likelihood measure of malignancy. They were asked to try to use the rating scale consistently and uniformly.
We provided training for the radiologists before the test in order that they could learn how to operate the observer interface and how to take into account the computer output in their decision making. The distribution of cases with various likelihood measures of malignancy for five training cases was similar to that for the test cases; two of three malignant cases indicated more than 50% of the likelihood of malignancy and two benign cases indicated less than 50% of the likelihood measure of malignancy. In the training session, the actual diagnosis (ie, benign or malignant) was indicated on the monitor after the radiologists final decision was determined with the computer output.
ROC analysis was used for comparison of the radiologists performance without and with computer output for distinction between benign and malignant solitary pulmonary nodules on chest radiographs. A binormal ROC curve was fitted to each radiologists confidence rating data from two reading conditions with quasi-maximum likelihood estimation (18). A computer program (LABROC5; Charles E. Metz, University of Chicago, Ill) was used for obtaining binormal ROC curves from the ordinal-scale rating data (18). The area under the best-fit ROC curve plotted in the unit square (Az) was calculated for each fitted curve. The statistical significance of the difference between the ROC curves obtained without and with the computer output was tested by using a computer program (LABMRMC; Charles E. Metz, University of Chicago), and the difference was estimated by using the analysis of variances in pseudovalues of Az calculated from all rating scores of all radiologists (20). For examination of the relationship between the confidence level and the computer output, correlation coefficients were determined among four data sets as follows: (a) the rating score for each nodule determined without computer output as indicated by each radiologist, (b) the rating score for each nodule determined with computer output as indicated by each radiologist, (c) the change in rating scores for each nodule between those determined without computer output and those determined with computer output as indicated by each radiologist, and (d) the likelihood measure of malignancy for each nodule.
| RESULTS |
|---|
|
|
|---|
|
|
|
|
| DISCUSSION |
|---|
|
|
|---|
The potential usefulness of CAD in differential diagnosis has been investigated in the field of chest radiography (12,21,22) and mammography (23). Although the effect of the CAD scheme on radiologists decision making has been demonstrated (22), it is still unclear how radiologists would use the computer output in their decision making for a differential diagnosis. If radiologists would be affected directly by the output of the CAD scheme, the magnitude of the change in the radiologists decision making without and with the computer output would depend on the accuracy of the computer output. Therefore, it is important to note that the average change in terms of a percentage in radiologists confidence level for interpretation without and with the computer output was strongly related to the likelihood measure of malignancy (Fig 3). In other words, decision making of radiologists was clearly influenced by the computer output, which was shown as a second opinion. We believe that this beneficial effect can be realized when the performance of the CAD scheme is very high and also if the radiologists can trust the computer output. Therefore, it will be important to introduce CAD schemes into the clinical environment only when the performance of the CAD scheme will be at a level that is acceptable to radiologists.
With regard to detection tasks, such as detection of nodules on chest radiographs (16,19,24), of interstitial opacities on chest radiographs (25), and of microcalcifications on mammograms (26), it has been shown that the gain in diagnostic accuracy by using a CAD scheme was generally greater for radiologists with limited experience. In our observer study, we also found that the benefits of a CAD scheme in terms of the improvement in the average Az value was greater for radiology residents (Az = 0.079) than it was for attending radiologists (Az = 0.071). However, the correlation coefficient between the likelihood measure of malignancy and the average change in terms of a percentage in confidence level for interpretation without and with computer output was greater for attending radiologists than it was for radiology residents, which appears to indicate that the effect of the CAD scheme for attending radiologists was greater than it was for radiology residents. This result suggests that the value of a CAD scheme for differential diagnosis depends on the diagnostic skill of the radiologist. The lack of confidence in decision making among less experienced radiologists might cause a larger variation in their decisions, even if correct (or incorrect) information was provided to them with the CAD scheme.
In the future, CAD schemes such as the one evaluated in this study are likely to be used in clinical situations, and radiologists will become skilled in the use of CAD schemes as they become familiar with the relationship between their own decision-making process and the computer output. The results in Table 2 indicate the potential usefulness of a CAD scheme in the improvement of radiologists performance; four additional benign cases from a total of 22 benign cases could be identified correctly when a CAD scheme was used, whereas only five cases were identified correctly as benign by radiologists without the computer output. It should be noted, however, that the computer identified 16 of 22 benign cases correctly, whereas all the malignant cases were identified correctly. Therefore, it is apparent that there is considerable room for further improvement in how the computer output is used by radiologists as a second opinion. In addition, it is important to note that we are not proposing that the CAD scheme can be used to determine which nodules are benign and eliminate the need for further work-up, but it seems that the CAD scheme has a potential to help avoid unnecessary CT examinations. Although this accuracy was obtained by using a round-robin test, such results depend on the individual radiologists who participate in the test and also on the number of cases used. Therefore, further studies with larger numbers of cases will be necessary prior to the use of this CAD scheme in actual clinical situations.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: Az = area under the best-fit ROC curve plotted in the unit square, CAD = computer-aided diagnosis, ROC = receiver operating characteristic
Author contributions: Guarantors of integrity of entire study, J.S., K.D.; study concepts, K.D.; study design, J.S., K.D.; literature research, J.S., M.A.; clinical studies, H.A., H.M.; experimental studies, J.S., R.E., H.A.; data acquisition, J.S., R.E.; data analysis/interpretation, J.S., K.D.; statistical analysis, J.S.; manuscript preparation, J.S.; manuscript definition of intellectual content, H.A., M.A., R.E.; manuscript editing, K.D.; manuscript revision/review and final version approval, K.D., H.M.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
H. P. McAdams, E. Samei, J. Dobbins III, G. D. Tourassi, and C. E. Ravin Recent Advances in Chest Radiography Radiology, December 1, 2006; 241(3): 663 - 683. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hirai, Y. Korogi, H. Arimura, S. Katsuragawa, M. Kitajima, M. Yamura, Y. Yamashita, and K. Doi Intracranial Aneurysms at MR Angiography: Effect of Computer-aided Diagnosis on Radiologists' Detection Performance Radiology, November 1, 2005; 237(2): 605 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Sortini, K. Maravegias, and A. Sortini Difficulty of early diagnosis in patients with solitary pulmonary nodule J. Thorac. Cardiovasc. Surg., May 1, 2005; 129(5): 1196 - 1196. [Full Text] [PDF] |
||||
![]() |
K Doi Current status and future potential of computer-aided diagnosis in medical imaging Br. J. Radiol., January 1, 2005; 78(suppl_1): S3 - s19. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Li, M. Aoyama, J. Shiraishi, H. Abe, Q. Li, K. Suzuki, R. Engelmann, S. Sone, H. MacMahon, and K. Doi Radiologists' Performance for Differentiating Benign from Malignant Lung Nodules on High-Resolution CT Using Computer-Estimated Likelihood of Malignancy Am. J. Roentgenol., November 1, 2004; 183(5): 1209 - 1215. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Sortini, C. V. Feo, G. Carrella, P. Carcoforo, E. Pozza, and A. Sortini Drawbacks to videothoracoscopic management of solitary pulmonary nodules Ann. Thorac. Surg., August 1, 2004; 78(2): 752 - 752. [Full Text] [PDF] |
||||
![]() |
G. Cardillo, M. D. Martino, and M. Martelli Drawbacks to videothoracoscopic management of solitary pulmonary nodules: Reply Ann. Thorac. Surg., August 1, 2004; 78(2): 752 - 753. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |