Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Nakamura, K.
Right arrow Articles by Doi, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nakamura, K.
Right arrow Articles by Doi, K.
(Radiology. 2000;214:823-830.)
© RSNA, 2000


Computer Applications

Computerized Analysis of the Likelihood of Malignancy in Solitary Pulmonary Nodules with Use of Artificial Neural Networks1

Katsumi Nakamura, MD 2, Hiroyuki Yoshida, PhD, Roger Engelmann, MS, Heber MacMahon, MD, Shigehiko Katsuragawa, PhD, Takayuki Ishida, PhD, Kazuto Ashizawa, MD 3 and Kunio Doi, PhD

1 From the Department of Radiology, Kurt Rossmann Laboratories for Radiologic Image Research, University of Chicago, 5841 S Maryland Ave, Chicago, IL 60637. Received November 25, 1998; revision requested December 29; final revision received August 30, 1999; accepted September 2. Supported in part by United States Public Health Service grants CA24806 and CA62625. Address reprint requests to K.D. (e-mail: k-doi@uchicago.edu).


    Abstract
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
PURPOSE: To develop a computer-aided diagnostic scheme by using an artificial neural network (ANN) to assist radiologists in the distinction of benign and malignant pulmonary nodules.

MATERIALS AND METHODS: Fifty-six chest radiographs of 34 primary lung cancers and 22 benign nodules were digitized with a 0.175-mm pixel size and a 10-bit gray scale. Eight subjective image features were evaluated and recorded by radiologists in each case. A computerized method was developed to extract objective features that could be correlated with the subjective features. An ANN was used to distinguish benign from malignant nodules on the basis of subjective or objective features. The performance of the ANN was compared with that of the radiologists by means of receiver operating characteristic (ROC) analysis.

RESULTS: Performance of the ANN was considerably greater with objective features (area under the ROC curve, Az = 0.854) than with subjective features (Az = 0.761). Performance of the ANN was also greater than that of the radiologists (Az = 0.752).

CONCLUSION: The computerized scheme has the potential to improve the diagnostic accuracy of radiologists in the distinction of benign and malignant solitary pulmonary nodules.

Index terms: Computers, neural network • Computers, diagnostic aid • Diagnostic radiology, observer performance • Lung neoplasms, diagnosis, 60.11, 60.31, 60.321 • Lung, nodule, 60.281 • Receiver operating characteristic (ROC) curve


    Introduction
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Although a solitary pulmonary nodule is a common finding on a chest radiograph, the differential diagnosis of a pulmonary nodule is often difficult for radiologists (18). Because a noncalcified solitary pulmonary nodule may be the first sign of lung cancer, especially in its early stage, most patients undergo further diagnostic evaluation with computed tomography (CT) (1). About 20% of solitary pulmonary nodules in the population are malignant (9), whereas the majority of radiographically detected pulmonary nodules are benign (3,8,1014). According to findings from our recent survey of five medical centers (Appendix), 64 (48.1%) of the 133 patients in whom CT examinations were performed as part of the investigation had benign nodules. Some of these CT examinations could have been avoided if the benign nodules were identified as such on chest radiographs.

A computerized scheme that is capable of providing objective information may aid radiologists in the classification of pulmonary nodules. Various computerized schemes have been investigated for the characterization of pulmonary nodules. In most of these studies, however, radiographic features were extracted manually, and the computer was used only to merge the image features by rule-based or discriminant analysis for the determination of the likelihood of malignancy.

Swensen et al (15) estimated the probability of malignancy in radiologically indeterminate solitary pulmonary nodules by use of multivariate logistic regression. They concluded that three clinical characteristics (age, cigarette-smoking status, and history of cancer) and three radiologic characteristics (diameter, spiculation, and location in the upper lobe) were independent predictors of malignancy. Cummings et al (16) estimated the probability of malignancy of pulmonary nodules by use of Bayesian analysis based on the diameter of a solitary pulmonary nodule, patient's age, history of cigarette smoking, and prevalence of malignancy in solitary pulmonary nodules. Gurney (17) and Gurney et al (18) also used Bayesian analysis to calculate the probability of malignancy, which was compared with the findings of the radiologists.

Other investigators have used computer-extracted features to differentiate malignant and benign pulmonary nodules. Sherrier et al (19) applied gradient analysis in the distinction of benign and malignant nodules, and they reported that benign calcified granulomas showed a gradient number that was greater than that of malignant nodules. Sasaoka et al (20) extracted nodule features by use of a computerized method. However, the extracted features, such as density gradient and density entropy, were not directly correlated with specific radiologic findings. This lack of correlation makes it difficult to understand the importance of their findings.

Recently, artificial neural networks (ANNs) have been used in diagnostic radiology as potentially powerful classification tools (2127). Gurney and Swensen (28) reported that use of the Bayesian method was better than use of an ANN in the prediction of the probability of malignancy in pulmonary nodules in which radiographic features were extracted manually. Despite their considerable efforts, no practical computerized schemes were developed.

Our purpose in this study was to develop a practical computer-aided diagnostic scheme to assist radiologists in the objective distinction of benign and malignant pulmonary nodules.


    MATERIALS AND METHODS
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Database
The chest radiographs used in this study were selected from cases that were used in the development of computerized schemes for the detection of pulmonary nodules (29,30). Radiographs of solitary pulmonary nodules larger than 3 cm were excluded. No nodules had evidence of calcification on CT images or scarlike linear opacities.

The final diagnosis was established at pathologic examination, and, for some benign nodules, a presumed diagnosis was made on the basis of an absence of change or a decrease in nodule size during a 2-year period. The 34 primary bronchogenic carcinomas included adenocarcinomas (n = 26), squamous cell carcinomas (n = 3), small cell carcinoma (n = 1), carcinoid tumors (n = 1), and tumors of unknown subtype (n = 3). The 22 benign lesions were classified as pulmonary hamartomas (n = 2), granulomas (n = 12), inflammatory lesions (n = 7), and pulmonary infarctions (n = 1). Fifty-six radiographs were obtained in 33 women and 23 men (age range, 24–86 years; mean age, 58.4 years). Chest radiographs were digitized by use of a laser scanner (Abe Sekkei, Tokyo, Japan) with a pixel size of 0.175 mm and a 10-bit gray scale (1,024 gray levels).

Subjective Feature Extraction by the Radiologists
Eight subjective radiologic features of the pulmonary nodules were quantitatively recorded. These included nodule size, nodule shape, marginal irregularity, spiculation, border definition, lobulation, nodule density (contrast), and homogeneity. The observers used a ruler to measure nodule size. The variation in measured nodule sizes was mainly due to variation in subjective judgments of the edges of the nodule. Nodule shapes ranged from round to elongated, marginal irregularity ranged from smooth to irregular, spiculation ranged from nonspiculated to spiculated, border definition ranged from well defined to poorly defined, lobulation ranged from nonlobulated to lobulated, nodule density ranged from low to high, and homogeneity ranged from homogeneous to inhomogeneous.

Seven radiologists (four attending radiologists [including K.A.] and three radiology residents) used a score sheet with a scale from 1 to 5 to independently extract the features of each nodule. The score sheet included pictorial diagrams of two extreme examples, such as a nonspiculated nodule and a spiculated nodule, to serve as a guide. Table 1 shows examples of a radiologist's ratings for the eight radiologic findings in two pulmonary nodules (Fig 1). To eliminate bias, all radiologists rated each nodule without knowledge of the correct diagnosis.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Subjective Ratings of Image Features in Two Nodules
 


View larger version (169K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1a. Posteroanterior chest radiographs show (a) a malignant nodule (adenocarcinoma) in a 72-year-old woman and (b) a benign nodule (inflammatory granuloma) in a 67-year-old man.

 


View larger version (200K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1b. Posteroanterior chest radiographs show (a) a malignant nodule (adenocarcinoma) in a 72-year-old woman and (b) a benign nodule (inflammatory granuloma) in a 67-year-old man.

 
Objective Feature Extraction by Use of the Computer
Twelve features of the pulmonary nodule were extracted from digitized chest radiographs by use of computerized methods. These features were selected on the basis of their expected correlation with the subjective features that were used in the radiologists' ratings. These objective features were determined from the outline (or contour) of the nodule that was manually extracted by one radiologist (K.A.) (Fig 2). Another radiologist independently produced a second set of nodule outlines to examine the effect of variation in the outlines on feature extraction by use of the computer. The radiologists were not informed about the correct diagnosis to avoid bias in their drawing of the outlines.



View larger version (163K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2a. Posteroanterior chest radiographs show the hand-drawn margins and subsequently fitted ellipses for (a) a malignant nodule and (b) a benign nodule.

 


View larger version (191K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2b. Posteroanterior chest radiographs show the hand-drawn margins and subsequently fitted ellipses for (a) a malignant nodule and (b) a benign nodule.

 
The effective diameter of a nodule (31,32) was defined by the diameter of a circle with the same area as that of the nodule. The shape of a nodule was examined by the degree of circularity, which was defined by the ratio of the area of the nodule that overlapped the equivalent circle to the total area of the nodule (31,32). The degree of ellipticity was another measure used in the evaluation of the shape of the nodule. The degree of ellipticity was calculated in the same manner as the degree of circularity, except that an ellipse was fitted to the nodule outline (Fig 2) (33,34).

We believe that the definition of marginal irregularity contains two independent factors, namely, the magnitude and the coarseness (or fineness) of irregular edge patterns, which was defined here as the distance from the nodule outline to the fitted ellipse (Fig 3). The irregular edge pattern was analyzed by means of Fourier transformation. The root-mean-square variation and the first moment of the power spectrum (35) were calculated as measures of the magnitude and the coarseness, respectively, of marginal irregularity. The degree of irregularity (32), which was used as another measure of marginal irregularity, was defined as 1 minus the ratio of the perimeter (circumference) of the ellipse to the length of the contour.



View larger version (25K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3a. Graphs show the one-dimensional representation of the nodule margins as the distance from a fitted ellipse to the hand-drawn outline for (a) the malignant nodule shown in Figure 2a, and (b) the benign nodule shown in Figure 2b. The graph for the malignant nodule has a larger amplitude and higher frequency than the graph for the benign nodule. These findings appear to agree with the characteristics of two nodules; namely, the malignant nodule contained margins that were more irregular than those of the benign nodule.

 


View larger version (20K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3b. Graphs show the one-dimensional representation of the nodule margins as the distance from a fitted ellipse to the hand-drawn outline for (a) the malignant nodule shown in Figure 2a, and (b) the benign nodule shown in Figure 2b. The graph for the malignant nodule has a larger amplitude and higher frequency than the graph for the benign nodule. These findings appear to agree with the characteristics of two nodules; namely, the malignant nodule contained margins that were more irregular than those of the benign nodule.

 
Border definition was quantified by use of the mean gradient, which was obtained from the mean edge gradients over a selected border region. The radial gradient index was used to measure the spiculation of a nodule. The radial gradient index was defined by the mean absolute value of the radial edge gradients projected along the radial direction (36). The tangential gradient index was another measure of the spiculation of a nodule. The tangential gradient index was obtained by use of a tangential component of the edge gradient, which was projected perpendicularly to the radial direction. The line enhancement index was another measure of the spiculation of a nodule; it indicated the magnitude of the line pattern components obtained by means of the line enhancement filter (37), in a direction that was within 45° of the radial direction.

The mean pixel value was a measure of the optical density of a nodule. The SD of the pixel values over the nodule was a measure of the homogeneity of the nodule.

Artificial Neural Networks
Three-layer, feed-forward ANNs with back-propagation algorithms (38) were used. Two clinical parameters (patient's age and sex) and eight radiologic findings extracted by radiologists or physical measures obtained from the computer analysis were used as input data for the ANNs. The basic structure of the ANN included 10 input units, five hidden units, and one output unit. The number of hidden units was empirically determined, as it is generally done in ANN applications. Input data obtained from clinical parameters, subjective ratings by radiologists, and physical measures obtained by use of the computer were normalized to range from 0 to 1. The output of the ANN represented the likelihood of malignancy (0 = benign, 1 = malignant).

The training and testing of the ANN were performed by means of a round-robin (or leave-one-out) method (24). With this method, all of the cases in the database but one were used for training, and the case not used was applied in the testing of the trained ANNs. This procedure was repeated until every case in the database was used once for testing. The performance of the ANNs was evaluated on a per-patient basis (24) for individual radiologists and for all of the radiologists together by means of receiver operating characteristic (ROC) curves (39). The LABROC4 program (40) was used to fit the ROC curves, and the area under the ROC curve, Az, was used as an index of performance in the distinction of benign and malignant nodules.

Observer Performance
The seven radiologists participated in the evaluation of their performance in the classification of pulmonary nodules. Each observer was presented with a chest radiograph and two clinical parameters (patient's age and sex) and was asked to mark his or her confidence level regarding the likelihood of malignancy by using an analog continuous rating scale with a line-checking method (29). Confidence ratings of definitely benign and definitely malignant were marked above the left and the right ends of the line, respectively. Radiographs were presented in random order. ROC analysis was used for the comparison of the performance of observers with that of the computerized methods in the distinction of benign and malignant nodules.


    RESULTS
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Performance of the ANN with Subjective Features Extracted by the Radiologists
In the determination of the performance of the ANN, we first used the round-robin method with the data sets that were separately extracted by each radiologist. Table 2 shows the performance of the ANN with the subjective ratings by each radiologist. The Az values varied widely and ranged from 0.579 to 0.834, which indicated a considerable variation among the image features extracted by radiologists. The performance of the ANN with features extracted by the radiology residents was much lower than the performance with features extracted by the attending radiologists.


View this table:
[in this window]
[in a new window]

 
TABLE 2. Az Values
 
It is generally desirable for ANNs to achieve a high performance, even when only a small number of essential input units are applied. Thus, we attempted to reduce the number of input units because the initial selection of subjective features may have included redundant and nondiscriminating features that could have degraded the performance of the ANN in the distinction of benign and malignant pulmonary nodules.

We selected six features for input into the ANN on the basis of the performance obtainable with each independent feature and with the radiologists' knowledge and experience. These features included patient's age, nodule size, marginal irregularity, border definition, spiculation, and homogeneity.

The mean Az value of 0.761 for all seven radiologists with the six selected subjective features was significantly greater than the Az value for all 10 subjective features (Az = 0.710; P <.001). The mean Az value for the ANN with the six features selected by either the attending radiologists or the residents was greater than the Az value for all 10 features. Figure 4 shows the comparison of the performances of the radiologists and the ANN with the selected subjective features. The mean performance (Az = 0.790) of the ANN with selected features extracted by attending radiologists was slightly greater than the mean performance (Az = 0.774) of the attending radiologists. The mean performance (Az = 0.722) of the ANN with selected subjective features extracted by residents was lower than the mean performance (Az = 0.744) of the residents.



View larger version (21K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4a. ROC curves show that the mean performance of the ANN was (a) slightly better than that of the attending radiologists, with features extracted by the attending radiologists; (b) worse than that of the residents, with features extracted by the residents; and (c) slightly better than that of all of the radiologists, with features extracted by all of the radiologists. These findings indicate that radiologists with less experience could not extract the nodule features with sufficient accuracy. Consequently, the ANN could not learn well the specific patterns between the input and output data.

 


View larger version (22K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4b. ROC curves show that the mean performance of the ANN was (a) slightly better than that of the attending radiologists, with features extracted by the attending radiologists; (b) worse than that of the residents, with features extracted by the residents; and (c) slightly better than that of all of the radiologists, with features extracted by all of the radiologists. These findings indicate that radiologists with less experience could not extract the nodule features with sufficient accuracy. Consequently, the ANN could not learn well the specific patterns between the input and output data.

 


View larger version (20K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4c. ROC curves show that the mean performance of the ANN was (a) slightly better than that of the attending radiologists, with features extracted by the attending radiologists; (b) worse than that of the residents, with features extracted by the residents; and (c) slightly better than that of all of the radiologists, with features extracted by all of the radiologists. These findings indicate that radiologists with less experience could not extract the nodule features with sufficient accuracy. Consequently, the ANN could not learn well the specific patterns between the input and output data.

 
We selected another subset of six subjective features. These included patient's age, nodule size, marginal irregularity, border definition, nodule attenuation, and homogeneity. The ANN also showed a higher performance with this set of features than it did with all 10 features. These results indicated that the ANN appeared to be able to learn and to generalize better with the selected subjective features than with all subjective features.

We also evaluated the performance of the ANN by using the features extracted by all of the radiologists as a group. The performance of the ANN with the features extracted by all radiologists is also shown in Table 2. With all 10 of the subjective features, the Az value of 0.747 for the ANN with the features extracted by all radiologists was comparable to the mean Az value of 0.742 for the ANN with the features extracted by each attending radiologist.

However, when the ANN was used with the six selected subjective features, the Az value of 0.754 for the ANN with the features extracted by all radiologists was lower than the mean Az value of 0.790 for the ANN with the features extracted by each attending radiologist. This result seems to indicate that, despite the larger number of input data for the ANN with all radiologists, the ANN did not learn patterns of input data well in the distinction of benign and malignant nodules; this may have been due to the variations among radiologists' subjective ratings.

Computerized Analysis of Objective Features
To evaluate the usefulness of physical measures obtained by means of the computerized analysis, we first compared the physical measures with the radiologists' subjective ratings. Figure 5 shows the relationship between the root-mean-square value and the subjective ratings of marginal irregularity and the relationship between the SD of pixel values and the subjective ratings of homogeneity. These results indicate that physical measures corresponded well to the radiologists' subjective ratings. Table 3 shows the correlation coefficients between the physical measures and subjective features; in general, the coefficients indicated that most of the physical measures correlated well with the corresponding subjective features.



View larger version (17K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5a. Graphs depict the relationships between (a) the root-mean-square (rms) value and the subjective rating of marginal irregularity and (b) the SD of pixel value and the subjective rating of homogeneity. Physical measures corresponded well to the radiologists' subjective ratings. Error bars indicate the SD, {block} indicates the mean.

 


View larger version (18K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5b. Graphs depict the relationships between (a) the root-mean-square (rms) value and the subjective rating of marginal irregularity and (b) the SD of pixel value and the subjective rating of homogeneity. Physical measures corresponded well to the radiologists' subjective ratings. Error bars indicate the SD, {block} indicates the mean.

 

View this table:
[in this window]
[in a new window]

 
TABLE 3. Correlation between Subjective and Objective Features
 
Figure 6 shows the relationships between two selected image features of the pulmonary nodules. Although a considerable overlap between the malignant and benign pulmonary nodules was observed, a trend in the distribution of the two groups allowed discrimination of malignant and benign nodules. For example, malignant nodules tended to have a larger effective diameter and a smaller degree of ellipticity than did benign nodules (Fig 6a). This result appeared to agree with the general characteristics of pulmonary nodules; namely, malignant nodules are larger than benign nodules, and benign nodules tend to be round. Malignant nodules tended to have a larger mean edge gradient and a larger degree of elliptical irregularity than did benign nodules; this finding indicated the potential for the discrimination of benign and malignant nodules (Fig 6b). This result also agreed with the general observation that malignant nodules tended to contain spiculations and irregular margins.



View larger version (15K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6a. Scatterplots show the relationships between (a) the degree of ellipticity and the effective diameter and (b) the degree of elliptical irregularity and the mean edge gradient for benign and malignant nodules. Although a considerable overlap in the values for malignant and benign pulmonary nodules is observed, there are also trends in the distributions that could be used in the discrimination between malignant and benign nodules.

 


View larger version (16K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6b. Scatterplots show the relationships between (a) the degree of ellipticity and the effective diameter and (b) the degree of elliptical irregularity and the mean edge gradient for benign and malignant nodules. Although a considerable overlap in the values for malignant and benign pulmonary nodules is observed, there are also trends in the distributions that could be used in the discrimination between malignant and benign nodules.

 
Performance of the ANN with Objective Features Obtained from Computer Analysis
Although all 12 features could have been used in the distinction of benign and malignant nodules, it was desirable for ANN to achieve a high performance with a smaller number of essential input units, as was demonstrated in the analysis of ANN performance with subjective features. Therefore, we attempted to select certain features from the 12 physical measures and two clinical parameters that could generate a high performance. A genetic algorithm (41) was used to automatically generate combinations of features and to select a set of features that would yield a high performance when used as inputs to the ANN.

It should be noted, however, that we always included the patient's age and nodule size because these two features are considered to be among the most important features in the differentiation of pulmonary nodules (15,16). Table 4 shows combinations of computer-extracted features that were used to achieve a high performance with the ANN (Az > 0.830). Figure 7 shows a comparison of the ROC curves for the performances of the ANN with the computer-extracted features, of the ANN with six subjective ratings by selected radiologists, and of the radiologists. The results indicated that the performance of the ANN with the selected computer-extracted features was better than that of the radiologists or the ANN with subjective features.


View this table:
[in this window]
[in a new window]

 
TABLE 4. Performance of the ANN in the Distinction between Benign and Malignant Nodules with Image Features Extracted by Computer
 


View larger version (28K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 7. ROC curves show the performance of the radiologists, the ANN with objective features extracted by the computer, and the ANN with subjective features extracted by the radiologists in the distinction of benign and malignant nodules. The performance of the ANN with computer-extracted features was better than that of the radiologists or the ANN with subjective features.

 

    DISCUSSION
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
The performance of the ANN with subjective ratings extracted by the radiologists was better than the performance of the radiologists in the distinction of benign and malignant pulmonary nodules. Similar results were reported in studies on the differential diagnosis of breast cancer (22, 23) and interstitial lung diseases (42). These findings may be explained because radiologists do not effectively use all of the image features in their differential diagnosis of solitary pulmonary nodules. For example, radiologists, because of their knowledge and experience, may strongly depend on a limited number of conspicuous features in their decision making. Also, in general, radiologists tend not to consider all of the features systematically. On the other hand, the ANN is consistently and comprehensively affected by all of the data. In addition, the ANN are superior to radiologists in merging large amounts of data.

The ANN has a unique ability to learn specific patterns between input and output data if it is repeatedly trained with examples. However, this ability strongly depends on the quality of the input data. If the input data are randomly selected and if they have no correlation with the output data, the ANN cannot learn any specific pattern; this would result in poorer performance. In this study, the performance of the ANN with use of subjective ratings by each radiologist varied considerably. This seemed to reflect the large variation in the subjective ratings made by the radiologists. Although we provided pictorial diagrams to help the radiologists to improve their consistency in extracting subjective image features, the ratings were highly subjective and could have been strongly affected by an individual radiologist's knowledge and experience.

In general, it is desirable to train the ANN with a large database that contains a wide spectrum of data. In this study, therefore, we applied the round-robin test by combining all of the data provided by the seven radiologists. Although the performance of the ANN improved slightly with use of all of the data, compared with the mean performance of the ANN with each radiologist's data for all features, the performance decreased when selected features were used. This might have been caused by the variation among the subjective ratings by the radiologists. The data from some radiologists might have had a negative influence on the ANN in learning the pattern of the data from the other radiologists.

Another limitation of using the subjective ratings as input data for the ANN is that the quality of these subjective ratings depends on the ability of a radiologist to extract nodule features. In our study, the performance of the ANN with subjective features extracted by radiology residents was much lower than its performance with features extracted by attending radiologists. This indicated that radiologists with less experience could not extract the nodule features with sufficient accuracy. Consequently, the ANN could not learn the specific patterns between input data and output data well enough. Therefore, computer-aided diagnostic schemes that can be used to extract nodule features automatically, objectively, and reproducibly are highly desirable.

Our results showed that, with features extracted by the computer, the ANN performed better than the radiologists (mean value); it even performed better than the ANN with subjectively extracted features. Although physical measures were initially selected on the basis of their expected correlation with the subjective features, these physical measures may have contained additional and possibly useful information. This may have contributed to the distinction of the benign from the malignant pulmonary nodules.

If more useful parameters for the differentiation of pulmonary nodules (ie, history of smoking and tumor markers) are available as input data, a better performance with the ANN would be expected. However, in this study, we used a smaller number of essential features in our attempt to develop a ANN scheme for use in more practical clinical situations. We chose patient's age and sex because they are available in almost all clinical situations. However, smoking history and history of cancer are not routinely available and were, therefore, not incorporated into the analysis.

In this study, the computer analysis was performed on the basis of nodule outlines drawn by a radiologist. This introduced a subjective element that is a potential limitation of this study. Therefore, we examined the nodule outlines drawn by another radiologist and confirmed that the ANN, with another set of computer-extracted features, provided a comparable or better performance (Az = 0.920). However, the difference in the performance of the ANN with the nodule outlines drawn by each of the two radiologists was large. This could have been due to the variation between the two outlines, although the radiologists were not informed about the correct diagnosis. Therefore, the development of a computerized method for the automatic extraction of a nodule outline is desirable.

Because the definition of the morphology of pulmonary nodules is limited on chest radiographs, a CT image must be obtained in most patients who have noncalcified solitary pulmonary nodules. However, CT is expensive and exposes the patient to radiation. The aim of our computerized classification scheme for solitary pulmonary nodules was to reduce the number of patients with benign nodules who are referred for further diagnostic evaluation. This scheme achieved a higher performance than did the radiologists (mean value) in the study, although the computer was provided with a nodule outline drawn by a radiologist. It appears that the computerized classification method has the potential to be a useful aid to radiologists in the differentiation of benign and malignant pulmonary nodules; in the future, it may reduce the number of unnecessary CT examinations that are performed.


View this table:
[in this window]
[in a new window]

 
TABLE A1. Data from the Survey on the Final Diagnosis of Solitary Pulmonary Nodules Examined at Chest CT
 


    APPENDIX
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Although CT has become a major diagnostic method for the differentiation of pulmonary nodules in recent years, a large number of CT examinations were performed for patients suspected of having malignant nodules who actually had benign nodules. We conducted a survey to estimate the relative numbers of malignant and benign solitary pulmonary nodules that were investigated at chest CT. The survey was performed at the following five medical institutions: the University of Chicago Hospitals; the University of Occupational and Environmental Health Hospital, Kitakyushu, Japan; Nagasaki University Hospital, Japan; Iwate Prefectural Central Hospital, Morioka, Japan; and Tokyo Metropolitan Hospital, Japan. At each institution, the records of consecutive patients who underwent chest CT for evaluation of a suspicious pulmonary nodule on chest radiographs were reviewed for the clinical diagnosis before CT, final diagnosis, age, and sex. The final diagnosis was established at pathologic examination or clinical follow-up.

One hundred thirty-three patients (83 men, 50 women; age range, 25–85 years; mean age, 62.9 years) were identified at the five institutions. The clinical diagnosis before CT consisted of suspected lung cancer (n = 43), pulmonary nodule and/or mass (n = 70), abnormal shadow (n = 10), suspected pulmonary metastasis (n = 6), and benign diseases (suspected pulmonary tuberculosis [n = 1], abscess [n = 2], or aspergillosis [n = 1]). Of the patients who, before CT, were suspected of having a pulmonary nodule and/or mass, some might have been suspected of having benign disease. However, we assumed that most of these patients were suspected of having malignant disease.

Table A1 summarizes the data from this survey. Fifty-five (41.4%) of the 133 patients had malignant nodules, which included primary lung cancer and pulmonary metastases. Sixty-four patients (48.1%) had benign conditions, which included benign diseases and negative findings with no apparent pulmonary abnormality at CT. Fourteen patients did not have conclusive final diagnoses. The results obtained in this survey showed that a large fraction of the patients who underwent chest CT were identified as having benign conditions. This finding appears to indicate that some of these studies could have been avoided if the benign cases had been confidently diagnosed on chest radiographs.


    Acknowledgments
 
The authors are grateful to John J. Fennessy, MD, Carl J. Vyborny, MD, PhD, Shunji Tsukuda, MD, Ibraham Syed, MD, Christine Colton, MD, and Joe Montalbano, MD, for participating as observers; to Charles E. Metz, PhD, for his useful suggestions and discussions; and to Elizabeth Lanzl for improving the manuscript.


    Footnotes
 
2 Current addresses: Department of Radiology, University of Occupational and Environmental Health School of Medicine, Kitakyushu, Japan. Back

3 Department of Radiology, Nagasaki University School of Medicine, Japan. Back

H. M. and K. D. are shareholders of R2 Technology, Los Altos, Calif. It is the policy of the University of Chicago that investigators publicly disclose actual or potential substantial financial interests that may appear to be affected by the research activities.

Abbreviations: ANN = artificial neural network ROC = receiver operating characteristic

Author contributions: Guarantors of integrity of entire study, study concepts and design, and definition of intellectual content, K.N., K.D.; literature research, K.N., H.M.; clinical studies, K.N., H.M., K.A.; experimental studies, T.I., S.K.; data acquisition, H.Y., R.E., S.K.; data analysis, H.Y., K.N., S.K.; manuscript preparation, K.N., K.D.; manuscript editing, H.M.; manuscript review, H.Y., R.E., S.K., T.I., K.A.


    References
 TOP
 Abstract
 Introduction
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 

  1. Webb WR. Radiologic evaluation of the solitary pulmonary nodule. AJR Am J Roentgenol 1990; 154:701-708.[Free Full Text]
  2. Lillington GA. The solitary pulmonary nodule 1974. Am Rev Respir Dis 1974; 110:699-706.[Medline]
  3. Ray JF, Lawton BR, Magnin GE, et al. The coin lesion story: update 1976—twenty years experience with early thoracotomy for 179 suspected malignant coin lesions. Chest 1976; 70:332-336.
  4. Gracey DR, Byrd RB, Cagell DW. The dilemma of the asymptomatic pulmonary nodule in the young and not-so-young adult. Chest 1971; 60:479-483.[Free Full Text]
  5. Nathan MH. Management of solitary pulmonary nodules: an organized approach based on growth rates and statistics. JAMA 1974; 227:1141-1144.[Medline]
  6. Lillington GA, Stevens GM. The solitary nodule: the other side of the coin. Chest 1976; 70:322-323.[Free Full Text]
  7. Cortese DA. Solitary pulmonary nodule: observe, operate, or what?. Chest 1982; 81:662-663.
  8. Lillington GA. Pulmonary nodules: solitary and multiple. Clin Chest Med 1982; 3:361-367.[Medline]
  9. Khouri NF, Meziane MA, Zerhouni EA, Fishman EK. The solitary pulmonary nodule: assessment, diagnosis, and management. Chest 1987; 91:128-133.[Abstract/Free Full Text]
  10. Bateson EM. An analysis of 155 solitary lung lesions illustrating the differential diagnosis of mixed tumors of the lung. Clin Radiol 1965; 16:51-65.[Medline]
  11. Edwards WM, Cox RS, Jr, Garland LH. The solitary nodule (coin lesion) of the lung: an analysis of 52 consecutive cases treated by thoracotomy and a study of preoperative diagnostic accuracy. AJR Am J Roentgenol 1962; 88:1020-1042.
  12. Toomes H, Delphendahl A, Manke HG, Vogt-Moydopf I. The coin lesion of the lung: a review of 955 resected coin lesions. Cancer 1983; 51:534-537.[Medline]
  13. Keagy BA, Starek PJK, Murray GF, Battaglini JW, Lores ME, Wilcox BR. Major pulmonary resection for suspected but unconfirmed malignancy. Ann Thorac Surg 1984; 38:314-316.[Abstract]
  14. Daly BDT, Faling LJ, Diehl JT, Bankoff MS, Gale ME. Computed tomography–guided minithoracotomy for the resection of small peripheral pulmonary nodules. Ann Thorac Surg 1991; 51:465-469.[Abstract]
  15. Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of malignancy in solitary pulmonary nodules: application to small radiologically indeterminate nodules. Arch Intern Med 1997; 157:849-855.[Abstract]
  16. Cummings SR, Lillington GA, Richard RJ. Estimating the probability of malignancy in solitary pulmonary nodules. Am Rev Respir Dis 1986; 134:449-452.[Medline]
  17. Gurney JW. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. I. Theory. Radiology 1993; 186:405-413.[Abstract/Free Full Text]
  18. Gurney JW, Lyddon DM, McKay JA. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. II. Application. Radiology 1993; 186:415-422.[Abstract/Free Full Text]
  19. Sherrier RH, Chiles C, Johnson GA, Ravin CE. Differentiation of benign from malignant pulmonary nodules with digitized chest radiographs. Radiology 1987; 162:645-649.[Abstract/Free Full Text]
  20. Sasaoka S, Takabatake H, Mori M, Natori H, Abe S. Digital analysis of pulmonary nodules: potential usefulness of computer-aided diagnosis for differentiation of benign from malignant nodules. Jpn J Chest Dis 1995; 33:489-496[Japanese].
  21. Asada N, Doi K, MacMahon H, et al. Potential usefulness of an artificial neural network for differential diagnosis of interstitial lung diseases: pilot study. Radiology 1990; 177:857-860.[Abstract/Free Full Text]
  22. Wu Y, Doi K, Giger ML, Nishikawa RM. Computerized detection of clustered microcalcifications in digital mammograms: application of artificial neural networks. Med Phys 1992; 19:555-560.[Medline]
  23. Wu Y, Giger ML, Doi K, Vyborny CJ, Schmidt RA, Metz CE. Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer. Radiology 1993; 187:81-87.[Abstract/Free Full Text]
  24. Jiang Y, Nishikawa RM, Wolverton DE, et al. Malignant and benign clustered microcalcifications: automated feature analysis and classification. Radiology 1996; 198:671-678.[Abstract/Free Full Text]
  25. Ishida T, Katsuragawa S, Ashizawa K, MacMahon H, Doi K. Artificial neural networks in chest radiographs: detection and characterization of interstitial lung disease. Proc SPIE 1997; 3034:931-937.
  26. Gross GW, Boone JM, Greco-Hunt V, Greenberg B. Neural networks in radiologic diagnosis. II. Interpretation of neonatal chest radiographs. Invest Radiol 1990; 25:1017-1023.[Medline]
  27. Lo SC, Freedman MT, Lin JS, Mun SK. Automatic lung nodule detection using profile matching and back-propagation neural network techniques. J Digit Imaging 1993; 6:48-54.[Medline]
  28. Gurney JW, Swensen SJ. Solitary pulmonary nodules: determining the likelihood of malignancy with neural network analysis. Radiology 1995; 196:823-829.[Abstract/Free Full Text]
  29. Kobayashi T, Xu XW, MacMahon H, Metz CE, Doi K. Effect of a computer-aided diagnosis scheme on radiologists' performance in detection of lung nodules on radiographs. Radiology 1996; 199:843-848.[Abstract/Free Full Text]
  30. Xu XW, Doi K. Development of an improved CAD scheme for automated detection of lung nodules in digital chest images. Med Phys 1997; 24:1395-1403.[Medline]
  31. Giger ML, Doi K, MacMahon H, Metz CE, Yin FF. Pulmonary nodules: computer-aided detection on digital chest images. RadioGraphics 1990; 17:861-865.
  32. Matsumoto T, Yoshimura H, Doi K, et al. Image feature analysis of false-positive diagnoses produced by automated detection of lung nodules. Invest Radiol 1992; 27:587-597.[Medline]
  33. Pilu M, Fitzgibbon A, Fisher R. Ellipse-specific direct least-square fitting. Proceedings of the IEEE international Conference on Image Processing. Los Alamitos, Calif: IEEE Computer Society Press, 1996; 3:599-602.
  34. Fitzgibbon A, Pilu M, Fisher R. Direct least-square fitting of ellipses. Proceedings of the 13th International Conference on Pattern Recognition. Los Alamitos, Calif: IEEE Computer Society Press, 1996; 1:253-257.
  35. Katsuragawa S, Doi K, Nakamori N, MacMahon H. Image feature analysis and computer-aided diagnosis in digital radiography: effect of digital parameters on the accuracy of computerized analysis of interstitial disease in digital chest radiographs. Med Phys 1990; 17:72-78.[Medline]
  36. Bick U, Giger ML, Schmidt RA, Doi K. A new single-image method for computer-aided detection of small mammographic masses In: Lemke HU, Inamura K, Jaffe CC, Vannier MW, eds. Computer Assisted Radiology. Amsterdam, the Netherlands: Elsevier, 1995; 357-363.
  37. Ishida T, Katsuragawa S, Kobayashi T, MacMahon H, Doi K. Computerized analysis of interstitial disease in chest radiographs: improvement of geometric-pattern feature analysis. Med Phys 1997; 24:915-924.[Medline]
  38. Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL, eds. Parallel distributed processing: explorations in the microstructure of cognition. Vol 1. Cambridge, Mass: MIT Press, 1986; 318-362.
  39. Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986; 21:720-733.[Medline]
  40. Metz CE, Herman BA, Shen JH. Maximum-likelihood estimation of receiver operating characteristic (ROC) curves from continuously distributed data. Stat Med 1998; 17:1033-1053.[Medline]
  41. Anastasio MA, Yoshida H, Nagel R, Nishikawa RM, Doi K. A genetic algorithm-based method for optimizing the performance of a computer-aided diagnosis scheme for detection of clustered microcalcifications in mammograms. Med Phys 1998; 25:1613-1620.[Medline]
  42. Ashizawa K, MacMahon H, Ishida T, et al. Effect of artificial neural networks on radiologists' performance for differential diagnosis of interstitial lung disease using chest radiographs. AJR Am J Roentgenol 1999; 172:1311-1314.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
Am. J. Neuroradiol.Home page
K. Yamashita, T. Yoshiura, H. Arimura, F. Mihara, T. Noguchi, A. Hiwatashi, O. Togao, Y. Yamashita, T. Shono, S. Kumazawa, et al.
Performance Evaluation of Radiologists with Artificial Neural Network for Differential Diagnosis of Intra-Axial Cerebral Tumors on MR Images
AJNR Am. J. Neuroradiol., June 1, 2008; 29(6): 1153 - 1158.
[Abstract] [Full Text] [PDF]


Home page
ThoraxHome page
E M Schultz, G D Sanders, P R Trotter, E F Patz Jr, G A Silvestri, D K Owens, and M K Gould
Validation of two models to estimate the probability of malignancy in patients with solitary pulmonary nodules
Thorax, April 1, 2008; 63(4): 335 - 341.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
M. K. Gould, J. Fletcher, M. D. Iannettoni, W. R. Lynch, D. E. Midthun, D. P. Naidich, and D. E. Ost
Evaluation of Patients With Pulmonary Nodules: When Is It Lung Cancer?: ACCP Evidence-Based Clinical Practice Guidelines (2nd Edition)
Chest, September 1, 2007; 132(3_suppl): 108S - 130S.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
Y. J. Jeong, C. A. Yi, and K. S. Lee
Solitary Pulmonary Nodules: Detection, Characterization, and Guidance for Further Diagnostic Workup and Treatment
Am. J. Roentgenol., January 1, 2007; 188(1): 57 - 68.
[Abstract] [Full Text] [PDF]


Home page
JNMHome page
Y. Nie, Q. Li, F. Li, Y. Pu, D. Appelbaum, and K. Doi
Integrating PET and CT Information to Improve Diagnostic Accuracy for Lung Nodules: A Semiautomatic Computer-Aided Method
J. Nucl. Med., July 1, 2006; 47(7): 1075 - 1080.
[Abstract] [Full Text] [PDF]


Home page
Br. J. Radiol.Home page
K Doi
Current status and future potential of computer-aided diagnosis in medical imaging
Br. J. Radiol., January 1, 2005; 78(suppl_1): S3 - s19.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
F. Li, M. Aoyama, J. Shiraishi, H. Abe, Q. Li, K. Suzuki, R. Engelmann, S. Sone, H. MacMahon, and K. Doi
Radiologists' Performance for Differentiating Benign from Malignant Lung Nodules on High-Resolution CT Using Computer-Estimated Likelihood of Malignancy
Am. J. Roentgenol., November 1, 2004; 183(5): 1209 - 1215.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
A. Fukushima, K. Ashizawa, T. Yamaguchi, N. Matsuyama, H. Hayashi, I. Kida, Y. Imafuku, A. Egawa, S. Kimura, K. Nagaoki, et al.
Application of an Artificial Neural Network to High-Resolution CT: Usefulness in Differential Diagnosis of Diffuse Lung Disease
Am. J. Roentgenol., August 1, 2004; 183(2): 297 - 305.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
J. Shiraishi, H. Abe, R. Engelmann, M. Aoyama, H. MacMahon, and K. Doi
Computer-aided Diagnosis to Distinguish Benign from Malignant Solitary Pulmonary Nodules on Radiographs: ROC Analysis of Radiologists' Performance--Initial Experience
Radiology, May 1, 2003; 227(2): 469 - 474.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
B. B. Tan, K. R. Flaherty, E. A. Kazerooni, and M. D. Iannettoni
The Solitary Pulmonary Nodule
Chest, January 1, 2003; 123(1_suppl): 89S - 96S.
[Abstract] [Full Text] [PDF]


Home page
RadioGraphicsHome page
H. Abe, H. MacMahon, R. Engelmann, Q. Li, J. Shiraishi, S. Katsuragawa, M. Aoyama, T. Ishida, K. Ashizawa, C. E. Metz, et al.
Computer-aided Diagnosis in Chest Radiography: Results of Large-Scale Observer Tests at the 1996-2001 RSNA Scientific Assemblies
RadioGraphics, January 1, 2003; 23(1): 255 - 265.
[Abstract] [Full Text] [PDF]