|
|
||||||||
Thoracic Imaging |
1 From the Department of Radiology, Stanford University Medical Center, 300 Pasteur Dr, S072A, Stanford, CA 94305-5105. Received February 23, 2000; revision requested April 9; revision received June 12; accepted July 25. Supported in part by a grant from Fuji Medical Systems. Address correspondence to A.N.L. (e-mail: aleung@stanford.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: One hundred sixty patients who underwent dedicated computed tomography (CT) of the thorax were prospectively recruited into the study. Posteroanterior and lateral computed radiographs of the chest were acquired in each patient and printed in 2K and 4K formats. Six radiologists independently analyzed the hard-copy images and scored the presence of parenchymal (opacities
2 cm, opacities >2 cm, and subtle interstitial), mediastinal, and pleural abnormalities on a five-point confidence scale. With CT as the reference standard, observer performance tests were carried out by using receiver operating characteristic (ROC) analysis.
RESULTS: Analysis of averaged observer performance showed 2K and 4K images were equally effective in detection of all three groups of abnormalities. In the detection of the three subtypes of parenchymal abnormalities, there were no significant differences in averaged performance between the 2K and 4K formats (area below ROC curve [Az] values: opacities
2 cm, 0.62 ± 0.056 [standard error] and 0.59 ± 0.045; opacities >2 cm, 0.86 ± .025 and 0.85 ± 0.030; subtle interstitial abnormalities, 0.73 ± 0.041 and 0.72 ± 0.041). Averaged performance in detection of mediastinal and pleural abnormalities was equivalent (Az values: mediastinal, 0.70 ± 0.046 and 0.73 ± 0.033; pleural, 0.85 ± 0.032 and 0.86 ± 0.033).
CONCLUSION: Observer performance in detection of parenchymal, mediastinal, and pleural abnormalities was not significantly different on 2K and 4K storage phosphor chest radiographs.
Index terms: Radiography, comparative studies, 60.1215 Radiography, digital, 60.1215 Radiography, storage phosphor, 60.1215 Receiver operating characteristic (ROC) curve Thorax, radiography, 60.1215
| INTRODUCTION |
|---|
|
|
|---|
Authors of previous studies (25) attempted to define optimum matrix (pixel) size for chest radiography by selectively assessing for high-spatial-frequency findings such as pneumothorax or interstitial-type abnormalities with use of digitized radiographs converted from screen-film originals in a limited number of patients. The aim of this study was to compare observer performance in the detection of parenchymal, mediastinal, and pleural abnormalities on 1,760 x 2,140 matrix (2K) and 3,520 x 4,280 matrix (4K) digital storage phosphor chest radiographs in patients who underwent computed tomography (CT) of the thorax for clinical indications.
| MATERIALS AND METHODS |
|---|
|
|
|---|
CT Technique and Analysis
All CT examinations were performed during suspended full inspiration (HiSpeed CT/i; GE Medical Systems, Milwaukee, Wis). In 140 patients, helical CT data sets extending from the thoracic inlet to the lung bases were acquired during a single 2535-second breath hold by using a pitch of 1.02.0 and section thicknesses of 7 mm (n = 133), 5 mm (n = 4), and 3 mm (n = 3). Additional 12-mm thin-section images were often obtained, as clinically indicated, in regions of focal abnormality. In 20 patients, noncontiguous 1-mm CT sections were acquired at 10-mm intervals from the level of the thoracic inlet to the lung bases. All images were reconstructed with a bone reconstruction kernel; intravenous contrast medium (Omnipaque 300; Nycomed Amersham, Princeton NJ) was administered in 85 patients.
To establish a reference standard for interpretation of the digital radiographs, CT scans were reviewed by two thoracic radiologists (S.P.M.M., A.N.L.) in consensus. The observers evaluated for the presence or absence of parenchymal, mediastinal, and pleural abnormalities. Parenchymal abnormalities were further subdivided into three categories: opacities of 2 cm or smaller; opacities larger than 2 cm; and subtle interstitial abnormalities that were defined to include ground-glass opacities, micronodules (<7 mm), reticular opacities, cysts, bronchiectasis, and platelike atelectasis. Any other type of parenchymal abnormality, such as nodules, masses, granulomas, or consolidation, was categorized on the basis of size.
Mediastinal abnormalities were defined as hilar or mediastinal lymphadenopathy (nodes >1 cm in short-axis diameter on CT images), calcified lymph nodes, aortic diameter greater than 4 cm, calcified cardiac valves, and mediastinal or hilar masses, including hiatal hernias. Pleural abnormalities were defined as pleural thickening, calcification, effusion, or pneumothorax. CT images were viewed on a 12 x 16-inch cathode-ray-tube monitor with 2,048 x 2,560 x 8-bit frame memory. The monitor had a maximum brightness level of 80 foot lamberts and operated at 71 Hz in a noninterlaced mode to eliminate flicker. All CT images were viewed with mediastinal (level, 40 HU; width, 400 HU) and lung (level, -700 HU; width, 1,500 HU) window settings.
Computed Radiographic Acquisition, Processing, and Printing
The mean interval between CT and digital radiographic examinations was 52 minutes (range, 15 minutes to 14 hours); in 148 patients, CT and digital radiographic studies were performed within 1 hour of each other. Posteroanterior and lateral computed radiographs of the chest of each patient were acquired (FCR 9501 HQ; Fuji Photo Film, Tokyo, Japan) by using 130 kVp, 1.25-mm nominal focus, 72-inch film-focus distance, 12:1 Bucky grid, and phototimed exposure (Advantax; GE Medical Systems) to simulate a 400-speed system. Storage phosphor plates (ST-V; Fuji), which were 35 x 43 cm, were read by using a laser diode (CR-IR 327; Fuji) at a sampling rate of 10 pixels per millimeter and density resolution of 10 bits per pixel.
Computed radiographs were processed with a sigmoid, long-contrast Hurter and Driffield curve and slight edge enhancement at higher frequencies. The enhancement factor was 0.5, with a frequency range of greater than about 0.35 cycle per millimeter. The gradation processing parameters were as follows: rotation amount, 0.9; gradation type, E; rotation center, 1.6; and gradation shifting amount, -0.1.
Posteroanterior and lateral computed radiographic studies of each patient were printed at two different matrix sizes, each at a density resolution of 10 bits per pixel. The 2K images, which had an effective pixel size of approximately 0.2 mm, were printed at two-thirds size reduction onto 25.7 x 36.4-cm film (CR 780-H; Fuji); the 4K images, which had an effective pixel size of 0.1 mm, were printed at full size onto 35 x 43-cm film (LI-LM DL; Fuji).
Reading Methods
The 160 computed radiographic studies at each matrix size were divided into four sets of 40. The resultant eight study sets of 2K and 4K images were reviewed independently by six radiologists (G.D.R., Y.H.C., S.T.K., R.E.M., P.S., L.W.), each with at least 10 years of radiology experience, who were blinded to patient clinical information and results of CT analysis. All but one reviewer (R.E.M.) were subspecialty trained in either cardiac or thoracic radiology; neither of the two radiologists (S.P.M.M., A.N.L.) who interpreted the CT scans participated in the radiographic analysis. To avoid reading-order bias, the six radiologists were divided into two groups of three. Reading order for the group who began with a 2K set was: 2K, 4K, 4K, 2K, 2K, 4K, 4K, and 2K; the converse order was followed by the other group. The mean interval between reading sessions was 12 days (range, 340 days). All digital radiographic studies were interpreted by using a conventional multiviewer; movable black border shutters (6) were used to block extraneous light when viewing the smaller 2K images.
When analyzing the computed radiographs, observers were asked to determine the presence of parenchymal, mediastinal, and pleural abnormalities by using the same criteria as previously defined for CT analysis according to a five-level scale of confidence: 1, absent; 2, probably absent; 3, indeterminate; 4, probably present; 5, definitely present. Prior to formal analysis of the digital radiographic studies obtained in patients enrolled in this study, the six reviewers observed three radiographic studies of patients not in the study population and assigned scores in consensus to ensure consistency of readings. No time limit was imposed for reading the computed radiographic study sets.
Statistical Analysis
Observer performance for detection of parenchymal, mediastinal, and pleural abnormalities on the 2K and 4K computed radiographs was tested according to receiver operator characteristic (ROC) analysis of individual and averaged reader data. Detection accuracy was measured according to the area below the ROC curve (Az) value. We used a multireader-multicase ROC approach with use of the jackknife method (LABMRMC software; Metz CE et al, University of Chicago, Ill; available at ftp://random.bsd .uchicago.edu/roc/. Accessed October 1998.) to allow for generalization to the population of readers and cases (7,8). This method has the advantage of accounting for matrix sizecases, matrix sizereaders, and matrix sizereaders-cases interactions in its final estimates of the Az value (7). Statistical significance of the results was reported as 95% CIs for the mean difference of Az values for observer performance with 2K and 4K storage phosphor radiographs (9). Mean differences were regarded as statistically significant at the 5% level when the corresponding CI did not encompass zero (9).
Although assessment of the degree of interobserver variability is an integral component of the software package, it can also be estimated by using a statistic such as the Cohen weighted
statistic (10). This statistic was computed for each of 15 possible pairings of six observers by pooling together all of their observations.
| RESULTS |
|---|
|
|
|---|
Comparison of Az values for detection of parenchymal, mediastinal, and pleural findings on 2K and 4K computed radiographs on the basis of averaged observer performance showed no statistically significant differences in detection of any type of abnormality (P > .05) (Table). Specifically, no significant difference (P > .05) in averaged reader (Fig 1) detection of any of the three subtypes of parenchymal abnormalities was found (Fig 2). ROC curves for averaged reader performance in detection of mediastinal and pleural abnormalities are shown in Figure 3.
|
|
|
|
|
|
|
|
As a measure of interobserver variability among all six observers, the mean weighted
value was 0.41 (range, 0.320.48; standard error, 0.045), a level of mean agreement considered to be "fair to good" (11).
| DISCUSSION |
|---|
|
|
|---|
In this prospective study, case selection bias was minimized by enrollment of consecutive patients who presented to our institution for CT evaluation of the thorax because of clinical indications. CT was chosen as the objective reference standard for interpretation of the digital radiographs to avoid observer bias that may have been introduced by consensus panel review of the 2K and 4K digital radiographs, which are easily differentiated from one another owing to differences in film size. Since CT is a more sensitive technique than radiography for detection of thoracic abnormalities, its use as the reference standard is expected to result in more inferior observer performance than would have occurred with a less sensitive radiographic reference standard. However, adherence to this more rigorous reference standard does not invalidate a meaningful comparison between observer performance with 2K and 4K computed radiographs.
In our study, we found no statistically significant differences in averaged observer performance for detection of any type of parenchymal, mediastinal, or pleural abnormality on the 2K and 4K computed radiographs. These results agree with theoretic predictions that indicate the detection of large, predominantly low-spatial-frequency objects such as pulmonary nodules, mediastinal abnormalities, or pleural effusion and thickening may be more strongly dependent on the noise properties than on the spatial resolution properties of an imaging system (14). In an ROC study by Lams and Cocklin (4), detection of solitary pulmonary nodules (mean diameter, 12 mm) by using a cathode-ray-tube monitor display did not significantly (P > .05) improve at effective pixel sizes less than 0.8 mm. Effective pixel sizes for the 2K and 4K computed radiographs of the chest used in this study were approximately 0.2 and 0.1 mm, respectively.
Supported by both theoretic considerations (14,15) and empiric data (4,5,12), some investigators (16) have postulated that a 0.2-mm pixel size may provide adequate resolution for visual detection of necessary detail at digital chest radiography. However, because this resolution is inferior to that of standard screen-film systems, decreased detection of fine patterns of interstitial lung disease may occur owing to obscuration of these high-frequency image components by the larger pixels.
The results of our study suggest that diagnostic accuracy in detection of the subtle interstitial type of abnormalities is not significantly different between 0.2-mm pixel (2K) and 0.1-mm pixel (4K) computed radiographs. In contrast to our results, MacMahon and colleagues (2) found that for digital images acquired with screen-film systems, diagnostic accuracy for mild interstitial infiltrates was substantially improved on the higher resolution 0.1-mm pixel images. These differing results likely relate to technical differences between the types of digital imaging systems being evaluated in the two studies. The greater dynamic range of storage phosphor detectors in combination with edge enhancement postprocessing techniques can result in optical resolution on computed radiographs equivalent to that on screen-film images despite a lower intrinsic spatial resolution (17). On the basis of physical measures, Flynn et al (18) found that effective resolution on 2K and 4K computed radiographs were similar; this equivalence in resolution despite differences in output pixel size may relate to an intrinsic limitation of the phosphor detector or result from retention of high-frequency information on 2K images that were initially sampled at 10 pixels per millimeter (4K) before data output occurred at 5 pixels per millimeter (2K).
A confounding factor in our study was a difference in size between the 2K and 4K computed radiographs. Although Schaefer and colleagues (19) have shown that reduction in hard-copy size can negatively affect observer performance, particularly in detection of small, low-contrast objects such as fine lines or micronodules, no significant difference in detection of any type of abnormality on 2K and 4K images was found in our study despite the use of 2K images that were reduced to two-thirds of conventional image size. Primarily because of the increased complexity of data acquisition and analysis, we elected not to use free-response methodology (20) in our study design; although the additional requirement for localization of all detected abnormalities may have reduced the number of false-positive findings, we believe that this more rigorous methodology would not have had a major effect on the results of this comparative study.
The diagnostic accuracies of individual and averaged reader results in the detection of all types of abnormalities assessed in our study on both the 2K and 4K computed radiographs were uniformly lower than those reported in prior ROC studies (2,12,13,21,22). We attribute our lower diagnostic accuracy to differences in our study design, which likely resulted in inclusion of more subtle cases that were interpreted under more difficult reading conditions. Unlike patients in prior studies (2,12,13,21,22), the patients entered into our study were not selected on the basis of their radiographic findings. Because a recent CT examination was the required entry criterion and because CT served as the reference standard, in some cases, abnormalities may have been radiographically occult. Also, rather than limiting assessment to a particular type of abnormality within a limited area of the radiographs (2,12,22), our observers were required to analyze for the presence or absence of parenchymal, mediastinal, and pleural abnormalities on the entirety of the digital chest radiographs; these more rigorous reading conditions more closely simulate daily clinical practice.
In conclusion, observer performance in detection of parenchymal, mediastinal, and pleural abnormalities was not significantly different between 2K and 4K digital storage phosphor chest radiographs. Our results suggest that hard-copy display of computed radiographs of the chest at 2K resolution may be adequate for most clinical applications. It should be emphasized that this studys findings cannot be applied to resolution requirements either for nonstorage phosphor digital radiographic systems that have different optophysical properties or for soft-copy display because interpretation on cathode-ray-tube monitors introduces additional variables that can influence diagnostic accuracy.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
3 Current address: Department of Radiology, Dankook University College of Medicine, Seoul, Korea. ![]()
4 Current address: Department of Radiology, University of California, San Diego. ![]()
Abbreviations: Az = area below the ROC curve, ROC = receiver operating characteristic, 2K = 1,760 x 2,140 matrix, 4K = 3,520 x 4,280 matrix
Author contributions: Guarantors of integrity of entire study, S.P.M.M., A.N.L.; study concepts, S.P.M.M., A.N.L.; study design, S.P.M.M., A.N.L., G.D.R., S.K.P.; definition of intellectual content, S.P.M.M., A.N.L.; literature research, S.P.M.M., A.N.L.; clinical studies, S.P.M.M., A.N.L.; data acquisition, S.P.M.M., A.N.L., G.D.R., Y.H.C., S.T.K., R.E.M., L.W., P.S.; data analysis, S.P.M.M., A.N.L., B.J.B.; statistical analysis, S.P.M.M., A.N.L., S.K.P., B.J.B.; manuscript preparation, S.P.M.M.; manuscript editing, A.N.L., G.D.R.; manuscript review and final version approval, all authors.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. M. Goo, J.-G. Im, H. J. Lee, M. J. Chung, J. B. Seo, H. Y. Kim, Y.-J. Lee, J.-W. Kang, and J. H. Kim Detection of Simulated Chest Lesions by Using Soft-Copy Reading: Comparison of an Amorphous Silicon Flat-Panel-Detector System and a Storage-Phosphor System Radiology, July 1, 2002; 224(1): 242 - 246. [Abstract] [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |