|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Thoracic Imaging |
1 From the Mallinckrodt Institute of Radiology, Washington University School of Medicine, 510 S Kingshighway Blvd, St. Louis, MO 63105 (D.S.G., T.K.P.); Westat Corporation, Rockville, Md (M.F.); the National Cancer Institute, Bethesda, Md (R.M.F.); the University of Minnesota Medical School, Minneapolis (T.R.C.), the University of Alabama at Birmingham School of Medicine (H.N.); the University of Colorado Health Sciences Center, Denver (K.G.); and the University of Pittsburgh School of Medicine, Pa (D.C.S.). From the 2005 RSNA Annual Meeting. Received December 8, 2006; revision requested February 20, 2007; revision received April 4; accepted May 4; final version accepted June 1. Supported by the National Cancer Institute Contract N01-CN-25516. Address correspondence to D.S.G. (e-mail: gieradad{at}wustl.edu).
Purpose: To evaluate agreement among radiologists on the interpretation of pulmonary findings at low-dose computed tomographic (CT) screening examinations for lung cancer.
Materials and Methods: Institutional review board approval and informed consent were obtained. HIPAA guidelines were followed. Sixteen radiologists from the 10 National Lung Screening Trial screening centers of the National Cancer Institute's Lung Screening Study network reviewed image subsets from 135 baseline low-dose screening CT examinations in 135 trial participants (89 men, 46 women; mean age, 62.7 years ± 5.4 [standard deviation]). Interpretations were classified into one of four of the following categories: noncalcified nodule 4 mm or larger in greatest transverse dimension (positive screening result); noncalcified nodule smaller than 4 mm in greatest transverse dimension (negative screening result); calcified, benign nodule (negative screening result); or no nodule (negative screening result). A recommendation for follow-up evaluation was obtained for each case. Interobserver agreement was evaluated by using the multirater
statistic and by using response frequencies and descriptive statistics.
Results: Multirater
values ranged from 0.58 (for agreement among all four classifications; 95% confidence interval: 0.55, 0.61) to 0.64 (for agreement on classification as a positive or negative screening result; 95% confidence interval: 0.62, 0.65). The average percentage of reader pairs in agreement on the screening result per case (percentage agreement) was 82%. There was wide variation in the total number of abnormalities detected and classified as pulmonary nodules, with differences of up to more than twofold among radiologists. For cases classified as positive, multirater
for follow-up recommendations was 0.35.
Conclusion: Interobserver agreement was moderate to substantial; potential for considerable improvement exists.
© RSNA, 2007
Clinical trial registration no. NCT00047385 [ClinicalTrials.gov]