|
|
||||||||
Gastrointestinal Imaging |
1 From the Mallinckrodt Institute of Radiology (E.G.M., T.K.P., R.A.M., C.V.S., P.W.B., J.P.H., D.M.B.) and Department of Internal Medicine, Gastroenterology Division (L.B.W., E.P.T.), Washington University School of Medicine, 510 S Kingshighway Blvd, St Louis, MO 63110; Department of Diagnostic Radiology, Yale University School of Medicine, New Haven, Conn (J.A.B.); and Department of General Internal Medicine, University of Vermont, Burlington (B.L.). Received October 2, 2001; revision requested November 16; final revision received June 6, 2002; accepted June 7. Supported in part by National Cancer Institute PLCO Cancer Screening Trial (N01-CN-25516), General Electric, Association of University Radiologists (GERRAF, EGM), and Washington University Siteman Cancer Center. Address correspondence to E.G.M. (e-mail: mcfarlandb@mir.wustl.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: A cohort of 70 patients suspected of having polyps was examined with spiral computed tomographic (CT) colonography, with colonoscopy performed the same day. After air insufflation per rectum, supine and prone images were obtained with singledetector row CT (5-mm collimation, 8-mm table increment, 2-mm reconstruction interval). Images were analyzed independently by four experienced abdominal radiologists using two-dimensional multiplanar reformation followed by selective use of three-dimensional endoscopic volume-rendered images. Data were analyzed both per polyp and per patient.
RESULTS: Analysis per polyp demonstrated a pooled sensitivity of 0.68 for lesions 10 mm or larger (n = 40), with 75% agreement among the four readers. Analysis per patient demonstrated improved detection and agreement, with a pooled sensitivity of 0.88 for patients with polyps or cancers 10 mm or larger (n = 28), with 94% agreement. When sensitivity and receiver operating characteristic analyses were analyzed per polyp size threshold, results among readers converged and peaked at polyp diameters of approximately 10 mm.
CONCLUSION: In this patient cohort, diagnostic performance and interobserver agreement with singledetector row CT colonography was sufficient for detection of patients with lesions 10 mm or larger, with more variable results for smaller polyps.
© RSNA, 2002
Index terms: Colon, CT, 75.12115, 75.12117 Colon neoplasms, 75.311 Computed tomography (CT), image processing, 75.12115, 75.12117 Diagnostic radiology, observer performance Images, analysis, 75.12115, 75.12117
| INTRODUCTION |
|---|
|
|
|---|
Preliminary estimates of the diagnostic performance of CT colonography have been promising but variable. Sensitivity for detection of polyps 10 mm and larger has been reported to be 50%90% in series with variability in sample size, case mix, image acquisition, and image-display techniques (820). In 2001, authors of the largest published prospective series (19) reported excellent sensitivity (90%) of singledetector row CT for polyps 10 mm and larger (n = 88) in 300 patients. To date, reader performance has been analyzed in very few studies (21,22).
Our purpose was to prospectively evaluate multiobserver diagnostic performance and reader agreement for colorectal polyp detection in a well-characterized cohort of polyp-rich patients, that is, patients with increased number of polyps, compared with an average-risk patient, with colonoscopy as the reference standard.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Data Acquisition
On the day before the CT and colonoscopic examinations, all patients underwent a standard bowel cleansing preparation (phospho-soda, 24-hour Fleet 1 preparation, Fleet Pharmaceuticals, Lynchburg, Va; or polyethylene glycol electrolyte solution, GoLytely, Braintree Laboratories, Braintree, Mass), with the recommended dietary restriction. Each patient underwent the CT examination just prior to the colonoscopic examination.
For the CT examination, air was insufflated in each patient manually per rectum by an abdominal radiologist (E.G.M., R.A.M.). Images were acquired with the patients in the supine position, directly followed by acquisition in the prone position. Patients were initially positioned in the right lateral decubitus position, followed by the supine position, with a total instillation of air of approximately 3040 puffs (30 mL per puff). At the initiation of the study, glycopyrrolate (Robinol; Robins, Richmond, Va) was given intravenously to decrease colonic spasm in 22 of the patients; however, it was discontinued owing to lack of consistent improvement of colonic distention with bowel relaxants (24). An anteroposterior CT topogram was acquired to confirm adequate insufflation for each position, and further air was instilled per rectum as needed.
Images of the abdomen and pelvis were obtained with a singledetector row spiral CT unit (Somatom Plus 4, Siemens Medical Systems, Iselin, NJ; or Proscan, GE Medical Systems, Milwaukee, Wis) by using 5-mm collimation, 8-mm table increment, 200 mAs, 120 kv, and a 2-mm reconstruction interval. Image acquisition times ranged from approximately 45 to 60 seconds for each position. The transverse source images were transferred to a dedicated 3D workstation (Silicon Graphics 02; Silicon Graphics, Mountain View, Calif) with 1-Gbyte RAM by using dedicated 3D software (Vitrea 1.2; Vital Images, Minneapolis, Minn) for the analysis of the colon. For the majority of data sets, further linear interpolation of the section data was performed, resulting in nearly isotropic voxel dimensions (in-plane of 0.60.8 mm2 and through-plane of 0.62.0 mm).
After the CT examinations, the patients directly underwent colonoscopy. The colonoscopic examinations were performed (without knowledge of the CT findings) by board-certified gastroenterologists (L.B.W., E.P.T.; average experience, 9 years; range, 513 years) in the gastrointestinal division of the Department of Internal Medicine at Washington University (St Louis, Mo). A dedicated research coordinator was present during each colonoscopic examination to facilitate recording of the size and location of each polyp. For each polyp that was localized, the polyp was measured in vivo by the gastroenterologist using a measuring device (Olympus M2-3U; Olympus of America, Lake Success, NY), which was calibrated with 2-mm markers. The gastroenterologists were instructed to measure the long axis of the base or head of the polyp, with exclusion of the stalk if present. Digital images of the polyp were obtained, and the entire examination was recorded on a VHS tape. A summary of the polyp sizes and locations was recorded at the end of each examination. For each lesion, pathologic specimens were analyzed. However, in several cases in which multiple lesions were submitted in the same container, individual identification of each lesion was not possible.
CT Image Analysis and Reader Protocol
Four experienced fellowship-trained abdominal radiologists (E.G.M., J.A.B., J.P.H., D.M.B.) with a mean of 12.3 years (range, 519 years) of faculty-level body CT experience participated in this study. Prior experience with CT colonography consisted of initial training with a teaching library of five CT colonographic data sets, followed by completion of a previous test and a retest study of 30 colonic segments containing 22 lesions (22). All readers were facile with the software and two-dimensional (2D) and 3D image-display techniques.
To preserve patient confidentiality, the digital imaging and communications in medicine headers of the patient names were electronically stripped in all 140 acquisitions (70 supine, 70 prone) and a random code number ranging from 100 to 999 was assigned for each supine and prone pair. The order of the cases was randomly assigned, and each reader read the cases in approximately the same order.
Each radiologist independently evaluated the 70 patient data sets (supine and prone acquisitions read together for each patient). Initially, the colon was surveyed by using 2D multiplanar reformation (MPR), which uses interactive viewing of the transverse, coronal, and sagittal images of the entire volumetric CT data at two fixed window level settings. We predominantly used a high-contrast window display setting (width of 1,500 HU and level of -200 HU) for polyp detection. This window display setting was chosen empirically to provide high contrast for polyp detection while preserving some soft-tissue detail to discriminate extraluminal findings, including diverticula, lipomas, and wall abnormalities. A soft-tissue window display setting (width of 400 HU and level of 10 HU) was used occasionally for further wall or fat characterization.
For each focal finding detected with 2D MPR, the x, y, and z coordinates were recorded and the greatest long-axis dimension of the focal finding was measured. For polypoid lesions, the polyp head but not the stalk was measured. Each focal finding detected with 2D MPR was further characterized with 3D endoscopic view by using perspective volume rendering (PVR) with color. A nonlinear opacity function, which assigns a percentage of opacity across each attenuation value of the frequency histogram, and a mucosal color map of the red-green-blue assignments were used (25).
By using the combined approach of 2D MPR to detect and the 3D endoscopic view to further characterize each focal finding, a a five-point rated response was used to assign a level of confidence for the presence of a polyp or cancer: 1, definitely not present; 2, probably not present; 3, possibly present; 4, probably present; and 5, definitely present. Each detected lesion was first scored with 2D MPR alone, then an additional score was given after further evaluation with 3D PVR. The readers were instructed that for the sensitivity and specificity analyses, scores of 1 and 2 would be collapsed to negative and scores of 3, 4, and 5 would be collapsed to positive. At the end of the entire case, a similar five-point response was also used for the patient score. This score represented each readers overall assessment of the patient, including presence or absence of focal finding(s), lesion size and morphology, and readers confidence. The duration of the reading session for each patient, defined as the time from the start of the evaluation of the images at the 3D workstation to the completion of an Internet-posted report page of labeled images for each detected lesion, was recorded for each reader.
After the reader study was completed, three investigators (C.V.S., R.A.M., E.G.M.) together determined the supine and prone CT colonographic coordinates for lesions 4 mm and larger found at colonoscopy. On the basis of the numbering system of lesions found at colonoscopy, the results of the four radiologists were then correlated with the CT findings of the true-positive lesions. Correlation of the false-positive CT findings was also performed among the four readers on a per-lesion basis.
Image quality grading scores for tortuosity, colonic collapse (degree of nondistention), fluid retention, and motion artifact for all of the supine and prone segments (rectum, sigmoid colon, descending colon, transverse colon, and ascending colon or cecum, with defined CT criteria for the flexural transitions) were also determined collectively by the three investigators. At the beginning of the qualitative evaluation, the investigators reviewed several data sets that represented the range of the different grades of tortuosity, collapse, fluid retention, and motion artifact. This process helped to calibrate the qualitative scoring process, and good agreement among the investigators was present. Tortuosity and collapse were graded with three-point grading scale: 1, mild; 2, moderate; and 3, severe. Fluid retention was also graded with a three-point grading scale: 1 for mild, fluid level one-fourth or less of lumen; 2 for moderate, fluid level ranging from more than one-fourth to less than or equal to one-half of lumen; and 3 for severe, fluid level more than one-half of lumen. For tortuosity, collapse, and fluid evaluations, the dominant grade per segment was determined qualitatively, since variation within a given segment was often present. Motion artifact (breathing or cardiac) was quantified for each patient (on each supine and prone data set). Four qualitative grades were defined: grade 0, no artifact; grade 1, slight artifact, with no diagnostic quality degradation; grade 2, mild to moderate artifact, with appreciable diagnostic quality degradation; and grade 3, severe artifact, with marked diagnostic quality degradation. The total percentage of grades 2 and 3 artifact was measured on the coronal MPR per patient (eg, craniocaudal distance of the summed artifact divided by total craniocaudal distance of the colon).
Statistical Analysis
Colonoscopic findings were the reference standard for the presence of colorectal lesions. Polyp size determined at colonoscopy was used as the reference for calculation of sensitivity and specificity by different size categories. Since calculation of positive predictive value involved counting false-positive findings at CT with no colonoscopic correlate, each radiologists size estimate of the lesions from CT was used.
Readers rated both individual polyps and overall patient scores with a five-point scale conveying certainty of diagnosis. For analysis per polyp, sensitivity and positive predictive value were calculated by collapsing the five-point rated responses into a binary diagnosis, with the two points that represented strong negative and weak negative diagnoses becoming absolute negative diagnoses and the three points that represented strong positive, weak positive, and uncertain diagnoses becoming absolute positive diagnoses.
For analysis per patient, patients were assigned into a category according to the size of the largest polyp found at colonoscopy. Calculation of sensitivity was performed on the basis of the overall patient-rated responses of each reader. Calculation of specificity was performed on the basis of true-negative findings in patients defined as having polyp(s) 5 mm or smaller in diameter at colonoscopy. To evaluate for suitability of pooling, individual sensitivity (per polyp and per patient) and specificity (per patient) results were examined for heterogeneity with a simple
2 test. Results were pooled only when differences among readers were not significant (P > .05). Receiver operating characteristic analyses were also performed with a software program (MRMC 1.70; available at: ftp://perception.radiology.uiowa.edu/public/MRMC32) (26) by using the responses from the five-point scale. No pooling of receiver operating characteristic results was performed.
Both sensitivity and receiver operating characteristic analyses were also performed at multiple polyp size thresholds. For calculation of sensitivity at different lower limits of polyp size, only polyps equal to or larger than the threshold or patients whose largest polyp was equal to or larger than the threshold were evaluated at each tested size threshold. In the receiver operating characteristic analyses, all 70 patients were included at each threshold; however, the threshold used determined whether the reference standard for each patient was counted as negative or positive. For instance, a patient whose largest polyp was 6 mm at colonoscopy would be counted as a "positive patient" when using thresholds of 4, 5, and 6 mm, but the patient would be counted as "negative" at higher threshold values.
The percentage agreement among the four readers was determined both per patient and per polyp. Percentage agreement was performed on the basis of binary collapse of the rated responses (scores 1 and 2 were negative, and scores 3, 4, and 5 were positive). Percentage agreement among patients was determined for all patients, patients with lesions 10 mm and larger, and patients with 69-mm lesions. The calculation of percentage agreement per patient was straightforward, since a response was required for each patient score. The calculation of percentage agreement per polyp, however, was more complex, since readers could fail to detect individual polyps and therefore have no response to compare. Thus, when percentage agreement was calculated per polyp, all polyps found at colonoscopy were counted, and agreements of both nonresponses and responses at CT were calculated. The high case mix of this patient cohort can create bias with statistical measures of agreement (27), and we chose percentage agreement because we believed it would be the most simple and clear.
| RESULTS |
|---|
|
|
|---|
|
10-mm lesion). Among the 62 patients with lesions 4 mm and larger, the average number of polyps per patient was 2.5 (range 114); 60% (37 of 62) of these patients had distal rectosigmoid lesions only, 35% (22 of 62) had both proximal and rectosigmoid polyps, and 5% (three of 62) had proximal lesions only. Nine patients had proximal lesions that were 10 mm or larger in diameter; six of these patients had an associated distal lesion 10 mm or larger.
Image Quality Scores
Table 2 summarizes the overall CT visualization scores of tortuosity, fluid retention, and collapse per colonic segment (eg, the arithmetic mean of the three-point grading scales for the three parameters determined for each colonic segment). The majority of segments demonstrated excellent visualization, with the sigmoid colon having the lowest visualization scores (16% of sigmoid segments were considered poorly visualized).
|
For motion artifact, 41% (29 of 70) of patients had grade 0 artifact and 16% (11 of 70) of patients had grade 1 artifact. In the remaining 43% (30 of 70) of patients, the longitudinal extent of the grades 2 and/or 3 motion artifact (mean ± SD) was present in 16% ± 0.19 (mean ratio of 59 mm to 360 mm) of supine and 12% ± 0.12 of prone (mean ratio of 43 mm to 380 mm) scanning data.
Diagnostic Performance of CT Colonography
Analysis per polyp.For the 45-mm polyps, individual sensitivity ranged from 0.06 (four of 72) to 0.38 (27 of 72) (significant differences were present among readers, P < .01) and positive predictive value ranged from 0.52 (27 of 52) to 1.0 (four of four). Table 3 demonstrates the sensitivity and positive predictive values calculated for the more clinically relevant 69-mm and 10-mm and larger polyps among the four readers using 2D MPR alone and 2D MPR with the additional use of 3D PVR. The following are the results of 2D MPR with 3D PVR. For polyps 6 mm and larger, there was considerably more individual variability among readers for 69-mm polyps (P = .07) than for lesions 10 mm or larger (P = .29). Pooled results of sensitivity were 0.36 for 69-mm polyps and 0.68 for lesions 10 mm or larger. Individual positive predictive values ranged from 0.35 to 0.79 (69-mm polyps, P = .01) and from 0.64 to 0.86 (
10-mm lesions, P = .06). As would be expected, there were general trends for readers with higher sensitivity to also have more false-positive findings; however, trends in positive predictive value were less consistent among readers. Figures 14 demonstrate individual cases that exemplify our reader results.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Analysis per patient.Sensitivity, categorizing patients by the size of their largest lesion at colonoscopy, was similar among readers per patient category than per polyp category (Table 4). Pooled results of sensitivity per patient category were 0.48 (27 of 56 pooled readings, P = .30 for
2 analysis among readers) for patients with 45-mm polyps, 0.71 (P = .14) for patients with 69-mm polyps, and 0.88 (P = .81) for patients with lesions 10 mm and larger.
|
Specificity, categorizing patients as negative with polyps 5 mm or smaller in diameter (n = 22), was 0.60 (53 of 88 pooled responses, P = .06). Individual reader results of specificity were 0.68 (15 of 22), 0.64 (14 of 22), 0.36 (eight of 22), and 0.73 (16 of 22).
Analysis per polyp size thresholds.Results of sensitivity calculated per polyp and per patient units of analyses are plotted in Figure 5, analyzed at each lower limit of polyp size. Two trends were evident. First, sensitivity increased and converged as the size threshold increased. Second, sensitivity was higher and converged more strongly among results categorized per patient. Results above a threshold of 10 mm showed little change as the sample size continued to decrease.
|
|
10-mm lesions), the pooled agreement (detected and undetected) was 75%.
|
10-mm lesions) resulted in two large false-negative lesions owing to fluid retention. Of the 45 69-mm lesions, 26 false-negative lesions were present among three or more of the readers (14 in the rectosigmoid, seven in the transverse colon, three in the descending colon, and two in the ascending colon or cecum). The relationship of false-negative rates among readers and two image quality scores (fluid retention and collapse) was separately evaluated for the 40 lesions that were 10 mm or larger. The prone and supine scores were combined by using the best score in either position. The rationale for this was that readers would use the best-visualized image to make their judgements. For the evaluation of collapse scores, 31 of 40 segments had a best score of 1 (mild collapse), nine of 40 had a best score of 2 (moderate collapse), and none had a score of 3. There was no relationship between false-negative rates and collapse scores. For the evaluation of fluid retention, 33 of 40 segments had a best score of 1 (fluid level occupying less than one-fourth of the lumen), seven of 40 had a best score of 2, and none had a score of 3. Among the seven of 40 segments that had a best score of 2 (fluid level occupying one-fourth to one half the lumen), there was a slight trend of increasing false-negative findings among all four readers.
The sizes and distributions of the false-positive lesions were assessed among the readers. There were 21 different false-positive lesions (15 patients) that were reported by three or four of the readers, with 12 (six in the transverse colon, four in the rectosigmoid, one in the descending colon, one in the cecum) of 21 false-positive lesions being 10 mm and larger in size and nine (three in the rectosigmoid, three in the descending colon, two in the ascending colon, and one in the transverse colon) of 21 of the false-positive lesions being smaller than 10 mm in size.
Reader Times
The average duration of the individual reader sessions (image evaluation and Internet posting of results) was 20 minutes (range, 1523 minutes). The two readers with the highest sensitivity (per polyp analysis) had longer average reader times (21 and 25 minutes), while the two readers with lower sensitivity had shorter average reader times (14 and 15 minutes). Less consistent trends were seen between reader times and the number of false-positive findings among readers.
| DISCUSSION |
|---|
|
|
|---|
Within the six-tiered hierarchic model of technology assessment (28,29), the current assessment of CT colonography is at level 2, defined as diagnostic-accuracy efficacy. At this level, evaluation of well-characterized cohorts by more experienced readers is performed to predict preliminary diagnostic performance of an evolving technology. The following are the important input variables (case mix, reference standard, CT acquisition protocol) and the subsequent analysis (reader protocol, statistical methods, interobserver agreement) that influence the current diagnostic performance and may play a role in future investigations.
Case mix (the proportion of positive and negative cases in the patient cohort) imparts a major influence in the evaluation of diagnostic performance and needs to be appropriate for each evolving level. At the beginning of the recruitment for this study, published investigations (813) of CT colonography had reported a range of sensitivity results for a small number of large polyps (820 of the 10-mm and larger polyps). In our investigation, a total of 40 lesions (
10 mm in diameter) were evaluated in an enriched cohort to rigorously test sensitivity and reader agreement. Although an enriched cohort is appropriate at this stage, spectrum bias can be imparted. This bias is defined as the error imparted by the lack of a broad spectrum of cases with disease and those without disease (30). Although a high case mix can often lead to falsely increased sensitivity, the high number of polyps per patient in our study (eg, 2.5 polyps per patient; range, 014) may have decreased sensitivity at a per-polyp level of analysis due to satisfaction of search.
To date, case mix has greatly varied in reported investigations, with sensitivity to detect polyps 10 mm and larger ranging from 50% to 90%. Fenlon et al (13) reported a sensitivity of 90% for detection of large polyps (n = 22) in a cohort of 100 patients (118 lesions total) recruited for being at risk of colorectal neoplasia, with 49% of patients having abnormal findings at colonoscopy. In contrast, Rex et al (12) reported a sensitivity of 50% for detection of large lesions (n = 14) in a cohort of 46 asymptomatic patients (91 lesions total) recruited for colorectal screening, with fluid retention and collapse noted to decrease the sensitivity of three large flat adenomas. In 2001, Yee et al (19) extended these findings in the largest reported prospective series, demonstrating a sensitivity of 90% for detection of lesions 10 mm and larger (n = 88) in a cohort of 300 symptomatic and asymptomatic patients. Future cohorts will need to include screening cohorts to estimate the false-positive rates in the general population, as well as surveillance cohorts.
The use of colonoscopy as the reference standard imparts important influences of the validation. However, prior investigations have demonstrated that colonoscopy is not a perfect reference standard (31,32). In our investigation, we specifically define the reference unit of analysis as focal lesions found at colonoscopy, with histologic proof of individual polyps provided as possible. Although predominantly polypoid focal morphologies were evaluated in this investigation, a total of three advanced wall lesions were also included to provide a more comprehensive prospective evaluation of focal colorectal lesion detection. Discrimination between morphologic and histologic findings at colonoscopy is important. Namely, for noninvasive colorectal screening examinations such as CT, barium enema, or flexible sigmoidoscopy, the goal of the test is to detect focal morphologies to better characterize those patients who would benefit from therapeutic colonoscopy. To change the denominator of results on the basis of histologic findings could be misleading. Finally, colon size measurement is an important influence that affects the categorization of results and clinical management. Polyp size measurement at colonoscopy can vary with different techniques, with the most accurate results obtained with an in vivo linear probe (33). Our reference standard did use the in vivo linear probe, which has not been used widely in validation studies of CT colonography to date. We preliminarily have reported differences in polyp size measurement between CT and colonoscopy (McDermott RA, McFarland EG, Brink JA, et al. Evaluation of polyp size measurement in multiobserver prospective study of CT colonography. Presented at the Society of Computed Body Tomography and Magnetic Resonance Scientific Session, Miami, Fla, March, 2001). Further investigation and protocol definition of size measurement will be important.
The CT acquisition protocol represents a challenging input variable owing to the rapid flux of technologic advancements in spatial and temporal resolution. The threshold of 10 mm for convergence of our readers diagnostic performances is consistent with the acquisition capabilities imparted by use of 5-mm collimation with singledetector row CT used in this study. To date, singledetector row acquisition capabilities have shown very good results of sensitivity for intermediate and large polyps acquired at 35-mm collimation, pitch of 1.252.0, and 1.52.0-mm reconstruction interval (13,19). In our study, section data were reconstructed at 2.0-mm intervals, followed by further linear interpolation of section data to submillimeter voxel dimensions. The resulting efficiency of reconstructing 400 sections from the CT device (for 40-cm coverage for both the prone and supine acquisitions), compared with 800 sections that would result from use of a 1.0-mm reconstruction interval, is efficient for data management.
The improved volumetric capabilities with multidetector row CT now permit rapid, dose-efficient, thin-collimation data acquisition, with marked improvement in partial volume effects. Decreased respiratory artifacts and enhanced temporal resolution have been reported with use of 5-mm effective section thickness (18), and improved spatial resolution with excellent dose efficiency has been demonstrated with 1-mm effective section thickness at 50 mAs (effective) (20). As acquisition capabilities continue to improve the depiction of smaller anatomic structures, optimized protocols are needed to improve detection of clinically important lesions, without increasing false-positive rates or data management burdens.
In addition to inherent spatial resolution influences of the CT protocol, the patient-related CT acquisition factors (eg, fluid retention, collapse, stool retention) are important to discriminate. In this study, overall image quality scores for collapse and fluid retention were predominantly of diagnostic quality. The lowest visualization scores were present in the sigmoid colon, which had the highest number of false-negative findings. Although there were individual false-negative cases associated with higher fluid retention scores, these trends were weak. Future optimization of patient bowel preparation and insufflation may prove to be more important than continual improvements in spatial resolution capabilities.
Image-display protocols are also in rapid evolution and strongly influence reader performance. Consistent with current practice, this study used 2D MPR as the primary screening modality. This display method imparts high sensitivity for detection of focal lesions and provides ease of navigation from an extraluminal orientation. Interactive use of 3D endoscopic views was used to improve characterization of lesions in our current study design. Future developments in 3D image processing, such as automated flight paths, extended field-of-view image-display techniques (34,35), stool tagging and subtraction (36,37), and computer-aided diagnosis (38), will require further evaluation.
The statistical method commonly used to determine diagnostic performance of CT colonography has mainly been sensitivity, stratified by polyp size. Results per patient are of greater clinical importance, since patients are the unit of analysis most germane to clinical decision making. We demonstrated excellent sensitivity results per patient (in contrast to the variability of results per individual polyp size), with a high pooled sensitivity of 0.88 among patients with polyps 10 mm and larger in diameter. Any patient analysis can be positive based on individual false-positive response(s). We evaluated this influence, recognizing that our polyp-rich cohort increased the likelihood of a positive bias. Only 2% of patients with polyps 10 mm and larger were categorized as true-positive due to individual false-positive responses, but this increased to 24% in patients with 69-mm polyps. Estimates of specificity per patient will also be increasingly important as authors of studies begin to assess cost-effectiveness in screening cohorts. Of note, definition of the lower limit of lesion size, which constitutes a negative patient, is important to delineate. Although our sample size of negative patients was not designed to adequately estimate specificity, we document specificity to be 60% on the basis of patients defined as negative with polyp(s) 5 mm or smaller in diameter.
Unlike the investigations in previous studies of traditional analyses of accuracy, we also analyzed our results at different thresholds of polyp diameter. We found that reader results optimized and converged at polyp diameters of 10 mm. This approach may be particularly important in disease states, in which the lower size limit of clinically important polyps for screening purposes is not well defined. Namely, for colorectal polyps, there are well-established clinical guidelines for management of lesions 10 mm and larger in diameter (39,40). However, there is more variability in clinical management of smaller lesions (4143). In addition, the clinical relevance and sensitivity of detection of flat adenomas is controversial (44). Future evaluations will require multidisciplinary agreement to define the lower limits of polyp sizes and morphologies that are clinically important.
In addition to diagnostic performance, reader agreement in CT colonography is important but has been less well studied to date. In a prior study of 50 patients with two teams of readers (gastroenterologist and radiologist), variable interobserver agreement has been attributed to learning-curve effects (20). In a retrospective evaluation of 30 selected colonic segments, interobserver agreement among three abdominal radiologists was variable (
= 0.530.80) with 2D MPR, whereas intraobserver agreement was higher (
= 0.61.0) with different 2D and 3D image-display techniques (21). In our current prospective study of four experienced abdominal radiologists, we found that overall interobserver agreement for all patients was 79%, which increased to 94% for patients with lesions 10 mm and larger in diameter. The high case mix of this polyp-rich cohort may have accentuated differences of reader persistence following identification of at least one positive finding. Two readers tended to persist longer in polyp detection in cases of either high polyp number or suboptimal CT image quality. These readers had longer reader times and higher polyp sensitivity; however, all four readers had similar sensitivity per patient. More variable trends were seen in individual false-positive findings; however, three readers demonstrated similar specificity per patient. Definition of training protocols for new readers will be an important next step to aid implementation in community settings.
Although our study results are promising, each study represents only one stage of assessment along a curve of continual change of acquisition and image-display advances. Readers with less expertise will need to be trained in more generalizable patient cohorts for the appropriate detection and characterization of various colorectal morphologies. However, the definition of test expectation will greatly effect future evaluations and implementation. Should the expectation be to detect every polyp or every "clinically important" polyp? Multidisciplinary collaboration will be useful to develop clinical management algorithms, which could help define the lower limits of polyp sizes that are clinically relevant in different patient cohorts, stratified by age, a priori risk, and comorbidity. Evidence-based outcome analyses can then provide a framework for implementing efficient study designs and help assess cost-benefit considerations. Ultimately, future evaluations will determine the role of CT colonography among diagnostic tools intended to reduce colorectal cancer mortality.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: MPR = multiplanar reformation, PVR = perspective volume rendering, 3D = three dimensional, 2D = two dimensional
Author contributions: Guarantors of integrity of entire study, E.G.M., T.K.P., J.A.B., B.L.; study concepts, E.G.M., T.K.P., J.A.B., B.L.; study design, E.G.M., T.K.P., J.A.B., J.P.H., D.M.B., B.L., E.P.T., L.B.W.; literature research, E.G.M., R.A.M., C.V.S., P.W.B.; clinical studies, all authors; data acquisition, E.G.M., J.P.H., D.M.B., L.B.W., E.P.T.; data analysis/interpretation, E.G.M., T.K.P., J.A.B., R.A.M., C.V.S., P.W.B., J.P.H., D.M.B., B.L.; statistical analysis, T.K.P., E.G.M., B.L., J.A.B.; manuscript preparation and definition of intellectual content, E.G.M., T.K.P., J.A.B., B.L.; manuscript editing, E.G.M., T.K.P., J.A.B., R.A.M., J.P.H., D.M.B., L.B.W., E.P.T., B.L.; manuscript revision/review and final version approval, all authors.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. Hock, R. Ouhadi, R. Materne, A.-S. Aouchria, I. Mancini, T. Broussaud, P. Magotteaux, and A. Nchimi Virtual Dissection CT Colonography: Evaluation of Learning Curves and Reading Times with and without Computer-aided Detection Radiology, September 1, 2008; 248(3): 860 - 868. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Sosna, T. Sella, O. Sy, P. T. Lavin, R. Eliahou, S. Fraifeld, and E. Libson Critical Analysis of the Performance of Double-Contrast Barium Enema for Detecting Colorectal Polyps >= 6 mm in the Era of CT Colonography Am. J. Roentgenol., February 1, 2008; 190(2): 374 - 385. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Baker, L. Bogoni, N. A. Obuchowski, C. Dass, R. M. Kendzierski, E. M. Remer, D. M. Einstein, P. Cathier, A. Jerebko, S. Lakare, et al. Computer-aided Detection of Colorectal Polyps: Can It Improve Sensitivity of Less-Experienced Readers? Preliminary Findings Radiology, October 1, 2007; 245(1): 140 - 149. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yasumoto, T. Murakami, H. Yamamoto, M. Hori, R. Iannaccone, T. Kim, H. Abe, M. Kuwabara, K. Yamasaki, N. Kikkawa, et al. Assessment of Two 3D MDCT Colonography Protocols for Observation of Colorectal Polyps Am. J. Roentgenol., January 1, 2006; 186(1): 85 - 89. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Halligan, D. G. Altman, S. A. Taylor, S. Mallett, J. J. Deeks, C. I. Bartram, and W. Atkin CT Colonography in the Detection of Colorectal Polyps and Cancer: Systematic Review, Meta-Analysis, and Proposed Minimum Data Set for Study Level Reporting Radiology, December 1, 2005; 237(3): 893 - 904. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Iannaccone, C. Catalano, F. Mangiapane, T. Murakami, A. Lamazza, E. Fiori, A. Schillaci, D. Marin, I. Nofroni, M. Hori, et al. Colorectal Polyps: Detection with Low-Dose Multi-Detector Row Helical CT Colonography versus Two Sequential Colonoscopies Radiology, December 1, 2005; 237(3): 927 - 937. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Pickhardt, A. D. Lee, E. G. McFarland, and A. J. Taylor Linear Polyp Measurement at CT Colonography: In Vitro and in Vivo Comparison of Two-dimensional and Three-dimensional Displays Radiology, September 1, 2005; 236(3): 872 - 878. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Rogalla, A. Lembcke, J. C. Ruckert, E. Hein, M. Bollow, N. E. Rogalla, and B. Hamm Spasmolysis at CT Colonography: Butyl Scopolamine versus Glucagon Radiology, July 1, 2005; 236(1): 184 - 188. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. P. Mulhall, G. R. Veerappan, and J. L. Jackson Meta-Analysis: Computed Tomographic Colonography Ann Intern Med, April 19, 2005; 142(8): 635 - 650. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. H. Dachman, P. Schumm, B. Heckel, H. Yoshida, and P. LaRiviere The Effect of Reconstruction Algorithm on Conspicuity of Polyps in CT Colonography Am. J. Roentgenol., November 1, 2004; 183(5): 1349 - 1353. [Abstract] [Full Text] [PDF] |
||||
![]() |
H Herfarth and A G Schreyer The virtuosity of virtuality or how real is virtual colonography Gut, December 1, 2003; 52(12): 1662 - 1664. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |