|
|
||||||||
Thoracic Imaging |
1 From the Department of Radiology, University of Pittsburgh, Imaging Research, Suite 4200, 300 Halket St, Pittsburgh, PA 15213 (B.E.C., D.G.); and Department of Radiology, Weill Medical College of Cornell University, the New York Hospital-Cornell Medical Center, New York, NY (D.F.Y., C.I.H.). Received January 9, 2004; revision requested March 10; revision received March 29; accepted April 28. Address correspondence to B.E.C. (e-mail: chapmanbe@upmc.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Monte Carlo simulations of lung cancer screening programs were performed in subjects at high risk for developing cancer. The effects of detection probabilities, symptomatic presentation of tumors, tumor volume doubling time, and time between screenings were examined. Computed tomography (CT) and chest radiography models were used.
RESULTS: For imperfect detection probabilities, the percentage of subjects with cancers detected with repeated screenings decreased to a steady-state value. The transition period was the period during which screenings were performed and detection rates decreased. At steady-state repeat screening, the proportion of subjects with cancers diagnosed at screening or by means of symptomatic presentation was determined by the annual probability of developing cancer and not by the sensitivity of the screening modality. The sensitivity of the screening technique did affect detected cancer size, number of interval cancers, and total number of cancers observed. CT was used to detect more total cancers over the course of the screening program and cancers with a smaller average size; moreover, fewer interval cancers were observed with CT screening than with chest radiography screening.
CONCLUSION: Lung cancer screening with imperfect detection has a transition period between baseline screening and steady-state behavior of annual screenings. Advantages of CT screening include a decrease in the average cancer size at detection, a decrease in the number of observed interval cancers, and an increase in the total number of cancers observed. Steady-state behavior indicates that long-term trials of screening may not be necessary.
© RSNA, 2005
| INTRODUCTION |
|---|
|
|
|---|
A perfect screening modality would be able to depict any cancer with a probability of one. In reality, all screening modalities have imperfect sensitivities, in that they can only depict cancers larger than some action size (ie, the minimum tumor diameter for which the radiologist will record a positive or semipositive finding [13]). Tumors that are larger than the action size require different follow-up than tumors that are smaller than the action size, and the probability of detecting cancers larger than the action size is less than one. These detection imperfections may be due to the technology, the observer, or both. Imperfections affect the temporal dynamics of a cancer screening program, ranging from the number of cancers detected at baseline screening, the duration of the transition period (ie, the period between baseline screening and steady-state screening during which the percentage of screening subjects with detected cancers at each screening is decreasing), and the size (hence, the potential curability) of cancers observed.
Lung cancer screening can be viewed as consisting of four phases. The first phase consists of disease development prior to screening. The second phase consists of baseline screening, when asymptomatic subjects are imaged. The third phase consists of transition screenings, when the percentage of subjects with detected cancers gradually decreases during repeated screening until a steady state is reached. The fourth phase consists of a steady-state condition, during which any repeated screening behaves largely the same. These phases are described in more detail in the Appendix.
Mathematic modeling of cancer screening has been presented by other researchers (1416). On the basis of results of screening programs conducted in the 1980s, Flehinger and Kimmel (16) modeled lung cancer screening and focused on the effect on mortality. To our knowledge, the temporal dynamics of a lung cancer screening program based on a more sensitive test, such as multidetector row helical CT, have not been described in detail. These dynamics are of interest in the design of screening trials in which the number of positive cases during the study must be estimated.
Thus, the purpose of our study was to use a mathematic model to demonstrate the effect of imperfect detection on the temporal dynamics of radiologic lung cancer screening programs.
| MATERIALS AND METHODS |
|---|
|
|
|---|
)1/3. Published values for VDT vary. Usuda et al (20) reported a mean VDT of 163.7 days. By contrast, Hasegawa et al (19) reported a mean VDT for adenocarcinomas of up to 452 days; however, their study included subtypes that were not typical solid tumors. In a review of the Mayo Lung Project and the Memorial Sloan-Kettering Cancer Center lung screening data, Yankelevitz et al (21) reported mean VDT values of 101 and 144 days, respectively. They suggested that reported mean VDT values of more than 400 could be due to overdiagnosis. In our simulation, we chose a middle ground between the Mayo Lung Project and Memorial Sloan-Kettering studies; thus, we modeled VDT as a Gaussian distribution with a population mean of 125 days (standard deviation, 30 days). The standard deviation was chosen to approximate the minimum doubling times observed in these studies. We did not attempt to model the effect of a skewed tail resulting from slow-growing cancers. Because we used a Gaussian function, there is a very small probability that a cancer will have a computed negative VDT value. We do not explicitly address this issue; however, we assume that these few cancers are actually initiated but never grow.
When cancers reached a critical diameter, they were assumed to become symptomatic; patients with such cancers were removed from the screening population. The actual size when a cancer becomes symptomatic is not always well defined. Cancers that are detected between screenings in symptomatic subjects are referred to as interval cancers. Not all interval cancers are symptomatic, however, because interval cancers also include cancers detected incidentally during other diagnostic studies that are not part of the screening. In the chest radiography screening study performed at Memorial Sloan-Kettering Cancer Center (New York Screening Program), Heelan et al (22) found that interval cancers have a mean size of 39 mm (range, 1193 mm). These interval cancers included all incidental cancers found with radiologic studies performed for any reason and all cancers in patients who presented with symptoms. By contrast, the mean size of cancers detected with the actual screening program was 24 mm. For this simulation, we chose a threshold of 40 mm for symptomatic presentation. We did not model the effect of smaller cancers being detected incidentally during other imaging studies. We did not model other causes of death or removal from the screening program; however, we assume that these would be equally distributed throughout all patients (ie, both those with and those without cancer) in the study and would not affect the proportions we are examining.
Cancer detection.To simplify the model, we assumed that cancer detection was independent from year to year; namely, the availability of results of prior studies did not affect detectibility. We modeled cancer detection by using several different functions. Cancer detection with CT was based on the experience at Weill Medical College of Cornell University, New York, NY, in the Early Lung Cancer Action Project (Henschke CI, unpublished data, 1999). We used the following tumor diameters to define CT detection probability (pdet):
|
From cancer detection with chest radiography in the same data set, we modeled the chest radiography detection function as follows:
|
Finally, for the purpose of illustration, we examined a simple detection function:
|
|
For these simulations, we set the action diameter to be 2 mm, which was the size of the smallest detectable cancer in the CT model.
When cancer was observed in a subject, that subject was removed from the screening population. Thus, as the course of the screening program progressed, the number of screening subjects decreased.
Initial conditions.Initial conditions for the screening population were created by simulating an initially cancer-free population for 20 years without a screening intervention. The initial population size was 100 000 patients, and the annual probability of developing cancer was .005. In all the screening simulations, multiple cancers within a patient were not modeled. Patients who developed cancers larger than the symptomatic size were removed from the population. At the end of 20 years, 10 000 subjects were randomly selected from the remaining pool of patients without symptoms to undergo screening; baseline screening was performed to initiate the screening program. By 20 years, both the population cancer burden and the average cancer size had leveled off, which indicated that a reasonable approximation of a steady-state value had been reached (Fig 1). The population tumor burden continued to increase slowly, however, since symptomatic subjects were not replaced with tumor-free subjects.
|
|
Summary of model assumptions.We have summarized some of the more important assumptions of the models. First, we assumed sharp thresholds for both the action size and the size when a cancer becomes symptomatic. Second, we assumed that each screening was independent (ie, results of prior studies were not examined). Third, we assumed that there were no deaths or reasons to be removed from the screening program other than detection of or manifestation of lung cancer. Fourth, we assumed that cancer incidence remained constant over the course of the screening program. Fifth, we assumed a Gaussian distribution of VDT. Our model was intended to be a simple first-order approximation.
Monte Carlo Simulations
The simulation was written with Interactive Data Language, version 5.6 (Research Systems, Boulder, Colo). For each case, 3000 runs of the model were computed. The reported values represent the mean value across the 3000-run output for each set of input parameters. We performed baseline simulations for the constant, CT, and chest radiography detection functions. For these baseline simulations, we assumed an incidence of 0.005, a VDT with a mean Gaussian distribution of 125 days ± 30, and a critical diameter of 40 mm.
Sensitivity Analysis
We performed sensitivity analysis to assess the dependence of our model on assumed parameter values for critical diameter, VDT, screening interval, detection probabilities, initial cancer size, and cancer incidence.
We varied critical diameter between 30 and 60 mm to examine the dependence of the model on the value of critical diameter. This range is narrower than the range reported by Heelan et al (22) because we modeled only the average symptomatic group and because we did not include incidental cancers. We varied the mean VDT from 100 to 150 days and kept the standard deviation of VDT constant at 30 days. We varied the screening intervals from one screening every 4 months to one screening every 2 years. We examined the effect of different detection probabilities on behavior during the steady state. We adjusted the detection functions by shifting the detection probabilities for the cancer diameter by 1 mm. That is, we modified Equations (1) and (2) to use the given detection probability for a cancer with a smaller or larger diameter. We also varied the initial cancer diameter (220 µm). Finally, we varied the underlying cancer incidence (0.3%0.7%). For each of these analyses, we used baseline values for the fixed parameters.
| RESULTS |
|---|
|
|
|---|
At the initiation of the screening program, the expected tumor burden (ie, percentage of screening population with cancer of any size) was 7.1% (708 of 10 000 patients). These cancers had an average diameter of 4 mm, an average duration of 7.8 years, and an average VDT of 132 days (Fig 2). The shift in VDT distribution to 125 days is due to more aggressive cancers becoming symptomatic earlier and being removed from the start-up population prior to the initiation of screening. Note that the histograms in Figure 2 include all cancers present (ie, detected and undetected) in the population.
|
For constant detection functions (Fig 3), we observed a decrease in the number of cancers detected at baseline and an increase in the duration of the transition period as the detection probability decreased (Fig 3a). There was also a slight decrease in the percentage of patients with cancers detected with radiology during the steady state. Ultimately, however, the total percentage of subjects with observed cancers (ie, interval cancers or cancers detected with screening) is identical, regardless of detection probability. (Fig 3b). As detection probability decreases, the mean diameter of detected cancers increases noticeably (Fig 4).
|
|
|
|
|
|
|
|
VDT.As the mean VDT varied from 100 to 150 days, the expected percentage of interval cancers observed during steady state ranged from 0.51% (0.228 of 44.958 cancers) to 0.0045% (0.002 of 44.632 cancers) for CT and from 26.2% (11.9023 of 45.4323 cancers) to 11.1% (5.0140 of 45.3240 cancers) for chest radiography. Over the same range of VDT values, the expected percentage of subjects with cancers that were detected at baseline ranged from 1.5% (152.94 of 10 000 patients) to 2.1% (212.87 of 10 000 patients) for CT and from 0.47% (46.79 of 10 000 patients) to 0.61% (60.62 of 10 000 patients) for chest radiography. As VDT increases, mean size of detected cancer in the steady state decreases from 5.3 to 3.8 mm for CT and from 22.5 to 20.9 mm for chest radiography. The change in expected size of the cancers detected at baseline screening was very small.
Screening interval.When screening was performed at 4-month intervals rather than at yearly intervals, the expected percentage of observed cancers that were interval cancers decreased from 0.075% (0.0333 of 44.6133 cancers) to 0.0082% (0.0013 of 15.8013 cancers) with CT and from 16.8% (7.599 of 45.109 cancers) to 0.75% (0.1203 of 16.0703 cancers) with chest radiography. When the screening interval was extended to 2 years, 1.49% (1.2143 of 81.5743 cancers) of cancers observed with CT were interval cancers, while 46.4% (38.5103 of 83.0803 cancers) of cancers observed with chest radiography were interval cancers. Over the range of screening intervals, the mean size of detected cancers ranged from 2.6 to 7.7 mm for CT and from 15.5 to 23.5 mm for chest radiography. The mean tumor duration of detected cancers ranged from 10.56 to 11.86 years with CT and from 13.13 to 14.43 years with chest radiography.
Detection probability.When the CT scanner was made sensitive to cancers 1 mm smaller, the average diameter of a detected cancer decreased from 4.3 to 4.2 mm, and the expected proportion of observed cancers that were interval cancers decreased to 0.06% (0.0263 of 45.0363 cancers). When the CT scanner was made sensitive to cancers 1 mm larger, the average diameter of a detected cancer increased to 6.2 mm, and the expected percentage of observed cancers that were interval cancers increased to 0.1% (0.0477 of 44.9677 cancers). When chest radiography was made sensitive to cancers 1 mm smaller, the average diameter of a detected cancer decreased from 21.8 to 20.4 mm, and the expected percentage of observed cancers that were interval cancers decreased to 14.5% (6.567 of 45.187 cancers). When chest radiography was made sensitive to cancers 1 mm larger, the average diameter of a detected cancer increased to 23.0 mm, and the expected percentage of interval cancers increased to 19.3% (8.7707 of 45.3907 cancers).
Initial cancer size.As the initial cancer diameter increased from 2 µm to 10 µm and 20 µm, the mean duration of a cancer detected with CT during the steady state decreased from 11.0 to 8.7 and 7.7 years, respectively. For chest radiography, the mean duration decreased from 14.0 to 11.5 and 10.5 years, respectively. There were no changes in the number of cancers detected at baseline screening, the average size of a detected cancer at steady state, or the proportion of interval cancers at steady state.
Cancer incidence.The expected percentage of patients with cancers detected with baseline screening ranged from 1.1% (113.67 of 10 000 patients) to 2.6% (262.54 of 10 000 patients) for CT and from 0.34% (34.31 of 10 000 patients) to 0.80% (79.72 of 10 000 patients) for chest radiography. Neither the expected proportion of observed cancers that were interval cancers nor the average size of detected cancers changed during the steady state.
| DISCUSSION |
|---|
|
|
|---|
The percentage of subjects with observed cancer in the steady state is ultimately determined by the incidence of new cancers that develop each year and is independent of the screening modality. By using constant detection functions, we demonstrated that lower detection probabilities resulted in increased size and hence, increased stage of detected cancers, as the cancers had additional screening cycles to grow.
This behavior was also illustrated with the CT and chest radiography detection probabilities. The total percentage of observed steady-state cancers in the screening population is the same for CT and chest radiography. Because chest radiography tends to detect larger cancers than does CT, however, a greater percentage of steady-state cancers are interval cancers. Also, because CT detects more cancers than does chest radiography at baseline screening, the screening population is smaller in the steady state, which reduces the actual number of cancers observed during each screening interval.
These simulations showed that CT screening has the following advantages over chest radiography: First, more total cancers will be observed with a CT-based screening program than with a chest radiography screening program. This difference in observed cancers is a result of the fact that more cancers are detected with CT than with chest radiography at baseline screening. Even though the same percentage of screening subjects who undergo chest radiography and CT have observed cancers in the steady state, chest radiography never catches up with CT. Second and more important, because CT is more sensitive than chest radiography in the detection of small cancers, CT detects cancers at an earlier stage of their development; this results in a substantial reduction in the number of interval cancers and potentially reduces the mortality associated with this disease. Because of its high sensitivity, CT can result in almost no symptomatic presentation of cancers if the screening interval is short enough relative to the tumor VDT; this was demonstrated in the sensitivity analysis for both VDT and the screening interval.
There are three important limitations to this study. First, we have presented a simple model, and our results are dependent on our model assumptions. It was not our intent to replicate the precise values observed in various screening programs. Rather, we wanted to use a simple numeric model to illustrate the oft-overlooked dynamics of a screening process. For example, our threshold for symptomatic cancers is an oversimplification. Published results show a large overlap between the average sizes of screening-detected cancers and interval cancers. Hence, we would not expect to achieve a close numeric match. We described cancer growth as a simple exponential function. Cancer growth may be modeled more accurately by using Gompertzian functions, which account for saturated growth rates demonstrated in larger tumors (2325). We do not believe that accounting for the saturated growth rates would affect the temporal dynamics, but we would see a change in the proportion of interval cancers. We also used a simple symmetric distribution for VDT; thus, we did not capture the effects of extremely slow-growing cancers. The detection probabilities we used were based on our own experience and may not reflect detection probabilities observed by others. Since we assumed that the performance and availability of prior studies did not affect detection probabilities in subsequent examinations, we probably underestimated detection during follow-up screenings. In reality, detection will vary with observer skill and the operating point selected for balancing sensitivity and specificity, as will be discussed later. We implicitly assume that the sensitivities are representative of the skill and operating point of the typical radiologist. The simulation is most dependent on detection probabilities, VDT, and critical diameter values. As these values decrease, the number of cancers detected at baseline screening decreases, and the number of interval cancers and mean size of detected cancers increases.
A second limitation of this study is that we did not account for mortality from other causes; therefore, we could not address overdiagnosis (21,26). The commonly presented view is that overdiagnosis occurs when cancer is diagnosed, but the subject dies of other causes before he or she dies of cancer. Overdiagnosis occurs most often at baseline and transition screenings, when very slow-growing cancers have grown to a detectable size prior to initiation of the screening program. With the common viewpoint, however, even fast-growing cancers can be considered overdiagnosed if the subject dies of other causes. Thus, overdiagnosis can occur throughout the screening program. Assuming that mortality rates are approximately constant throughout the screening program, however, overdiagnosis of a typical cancer will occur uniformly throughout the screening program and will not affect the relative proportions we have presented. Techniques such as CT, which are sensitive to small cancers, can potentially produce more cases of overdiagnosis than can techniques such as chest radiography, which are less sensitive to small cancers, because the cancers are detected earlier, thus increasing the temporal interval between diagnosis and death due to cancer. This increased temporal interval increases the probability that death will occur because of causes other than cancer after cancer has been detected.
A third limitation of this study is that we did not address the effect of specificity on the screening program. As recently pointed out by Hillman (27), the cost of unnecessary biopsies and other diagnostic tests as a result of false-positive findings can potentially exceed the cost of screening, thus reducing and potentially eliminating the benefits of screening. False-positive CT findings are believed to be of particular interest because CT is highly sensitive to small structures; hence, the potential cost-effectiveness of CT screening for lung cancer is controversial. In two recent reports in which this problem was evaluated (28,29), the researchers reached opposite conclusions. A model that allows lung cancer screening programs to be more fully evaluated would need to include the presence of benign lung nodules and account for the costs associated with procedures and interventions subsequent to screening.
The temporal dynamics we have illustrated have implications in the design of studies intended to demonstrate the benefit of lung cancer screening. In particular, our results indicate that with imperfect detection, the steady state should not be assumed at the first repeated screening. As the screening program proceeds, the presence of a steady state can be appreciated by plotting either the percentage of screening subjects with observed cancers or the average diameter of screening detection cancers or by using the approximate mathematic technique, as described in the Appendix. The steady state demonstrated with this model indicates that long-term trials of screening may not be necessary because repeated screenings produce nearly identical results after 2 or 3 years.
| APPENDIX |
|---|
|
|
|---|
Phase One: Disease Development
Prior to the initiation of screening, cancers develop in patients. As cancers become symptomatic, patients seek treatment and are no longer part of the potential screening population.
Phase Two: Baseline Screening
Once screening is initiated, we can view the asymptomatic cancers as being in one of three categories. The first category includes cancers that are smaller than the action size. These cancers either would not be observed or would be observed, but the radiologist would not recommend any action regarding the cancer. The second category includes cancers that are larger than the action size and are detected with the screening modality. The third category includes cancers that are larger than the action size but are missed with the screening modality. These categories persist throughout the screening program. If the probability of detection of cancers larger than the action size is one, then there are no missed cancers. The actual detection probability of the imaging modality is dependent on the physical characteristics of the imaging system, the criteria for interpreting the images, and the performance of the reader.
Phase Three: Transition Screenings
If the probability of detection (ie, sensitivity) for cancers larger than the action size was one, the screening program would achieve a steady-state condition immediately after baseline screening. If the imaging modality and observer have a sensitivity of less than one in the detection of cancers larger than the action size, however, some larger cancers will be missed with baseline screening. These cancers will persist in subjects for subsequent screening periods. The presence of these additional cancers results in a greater number of cancers that are observed during screening periods immediately after baseline screening than during later screening periods. For a given probability of detecting a cancer larger than the action size of pd, the probability of a cancer being missed is calculated by subtracting pd from one. Assuming a uniform detection probability for all cancers larger than the action size and assuming that each screening is independent of prior screenings, after a number (N) screenings, the probability of a cancer being missed is calculated with the following equation: (1 pd)N. Although nonzero for all finite N and pd less than one, the detection probability eventually approaches a negligible value or the missed cancers become symptomatic and are removed from the screening population. The duration of the transition period can be estimated by computing which value of N leads to a negligible value for (1 pd)N. In reality, pd increases as cancers become larger; this results in the probability of N misses approaching zero more rapidly than described previously, with a consequently shorter transition phase.
Phase Four: Steady-State Screenings
Assuming that the annual incidence of cancer remains constant in the screening population, after cancers that are present in the population before the initiation of the screening are detected (either at baseline screening or transition screenings) or present symptomatically (interval cancers), the percentage of screening subjects with cancers detected with periodic screening will approach a steady-state condition where an equal number of cancers are detected during each screening. Cancers that become symptomatic will be detected clinically during the interval between screenings (interval cancers). The percentage of screening subjects with interval cancers also becomes a constant for each screening period.
| FOOTNOTES |
|---|
Authors stated no financial relationship to disclose.
Author contributions: Guarantor of integrity of entire study, B.E.C.; study concepts, all authors; study design, B.E.C., D.G.; literature research, B.E.C., D.F.Y., C.I.H.; data acquisition, B.E.C.; data analysis/interpretation, all authors; manuscript preparation, B.E.C.; manuscript definition of intellectual content, all authors; manuscript editing, B.E.C.; manuscript revision/review and final version approval, all authors
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. B. Bach, G. A. Silvestri, M. Hanger, and J. R. Jett Screening for Lung Cancer: ACCP Evidence-Based Clinical Practice Guidelines (2nd Edition) Chest, September 1, 2007; 132(3_suppl): 69S - 77S. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |