|
|
||||||||
Opinion |
1 From the Program for the Assessment of Radiological Technology (ART Program), Departments of Radiology (M.G.M.H., G.K.) and Epidemiology and Biostatistics (M.G.M.H.), Erasmus University Medical Center Rotterdam, Dr Molewaterplein 50, Room EE2140, 3015 GE Rotterdam, the Netherlands; and the Department of Health Policy and Management, Harvard School of Public Health, Boston, Mass (M.G.M.H.). From the 2000 RSNA scientific assembly. Received January 18, 2001; revision requested February 26; revision received July 3; accepted July 16. Address correspondence to M.G.M.H. (e-mail: hunink@epib.fgg.eur.nl).
| ABSTRACT |
|---|
|
|
|---|
© RSNA, 2002
Index terms: Opinions Radiology and radiologists, research Radiology and radiologists, socioeconomic issues
| INTRODUCTION |
|---|
|
|
|---|
|
The goal of developing and assessing new diagnostic imaging technology is presumably to implement valuable and affordable new technology in a timely fashion to attain our fundamental goal of improving health. To achieve this goal we feel the need to perform thorough scientifically based assessment studies of new diagnostic imaging technology prior to its implementation, to demonstrate that the technology provides value for money (13). At the same time, we feel the need to rapidly implement new diagnostic imaging technology, which is why we tend to base our judgment of the value of new technology on subjective experience with a limited number of cases. As demonstrated in Figure 2, these considerations give rise to two apparently disparate pathways to our goal (14,15). The diagram represents a closed circle, which symbolizes the conflict that exists between the two pathways. As long as the circle remains closedthat is, as long as the conflict remains unresolvedtension will persist. So, how do we resolve this conflict?
|
| CHALLENGING THE UNDERLYING ASSUMPTIONS |
|---|
|
|
|---|
(A) Only Thorough Hierarchical Assessment Will Provide the Necessary and Relevant Information
Traditionally, the evaluation of new diagnostic imaging technologies has focused on determining pairs of sensitivity and specificity (or a receiver operating characteristic curve) values in comparison to a reference standard. This entails performance of both the new test and the reference standard test in all patients in a cohort study and determination of the probability of abnormal and normal findings conditional on disease or no disease (sensitivity and specificity, respectively) or, alternatively, the predicted probability of disease conditional on the test result. The reference standard, however, is not flawless in helping distinguish individuals with disease from those without. For example, intraarterial x-ray angiography (an invasive procedure involving arterial catheterization) is generally considered to be the reference standard for vascular disease but has been shown to result in missed patent runoff vessels (16,17). Because computed tomographic (CT) angiography and MR angiography (1820) provide three-dimensional information, they may surpass intraarterial angiography. A comparison of CT angiography or MR angiography with intraarterial angiography as the reference standard will, therefore, lead to underestimation of the sensitivity and specificity of these new tests because intraarterial angiography is defined as the reference standard, or perfect, test. Furthermore, for some innovative technologies such as molecular imaging, an appropriate reference standard simply does not exist because the index test is an attempt to diagnose a condition that is unidentifiable with existing technology (eg, cancer in a very early phase).
In addition, if the reference standard examination is invasive we cannot, on ethical grounds, perform it in all cases. This implies that the new test will be verified in selected cases only, and this, too, may lead to biased estimates of sensitivity and specificity (2123). For example, in potential living kidney donors, surgical findings may be considered the best reference standard, but surgical findings will be available only for those kidneys removed for transplantation. Commonly, we revert to using a combination of findings as the reference standard (24,25). Such an approach inevitably leads to biases related to the mixed reference standard and a reference that is not independent of the tests being evaluated (2123). Alternatively, we can use a mathematical correction method that adjusts for the potential bias resulting from selected verification (2630). This method does, however, require the assumption that the predictive value of a test is unaffected by verification bias.
Furthermore, although sensitivity and specificity may be useful performance parameters in the initial evaluation of a new test, they seldom provide the information we need to decide whether the new diagnostic strategy should be implemented. A result indicating that sensitivity and specificity are both, say, 95% is difficult to translate into a meaningful clinical decision. To decide whether the new diagnostic strategy should actually replace the current strategy requires studies to evaluate the effect on decision making, patient outcomes, and costs.
Decision analysis is a useful tool for estimating the effect on cost and effectiveness outcomes and for evaluating diverse strategies (3133). Advantages of decision analysis are the ability to integrate all the available evidence and values, model a wide range of strategies, and explore the effect of uncertainty on the decision. Decision analysis does, however, have limitations. It is time-consuming to perform properly, it relies on data from multiple heterogenous sources, it requires assumptions to make the problem tractable, and it has limited impact on everyday clinical practice. A large randomized controlled trial (RCT) with long-term follow-up of quality-of-life and survival measures would be scientifically rigorous, but, because the differences in quality of life and survival across diagnostic strategies are generally small (34), such studies must be large, take a long time to perform, and require substantial resources. One can question whether the additional information obtained from such trials justifies the research resources used (35).
(B) Relevant Information Will Lead to Implementation of Optimal Diagnostic Imaging Technology
Even if we have the resources to perform a thorough evaluation of a new diagnostic imaging technology, implementation of the study results can pose a problem. After performing a cohort study in which both the old test and the new test are performed in all patients, we subsequently often find ourselves performing both tests because that is what clinicians have become familiar with. Both physicians and patients assume that more imaging studies will lead to better diagnostic information. Furthermore, more tests yielding the same findings increase confidence in the diagnosis. Thus, physicians will tend to want to use all the possible diagnostic imaging technologies available. Four of five times, implementation of new technology implies that instead of the new technique replacing the old technique, the new technique is performed in addition to the old one (36).
(C) New Imaging Technology Provides More and Better Diagnostic Information That Will Lead to Better Therapeutic Choices and Outcomes
The tremendous increase in the use of imaging procedures suggests that physicians and patients believe that more and better diagnostic imaging technology leads to more and better diagnostic information, which in turn leads to optimal therapeutic choices and outcomes. The connection between more diagnostic imaging information and patient outcomes, however, is difficult to demonstrate because of the multiple intervening paths and steps (36). Although the choice of treatment is commonly influenced by imaging findings, a change from a currently used diagnostic strategy to one that uses a newer technology will generally yield only small benefits to the patient (34). New imaging technology can, however, reduce the convalescence period through the use of procedures, both diagnostic and therapeutic, that are less invasive.
(D1) Rapid Advances in New Diagnostic Imaging Technology Are Hard to Keep Up With
Rapid technologic advances are a fact. Not only do we accept this fact, but we are also generally content with the technologic advances, since they bring with them the possibility of better health care and a higher standard of living. New technology is becoming more widely available, there is a continual increase in the expertise of those using new technology, and there is continual development of the technology. The pace of the advances are mind-boggling, and we must do everything we can to cope with the technologic changes. In effect, this requires efficient methods of research, which implies a focus on providing the information relevant to the clinical decision-making process.
(D2) Current Technology Assessment Studies Do Not Provide Relevant Information in a Timely Fashion
In this respect we should ask ourselves: What exactly is the relevant information in this context? To decide whether a new technology is valuable and affordable, do we really need to go through the whole gamut of the tests reproducibility; sensitivity; specificity; receiver operating characteristic analysis; effect on diagnostic, therapeutic, and prognostic thinking; patient benefit from use of the test; and cost-effectiveness from a societal perspective? This is probably not so. In fact, the only evidence we really need is that use of the new diagnostic strategy facilitates the clinical decision-making process without compromising patient outcomes and that we can afford to pay for it.
Sometimes a new technology is so obviously better, simpler, less risky, and less expensive than the established technology that an extensive assessment is unnecessary: A balance sheet of the pros and cons, taking into account the various dimensions of the decision, can suffice (31). For example, head CT did not need to be extensively assessed in comparison to pneumoencephalography to demonstrate its superiority. At the same time, statements of obvious superiority must be interpreted with caution: The advocate of a new technology is usually someone with his or her own biases and agenda. Disillusionment frequently follows an initially optimistic introduction of new technology. Currently, new technology is generally only marginally better than what we have available already, and the associated price tag may not be justified. Commonly, the new diagnostic strategy has both advantages and disadvantages in comparison with the established tests, in which case an assessment of its effect on the clinical process is warranted.
(E) Evaluation of New Diagnostic Imaging Technology Must Be Performed before Implementation
With the hierarchical approach, evaluation of a new technology must precede its implementation. Although this appears to make good sense, one can question its necessity and desirability. In fact, there may be a study design that allows simultaneous evaluation of the new technology and implementation of the optimal diagnostic strategy. In other words, performance of the study ideally not only helps assess the new technology but at the same time improves clinical practice through some form of a self-organizing feedback system (37). In a self-organizing feedback system, adjustments of physician ordering behavior, of the technology used, and of the interpretation take place as a result of feedback loops.
The above list of problems and questionable assumptions suggests that we should think of a study design for the evaluation of new diagnostic imaging technology that gets away from our focus on cohort validation studies to evaluate sensitivity and specificity. So, is there a study design that can address the issues listed?
| CRITERIA FOR THE ASSESSMENT AND IMPLEMENTATION OF NEW DIAGNOSTIC IMAGING TECHNOLOGY |
|---|
|
|
|---|
| STUDY DESIGN: AN EMPIRICAL RCT |
|---|
|
|
|---|
Empirical Study
Our goal of timely implementation of valuable new diagnostic imaging technology could potentially be achieved if we strive for concurrent implementation and assessment of the new technology. That is, the process of performing the trial should lead not only to evaluation of the new diagnostic strategy in comparison with the old strategy but also to implementation of the new strategy if it is better. The trial design should, in our opinion, be empirically based and pragmatic; that is, the trial should be integrated into clinical practice rather than be implemented in a strictly controlled, but probably unrealistic, experimental setting.
In general, current clinical practice would be used as the control strategy for the study. We recognize that current clinical practice is, in many instances, unstructured and that documentation of relevant information may be lacking. Clearly, sloppy clinical practice will not suffice as the control strategy. Ideally, performance of the study will help structure and streamline clinical practice, initiate collection of relevant information, and enhance communication between physicians mutually and between physicians and their patients.
Randomization across Diagnostic Strategies
As with all studies, the inclusion and exclusion criteria should guarantee selection of a representative group of patients for the diagnostic problem under consideration. The tradeoff is between recruiting a fairly homogeneous group of patients that is clearly identifiable versus ensuring that the study is sufficiently generalizable (38).
After recruiting patients that fulfill the inclusion criteria, we propose random assignment of patients to the new diagnostic strategy or to the old strategy. For example, if we want to evaluate CT angiography versus intraarterial angiography for peripheral arterial disease, we would randomly assign patients with peripheral arterial disease to undergo either initial CT angiography or intraarterial digital subtraction angiography (DSA) (Fig 3). In this empirical design, we would allow for further evaluation with other imaging technology should the need arise. For example, if CT angiography does not provide sufficient diagnostic information to establish a therapeutic plan, the patient would undergo (selective) intraarterial angiography.
|
For example, in this clinical setting the test result is not a dichotomous yes/no result but rather a treatment choice out of a range of possible treatments, including bypass surgery with various possible proximal and distal anastomotic sites, percutaneous transluminal angioplasty or stent placement at various sites, combinations of surgical and percutaneous procedures, or no revascularization procedure. The treatment choice is further influenced by local preferences and expertise and by the patients preferences. Thus, what constitutes a discrepancy in the test results is not easily defined. Furthermore, if both tests have been performed, information from the test that is not supposed to influence management will invariably become known to the treating physicians in a proportion of cases, in part indirectly because the treating physician will know that the other test yielded a discrepant result (otherwise this patient would not have been randomly assigned) and in part because information that is withheld for research reasons can always be requested on ethical grounds. Finally, interpretation of the results may be hampered by the fact that the diagnostic work-up itself may influence not only the choice of treatment but also the manner in which the treatment is performed (eg, if a CT angiographic image is available to guide catheterization, the procedure may be performed differently than if no prior noninvasive imaging information is available).
Randomization between Providing Test Results versus Not Providing Test Results
An alternative to randomizing across diagnostic strategies would be to use the new diagnostic strategy in all patients but only provide the test results to the treating physician and patient for those randomly assigned to the new diagnostic strategy. This approach facilitates recruitment of patients because all patients will be offered the new technology, which tends to increase their willingness to participate. Furthermore, this approach could potentially reduce a biased follow-up if patients are more willing to respond to questionnaires after participation in a new diagnostic strategy. The authors of one published study (39) used this approach to determine the value of MR imaging for help in assessing fetal-pelvic proportions in breech presentation and demonstrated that the information obtained with MR imaging increased the number of successful vaginal deliveries and reduced the number of emergency cesarean sections.
The main caveat of this approach is similar to that of randomization of discrepant test results. Information from the new test will invariably become known to the treating physicians in a proportion of cases in the group where it was not supposed to influence management according to the study protocol. Part of the problem is the logistics of keeping test results secret, but this can be overcome with the correct procedures in place. More important, information withheld for research reasons can be requested for an individual patient on ethical grounds, and if the diagnostic test has been performed already it is a small step for the treating physician to insist on getting the information. Finally, if the new test is associated with a risk, even a small one, it may be considered unethical to perform such a test when the test result does not have any influence on the care of the patient.
A Nonrandomized Design
Although in general we would recommend an RCT design, some situations may justify a modified design. Sometimes the potential study population is so small that randomization does not seem feasible within a limited time frame. Sometimes the new diagnostic strategy is far safer or far less burdensome to the patient than the currently used diagnostic test. Sometimes preliminary reports from other institutions have already demonstrated good results with the new diagnostic strategy, making randomization unethical or impossible because of patient preferences and physician bias.
A nonrandomized design would imply that, first, the new test would be performed in addition to the current work-up to "fine tune" the technical details of performing the examination and to familiarize physicians (both radiologists and clinicians) with the new diagnostic imaging technology. During this phase the current work-up should be established in protocol form and documented. Approximately 1020 cases would probably suffice for this first phase. Next, the new diagnostic strategy should be implemented as the initial test. Physicians are given the option of requesting the previously used test if they believe it is necessary to complete the work-up, but such requests are recorded and tracked. The process is documented over time as physicians become familiar with the new test.
Outcome Measures and Trends over Time
As mentioned, we consider the trial to be a vehicle to implementation of the best diagnostic imaging technology. We propose evaluating the implementation process by including outcome measures that reflect the clinical decision-making process based on the imaging information and acceptance of the new test. Because physicians experience with and opinions of the test change over time, all outcome measures would have to be tracked over time. Crucial to our proposed study design is to measure outcomes that reflect the learning curve and acceptance of the new diagnostic strategy over time. Thus, taking into account the dimension "time" is an essential feature of our proposed study design.
Furthermore, the outcome measures should focus on the goal of performing the diagnostic trial and should therefore measure the clinical decision-making process. Outcome measures that reflect the goal of the trial include the probability that additional studies would be ordered, the costs of the diagnostic work-up and treatment, the physicians confidence in therapeutic decision making, patient outcome measures related to the clinical problem at presentation, and, in the event of a randomized trial, the recruitment rate.
Percentage (and type) of additional requested tests and trends over time.One would expect that this percentage is practically constant over time for the control (ie, currently used) diagnostic work-up strategy and variable over time for the experimental strategy. For example, in the course of an implementation study of CT angiography compared with intraarterial DSA for peripheral arterial disease, one would expect the percentage of requested intraarterial DSA procedures following CT angiography to be high initially (probably about 90%) and to decrease over time as interventional radiologists and vascular surgeons become more familiar with this technology for this particular indication. In other words, the proportion of intraarterial DSA procedures requested would reflect the learning curve of performing and interpreting CT angiographic images. Furthermore, one would expect that after initial CT angiography, a requested angiogram could be limited to a selective study. It is even conceivable that with the introduction of CT angiography, physicians will request CT angiography after intraarterial DSA to solve specific diagnostic problems. The latter would demonstrate a shift in the control strategy over time that would support the use of the new technology.
Costs of diagnostic work-up and treatment and trends over time.Resource utilization and the associated costs of the diagnostic work-up and treatment should be recorded, and the trends over time should be calculated (Fig 4). As indicated, it is likely that crossovers will occur, in that patients undergoing the experimental strategy may still need additional work-up with (part of) the control strategy. In addition, patients undergoing the control strategy may require additional work-up, which may, over time, be performed more and more with the experimental strategy. If the diagnostic strategy influences the treatment approach, this effect would need to be estimated. For example, the availability of CT angiographic images prior to a combined intraarterial DSAangioplasty procedure could potentially lead to a more focused selective diagnostic procedure and a more direct approach to the interventional procedure.
|
Confidence in therapeutic decision making and trends over time.The diagnostic and therapeutic decisions made and the physicians confidence with their therapeutic decision making are additional useful outcome measures (Fig 5). The physicians confidence may be measured by using a rating scale or a visual analogue scale (40,41). One would expect that their confidence is constant over time for the control (currently used) diagnostic work-up strategy. For the experimental strategy one would expect their confidence initially to be lower than that of the control strategy and to increase over time as they become more familiar with the new technology. Confidence could, however, decrease over time if initial expectations of the new technique are not fulfilled as the limitations of the technology become apparent.
|
The recruitment rate as function of time since the start of the study.One of the outcome measures in our proposed study design is the percentage of eligible patients actually recruited for the study as a function of time since the start of the study (Fig 6). As long as the two strategies are considered to be equivalent, physicians will be comfortable with random assignment of patients across the diagnostic work-up strategies. Any trend identified in the proportion of patients recruited should be interpreted in the context of all factors that may influence recruitment. For example, during the course of the trial physicians may become reluctant to recruit patients without palpable femoral pulses and may instead request CT angiography. Despite strict inclusion and exclusion criteria, a subtle change in those considered eligible for the trial may occur. Furthermore, patients may pick up subtle nonverbal information during the informed consent procedure and, as a result, be unwilling to participate.
|
Data Collection
The data that would need to be collected in the context of the described trial for peripheral arterial disease include the following:
1. Date of presentation, date of completion of the diagnostic work-up (ie, date of last imaging study), diagnostic tests performed, date and type of the treatment decision, date of treatment, date that normal activities are resumed, date of recurrence of symptoms and severity of those symptoms, and type of events.
2. All patients with peripheral arterial disease evaluated by the vascular surgery staff would need to be registered and baseline characteristics recorded to enable the evaluation of generalizability and the proportion eligible and proportion randomly assigned.
3. Patient characteristics and diagnostic test results should be documented, including age, sex, symptoms (according to the Rutherford classification), and findings on CT and/or intraarterial DSA images (including lesion severity, lesion length, lesion location, and runoff). These variables are used in evaluation of the randomization process and to help analyze temporal changes in clinical decisions made.
4. If reproducibility of interpretation is an issue, the images should be interpreted by at least two independent observers.
5. Diagnostic and therapeutic decisions made must be tracked for each patient during the vascular conference, as must confidence in the decision. The latter can be recorded on a rating scale from zero to 10, with zero indicating no confidence and 10 indicating full confidence in the decision (41).
6. Quality of life should be measured at baseline, 6 weeks, 3 months, and 6 months, with a general descriptive instrument (eg, MOS 36-Item Short-Form Health Survey), a disease-specific descriptive instrument (eg, the VascuQol), and an evaluative instrument providing general population values (eg, the EuroQol) (4548).
7. Resource utilization of diagnostic procedures must be tracked and a detailed cost analysis should be performed for all procedures that are important (eg, CT and DSA) by recording the time required to perform the procedure, the personnel involved, and the materials needed. Other costs per unit of resource can be analyzed independently of patient data collection.
8. Resource utilization for therapeutic procedures should be tracked with case record forms and complemented with data from the hospital information system and the patient questionnaires.
Data Analysis
Analogous to the intention-to-treat principle for analysis of therapeutic RCTs, the data of a diagnostic RCT should be analyzed on an intention-to-diagnose-and-treat basis. This implies that once a patient has been randomly assigned, he or she will remain in the assigned group for the analysis irrespective of whether crossover occurred to the other strategy and of whether follow-up was complete or not. The underlying concept is that patients that cross over and patients lost to follow-up are part and parcel of the strategy.
Essential in the analysis of data from a study that involves concurrent implementation and evaluation of a new diagnostic strategy is determination of trends among outcome measures over time. By including time since the start of the study in the analysis, we have a surrogate for the learning curve of physicians and we can adjust for this effect. The analysis can be performed pragmatically by using multivariable regression analysis to model costs of the diagnostic work-up, costs of the diagnostic work-up plus treatment, and physicians confidence in the therapeutic decision. Multivariable logistic regression analysis can be used to model the probability of additional examinations requested and the recruitment rate. In both models, the independent variables include the diagnostic strategy; time since the start of the study; an interaction term between these two; and covariates such as symptoms, age, and sex. In a randomized design, the focus of the analysis would be to demonstrate a difference between the two groups, taking into account the trend over time. In the absence of a concurrent randomized control group, the analysis of trends over time is the primary focus of the analysis.
Randomization and selected use of the traditional test does not preclude the possibility of obtaining estimates of the sensitivity and specificity of a new test in comparison with those of the traditional test as reference standard. Because the test results are verified only in selected cases, we would need to use a method that adjusts for the associated possible verification bias (2629). This requires development of a logistic regression model to predict the probability of verification as a function of age, sex, disease severity, and time since start of the study. The probability of verification is then used to adjust the data and calculate corrected estimates of sensitivity and specificity.
| DISCUSSION |
|---|
|
|
|---|
|
|
Randomization also has disadvantages, however. The inherent limitations of randomization are reduced generalizability of study results due to the selected patient population and a perceived necessity to strictly control the experimental setting. Generalizability of trial results to other settings can be ensured by including a wide range of patients in the trial. Furthermore, by using an empirical pragmatic design we in fact study daily clinical practice, which increases generalizability.
Ethical considerations are frequently mentioned as an argument against randomization. If there are good reasons to believe that one diagnostic imaging modality is superior to another, then randomization is clearly unethical. In such situations, the inferior test should not be performed because of the possibility of misleading results, risks, and costs, and a trial is unnecessary. In fact, if the available evidence demonstrates that a new diagnostic strategy is superior, the results of the trial may even be misleading, or, at the very least, the research resources used are not justified relative to the scientific information obtained (35). Furthermore, if the tests are very different in their usage and goal (eg, D-dimer and spiral CT for help in diagnosing pulmonary embolism), then randomization between these two tests is neither useful nor feasible. Randomization between two full diagnostic strategies that include these tests, however, is feasible. In evaluating new, less invasive tests in the context of existing screening programs, randomization between the reference standard and the new test is ethical only if the available evidence suggests that the risk of missing the disease is more or less an equivalent impediment to the patient as are the risk and discomfort of performing the reference standard. For example, in screening patients with a history of premalignant colon polyps by using colonoscopy, randomization between colonoscopy and virtual CT colonoscopy would be ethical only if the reduced burden to the patient of virtual colonoscopy is on the same order of magnitude as the burden to the patient of potentially missing a polyp. Finally, in the very early stages of development of a diagnostic technology and if missing the diagnosis would be life threatening, randomization between the new test and the reference standard is unacceptable.
More often than not, however, we truly do not know which diagnostic strategy is best, and diagnostic strategies have both potential advantages and disadvantages. Even obtaining all possible imaging information can have disadvantages, in that risks and burden are involved and results can potentially be misleading. In the example mentioned earlier, CT angiography may very well turn out to be diagnostically superior to intraarterial DSA, because multiple views can be obtained to evaluate the vessels. Furthermore, CT angiography is less risky and less burdensome than intraarterial DSA, which favors the use of the former. At the same time, however, physicians interpreting CT angiographic images may still be uncomfortable with the images, and artifacts may not be recognized as such. By allowing the use of an alternative diagnostic technique in cases where diagnostic uncertainty persists, valuable technology is not withheld from the patient. The premise is that physicians will request additional diagnostic examinations until they have sufficient diagnostic information to formulate a treatment plan. Furthermore, the patient may be examined with the alternative test if symptoms persist or recur during follow-up. Finally, if we truly do not know which test strategy is better, randomization will at least ensure that 50% of patients will be assigned to the better strategy.
Costs are also frequently brought up as an argument against randomization. Randomization between two diagnostic tests, however, is less expensive than performing both tests in all patients. In fact, in the example of intraarterial DSA versus CT angiography, we claim that performing the trial may actually reduce the expense of the diagnostic imaging work-up for peripheral arterial disease in comparison with the expense of not performing the trial! If the trial had not been initiated, the work-up would continue to be performed with intraarterial DSA in all patients. Through randomization, half of all patients with peripheral arterial disease are initially evaluated with CT angiography, which, as the initial test, is clearly less expensive than intraarterial DSA. Depending on how often intraarterial DSA is performed after CT angiography, this can lead to a cost reduction. Furthermore, after CT angiography, an additional intraarterial DSA examination can often be limited to selective views, and an interventional procedure can usually be planned concurrently, which can lead to additional cost savings. The only real research expenses are those associated with data collection, data analysis, and reporting of the resultsbut that is a cost that will be incurred irrespective of whether the trial is designed as a randomized or a nonrandomized trial. The proposed trial design will likely be funded by agencies interested in funding technology assessment, implementation, and translational research. We have been successful in obtaining funding from such an agency to perform the study described in this article.
An empirically based pragmatic study protocol may be frowned on by researchersit refutes the notion that research should be performed in a strictly controlled setting. An empirical pragmatic study protocol may be influenced by subjective experience and individuals opinions. Through feedback, we allow subjective experience to affect the postprocessing methods of the imaging study and the interpretation of images. Furthermore, we allow feedback to influence requests for additional examinations. The question is whether this empirical pragmatic approach invalidates the results of the trial. Our premise in allowing feedback to influence the clinical process under evaluation is that it will happen anywayif not explicitly, it will happen implicitly. It is impossible to prevent an increase in experience and knowledge among technologists and physicians using the new techniques, and changes will take place, although sometimes in a subtle way. In general, a scientist cannot play the role of a detached objective observer but becomes involved in the world he or she observes (49).
Similarly, we can question whether we want to separate the clinical process and research. Rather than struggle to create a "pure" experimental setting, we question whether creating a pure experimental setting is desirable. One of the frequently expressed complaints about research is the delay between the reporting of results and the implementation thereof. Then why not integrate the clinical process and the research that is meant to improve that process? Why not make it into one interwoven whole? The clinical researcher is intricately involved in the clinical process, and, therefore, it would make immanent sense to integrate clinical research and clinical practice into one interwoven process (37).
Our chosen outcome measures are, in part, subjective and may therefore be prone to bias. Although this certainly represents a limitation of our approach, one should recognize that whatever we measure is determined and colored by our perception of the problem. For example, if we choose to measure the ankle-brachial blood pressure index (an objective outcome measure) to determine the severity and progression of peripheral arterial disease, we will observe hemodynamic changes due to obstruction in the blood vessels to the legs. The fact that the patients symptoms may be influenced by the ability of muscle tissue to adapt to ischemia, the ability of vessels to develop collateral circulation, the patients general fitness, and the patients psychologic ability to cope with pain will remain unobserved. By using the ankle-brachial index as measure of disease severity, we have defined the problem of peripheral arterial disease as analogous to the problem encountered by a plumber. Thus, our choice of outcome measure is in itself subjective. We perceive the world from our own viewpoint, and our observations probably say more about how we have defined the problem and categorized the possible outcomes than about the actual phenomena we observe (49).
Ideally, one would want to observe phenomena from multiple perspectives. Putting all the different perspectives together will give an overall holistic picture. In choosing outcome measures for our trial design, we have attempted to capture the perspective of the patient by measuring health-related quality of life and events during follow-up, the perspective of the physician by measuring his or her confidence in the diagnosis and the decision whether or not to request further diagnostic examinations, and the perspective of the health care system by measuring the costs related to the diagnostic work-up and therapy. Depending on the clinical problem, it may be relevant to measure other outcomes as well.
Most important of all, we propose using time as an important dimension in the analysis of outcomes. Instead of measuring cumulative outcomes, the key feature of our proposed study design is to measure trends in outcomes over time. There is always a learning curve to get acquainted with new technology. Even after implementation of a new technology, adjustments are constantly made to the technique used, and interpretation skills keep developing. Thus, the development of new technology continues as we use itand it should continue to develop! If we force ourselves to adhere to the exact same imaging protocol as the one with which we initially started, we will very quickly be using an outdated technique. Apart from changes in the technique used, it is impossible (and undesirable) to prevent interpreters from learning from their experience. Thus, instead of forcing the imaging technique and interpretation to remain stable and stationary, we propose using time as an explanatory variable in the data analyses, which will model the learning curve, technical developments, and increasing interpretation skills over time.
Our proposed approach is not a panacea: It has both advantages and limitations. As with every technique, model, and theory, this design will have to find its place among the multitude of study designs available. The approach seems especially useful when available evidence based on results from small clinical studies, exploratory decision- and cost-effectiveness analyses, and clinical experience suggests that two diagnostic strategies are similar, that both have advantages and disadvantages, that defining a reference standard proves to be difficult, or that the new technology could potentially surpass the existing reference standard. Furthermore, we would suggest the integration of structured data collection into daily clinical practice for the purpose of assessment of imaging technology and randomization between diagnostic strategies, rather than performance of new imaging tests in addition to the currently used tests. This enables assessment of new imaging technologies as they become available for clinical use.
In conclusion, for the development, assessment, and implementation of new diagnostic imaging technology, we propose use of a randomized empirical trial design based on a pragmatic study protocol that interweaves research and clinical practice. Outcome measures should include measures related to the clinical decision-making process, costs, and patient health outcomes. The key feature of our approach is to measure the trends over time in the outcome measures.
| FOOTNOTES |
|---|
Abbreviations: DSA = digital subtraction angiography, RCT = randomized controlled trial
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
R. Ouwendijk, M. de Vries, T. Stijnen, P. M. T. Pattynama, M. R. H. M. van Sambeek, J. Buth, A. V. Tielbeek, D. A. van der Vliet, L. J. SchutzeKool, P. J. E. H. M. Kitslaar, et al. Multicenter Randomized Controlled Trial of the Costs and Effects of Noninvasive Diagnostic Imaging in Patients with Peripheral Arterial Disease: The DIPAD Trial Am. J. Roentgenol., May 1, 2008; 190(5): 1349 - 1357. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. S. Douglas Improving Imaging: Our Professional Imperative J. Am. Coll. Cardiol., November 21, 2006; 48(10): 2152 - 2155. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Petrella, L. M. Shah, K. M. Harris, A. H. Friedman, T. M. George, J. H. Sampson, J. S. Pekala, and J. T. Voyvodic Preoperative Functional MR Imaging Localization of Language and Motor Areas: Effect on Therapeutic Decision Making in Patients with Potentially Resectable Brain Tumors Radiology, September 1, 2006; 240(3): 793 - 802. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. J. M. Kock, M. E. A. P. M. Adriaensen, P. M. T. Pattynama, M. R. H. M. van Sambeek, H. van Urk, T. Stijnen, and M. G. M. Hunink DSA versus Multi-Detector Row CT Angiography in Peripheral Arterial Disease: Randomized Controlled Trial Radiology, November 1, 2005; 237(2): 727 - 737. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Hollingworth Radiology Cost and Outcomes Studies: Standard Practice and Emerging Methods Am. J. Roentgenol., October 1, 2005; 185(4): 833 - 839. [Full Text] [PDF] |
||||
![]() |
|