|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fundamentals of Clinical Research for Radiologists |
Department of Radiology, University of Washington, Harborview Medical Center, 325 Ninth Ave., Box 359960, Seattle, WA 98104.
Received November 17, 2004; accepted after revision November 23, 2004.
Series editors: Nancy Obuchowski, C. Craig Blackmore, Steven Karlik, and
Caroline Reinhold.
The introduction of noninvasive angiography using MRI or MDCT to replace catheter angiography provides one of several examples in which diagnostic imaging advances have the potential to simultaneously reduce costs and benefit patients [6, 7]. However, this will not always be the case. Newer imaging technology may increase costs for any of the following reasons: if it is an adjunct rather than a replacement for existing imaging methods; if it has a higher unit cost than existing imaging; or if, by making the imaging process more convenient, the threshold for imaging is lowered [8]. In these situations the onus will continue to be on radiologists to provide evidence that newer imaging techniques improve diagnostic and therapeutic decision making and thereby benefit patients.
This article has three objectives: first, to identify factors that, in combination, make radiology cost and outcomes studies unique; second, to review standard methods for measuring the cost and outcomes of diagnostic imaging; and third, to describe emerging methods that will help radiologists conduct and interpret cost and outcomes studies in future years.
What Are the Factors That Make Cost and Outcomes Research in Radiology Unique?
The Gap Between Diagnosis and Outcome
The fundamental distinction between outcomes research in radiology and
other areas of medicine, such as surgery and pharmaceutics, is the distance
between cause and effect. That is, the chain of events that separates the
immediate aim of radiology, which is to make an accurate diagnosis, from the
ultimate goal, which is to improve patient health and life expectancy at an
affordable cost. The links in this chain have been formalized in the hierarchy
originally developed by Fineberg et al.
[9] and adapted by others
[10,
11]. The first two levels of
this hierarchy depend on the capability of the imaging technology to depict
normal and abnormal anatomy and function (level 1) and the ability of
radiologists to use the images to make accurate diagnoses (level 2). Beyond
these initial two levels, the value of diagnostic imaging is dictated by
factors that are not under the control of radiology. The referring clinician
must be convinced by the imaging results to change the working diagnosis
(level 3) and therapy (level 4) for the patient. Effective therapeutic options
must be available if the change in therapy is to benefit patients (level 5).
Finally, the net cost of diagnosis and treatment must be justified by
improvements in patients' health (level 6). Failure at any one of the latter
four levels will undermine the value of even the most accurate diagnostic
test.
The Size of the Study
One upshot of this hierarchy of events is that imaging, particularly when
used to screen asymptomatic populations, is likely to directly benefit only a
small subgroup of recipients. This is in contrast to therapeutic
interventions, in which all patients have the potential to benefit. For
example, in many breast cancer screening programs, fewer than 1% of mammograms
result in a confirmed case of cancer
[12]. The health of the
remaining 99% of women is unlikely to be directly affected beyond reassurance
provided by a negative result or anxiety raised by false-positive findings.
Consequently, most studies of screening are large trials recruiting thousands
of patients, or decision analyses based on hypothetical models of diagnostic
accuracy and therapeutic effectiveness. Large trials are needed to detect with
statistical accuracy health effects in the small proportion of the population
with the disease.
The Intrinsic Value of Diagnostic Information
Even diagnostic imaging of symptomatic patients may not radically alter
treatment for many recipients. For example, in a study comparing MRI and
arthrography for patients with shoulder pain and suspected full-thickness
rotator cuff tears, Blanchard et al.
[13] found that preimaging
management plans changed in 36% and 25% of patients, respectively. Although
imaging may not always trigger a change in therapy, diagnostic information may
still have intrinsic value. In 1994, Mushlin et al.
[14] found that patients with
suspected multiple sclerosis became less anxious after a positive MRI
diagnosis, even though they faced a chronic disease with, at that time, few
therapeutic options. A negative test result may also be beneficial if it
reassures the patient that nothing is seriously wrong. However, this is not a
predictable effect; indeed, in some patients, negative test findings can
heighten anxiety about the cause of ongoing symptoms
[15]. These intrinsic effects
emphasize the importance of assessing patients' perceptions of their physical
and mental health after imaging.
Standard Methods in Cost and Outcomes Research
The diverse nature of cost and outcomes research makes it difficult to be prescriptive in defining best practice. However, as research methods have evolved there have been a number of landmark publications that have defined a methodologic blueprint for research. The Consolidated Standards of Reporting Trials (CONSORT) statement provides a checklist of items considered essential for the clear presentation of RCT results [16]. Similar guidelines have been developed for nonrandomized studies [17], economic evaluations [18], and decision analysis models [19]. In addition, a number of excellent articles apply general cost and outcomes methods to radiology [20, 21].
The purpose of this section is to briefly recapitulate the standard methodologic issues, with the expectation that readers who require more detail will turn to the citations listed in the text.
Study Design
Blackmore et al. [22]
identified 238 radiology cost and outcome studies conducted over a 40-year
period. Most studies presented primary data from observational cohort or
casecontrol studies (59%) or RCTs (18%), and the remaining studies used
secondary data available in the medical literature to build decision analysis
models. RCTs are thought to be the best method of providing unbiased evidence
on the costs and effectiveness of alternative imaging technologies
[23]. The process of randomly
allocating patients to receive one of the two or more putative technologies
makes it probable that any differences observed in subsequent outcomes will be
truly due to the imaging strategy and not caused by the myriad of individual
patient characteristics that confound the interpretation of nonrandomized
studies. However, RCTs do have drawbacks and are not necessary to answer all
radiology outcomes research questions
[24]. Most notably, rigorous
RCTs require a substantial commitment of time and money. Moreover, often only
a select subset of patients enrolls in trials, making the extrapolation of
trial results to real-world clinical practice problematic. Despite these
caveats, for the most important questions, RCTs should continue to spearhead
the push toward the rational use of diagnostic imaging.
Choosing the Perspective of the Study
Innovations in imaging rarely affect all elements of society, such as
physicians and insurers, equally. The value of imaging will depend on the
viewpoint, or perspective, of the analyst. By stating the perspective of the
study, the researcher predetermines the relevant costs that ought to be
included in the analysis. For example, a recent trial compared the cost of
abdominal CT with 120 mL of nonionic contrast versus the same technique with
100 mL of the same contrast material pushed with 40 mL of saline
[25]. From the perspective of
the hospital and society as a whole, the small cost reduction of the saline
flush method is relevant because it might generate substantial savings in the
long run. However, from the perspective of third-party insurers, who pay a
fixed reimbursement rate for contrast-enhanced CT, the cost reduction is of no
immediate relevance or value. Therefore, an explicit statement of the
perspective of the study is a vital, although often overlooked
[26], part of a cost and
outcomes study.
Current guidelines recommend that the default study perspective should be societal [18]. This is the broadest perspective and includes the costs borne by individuals and public and private organizations within society.
Measuring Costs
Table 1 provides examples of
the costs and costing methods that might be used for diagnostic technology
assessment from the point of view of four commonly encountered perspectives.
Importantly, the cost of medical care to society is not equivalent to the
charge billed by the provider. Charges incorporate both costs and a profit
margin. From the perspective of society, profit merely represents the transfer
of money from one member of society (the payer) to another (the provider), no
resources are depleted, and society as a whole is neither richer nor poorer.
Therefore, charges tend to overestimate cost.
|
Costs can either be calculated directly using activity-based costing (ABC) methods or indirectly using proxies for cost based on third-party insurer reimbursement rates or cost-to-charge ratios. The ABC method, also referred to as microcosting, is the more accurate and laborious. It is usually reserved for elements of cost likely to be most influential for the study results. Nisenbaum et al. [27] used ABC methods to calculate the costs of 17 CT procedures performed at a university hospital. Each element of resource use is identified, measured, and valued. For example, the CT machine cost per examination is a function of the purchase cost, maintenance and upgrade costs, machine life expectancy, yearly hours of machine operation, and the number of minutes spent imaging each patient. Using this detailed approach, a cost for all elements of the CT procedure, including consumables (e.g., contrast material and film) and radiologist, technologist, administrative, and overhead (e.g., rent) costs, is developed.
The intricate ABC approach is not always feasible, and simpler methods are often sufficient. For example, the Centers for Medicare and Medicaid Services has made extensive efforts to implement a resource-based relative value scale (RBRVS) of reimbursement. This system provides reimbursement for each radiology procedure based on the perceived complexity and resource utilization required to perform that procedure. One advantage of this system is that it is standardized at a national level. Nevertheless, recent work has indicated that substantial inaccuracies may still exist in reimbursement rates, resulting in poorly (e.g., radiography and interventional) and favorably (e.g., sonography, MR, and CT) reimbursed techniques [28]. Other authors have used cost-to-charge ratios to estimate cost by removing the element of profit in the charges billed for medical procedures [29]. The cost-to-charge ratio is the ratio of annual departmental expenditure to revenue. However, because the profit margin may vary widely among imaging examinations, the devaluation of charges based on uniform departmental-level cost-to-charge ratios provides only a crude estimate of the cost of individual imaging examinations. Therefore, overreliance on reimbursement rates or cost-to-charge ratios may distort the cost analysis. In practice, there is a trade-off between the accuracy and the feasibility of costing methods. Many studies use a combination of ABC methods for key cost elements, such as the initial imaging, and cost proxies for other costs, such as subsequent medications and inpatient and outpatient care.
All cost data should be standardized and updated to reflect current costs. Often, because of the scarcity of cost information, analysts draw on cost data from several years. In these circumstances, historical cost data are inflated to current values using the medical care component of the consumer price index. On a similar theme, current U.S. guidelines recommend that future costs, savings, and health outcomes be discounted at a rate of 3% per year [18]. Therefore, a screening test in 2004 that prevented $1,000 of treatment costs in 2006 would receive credit for saving only $943 (i.e., $1,000 / [1 + 0.03]2). The rationale for discounting is based on evidence that people prefer to have resources now rather than in the future for several reasons, including the opportunity to profitably invest current funds. Controversially, discounting lowers the estimated efficiency of screening interventions, in which costs occur immediately but benefits are delayed.
Choosing the Type of Economic Evaluation and Measuring Outcomes
Although there are four types of economic evaluation commonly defined in
the literature (Table 2), most
health care studies can be classified as one of two types. Currently, the most
prevalent method is cost-effectiveness analysis, accounting for more than 80%
of published analyses [30].
The distinguishing feature of cost-effectiveness analysis is that the outcome
measure used reflects only a limited aspect of health. This primary outcome
can be a clinical measure such as mortality, bone density, or exercise
tolerance, or a patient-reported measure such as pain or quality of life. For
example, in an RCT comparing coronary interventions guided by intravascular
sonography or angiography, Mueller et al.
[31] used 2-year major cardiac
event-free survival to determine whether either imaging method improved
patient outcomes. Cost-effectiveness analysis works well in situations in
which imaging is expected to improve one predominant aspect of health.
However, if imaging is likely to affect more than one element of health or
longevity, then the more inclusive quality-adjusted life year (QALY) outcome
measure used in cost-utility analysis is recommended.
|
Cost-utility analysis measures outcomes by weighting years of life by a factor (Q) that represents the patient's health-related quality of life. Q is anchored at 1 (perfect health) and 0 (a health state considered to be as bad as death) and is estimated for all health states between these extremes. A QALY is simply the number of years that a patient spends in each health state multiplied by the quality of life weight, Q, of that state. For example, a patient who spends 2 years in an imperfect health state, where Q = 0.75, would achieve 1.5 QALYs (0.75 x 2). The quality weight, Q, can be elicited directly from patients using methods such as the visual analogue scale, time trade-off, and standard gamble; these methods have been described in detail elsewhere [21, 32]. Alternatively, in an increasing number of studies, Q is estimated indirectly via a quality of life questionnaire such as the EQ-5D [33] or the Health Utilities Index [34]. The questionnaire asks the patient to categorize current health in various dimensionsfor example, physical functioning, pain, and mental health. Every possible combination of questionnaire responses is associated with a quality weight, Q, from a catalog or algorithm provided by the questionnaire creators. The weights in this catalog are based on prior surveys of the general public's preferences for the health states described by the questionnaire. This indirect approach to estimating Q is currently being used in a trial comparing duplex sonography with clinical surveillance after femoral vein bypass [35]. In this trial, imaging influences medical therapy for ischemia or surgical decisions to amputate and therefore affects several aspects of health, including mobility, self-care, and pain. These researchers chose the EQ-5D questionnaire, which incorporates all of these dimensions of health.
The QALY provides a universal outcome measure that could be used in all clinical trials. Therefore, the efficiency of femoral vein sonography from the trial just described could, in theory, be compared with any other medical intervention in which cost-utility analysis data are available. For this reason, current guidelines favor cost-utility analysis as the most useful method for policy makers [18]. However, some authors are skeptical of the QALY method [36], and it is likely that cost-effectiveness analysis will remain a popular method of economic evaluation in the near future.
The benefits of screening, diagnosis, and preventive treatment may influence the entire course of patients' lives. Therefore, cost and outcomes studies should strive to measure the lifetime impact of imaging. However, in prospective studies it is not practical to follow up patients indefinitely. Therefore, analysts often report the primary results after the first few years of follow-up and extrapolate any differences in cost and outcomes data over the remaining life expectancy of patients [37].
Analysis Methods
The incremental cost-effectiveness ratio (ICER) is conventionally used to
summarize the relative efficiency of medical procedures. The ICER is
calculated as follows:
![]() |
,
and
are the mean cost and effectiveness of the
two imaging strategies being compared, and
and
are the difference between the mean costs and mean
effectiveness of the two strategies, respectively. Therefore, a screening strategy that increases costs by an average of $500 per patient and improves life expectancy by an average of 0.04 QALYs per patient, has an ICER of $12,500 per QALY saved. Typically, less cost-effective imaging strategies will have higher, positive, ICER values. However, no consensus exists on an exact threshold that would distinguish efficient from inefficient health care interventions. In reality, this threshold will vary over time and according to many other factors, including the amount of money available to fund health care.
The ICER statistic has several weaknesses. Most important, the meaning of a negative ICER statistic is ambiguous and open to misinterpretation. For example, an efficient imaging strategy that is both cheaper ($1,000) and more effective (0.1 QALYs) than the strategy with which it is being compared has an ICER of $10,000. Likewise, an inefficient imaging strategy that is both more expensive ($500) and less effective (0.05 QALYs) than the strategy with which it is being compared also has the same ICER value, $10,000. The policy implications of these two scenarios are diametrically opposed, yet the ICER is identical. Furthermore, merely presenting the ICER estimate without quantifying the surrounding confidence interval is of limited value. Unfortunately, however, the ICER has an undefined variance; this complicates even simple statistical tasks such as hypothesis testing and confidence interval calculation [38].
In recognition of these weaknesses, newer methods are emerging, such as the net benefit statistic and cost-effectiveness acceptability curves, and are being used to complement or supplant the ICER statistic in economic analyses. These emerging methods will be discussed in the final section of this article.
It is often difficult to generalize cost-effectiveness results observed in one imaging center to other settings. For example, a survey of 26 Canadian MRI centers concluded that the average operating time per week was 64 hr (range, 25113 hr) [39]. It would be unreasonable to assume that the cost of MRI equipment per examination is identical for centers at opposite ends of this spectrum. Therefore, sensitivity analysis is frequently used to judge whether study conclusions might be reversed by plausible deviations in parameters, such as the intensity of MRI machine utilization, that underpin cost and efficacy estimates. In the example given, the sensitivity analyst might vary the mean capital cost of MRI by ± 60% to simulate the plausible variation in operating hours and to judge whether a particular application of MRI is likely to be efficient even in centers with low patient throughput. Sensitivity analysis takes many forms, including oneway, multiway, and threshold analyses. These methods have been described in detail in a previous article in this series [40].
Emerging Analytic Methods
Evaluating the Imaging Process from the Patient's Perspective
In many clinical applications there are now a multitude of highly accurate
imaging alternatives available. It is frequently impossible to differentiate
between two imaging techniques purely on the basis of their impact on patient
health or medical care costs. In these circumstances, researchers have begun
to formally assess patients' views on the desirability of competing imaging
procedures. For example, Blanchard et al.
[41] found that 26% of
patients undergoing shoulder MRI reported it to be unpleasant or extremely
unpleasant compared with 7% undergoing arthrography, although most patients
would allow either test to be repeated
[41]. Swan et al.
[42] developed a method for
further quantifying the strength of patient preferences. They report that, on
average, patients with peripheral vascular disease would be willing to wait an
extra 6 weeks for imaging results and treatment if they could avoid the
discomfort and risk of X-ray angiography. By comparison, patients would wait
just more than 2 weeks to avoid the MR angiography procedure
[42].
Net Benefits
Presenting cost-effectiveness results using the net benefit statistic
resolves many of the problems associated with incremental cost-effectiveness
ratios [43]. The net benefit
statistic is calculated as follows:
![]() |
is the amount that society is willing to pay for an improvement
in health. Therefore, continuing the previous example, if society is willing to pay $100,000 per QALY gained, then our hypothetical screening strategy that increased mean QALYs by 0.04 and increased mean costs by $500 would have a net benefit of $3,500 ([$100,000 x 0.04] - $500). Unlike the ICER, the interpretation of the net benefit statistic is clear-cut; a positive value indicates a cost-effective imaging strategy in which the net costs are more than justified by the net benefits, whereas a negative value indicates the opposite. The larger the net benefit statistic, the more cost-effective the imaging strategy and the more highly it should be prioritized. Furthermore, in large samples the mean net benefit statistic is normally distributed; therefore, hypothesis testing and confidence interval calculation are straightforward [43].
One potential limitation of the net benefit approach is that
, the
value that society is willing to pay for improved health, must be explicitly
quantified and embedded in the net benefit calculation. In general,
is not accurately known and will vary from setting to setting. To address this
limitation, many authors now present their results across the spectrum of
values. These values range from $0, implying that society cannot
afford or is not willing to pay anything for improved health and will simply
choose the cheapest option, through to millions of dollars, implying that
society wishes and is able to pay handsomely for even the most meager health
improvements. Using resampling or simulation methods
[44], the probability that the
net benefit statistic is positive (i.e., the intervention is cost-effective)
can be calculated for each value of
and presented as a
cost-effectiveness acceptability curve (CEAC).
Cost-Effectiveness Acceptability Curves
The CEAC describes the probability that an imaging intervention is
cost-effective at different willingness-to-pay thresholds.
Figure 1 shows the information
provided by the CEAC from a randomized trial comparing rapid MRI with
radiography as the initial imaging test in patients with lower back pain
[45]. The primary finding of
this trial was that costs were slightly (
$300), but not statistically
significantly, higher in patients initially imaged with rapid MRI and that
there was no clinically or statistically important difference in physical
function outcomes. In this trial, the ICER alone is difficult to interpret
because it is negative and has an undefined confidence interval. The CEAC
provides more useful information. In this case, the curve crosses the
y-axis, where society places no value on improvements in back-related
function, at 0.16 (Fig. 1).
This confirms that, on the basis of the trial data, a 16% probability still
exists that rapid MRI is the cheapest strategy. Therefore, more data are
required to state with certainty that the rapid MRI strategy is more expensive
than radiography. As we move right along the x-axis, the probability
that rapid MRI is cost-effective increases. This reflects the fact that the
more society is willing to pay for improvements in physical function, the more
likely it is that the extra cost of rapid MRI will be justified by small
improvements in function. However, in this example, the probability curve
flattens quickly and never rises above 0.50. This happens because the trial
data provide no substantive evidence that the rapid MRI strategy is either
more or less effective than radiography. Therefore, even if society is willing
to pay excessively for improved health, a 50% probability still exists that
rapid MRI is not the most effective strategy. This graph informs the decision
maker that it is probable, but not certain, that rapid MRI is currently not a
cost-effective initial imaging tool for improving the function of patients
with lower back pain.
|
Conclusions
This article provides a starting point for radiologists and allied health professionals who have an interest in conducting or applying the results of health services research. By its very nature, health services research is multispecialty research because the diagnostic information provided by radiology must be combined with the therapeutic expertise of other clinical specialties to improve the health of patients. This fact, coupled with the large sample sizes needed to provide a definitive answer to some screening questions, can make this type of research seem daunting. However, there are now numerous examples where simple observational studies [12, 13] and compact randomized trials [25, 45] have been used to elucidate the links between diagnostic imaging and the ultimate goal of better health for patients. It seems inevitable that the frequency and importance of these cost and outcomes studies will continue to increase in the future.
Acknowledgments
The author thanks Jeffrey G. Jarvik, MD, MPH, for his useful comments on an earlier version of this manuscript.
References
This article has been cited by other articles:
![]() |
H. J. Otero, F. J. Rybicki, D. Greenberg, and P. J. Neumann Twenty Years of Cost-effectiveness Analysis in Medical Imaging: Are We Improving? Radiology, December 1, 2008; 249(3): 917 - 925. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Keen On Measuring Costs and Profit Am. J. Roentgenol., May 1, 2006; 186(5): E8 - E8. [Full Text] [PDF] |
||||
![]() |
W. Hollingworth Reply Am. J. Roentgenol., May 1, 2006; 186(5): E8 - E8. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |