Health Care Policy and Quality
Original Research
Radiologist Report Turnaround Time: Impact of Pay-for-Performance Measures
OBJECTIVE. Expedited finalized radiologist report turnaround times (RTAT) are considered an important quality care metric in medicine. This study was performed to evaluate the impact of a radiologist pay-for-performance (PFP) program on reducing RTAT.
MATERIALS AND METHODS. A radiologist PFP program was used to assess its impact on RTAT for all departmental reports from 11 subspecialty divisions. Study periods were 3 months before (baseline period) and immediately after (immediate period) the introduction of the program and 2 years later after the program had terminated (post period). Three RTAT components were evaluated for individual radiologists and for each radiology division: examination completion (C) to final signature (F), C to preliminary signature (P), and P to F.
RESULTS. Eighty-one radiologists met the inclusion criterion for the study and performed a final signature on 99,959 reports during the baseline period, 104,673 reports during the immediate period, and 91,379 reports during the post period. Mean C–F, C–P, and P–F for all reports decreased significantly from baseline to immediate to post period (p < 0.0001), with the largest effect on the P–F component. Similarly, divisional C–F, C–P, and P–F also significantly decreased (p < 0.0001) for all divisions except the C–F for nuclear and neurovascular radiology from baseline to immediate period and the C–P component from baseline to post period for cardiac radiology.
CONCLUSION. A radiologist PFP program appears to have a marked effect on expediting final report turnaround times, which continues after its termination.
Keywords: pay-for-performance, radiologist reporting, report turnaround time
The landmark article “To Err Is Human: Building a Safer Health System” released by the Institute of Medicine in 1999, estimated that up to 100,000 Americans die unnecessarily each year due to variable and substandard medical care [1]. This disturbing statistic set off a wave of debates on the quality and safety of American medicine, and since then there has been intense focus among health policy makers from all sides to create health delivery systems that improve quality, contain costs, reduce waste, eliminate inefficiency, and enhance worker productivity and responsibility [2]. To achieve these goals, the Institute of Medicine went further and recommended the introduction of financial incentives, primarily designed to motivate providers to deliver better quality care [3, 4].
From the patients' and payers' perspectives, these principles seem to be self-evident and long overdue [5]. One increasingly popular trend, used both by the Centers for Medicaid and Medicare Services and by third-party payers, attempts to address variation in quality and performance through the use of pay-for-performance (PFP) programs. Individuals or organizations entering PFP agreements are essentially compensated through one of two mechanisms: a straight bonus that rewards providers with additional payments for achieving stipulated performance targets or placement of a percentage of contracted provider revenue that is directly at risk if these targets are not met [6]. Each has advantages and disadvantages to individuals or organizations. Although there are preliminary data available that support the validity of PFP programs in general [7, 8], there are scant published data on the outcomes of specific PFP programs in radiology [7]. In fact, some have argued that there are a number of barriers and obstacles to overcome before PFP in radiology becomes meaningful [4, 7, 9, 10]. Others have argued that its implementation can easily be abused or mismanaged [5].
There is also debate as to what PFP initiatives are relevant to radiology and radiologists [6, 10]. Some departments have introduced utilization targets for selected imaging studies [6]. Others have suggested customer satisfaction surveys [11] or peer-review programs, among others [6]. However, as part of a general strategy aimed at improving the quality of clinical care (a major PFP driver), the Massachusetts General (MGPO) Physicians Organization at the Massachusetts General Hospital (MGH) recently initiated a hospitalwide department-specific PFP initiative. The goal was to develop a program that was easily measurable; likely to succeed; and, most important, likely to have a positive impact on the quality of patient care. It was therefore decided by the chief of radiology in collaboration with the MGPO to establish a PFP program that measured radiologist report turnaround time (RTAT). This was largely driven by complaints from referring physicians that many preliminary radiology reports (those initially dictated by residents or fellows) were left unfinalized by staff radiologists for unnecessarily long periods of time.
We therefore decided to evaluate the impact of introducing this PFP program specifically aimed at reducing staff radiologist RTAT. The goal was to reduce radiologist RTAT to specified targets, whereupon a semiannual financial bonus would be paid to individual radiologists who met these guidelines. We also evaluated whether there was any negative impact on RTAT once the PFP program was terminated.
The radiology reporting algorithm for the department of radiology at the MGH was standard to most academic radiology departments. The examination completion time (C) for any radiologic procedure was recorded in the radiology information system (RIS) (IDXRAD, IDX Corp.) by the performing technologist. All patient images then resided within the departmental PACS (IMPAX, Agfa-Gevaert) and were available for interpretation. A subsequent preliminary radiology report (P) was then generated by a resident or fellow after image interpretation under supervision by a subspecialty staff radiologist. All preliminary dictations were performed using speech-recognition software (RadWhere, Commissure Communications) using preloaded standardized reporting templates, which required the resident or fellow to overlay the relevant positive and negative findings. The preliminary report then resided on the RIS until it was finalized (F) by the same staff radiologist who initially interpreted the study with the resident or fellow. This required the staff radiologist to log on to the RIS and electronically sign each report after any necessary editing. This function could be performed from any computer within the hospital electronic firewall, including by remote access via a virtual private network (VPN). Exact C, P, and F times were recorded and stored on the RIS and were electronically extracted for the purposes of this study.
The primary goal of the PFP program was to provide radiologists with an incentive to reduce the P–F component of the RTAT cycle because this was directly under the control of the participating radiologist. The program was first announced to staff radiologists in November 2006, when they were informed that all radiologists would be eligible for the program beginning January 2007. To qualify for the financial bonus from the PFP program, each staff radiologist was initially required to meet a mean 24-hour or less P–F time for all reports signed over the course of a randomly assigned (and blind to the radiologists) 1-month period. Bonus payments (over and above base salary) of $2,500 were to be made in 6-month intervals (total $5,000 annually) if the radiologists met their P–F RTAT goals. Reports that were more than 3 weeks old were excluded for the purposes of the incentive program because it was assumed that most of these reports had some reason, other than radiologist tardiness, that meant they could not be signed (e.g., awaiting additional clinical information from the referring physician or the report was incorrectly assigned to the wrong radiologist). In January 2008, the P–F RTAT target to qualify for the PFP bonus was reduced for all radiology divisions to 8 hours, except for neurointerventional radiology and neuroradiology, which remained at 24 hours. Additionally the bonus payment was reduced to $2,500 from $5,000 annually. The radiologist RTAT PFP program was then terminated on January 1, 2009. During the course of the study, each monthly mean C–F and P–F RTAT for every radiologist was available for review by any radiologist online at any time.
To assess any effect of the PFP program, three RTAT components (C–F, C–P, and P–F) were evaluated over a 3-month period before, immediately after initiation, and after termination of the program. The first period was chosen to represent a period (July–September 2006) before institution of the PFP program and before any radiologist was informed of the program. This period was used to reflect the baseline performance of staff radiologists across the department. The second period (January–March 2007) was chosen to assess the immediate impact of the program. The third, or post period (January–March 2009) was chosen to assess the longer-term impact of the PFP program after its termination.
A total of 234 clinical radiologists had final signing privileges at any time during the course of the study (July 2006–March 2009). This number included many fellows and some staff who either had a temporary assignment within the department or left the department at some point during the course of the study and were therefore not present during all three periods of the study (baseline, immediate, and post). To obtain an unbiased measure of the impact of the incentive program, it was necessary to include only those radiologists present during the entire duration of the study. There were a total of 81 staff radiologists who met this criterion and were therefore included in the study. These 81 staff radiologists uniquely worked within 11 clinical radiology divisions including musculoskeletal (n = 8), cardiac (n = 1), chest (n = 7), neurointerventional (n = 4), emergency (n = 6), gastrointestinal and genitourinary (n = 19), breast (n = 7), neuroradiology (n = 15), nuclear (n = 5), pediatric (n = 4), and vascular (n = 5). Vascular radiology refers to the division performing both nonneurologic vascular interventional procedures and vascular cross-sectional imaging.
The mean C–F, C–P, and P–F for all reports generated by the 81 radiologists during the specified time periods were initially evaluated. A further subanalysis (mean C–F, C–P, and P–F) for each of the 11 subspecialty divisions was also performed to determine the effect of the incentive program at a divisional, rather than a departmental level.
The Kruskal-Wallis test was used with SAS statistical software, version 9.1 (SAS Institute), to compare the mean times for the three periods (baseline, immediate, and post) with a post hoc pairwise comparison of these periods when the overall test was significant (p ≤ 0.05). To determine whether differences in the number of reports per radiologist affected the results, we calculated the mean times for each radiologist for each period. These mean values were then subjected to a repeated measures analysis of variance. Because the results were unchanged in terms of the significance for between-period differences, we present only the more robust nonparametric p values in this article.
The 81 radiologists who met the inclusion criterion for the analysis performed a final signature on 99,959 reports during the baseline period, 104,673 reports during the immediate period, and 91,379 reports during the post period. The mean RTATs (C–F, C–P, and P–F) for all reports for the baseline, immediate, and post periods are listed in Table 1. The mean C–F times for all radiologists significantly decreased from the baseline (42.7 hours) to the immediate period (31.6 hours) to the post period (16.3 hours) (p < 0.0001). Similarly the mean C–P time also declined for all three periods from 20.0 hours at baseline to 19.0 hours at the immediate period to 11.9 hours during the post period (p < 0.0001). This trend was also observed when evaluating the P–F component, with reduction from baseline (22.7 hours) immediate (12.6 hours) to the post period (4.4 hours) (p < 0.0001). At an individual level, seven of 81 radiologists did not meet the performance P–F target during the first PFP measurement period (January–March 2007) and 10 of 81 radiologists did not meet the second period (January–March 2009). Five of these radiologists met neither the first target (5 of 7) the second target (5 of 10).
From a divisional perspective, there also a significant (p < 0.0001) reduction mean C–F time from the baseline to immediate period for all divisions except for two divisions with the smallest number of reports: nuclear radiology (p = 0.32) and neurovascular radiology (p = 0.04) (Table 2). All divisions showed a significant reduction (p 0.0001) from the immediate to post period. Similarly, there was a significant reduction (p < 0.0001) in the C–F time for all divisions from the baseline to post period (Table 2).
There was also a significant reduction the mean P–F times (Table 3) for all divisions from the baseline to immediate period, from the immediate to post period, and from the baseline to post period (p < 0.0001). However, there was a variable divisional C–P response (Table 4), with some divisions showing a significant (p < 0.0001) decrease from the baseline to immediate period (cardiology, neurovascular, emergency radiology, gastrointestinal and genitourinary, breast, and neuroradiology) and other divisions showing an actual increase (musculoskeletal, chest, nuclear, pediatric, and vascular radiology). Most divisions, however, showed a significant (p < 0.0001) drop in C–P times from the baseline to post period except for cardiac radiology.
Some have described the increasing trend toward PFP measures in medicine as a tsunami or freight train that will not be sidelined [6, 10]. Across medicine as a whole, PFP initiatives are now becoming embedded into clinical practice as organizations struggle to increase the value and quality of their medical care in a cost-effective manner [6, 12–18]. For radiology, there have been sporadic attempts by payers to reduce imaging utilization through PFP programs, but radiologists to date have been relatively unaffected by PFP initiatives [4, 6, 11, 19]. However, there is increasing acceptance by radiologists that PFP measures are here to stay and will steadily infiltrate the practice of radiology [4, 6, 7, 11–18].
As part of this nationwide trend, the MGPO recently embarked on a wide-ranging quality incentive program, designed to improve key quality metrics within departments, with the ultimate goal of improving patient care. The physician's organization chose the straight financial PFP bonus model, targeted directly at the individual physician level. In other words, physicians who individually met the stipulated goals would qualify for the bonus. For the radiology department, the MGPO chose individualized staff radiologist P–F RTAT performance to qualify for the bonus. Other targets could have been chosen (for instance, compliance to standardized reporting), but the RTAT was considered to be an important patient care quality metric, which was easily measurable with accepted benefits and was under direct control of the participating radiologists. Indeed, it has been proposed that radiologists create limited value until a final signed report is available to the ordering physician [20–24].
In competitive private practice radiology markets, RTAT has long been considered a significant service and quality differentiator between competing radiology practices [21]. Referring physicians will often choose to direct patient imaging toward those practices with the shortest RTAT [21, 23, 24]. However, in relatively noncompetitive environments, particularly large academic teaching hospitals, there has traditionally been less emphasis on minimizing the RTAT. This could be because the primary differentiator for academic practices is subspecialization, and radiologists may therefore perceive that expedited RTATs are of less value to their referring physicians. On the other hand, given that residents or fellows will usually generate a preliminary report (in conjunction with staff radiologists) within relatively short timelines, some stakeholders, including staff radiologists, may believe that the necessity to finalize the preliminary report expeditiously is less critical.
To meet the demands for improved RTAT, there has been an increasing trend toward the use of speech-recognition software (SR) for report dictation. This has been shown to reduce the finalized RTAT because radiologists can edit their reports in real-time and provide a finalized version at the time of the initial dictation [25]. Speech recognition offers the opportunity to reduce the time from study completion to final report significantly; however, there remains an additional rate-limiting step in the reporting algorithm for academic departments. The report will usually only be finalized after the staff radiologist has electronically accessed the preliminary report (initially transcribed by the resident or fellow into preliminary status) and made the necessary edits before electronically signing it. The time for the report to move from preliminary to finalized status could take days, even weeks occasionally, depending on the staff radiologist's motivation or travel schedule. This was considered unacceptably long by our organization, hence the PFP program that was based on the hypothesis that RTAT could be reduced to acceptable levels through the use of financial incentives.
As shown in this study, the PFP program appeared to achieve the desired effects, with mean C–F time significantly reduced from 42.7 hours during the baseline period to 31.2 hours during the immediate period. Interestingly, the trend continued to reduce even after termination of the PFP program, with the mean C–F time of 16.3 hours during the post period. The effect was most profound, however, for that part of the reporting algorithm most under direct control of the staff radiologist, notably the P–F component. Perhaps this is predictable because the incentive was primarily directed at changing individual staff radiologist behavior. Mean P–F was reduced dramatically across the three periods from 22.6 hours during baseline to 12.6 hours during the immediate evaluation period. Similarly, RTAT remained significantly reduced at a mean of 4.4 hours during the post period.
Interestingly, the mean RTAT for that part of the reporting algorithm not directly influenced by the PFP incentive (the C–P component) was also reduced during the course of the study (from 20 hours at baseline to 19 hours during the immediate period to 11.9 hours during the post period), although the effect was not nearly as profound as observed for the P–F component. It could be argued that the PFP program would not be expected to reduce C–P times because no fundamental changes or incentives (either fiduciary or nonfiduciary) were made to this component of the reporting workflow. This could imply that there might be some other factor that contributed to the reduction in C–P times during the PFP period. However, the effect on the C–P component was not uniform across all divisions, with some divisions actually showing an increase in the C–P RTAT, particularly from the baseline to immediate period. Further study is required to determine why there was a nonuniform reduction in the C–P component across different divisions and to what extent any decrease was attributable to the PFP program.
It also should be noted that there was a substantial variation in the divisional P–F response to the PFP program. For instance, as might be expected, the P–F RTATs from the emergency radiology division were substantially lower than other divisions throughout the study, given the pressure to deliver emergency reports in a timely fashion. Elsewhere, the mean chest divisional P–F time reduced from 10.6 to 1.3 hours from the baseline to post period, whereas the neurovascular radiologists reduced their P–F times from 3 days to 16.6 hours. Some of this might be explained by the complexity of the examinations that the radiologists in different divisions perform and dictate. In the neuroradiology example, a neurointerventional procedure may have taken hours to perform, and the report may have required substantial editing by the staff radiologist because of the complexity of the study and findings. Both of these factors could limit the ability of the staff radiologists to expeditiously attend to their final signature queue.
On the other hand, a chest CT examination may have taken a few minutes to evaluate by a chest radiologist, and the report may be relatively straight forward, leaving the staff radiologist with the perception that the signing of their reports was a less burdensome process. However, other factors are probably at work because the mean gastrointestinal and genitourinary division P–F (which arguably has a similar workload to the chest division) reduced from 38.6 hours at baseline to 5.6 hours during the post period, a P–F time substantially longer than the 1.3 hours noted for chest radiologists. Presumably the chest radiologists were more diligent about signing their final reports in the first place, given that their mean P–F was 3.6 times shorter than gastrointestinal and genitourinary radiologists at baseline. This suggests that the directive and culture within the chest division was to encourage staff radiologists to sign their preliminary reports expeditiously even before the PFP program. This factor therefore could result in a more effective reduction in RTAT than a PFP program alone.
Although it is highly likely that the PFP incentive did directly impact staff physician behavior in this study, particularly for the P–F component, it cannot be implicitly stated that the significant reduction in RTAT was, in fact, due to the PFP incentive alone. Indeed, although some studies suggest that financial incentive programs positively alter physician behavior with a consequent improvement in the quality of patient care, other studies are less clear cut [14–18]. Other factors, which this study did not measure, also may have influenced the radiologists' behavior. For instance, simply discussing the importance of an expedited RTAT to the radiologists in a more formalized and consistent manner (as was the case in this study) may have influenced some radiologists to be more attentive and diligent in signing their reports from a preliminary to a finalized status. Indeed, although it was not evaluated in this report, there was a slight reduction in RTAT (although not significant) in the month (December 2006) immediately after the PFP initiative was announced (November 2006) and before the PFP program started (January 2007).
Furthermore, both mean C–F and P–F times for each radiologist from every division were available on an ongoing basis for electronic review each month by all participating radiologists. Therefore, a radiologist within a certain division could observe not only the performance of peers within that division but also across the department as a whole. It is therefore probable that some of the PFP program effect could have been due to peer pressure, whether through direct communication between radiologists or simply not wanting to be perceived to be a poor performer.
Aside from these limitations to the study, we did not directly question individual radiologists to determine if they believed the financial incentive affected their behavior. Although the data would have been subjective, it might have given further evidence to the positive effect of the program if they had stated it did indeed influence them to sign their reports more expeditiously. We also did not study the longer-term effects of terminating the PFP program. Although the RTAT remained low immediately after terminating the PFP program, we do not know whether RTAT will trend upward at a later date.
On the other hand, it is interesting to note that the effect of the PFP program was more profound 2 years after its introduction rather than during the immediate period after its introduction. This suggests that it can take some months (or possibly years) for the PFP program to deliver its full effect. Further studies also are required to determine the optimal financial incentive necessary to effect the necessary change in radiologist behavior to meet RTAT requirements. It is interesting to note in this study that the incentive was relatively small compared with the overall radiologists' salaries (between 1–2%) and that relatively small incentives can have a major impact on clinical performance metrics.
Finally, this study assessed only an academic radiology department, whereby radiology reports are initially transcribed into preliminary status by residents or fellows. On the other hand, many private practice radiology groups still use third-party transcriptionists to initially dictate radiologist reports into a preliminary status through the use of speech recognition software or not. It is unknown, however, whether a PFP program for this group of radiologists would cause any significant reduction in the P–F RTAT, particularly given the nature of the private practice, whose competitive environment appears to be a strong motivator to expedite RTAT. Furthermore, we could not assess any impact that a PFP program might have for those private practice radiologists who dictate reports directly into a finalized status. Using speech recognition software, there is an increasing trend to avoid the use of third-party transcriptionists, thereby obviating reports residing in a preliminary status.
In summary, rapid completion and ready availability of final radiology reports is considered an essential and important clinical quality metric. PFP incentives designed to motivate academic radiologists to alter their behavior and perform expeditious finalized signature of preliminary reports appear to be successful. Although the effect appears to be retained even after terminating the program, it is yet to be determined whether this effect is permanent.
Address correspondence to G. W. L. Boland ([email protected]).