Diagnostic Per-Patient Accuracy of an Abbreviated Hepatobiliary Phase Gadoxetic Acid–Enhanced MRI for Hepatocellular Carcinoma Surveillance

OBJECTIVE. The purpose of this study is to evaluate the per-patient diagnostic performance of an abbreviated gadoxetic acid-enhanced MRI protocol for hepatocellular carcinoma (HCC) surveillance. MATERIALS AND METHODS. A retrospective review identified 298 consecutive patients at risk for HCC enrolled in a gadoxetic acid-enhanced MRI-based HCC surveillance program. For each patient, the first gadoxetic acid-enhanced MRI was analyzed. To simulate an abbreviated protocol, two readers independently read two image sets per patient: set 1 consisted of T1-weighted 20-minute hepatobiliary phase and T2-weighted single-shot fast spin-echo (SSFSE) images; set 2 included diffusion-weighted imaging (DWI) and images from set 1. Image sets were scored as positive or negative according to the presence of at least one nodule 10 mm or larger that met the predetermined criteria. Agreement was assessed using Cohen kappa statistics. A composite reference standard was used to determine the diagnostic performance of each image set for each reader. RESULTS. Interreader agreement was substantial for both image sets (κ = 0.72 for both) and intrareader agreement was excellent (κ = 0.97-0.99). Reader performance for image set 1 was sensitivity of 85.7% for reader A and 79.6% for reader B, specificity of 91.2% for reader A and 95.2% for reader B, and negative predictive value of 97.0% for reader A and 96.0% for reader B. Reader performance for image set 2 was nearly identical, with only one of 298 examinations scored differently on image set 2 compared with set 1. CONCLUSION. An abbreviated MRI protocol consisting of T2-weighted SSFSE and gadoxetic acid-enhanced hepatobiliary phase has high negative predictive value and may be an acceptable method for HCC surveillance. The inclusion of a DWI sequence did not significantly alter the diagnostic performance of the abbreviated protocol.

ity for the detection of small and early HCCs is poor, as low as 60-63% according to two meta-analyses [8,9]. Because of the low perpatient sensitivity of ultrasound for the detection of early-stage HCC, many centers in the United States perform dynamic contrastenhanced MRI for HCC surveillance instead. Although dynamic contrast-enhanced MRI has a relatively high reported sensitivity for HCC detection (81%) and has an established role as a recall procedure in patients with positive surveillance results, its routine use in surveillance is limited by high cost and the relatively low frequency of true-positive examinations [9,10].
The high cost of individual dynamic contrast-enhanced MRI examinations can be attributed to equipment costs as well as relatively long examination times. We hypothesize that an abbreviated MRI protocol consisting only of a gadoxetic acid-enhanced hepatobiliary phase acquisition, a single-shot fast spin-echo (SSFSE) T2-weighted acquisition, and a diffusion-weighted imaging (DWI) acquisition could provide adequate per-patient sensitivity and negative predictive values for HCC surveillance. The basis of this hypothesis is that most HCCs, including early HCCs, are hypointense to the liver parenchyma in the hepatobiliary phase after gadoxetic acid administration and so are potentially detectable even in the absence of dynamic images [11][12][13][14]. The T2-weighted acquisition is included in the abbreviated examination to help differentiate HCC nodules from benign cysts and hemangiomas on the basis of their signal intensity on T2-weighted images [15][16][17]. The DWI acquisition is includ-ed to potentially detect HCC nodules that are not perceived or underreported on the other acquisitions [17][18][19]. The potential benefit of an abbreviated examination consisting of fewer sequences and without dynamic imaging is that it could be completed more rapidly and potentially at a lower cost, thereby making it more suitable for HCC surveillance.
The purpose of this study is to evaluate the diagnostic performance of an abbreviated MRI protocol that includes gadoxetic acid-enhanced hepatobiliary phase as a potentially lower-cost alternative to conventional MRI for HCC surveillance in patients at risk for HCC.

Materials and Methods Patients
This dual-center retrospective cross-sectional study was approved by the institutional review boards at the University of California, San Diego (UCSD), and Duke University, and the requirement for informed consent was waived at both institutions. Patients enrolled in an MRI-based HCC surveillance program who were imaged from October 23, 2008, through January 31, 2012, at the two participating institutions were eligible for this study (n = 580). Only the first available examination was included for each patient to avoid duplicates.
The inclusion criteria included a history of cirrhosis or other risk factors for HCC without having   [20]. The exclusion criteria, as shown in Figure 1, included no follow-up examinations (n = 252), inadequate follow-up examinations or procedures to meet reference standard criteria such as nonmultiphasic or unenhanced follow-up studies (n = 29), and a known malignancy other than HCC with liver metastases (n = 1). Thus, a total of 298 patients were included in the study. The population characteristics of the included patients are summarized in Table 1.

Liver Imaging Protocol
Imaging was performed using various 1.5-and 3-T systems at UCSD (Signa HDx 1.5 T, Discovery MR750w 3 T, and Signa HDxt 3 T, all from GE Healthcare) and at Duke University (Magnetom Avanto 1.5 T and Trio 3 T, both from Siemens Healthcare). For contrast-enhanced dynamic MRI sequences, the standard dose of gadoxetic acid was 0.025 mmol/kg at UCSD and 10 mL for all patients at Duke University. Detailed pulse sequence parameters are listed in Table 2. Numerous sequences other than those listed in Table 2 (including T1-weighted in-phase and out-of-phase images, fat-suppressed T2-weighted images, and vascular phase images acquired dynamically before and in the arterial, portal venous, and transitional phases after gadoxetic acid injection) were acquired for clinical care as part of a complete MRI protocol; to simulate an abbreviated surveillance protocol, these images were not analyzed in this study.

Image Analysis
For all patients who met the eligibility criteria, the first gadoxetic acid-enhanced MRI for each patient was identified as the index MRI. Two readers, one from each site (both board-certified radiologists with 1 year of experience in abdominal imaging fellowships), were blinded to the clinical data, clinical reports, reference standard, and each other's interpretations. Reader A was from UCSD, and reader B was from Duke University.
To simulate an abbreviated MRI protocol, each reader independently reviewed two image sets per patient. Set 1 consisted only of T1-weighted gadoxetic acid-enhanced hepatobiliary phase axial images and T2-weighted SSFSE images. Set 2 consisted of set 1 images and a free-breathing DWI sequence with two b values. DWI was used to either upgrade a lesion seen in set 1 or to detect new lesions. The readers were not able to view any other images acquired as part of the complete MRI examination.
For set 1, each nodule 10 mm or larger in maximal diameter was categorized according to its signal characteristics on T2-weighted SSFSE and T1weighted hepatobiliary phase sequences by the rules shown in Figure 2. Nodules 20 mm or larger in maximal diameter were upgraded by one category.
For set 2, nodule categorization was adjusted according to the signal characteristics on DWI. On the higher b value sequence, mildly hyperintense nodules were upgraded by one category, and very hyperintense nodules, comparable to the brightness of the spleen, were upgraded to category 4 or 5. When signal hyperintensity on the higher b value image was interpreted as T2 shinethrough, the nodule was not upgraded.
The final score for each examination was considered to be the highest final score of any nodule on that examination. The definitions of final imaging scores are shown in Figure 3. A final imaging score of 1-3 was interpreted as a negative examination, with the understanding that in clinical practice this would correspond to a recommendation to return to routine surveillance. A final imaging score of 1 meant that there were no nodules 10 mm or larger on the examination. A final imag-  ing score of 4 or 5 was interpreted as a positive examination, with the understanding that in clinical practice this would correspond to a recommendation for a complete dynamic contrast-enhanced liver MRI or CT for more definitive evaluation, diagnosis, and staging.

Reference Standard
A composite reference standard that incorporated the best available information was used patient by patient. The results of the complete MRI examination and the results and dates of all followup MRI and multiphasic CT studies were recorded for each patient. These examinations were classified using a modified version of LI-RADS version 2013.1 [20] that adds hepatobiliary phase hypointensity as an ancillary feature favoring HCC. In LI-RADS, category scores are assigned to individual observations detected at CT or MRI according to the probability of representing HCC: LI-RADS category 1 is definitely benign, LI-RADS category 2 is probably benign, LI-RADS category 3 is intermediate probability of HCC, LI-RADS category 4 is probably HCC, and LI-RADS category 5 is definitely HCC. For all pathology results, the specimen type (biopsy, resection, or explant), date, and result were recorded. In addition, the date and type of any ablative or embolic treatment without biopsy according to the clinical report of the complete index MRI was recorded. Follow-up studies, both MRI and multiphasic CT, and pathology results were recorded through June 1, 2012, to ensure that the largest number of studies with the proper reference standard could be included.
For all index MRI examinations in which the clinical radiology report described at least one LI-RADS category 3 or higher observation, the complete MRI examination was reviewed by a fellowship-trained abdominal radiologist with 4 years of experience in interpreting gadoxetic acid-enhanced MRI. From the complete examination, the number and size of each LI-RADS category 3, 4, and 5 observation was recorded on the basis of the abdominal radiologist's review. This radiologist also reviewed the last complete MRI or multiphasic CT within 13 months of the index MRI for all index cases with a LI-RADS category 3-5 observation and all follow-up studies for any included patient with a LI-RADS category 3, 4, or 5 observation on the clinical report.
The reference standard was considered positive for HCC if pathologic examination within 13 months of the index MRI identified at least one HCC 10 mm or larger; follow-up imaging within 13 months identified at least one LI-RADS category 4 (probably HCC) or 5 (definitely HCC) observation 10 mm or larger; or ablative or embolic treatment without biopsy was rendered on the basis of the clinical report of the complete initial MRI. Ablative or embolic treatment without biopsy was considered a positive reference standard for the presence of HCC because the integrated imaging and clinical data were sufficiently com-pelling to justify treatment without biopsy, as is the standard of practice at our institutions.
The reference standard was considered negative if follow-up imaging 6-13 months after the index MRI identified no LI-RADS category 4 or 5 observation 10 mm or larger (in the absence of ablative or embolic treatment) or if explant pathologic analysis at any time after the index MRI verified the absence of an HCC 10 mm or larger.

Data Analysis
Statistical analysis was performed using statistical computing software (R version 2.15.1, The R Foundation for Statistical Computing) by a biostatistical analyst under the supervision of a faculty biostatistician. For analysis of intraobserver variability between the image sets and interobserver variability, a Cohen kappa analysis with equal weight was applied, with κ = 0.6-0.8 considered substantial and κ > 0.8 considered excellent. For each image set and for each reader, the per-patient sensitivity, specificity, accuracy, positive predictive value, and negative predictive values were obtained with binomial 95% CIs and bootstrap-based CIs for the averages of reader A and B. Bootstrap-based comparisons of the diagnostic performance parameters of image set 1 versus image set 2 were made.

Inter-and Intrareader Agreement
Intrareader agreement between image sets 1 and 2 was excellent for both readers (κ = 0.97 for reader A and κ = 0.99 for reader B). Interreader agreement was substantial for both image sets (κ = 0.72 for both). Figure 4 shows representative images from a woman with HCC who was empirically treated with transarterial chemoembolization.

Per-Patient Diagnostic Performance
For image set 1, the mean per-patient sensitivity and negative predictive values were 82.6% (95% CI, 70.9-90.7%) and 93.2% (95% CI, 90.0-95.6%), respectively. For image set 2, the mean per-patient sensitivity and negative predictive values were 83.7% (95% CI, 71.7-90.9%) and 93.2% (95% CI, 90.0-95.6%), respectively. Other per-patient diagnostic performance parameters for sets 1 and 2 are summarized in Table 3. A comparison of the performance parameters between the averages of image set 1 and image set 2 showed no statistically significant difference between the two image sets and is summarized in Table 4.

False-Negative and False-Positive Examinations
There were 11 false-negative examinations (one for reader A, four for reader B, and six for both). In seven cases, one or both readers scored the examinations as category 1, 2, or 3, whereas the complete MRI examination showed a single LI-RADS category 4 observation (probably HCC) measuring 11-20 mm in diameter. In one case, one reader scored the examination as category 1 and the other as category 3, whereas a follow-up MRI examination 7 months later showed a single LI-RADS category 4 observation (probably HCC) measuring 24 mm. In one case, one reader missed an 83-mm biopsy-proven HCC, an error of observation possibly attributed to severe motion artifact. A 24-mm T2-isointense hepatobiliary phase-hyperintense lesion was scored as categories 1 and 3 by the readers, respectively, on the abbreviated examination and as LI-RADS category 2 (probably benign) on the clinical index examination but was upgraded to LI-RADS category 4 (probably HCC) on a subsequent MRI 7 months later because of new hypervascularity. These and the remaining false-negative examinations are summarized in Appendix 1.
There were 26 false-positive examinations (14 by reader A, four by reader B, and eight by both). In 16 of these 26 cases, no lesion was present on the complete index or follow-up examinations. Most of these were attributable to perceived T2-hyperintense DWI-negative hepatobiliary phase-hypointense observations smaller than 20 mm in diameter assigned scores of category 4 or 5 by one or both readers. In all cases, followup MRI 6-13 months later was negative for HCC (i.e., no LI-RADS category 4 [probably HCC] or 5 [definitely HCC] observations). The false-positive examinations are summarized in Appendix 2.

Discussion
Ultrasound, the method of HCC surveillance recommended by the American Association for the Study of Liver Diseases and other guidelines, has limited sensitivity for the detection of HCC. In a large study comparing ultrasound, CT, and MRI, per-patient sensitivities for detecting HCC were 64%, 76%, and 85%, respectively [21]. Thus, many North American medical centers, including ours, perform surveillance with MRI rather than ultrasound for many patients, especially patients with advanced cirrhosis.
Despite its higher diagnostic accuracy than ultrasound and CT, the greater expense, long examination times, and relatively limited availability remain a challenge to using MRI in HCC surveillance programs. We sought to determine the per-patient sensitivity and negative predictive value of an abbreviated MRI examination protocol that potentially could be used for surveillance of patients at risk for HCC. We simulated the performance of such an abbreviated examination by focusing on a subset of sequences (hepatobiliary phase, T2weighted, and DWI).
We found that the sensitivity and negative predictive value of the abbreviated examination were high, and higher than those reported for ultrasound. Two meta-analyses of the sensitivity of ultrasound for diagnosing HCC reported per-patient sensitivities of 60-63% [8,9]. The per-patient sensitivity of the abbreviated MRI without DWI was 82.6% in this study. Interreader agreement was substantial, suggesting that such a method might be generalizable, even though our readers were trained at different institutions. Notably, the inclusion of DWI led to only one of  We found that per-patient sensitivity for the detection of HCC with an abbreviated MRI examination was similar to the reported sensitivity for the complete study with gadoxetic acid. Bashir et al. [22] found the per-lesion sensitivity of gadoxetic acid-enhanced MRI without the hepatobiliary phase for all lesions to be 78.3% and that with the hepatobiliary phase to be 90.0%. The sensitivity for our abbreviated MRI was also higher than that described for ultrasound and CT in a number of other works. Di Martino et al. [23] found a per-lesion sensitivity of 85% for MRI and 69% for triphasic CT for all sizes of HCC. In a study of contrast-enhanced ultrasound, the sensitivity for lesions smaller than 2 cm was 51.7% [24]. In a prospective study, the per-patient sensitivity for HCC of triphasic CT was 59% [25]. Comparison of the sensitivity values for the abbreviated MRI against these prior studies should be interpreted cautiously, however, because our study focused on per-patient diagnosis of HCCs 10 mm or larger, whereas many of the prior studies focused on per-lesion diagnosis of HCC of any size, and the standards of reference in those other studies were variable.
We also found that DWI did not significantly increase the performance of the abbreviated examination for detecting HCC. Several published studies have suggested similar findings. Kim et al. [26] found no added benefit for diagnostic accuracy and sensitivity for HCC up to 2.8 cm with DWI. Miller et al. [27] found that the apparent diffusion coefficient values of solid benign liver were similar to those of malignant lesions. Another study found that the sensitivity for detecting HCC using DWI alone on a gadoxetic acid-enhanced MRI was 67% [28].
Most of our false-negative examinations were attributed to underscoring or nonvisualization of lesions 20 mm or smaller. Several studies have shown a decrease in sensitivity for HCCs smaller than 20 mm on gadoxetic acid-enhanced MRI [29,30]. Many of our false-positive examinations were attributable to observations smaller than 20 mm that were perceived as hypointense in the hepato-biliary phase on the index examination and that either disappeared or did not progress on follow-up examinations.
We found that an abbreviated MRI examination protocol using only two sequences (T2-weighted SSFSE and T1-weighted hepatobiliary phase) has high sensitivity and a high negative predictive value for HCC and may be appropriate for surveillance of patients at risk for HCC. The idea of using the abbreviated MRI for surveillance is conceptually similar to breast cancer screening according to BI-RADS, which advocates a two-view mammogram for screening and recalls patients for a full diagnostic mammogram or another modality of imaging if further imaging is deemed necessary. In our model, if the abbreviated MRI was positive, the patient would be recalled for a complete dynamic MRI study.
An abbreviated MRI could be implemented for surveillance at a reduced cost relative to a complete dynamic contrast-enhanced MRI. A complete liver MRI can require 20-40 minutes depending on the number and types of sequences and the contrast agent used. An abbreviated MRI protocol as described could be performed in approximately 5 minutes of scanning time, in a 15-minute time slot. Given the call-back rate in our study of 19.3% (115/596 readings), and assuming that two abbreviated MRIs could be performed in the same time as one full MRI (and were charged at half the rate), a total cost savings of 30.7% could be realized in such a surveillance program by using the abbreviated MRI protocol. We envision that patients would receive an injection of gadoxetic acid outside the MRI scanner. About 20 minutes later, after allowing enough time to achieve adequate hepatocellular uptake of the agent, patients would be scanned, and T2-weighted and T1-weighted hepatobiliary phase images would be acquired. Because patients would receive their contrast injection outside the MRI scanner room, the scanner resource would be occupied for only a short duration. Also, because dynamic vascular phase images are not acquired, contrast media could be administered via a onetime hand injection, which may further reduce costs and facilitate scanning in patients with poor IV access. To fully realize the cost reduction, special billing codes for a less-expensive abbreviated MRI examination may be required.
An important limitation of this study is its retrospective design; consequently, the abbreviated MRI was simulated from a com-plete dynamic MRI with gadoxetic acid. A large number of examinations were excluded because of a lack of adequate follow-up imaging, which could have introduced a selection bias. Also, we did not attempt to score observations smaller than 10 mm. This is in keeping with current American Association for the Study of Liver Diseases guidelines: only nodules 10 mm or larger detected at surveillance imaging require recall with complete multiphasic CT or MRI. The rationale is that because these patients are in a regular surveillance program, subcentimeter HCCs are unlikely to metastasize or become untreatable before the next surveillance examination 6 months to 1 year later. Also, the Organ Procurement and Transplantation Network and LI-RADS do not allow subcentimeter observations to be considered definite HCC [20,31]. Another limitation is that the dose of gadoxetic acid was not standardized between the two institutions. We also used an SSFSE rather than fast spin-echo for our T2-weighted imaging, which has relatively poor T2 weighting; nonetheless, the proposed method performed well. Finally, we could not optimize the flip angles of our T1-weighted sequences in the retrospective setting, which has been shown to improve detection of liver lesions and may be useful in the abbreviated MRI protocol [32][33][34].
Accepting these limitations, we conclude that an abbreviated MRI with a T2-weighted SSFSE and a gadoxetic acid-enhanced hepatobiliary phase may be an acceptable lower cost alternative to a standard dynamic MRI for HCC surveillance in the setting of chronic liver disease at centers that rely on MRI for surveillance. DWI did not improve the performance of the abbreviated MRI. Further studies are needed to confirm our findings. At centers that currently adhere to the American Association for the Study of Liver Diseases guidelines and use ultrasound as the primary screening modality, further research will be needed to determine the relative cost-to-benefit ratio of an abbreviated MRI surveillance versus ultrasound for surveillance.