For patients with biopsy-proven nonalcoholic fatty liver disease (NAFLD), the liver fibrosis stage but neither steatosis nor the non-alcoholic steatohepatitis activity score is associated with overall and disease-specific mortality, liver-related outcomes, and transplantation [
1,
2]. There is an exponential increase in the risk of liver-related mortality with increasing fibrosis stage [
3]. Furthermore, advanced liver fibrosis stage is also associated with extrahepatic malignancies and vascular events [
4]. Thus, quantifying liver fibrosis in patients with NAFLD to determine prognosis and guide treatment decisions is an important part of the clinical workup.
Because using biopsy to determine the presence of liver fibrosis is invasive and associated with significant cost, patient discomfort, sampling variability, and potential risks, several alternative approaches have been developed, including hematologic and biochemical tests [
5–
9] and the assessment of liver stiffness measurement (LSM). The LSM is a promising surrogate biomarker of liver fibrosis stage, and several elastography techniques are currently available, including transient elastography (TE), ultrasound-based 2D shear wave elastography (2D SWE), and MR elastography (MRE) [
10–
14]. Despite this intense interest in alternative approaches, no consensus exists regarding the optimal approach for noninvasive clinical assessment of liver fibrosis in NAFLD [
15]. Although prior studies investigated the diagnostic accuracy of LSMs obtained using MRE and TE [
16,
17], there is limited research on the accuracy of LSMs obtained with 2D SWE for the noninvasive quantification of liver fibrosis in patients with NAFLD [
18,
19].
Therefore, given the paucity of data on head-to-head comparison of these elastography techniques, the aim of the present study was to compare the performance of 2D SWE, TE, and MRE in the noninvasive assessment of liver fibrosis in patients with biopsy-proven NAFLD.
Subjects and Methods
Study Cohort
The institutional review board approved this HIPAA-compliant study. All subjects provided written informed consent.
This study was a prospective, cross-sectional cohort analysis conducted at a single institution. From October 2015 to December 2017, adult patients referred to the University of Pittsburgh Medical Center NAFLD clinic who had undergone a clinically indicated liver biopsy and had histologic findings characteristic of NAFLD identified within the prior 12 months were invited to participate in the study. Diagnosis of NAFLD was established by the absence of significant alcohol intake (defined as ≤ two drinks per day for men and ≤ one drinks per day for women); negative findings of serologic and biochemical workups performed at the initial evaluation for other causes of chronic liver disease, as per the guidelines of American Association for the Study of Liver Diseases; and findings characteristic of NAFLD identified during histologic analysis of liver biopsy specimens.
Each subject underwent 2D SWE, TE, and MRE. Subjects fasted for at least 4 hours before undergoing scanning. Electronic health records were reviewed to collect relevant clinical and laboratory data from the time point closest to study enrollment.
Two-Dimensional Shear Wave Elastography
Two-dimensional SWE was performed using a Logiq E9 system (GE Healthcare). This relatively new method uses comb-push and time-aligned sequential tracking techniques that allow the generation of a large elasticity map superimposed on the gray-scale image obtained using a conventional ultrasound scanner [
13,
14,
20]. Subjects underwent imaging in a supine position with the right arm extended above the head. The LSM was obtained during suspended respiration with use of a convex probe placed in a lower right intercostal space. The analysis box was placed in the right hepatic lobe (liver segment V or VIII) at least 1 cm from the liver capsule and no more than 6 cm from the skin surface, to include a homogeneous area of liver parenchyma and to exclude large vessels and artifacts. A circular ROI was then placed within the analysis box over a homogeneous portion of the color-coded map. Up to 12 measurements expressed in kilopascals were obtained for each subject, and the median value was used for statistical analysis. For each study, the interquar-tile range (IQR) over the median ratio was calculated. A measurement was considered invalid (denoting technical failure) when less than 50% of the acquisition ROI was filled with the color-coded map [
20]. The study was considered invalid if fewer than three valid measurements were obtained [
13] and unreliable if the IQR over the median value was greater than 30% [
13,
20,
21]. All 2D SWE measurements were obtained by one of two trained abdominal radiologists blinded to the results of pathologic analysis.
Transient Elastography
TE was performed using the FibroScan 502 Touch system (Echosens), with use of either the 3.5-MHz M probe (Echosens) or the 2.5-MHz XL probe (Echosens), with the choice of probe guided by the system computer. The median value of at least 10 valid LSMs expressed in kilopascals was used for analysis. The study was considered invalid if there were fewer than 10 valid LSMs and unreliable if the IQR over the median value was greater than 30% [
13,
21]. All TE scans were obtained by one of two trained operators, each of whom had previous experience in obtaining more than 100 TE scans and each of whom was blinded to the liver biopsy results.
MR Elastography
MRE was performed using a 1.5-T scanner, with a 2D gradient-echo sequence acquired as previously described elsewhere [
22]. Axial images with a slice thickness of 10 mm were obtained in four different locations with an interslice gap of 8 mm. The acquired images were processed at the scanner (with use of an inversion algorithm) to produce quantitative maps of liver stiffness known as elastograms. The elastograms with 95% confidence maps were chosen to obtain LSMs [
21]. At each location, a large ROI was traced to cover the largest liver surface excluding artifacts (i.e., wave interference), large vessels, gallbladder, and fis-sure. The mean of the four measurements was used for statistical analysis. The study was considered invalid if no liver parenchyma was available for measurement on the elastograms with confidence maps. All MRE measurements were obtained by a radiologist with 3 years of experience in interpreting MRE scans.
Pathologic Confirmation of Nonalcoholic Fatty Liver Disease
Percutaneous liver biopsy was obtained for clinical indications before study enrollment. The liver biopsy findings were interpreted by experienced board-certified pathologists in accordance with the Nonalcoholic Steatohepatitis Clinical Research Network nonalcoholic steatohepatitis activity scoring system. Fibrosis was scored as follows: F0 denoted no fibrosis; F1, perisinusoidal or portal fibrosis; F2, perisinusoidal and either portal or periportal fibrosis; F3, septal or bridging fibrosis; and F4, cirrhosis. Significant fibrosis was defined by a fibrosis score of F2 or higher and advanced fibrosis by a score of F3 or higher. The NAFLD nonalcoholic steatohepatitis activity score was also calculated and ranged from 0 to 8 as a sum of the scores for steatosis (0–3), lobular inflammation (0–3), and hepatocellular ballooning (0–2) [
23].
Statistical Analysis
The diagnostic performance of 2D SWE, TE, and MRE was assessed by calculating the area under the ROC curve (AUROC) values and the 95% CIs. The analysis was performed with pairwise fibrosis stages for the comparison of F0 and F1 versus F2–F4 (denoting significant fibrosis) and for the comparison of F0–F2 versus both F3 and F4 (denoting advanced fibrosis). For each method, the following cutoff values (expressed in kilopascals) were determined: the best cutoff value associated with the Youden index, a cutoff value for sensitivity of 90% or higher, and a cutoff value for specificity of 90% or higher. Sensitivity, specificity, and positive and negative predictive values were assessed for each cutoff value. The AUROC values of 2D SWE, MRE, and TE for significant and advanced fibrosis were compared (in pairwise comparison) using the DeLong test (and the roccomp command in Stata software [version 14, Stata-Corp]), with subjects with valid LSMs obtained using all three modalities included. The threshold for statistical significance was set as p < 0.05. A sample size of 62 subjects was derived during study planning for a four-method comparison (i.e., biopsy, MRE, 2D SWE, and TE), under the assumption of correlation of these methods of 0.5, an alpha level of 0.05, power of 0.80, and an effect size of 0.15.
Statistical analysis was performed using Stata software (version 14, StataCorp) and MedCalc software for Windows (version 17.1, MedCalc Software).
Discussion
In the context of the increasing worldwide prevalence of NAFLD and the strong clinical need to find alternatives to liver biopsy for the noninvasive staging of liver fibrosis in patients with NAFLD, the present study provides several insights. First, we found that 2D SWE had diagnostic accuracy similar to that of TE and MRE for the detection of significant fibrosis as well as advanced fibrosis in patients with NAFLD. Second, all three approaches have greater accuracy in diagnosing advanced fibrosis (F3–F4) compared with significant fibrosis (F2–F4). Third, all three modalities had a similar but small failure rate, suggesting that no one diagnostic approach may be universally applicable to all patients with NAFLD. Overall, the present study suggests that for the diagnostic goal of determining stage F3–F4 (advanced) fibrosis, LSM may be a viable strategy using any of these modalities.
Prior studies comparing MRE and TE found the accuracy of the two approaches to be similar in diagnosing advanced fibrosis, although MRE performed better in distinguishing mild from significant fibrosis. In one study in which the mean BMI of the cohort was 28, MRE performed better than TE for the classification of fibrosis with a Meta-vir score of F2 or higher (AUROC value, 0.91 vs 0.82;
p = 0.001), although there was no significant difference in the AUROC values of MRE and TE for the distinction of F3 and F4 fibrosis versus F0–F2 fibrosis (AUROC value, 0.89 vs 0.88;
p = 0.426) [
16]. In another recent study in which the mean BMI of the cohort was 30, MRE was more accurate than TE for diagnosing any grade of fibrosis (F1–F4 vs F0: AUROC value, 0.82 vs 0.67;
p = 0.01) [
17]. However, there was no significant difference between the two elastography methods for diagnosing any other dichotomized stage of fibrosis, including stage F2 or higher (AUROC value, 0.89 vs 0.86;
p = 0.4596) and stage F3 or higher (AUROC value, 0.87 vs 0.80;
p = 0.1942). Consistent with the findings of these prior studies, the present study found that the diagnostic accuracy of MRE and TE was comparable in distinguishing F3–F4 fibrosis from F0–F2 fibrosis. However, in contrast to these studies, our study also found no difference in the diagnostic accuracy between MRE and TE in distinguishing F0 and F1 fibrosis from F2–F4 fibrosis. It is possible that our findings may have been affected by the much higher mean BMI (34) of our cohort or a type II error resulting from the smaller sample size of our cohort, because MRE was favored over TE for distinguishing significant fibrosis from mild fibrosis (
p = 0.052).
Although prior studies compared the diagnostic performance of MRE and TE, there has been limited research on the diagnostic accuracy of 2D SWE in NAFLD. Super-sonic shear imaging, a first-generation shear wave elastography technology, showed an AUROC value of 0.86 and 0.89 for detecting significant fibrosis and advanced fibrosis, respectively [
18]. Supersonic shear imaging performed better than another elastography technique, acoustic radiation force impulse elastography, for the diagnosis of significant fibrosis, whereas the accuracy of supersonic shear imaging, acoustic radiation force impulse elastography, and TE in the diagnosis of advanced fibrosis was similar. A novel aspect of our study is the evaluation of a relatively new 2D SWE method that allows the creation of a large ROI for quantification of shear wave speed in a conventional ultrasound scanner [
13,
14]. In a recent study, Lee et al. [
20] showed that the LSM obtained with this method highly correlates with the degree of hepatic fibrosis. In the present study, 2D SWE showed good diagnostic accuracy for the detection of significant fibrosis (AUROC value, 0.80) and advanced fibrosis (AUROC value, 0.89). As previously noted, we found no difference in the accuracy of 2D SWE and MRE for the diagnosis of significant and advanced fibrosis.
How might our findings help clinicians who are taking care of patients with NAFLD make better management decisions? Because advanced liver fibrosis is associated with increased mortality and cardiovascular- and liver-related morbidity, an important diagnostic goal in the initial workup of patients with NAFLD is the identification of patients with advanced fibrosis who are at the highest risk of adverse outcomes [
4]. Our results suggest that all three noninvasive modalities evaluated in the present study have acceptable diagnostic accuracy in identifying the highest risk patients (i.e., those with advanced fibrosis). We also found that all three technologies had invalid results for a small but similar number of patients. In the case of MRE, the most common reason for failure was claustrophobia, whereas in the case of TE and 2D SWE, it was failure to obtain a valid measurement. Of interest, 2D SWE and TE did not fail to provide a valid result for the same patients, suggesting that if one modality fails to provide a valid result, it may be possible for a patient to undergo noninvasive fibrosis assessment using an alternative modality to avoid liver biopsy. Because there is wide variability in the cost, expertise, and availability of these modalities, no one approach is clearly superior to the others, and clinicians could consider any available modality to be an acceptable approach for initial risk stratification. Indeed, a spirit of collaboration (not competition) and the need for more than one diagnostic approach for NAFLD have previously been proposed, and our results support that viewpoint [
24]. Further research is needed to investigate the diagnostic performance of combinations of noninvasive biochemical indexes (e.g., the NAFLD fibrosis score and Fibrosis-4 index for liver fibrosis) and LSM modalities for better risk stratification in NAFLD.
Some limitations of the present study should be noted. The study was performed at a single institution and included patients who were referred to a tertiary care clinic, and it may not represent the population seen in other clinical settings. Because of technical failures, only results for 54 of the 62 enrolled patients could be used to compare the diagnostic accuracy of LSMs obtained using different elastography methods. The lack of a significant difference in accuracy may therefore be affected by a type II error, in particular the comparison of LSMs obtained with TE and MRE (
p = 0.52) that approached the threshold
p value set for statistical significance. Future studies may be needed in a larger and multicenter study setting. The preliminary results obtained in our relatively small population should be validated in a larger and preferably multicenter study. There was a mean interval of 105 days between the liver biopsy and the time that acquisition of LSM scans was completed in our study. However, prior studies have suggested that liver fibrosis progresses slowly in patients with NAFLD [
25]. When the study was first conceived, the MRI scanners at our institution did not have proton density fat fraction capability for estimation of hepatic fat content. Therefore, we exclusively focused on the estimation of liver fibrosis and could not determine the performance of the three techniques in determining the severity of hepatic steatosis.
The present study also has several strengths. To our knowledge, this is the first study to simultaneously compare the diagnostic performance of 2D SWE with that of MRE and TE in patients with biopsy-proven NAFLD. It was a prospective study in which most of the scans were completed the same day. All scans were performed, interpreted, or both performed and interpreted by trained operators, minimizing variations resulting from operator inexperience or variability. There was a range of patients with varying body weights and BMIs as well liver fibrosis stages, as would be expected in a real-world clinical setting.
In conclusion, LSM obtained with 2D SWE, TE, and MRE performed well when compared with liver biopsy for the diagnosis of advanced fibrosis. The pairwise comparisons of the three modalities did not find a difference in their performance in diagnosing both significant and advanced fibrosis, although there was a trend in favor of using MRE rather than TE to distinguish between significant fibrosis and mild fibrosis. Further research is needed to define the role of combinations of modalities for fibrosis detection and to better distinguish intermediate stages of fibrosis.