Hepatocellular carcinoma (HCC) is the fourth most common cause of cancer death worldwide and has a poor prognosis, with a 5-year survival rate less than 20% [
1]. Screening and surveillance for HCC in high-risk patients improves early-stage diagnosis, curative treatment, and mortality [
2–
5]. HCC screening and surveillance is primarily performed using ultrasound. The American College of Radiology (ACR) Ultrasound LI-RADS version 2017 (US LI-RADS) provides three possible categories for summarizing findings on ultrasound examinations performed for HCC screening and surveillance and for guiding appropriate follow-up [
6–
8]. A category of US-1 negative is assigned for examinations with no observations and is associated with a recommendation to repeat the surveillance ultrasound in 6 months. A category of US-2 subthreshold is assigned for examinations with a focal observation smaller than 10 mm and is associated with a recommendation to repeat the surveillance ultrasound in 3–6 months; if the observation does not show growth after 2 years of follow-up, then it is considered benign and the patient returns to routine surveillance. A category of US-3 positive is assigned for examinations with a focal observation 10 mm or greater, a focal area of parenchymal distortion, or new venous thrombus, and is associated with a recommendation for multiphasic CT, MRI, or contrast-enhanced ultrasound (CEUS) for definitive characterization.
Methods
Patient Cohort
This single-center retrospective study was approved by the institutional review board (IRB) of the Stanford University School of Medicine and was compliant with HIPAA. The IRB waived the requirement for written informed consent. At our institution, adult patients who qualify for HCC screening and surveillance are referred by hepatologists for a surveillance ultrasound examination every 6 months; the surveillance ultrasound examinations are interpreted using a standardized reporting template. If an US-2 observation is identified, the template recommends performing a follow-up surveillance ultrasound in 3–6 months. Hepatologists typically follow US-2 observations for 2 years by both serial ultrasound examinations and serum α-fetoprotein (AFP) levels. Occasionally, at the discretion of the hepatologist, a single surveillance ultrasound examination may be replaced by a multiphasic CT or MRI examination, with a return to ultrasound surveillance if the CT or MRI does not show an observation with category LR-3 or higher that correlates with the initial US-2 observation. During surveillance, hepatologists monitor patients clinically, and new symptoms or a rise in AFP (> 20 ng/mL) triggers ordering a multiphasic CT or MRI.
A fellowship-trained abdominal radiologist (J.R.T., in the 1st year of practice after fellowship) searched the university's informatics database to identify patients who underwent ultrasound examinations performed for HCC surveillance from January 2017 to January 2020, yielding 4287 patients. The US categories of patients' first surveillance US examination performed during the study period (baseline surveillance examination) were retrieved from the report impressions. A total of 3740 patients were excluded because of a baseline observation of US-1 negative, and 320 patients were excluded because of a baseline observation of US-3 positive. These exclusions resulted in 227 patients with a baseline observation of US-2 subthreshold according to the report impression. The investigator reviewed the images and reports of the baseline examinations to confirm the presence on the images of a US-2 observation; no examinations were excluded on the basis of this step. The investigator then reviewed the medical records to identify surveillance ultrasound examinations and multiphasic CT or MRI examinations that the patient underwent as follow-up to the baseline ultrasound. As a result of this review, 31 patients were excluded because of the lack of any imaging follow-up, and 21 patients were excluded because of insufficient imaging follow-up (defined as either a multiphasic CT or MRI examination performed at any point after the baseline ultrasound or as a surveillance ultrasound examination performed at least 2 years after the baseline ultrasound). This process resulted in a final study sample of 175 patients who underwent a surveillance ultrasound and had an observation that was assigned category US-2 and who underwent follow-up imaging by either multiphasic CT or MRI or by follow-up ultrasound at 2 years or more.
Figure 1 summarizes the patient selection process. For all included patients, the investigator recorded patient age, sex, AFP level, and cause of chronic liver disease. The investigator also reviewed the medical records to identify whether patients underwent liver transplant within 2 years after the baseline surveillance ultrasound examinations, and if so, recorded whether the pathology report for the liver explant described the presence of a dysplastic nodule, HCC, or other malignancy.
Surveillance Ultrasound Examinations
All surveillance ultrasound examinations were performed by sonographers certified by the American Registry for Diagnostic Medical Sonography. The ultrasound images were reviewed immediately after acquisition by a board-certified fellowship-trained abdominal radiologist or by a radiology trainee under the radiologist's supervision. Images were assessed to ensure that all views recommended by US LI-RADS were obtained, including at a minimum gray-scale and color Doppler longitudinal and transverse views of the right lobe, left lobe, and dome, showing the hepatic veins, main portal vein, and bile ducts, acquired using a curvilinear transducer (mean frequency, 1-9 MHz). Examinations also assessed the peripheral subcapsular parenchyma and liver surface nodularity using a high-frequency linear transducer (mean frequency, 5-12 MHz). The examinations were reported using structured reporting according to the US LI-RADS lexicon, with inclusion in the report impression of the US LI-RADS category (US-1 negative, US-2 subthreshold, or US-3 positive) [
6]. Sonographers routinely viewed the reports and images of prior imaging examinations to guide the reidentification of previously reported observations, for example, according to the observation's location (approximate segment and depth from the liver capsule) and the transducer that best visualized the observation on a prior surveillance ultrasound.
Multiphasic CT and MRI Examinations
Multiphasic CT and MRI examinations fulfilled ACR and Organ Procurement and Transplantation Network (OPTN) technical recommendations [
10,
11]. CT examinations were performed using a 64– or 128–detector row scanners (Lightspeed VCT or Revolution, GE Healthcare; Force or Flash, Siemens Healthineers) and included dynamic imaging with IV administration of iodinated contrast material (including late arterial, portal venous, and delayed phases) with axial, sagittal, and coronal reformations. MRI examinations were performed using a 1.5-T or 3-T scanner with a torso phased-array coil and included unenhanced T1-weighted in phase and opposed phase images, T2-weighted images with and without fat saturation, and DWI, as well as dynamic T1-weighted images with IV administration of an extracellular gadolinium-based contrast agent (including unenhanced, late arterial, portal venous, and delayed phases).
Review of Baseline and Follow-Up Imaging Examinations
Two abdominal radiologists (K.N.B. and L.S., in their 3rd and 4th years of practice after fellowship, respectively) independently reviewed the baseline ultrasound examinations and recorded the presence of an US-2 observation, transducer type used to visualize the US-2 observation (linear vs curvilinear), echogenicity of the US-2 observation relative to liver parenchyma (hyperechoic, isoechoic, or hypoechoic), and ACR US LI-RADS visualization score (A, B, or C).
For patients who underwent 2 or more years of follow-up ultrasound surveillance, the radiologists also reviewed the final available surveillance ultrasound examination. No follow-up ultrasound examination was reviewed in patients without 2 or more years of follow-up ultrasound. The originally assigned US-2 observation was classified on the final ultrasound examination as having no correlate if it was no longer visualized, as stable if it was again visualized and remained subcentimeter in size, or as progressed (i.e., to US-3) if it was again visualized and increased in size to 10 mm or greater [
6].
The radiologists also reviewed the final multiphasic CT or MRI available in each patient (unless the patient did not undergo any such follow-up examinations), whether performed before or after the 2-year follow-up point, and whether or not the patient also underwent 2 years of follow-up ultrasound. The radiologists recorded whether the multiphasic CT or MRI showed a correlate for the US-2 observation on the baseline ultrasound examination. If the CT or MRI showed a correlate, then the radiologists assigned the correlate a category using ACR CT/MRI LI-RADS version 2018 [
10]. The radiologists also reviewed the CT and MRI examinations for any LR-4, LR-5, LR-TIV (LR tumor in vein), or LR-M (definite or probable malignancy, not specific for HCC) observations unrelated to the baseline US-2 observations, defined as being in an unequivocally separate location of the liver from the baseline US-2 observation (e.g., baseline US-2 observation detected in the left lobe but LR-5 observation detected in the right hepatic lobe, or baseline US-2 observation detected in a subcapsular location using a high-frequency linear transducer but LR-5 observation clearly visualized in deeper liver parenchyma).
Disagreements between the two radiologists were resolved by consultation with a third abdominal radiologist (L.Y., with 14 years of postfellowship experience).
Statistical Analysis
Interrater agreement for the various assessments by the two radiologists who performed the retrospective image reviews was calculated using Cohen kappa coefficients for binary measures and weighted Cohen kappa coefficients for ordinal measures. Continuous variables were summarized as median and IQR. Categoric variables were summarized as total numbers with percentages. Outcomes were stratified in terms of 2 years or later follow-up imaging examinations and earlier than 2-year follow-up imaging examinations. Associations between presence of a correlate on follow-up imaging and the transducer initially detecting the US-2 observation were assessed using Fisher exact tests. The association between a baseline visualization score of B or C and the subsequent development of HCC was also assessed using Fisher exact test. For purposes of analysis, LI-RADS categories on follow-up CT or MRI examinations were grouped as LR-1 or LR-2, LR-3, LR-4 or LR-5, LR-TIV, and LR-M. The p values were considered statistically significant at a two-sided value of < .05. Statistical analyses were performed using GraphPad Prism Windows version 6.07 (GraphPad Software).
Results
Patient Characteristics
Clinical and demographic features of the patient cohort are summarized in
Table 1. The 175 patients included 70 women and 105 men (median age, 59 years; IQR, 50–59 years). The most common causes of chronic liver disease were hepatitis B virus (
n = 113; 65%), hepatitis C virus (
n = 26; 15%), and nonalcoholic steatohepatitis (
n = 12; 7%). On the baseline ultrasound examinations detecting the US-2 observation, the visualization score was A in 153 (87%), B in 18 (10%), and C in four (2%). A total of 141 (81%) US-2 observations were hyperechoic, and 34 (19%) were hypoechoic. A total of 102 (58%) US-2 observations were detected using a high-frequency linear transducer, and 73 (42%) were detected using a curvilinear transducer.
Interrater Agreement
Table 2 summarizes interrater agreement between the two radiologists for the various qualitative findings. The kappa coefficient ranged from 0.801 to 1.000 for all assessed measures aside from a kappa coefficient of 0.699 for the visualization score on the baseline ultrasound.
Outcomes of US-2 Subthreshold Observations in Patients With 2 Years or Longer Imaging Follow-Up
A total of 138 patients had imaging follow-up at 2 years or more (median, 2.7 years; IQR, 2.1–3.6 years), including 111 by ultrasound. According to review of the final available ultrasound in patients with 2 or more years of ultrasound follow-up, 43/111 (39%) US-2 observations showed no correlate at follow-up, 68/111 (61%) US-2 observations were stable at follow-up, and no US-2 observation progressed (i.e., grew to ≥ 10 mm to became a US-3 positive observation). Of the 111 patients, 42 also underwent a multiphasic CT or MRI at less than 2 years of follow-up; these CT and MRI examinations were performed at a median follow-up of 0.5 years (IQR, 0.2–0.7 years) after the baseline US-2 observation. Of these 42 CT or MRI examinations, 35 did not show a correlate for the baseline US-2 observation. Of the examinations without a correlate, the baseline US-2 observation was reidentified on the subsequent final ultrasound examination in 15/35 (43%) and showed no correlate on the subsequent final ultrasound examination in 20/35 (57%). The seven remaining CT or MRI examinations showed a correlate for the baseline US-2 observation, which in all cases was classified as LR-1 or LR-2; of these, the baseline US-2 observation could be reidentified on the subsequent final ultrasound examination in 6/7 (86%) and showed no correlate on the subsequent final ultrasound examination in 1/7 (14%).
The remaining 27 patients underwent 2-year or longer imaging follow-up by only multiphasic CT or MRI. A total of 20/27 (74%) of these examinations showed no correlate. Of the seven correlates on CT or MRI, five (19%) were classified as LR-1 or LR-2 and two (7%) were classified as LR-3. Both LR-3 observations were subcentimeter in size and were characterized as a focus of nonrim arterial phase hyperenhancement without washout or other ancillary features. No observations were classified as LR-4 or higher on multiphase CT or MRI.
Figure 2 shows a representative US-2 observation with 2-year imaging follow-up.
Outcomes of US-2 Subthreshold Observations in Patients With Less Than 2 Years of Imaging Follow-Up
A total of 37 (21%) of patients had less than 2-year ultrasound follow-up but underwent diagnostic characterization of the US-2 observation before 2 years by multiphasic CT or MRI. The median interval between the baseline US-2 observation and the follow-up CT or MRI was 0.8 years (IQR, 0.3–1.2 years). Of these follow-up CT or MRI examinations, 26/37 (70%) showed no correlate for the baseline US-2 observation and 11/37 (30%) showed a correlate that was classified as LR-1 or LR-2. No correlative was classified as LR-3 or higher.
Three of the 37 patients also underwent orthotopic liver transplant (OLT) after the multiphasic CT or MRI, performed before the 2-year follow-up point. The CT or MRI examinations in these three patients were among the 26 with no correlate. The pathology reports for the liver explants did not comment on the presence of dysplastic nodule, HCC, or other malignancy in any of these three patients.
Summary of Outcomes of US-2 Subthreshold Observations
Table 3 summarizes the outcomes for US-2 observations, stratified by follow-up length and modality. According to the final available follow-up imaging examination in each patient, 173/175 (99%) US-2 observations either were stable on follow-up ultrasound at 2 years or later (
n = 68); showed no correlate on follow-up ultrasound, CT, or MRI (
n = 88); or showed a correlate on CT or MRI that was classified as LR-1 or LR-2 (
n = 17). The remaining two (1%) observations showed a correlate on CT or MRI that was classified as LR-3. No observations progressed to US-3 on 2-year or more follow-up nor were classified as LR-4 or greater on follow-up CT or MRI. In addition, no dysplastic nodule or evidence of malignancy was identified at OLT (
n = 3).
A total of 106 patients underwent multiphasic CT or MRI at any time point (79 before 2 years [42 in patients with and 37 in patients without additional ≥ 2-year ultrasound follow-up]; 27 after 2 years). Among all 106 patients who underwent multiphasic CT or MRI (whether or not representing the patient's final available follow-up examination), the CT or MRI showed a correlate in 25 (24%) and no correlate in 81 (76%). Of the 25 correlates on CT or MRI, 23 were LR-1 or LR-2 and two were LR-3.
Table 4 summarizes these findings.
Of patients who underwent ultrasound follow-up at 2 years or later, initial detection of the US-2 observation by the linear versus curvilinear transducer was not associated (p > .99) with presence of a correlate on the final follow-up ultrasound (linear: 40/66, [61%]; curvilinear 28/45 [62%]). Of all patients who underwent a subsequent CT or MRI (whether or not representing the patient's final available follow-up examination), initial detection by the linear versus curvilinear transducer was not associated (p = .36) with presence of a correlate on the CT or MRI (linear: 12/61 [20%]; curvilinear: 13/45 [29%]).
Development of Hepatocellular Carcinoma Unrelated to Baseline US-2 Subthreshold Observations
Eight (5%) patients developed HCC during follow-up after initial detection of the US-2 observation. These patients are summarized in
Table 5. All eight HCCs were deemed unrelated to the baseline US-2 observations because the HCC was in an unequivocally different location than the baseline US-2 observation. In addition, in all eight patients, no correlate for the baseline US-2 observation was identified on the follow-up surveillance ultrasound that immediately preceded the imaging examination that detected the HCC. The frequency of a visualization score of B or C on the baseline ultrasound was significantly higher (
p = .009) in these eight patients (4/8; 50%) than in remaining patients who did not develop HCC during follow-up (18/167; 11%).
The eight patients developed HCC at a median of 2.0 years after detection of the baseline US-2 observation. In three patients, HCC was detected during routine surveillance imaging, performed by ultrasound in two (manifesting as a US-3 observation in both, prompting subsequent CT or MRI) and by MRI in one (
Fig. 3). In four patients, HCC was detected by multiphasic CT or MRI performed after an acute rise in AFP. In the remaining patient, HCC was detected by CT performed after the new onset of abdominal distention, which showed infiltrative HCC with extensive tumor in vein (i.e., LR-TIV) (
Fig. 4). The HCC was assigned a LIRADS category on CT or MRI of LR-5 in seven and LR-TIV in one. Aside from the one patient with LR-TIV, early-stage HCC was diagnosed in the remaining seven patients.
Discussion
In this study, we assessed outcomes of subcentimeter observations detected by screening and surveillance ultrasound examinations (i.e., US-2 subthreshold observations) in high-risk patients. No US-2 observation became HCC during follow-up, and the observations commonly had no correlate on follow-up imaging, whether performed by ultrasound or by multiphasic CT or MRI. Among patients who underwent follow-up ultrasound at 2 years or later, no observation progressed to US-3 positive (i.e., grew to ≥ 10 mm). Among patients who underwent further characterization by multiphasic CT or MRI, correlates (if detected) were most commonly LR-1 or LR-2; only two correlates were classified as LR-3, and none were classified as LR-4 or higher. Eight patients developed HCC over the course of surveillance, but all HCCs were unrelated to the baseline US-2 observation, with none showing a correlate for the original US-2 observation on a follow-up ultrasound performed before HCC detection.
The US LI-RADS surveillance algorithm recommends a follow-up ultrasound examination at 3–6 months after initial detection of US-2 observations [
6]. Focal subcentimeter observations cannot be definitively characterized as HCC by imaging alone and thus in general do not alter immediate clinical management [
8]. A short follow-up interval of 3–6 months is recommended to detect growth in the event that the subcentimeter observation reflects a lesion on the HCC spectrum. An increase in size to 10 mm or greater on follow-up ultrasound would prompt further definitive characterization and potentially allow early HCC diagnosis.
The initial management of subcentimeter observations varies across professional society recommendations, ranging from the lack of any recommendation for a short-term repeat surveillance examination to immediate diagnostic characterization. Relatively intensive strategies are adopted by the Asian Pacific Association for the Study of the Liver [
12] and Japanese Society of Hepatology [
13], which recommend immediate diagnostic characterization by multiphasic CT or MRI (or CEUS in certain cases) to potentially allow prompt definitive HCC diagnosis. If the observation cannot be characterized as HCC by CT or MRI, then a short-interval surveillance ultrasound is recommended. This intensive approach may be motivated by the region's relatively low rate of liver transplantation, leading to a larger role for other early treatments for subcentimeter lesions, as well as the high incidence of HCC in patients with hepatitis B (a major cause of hepatocellular disease in Asia) [
2]. On the other hand, the Korean Liver Cancer Study Group and National Cancer Center does not recommend immediate diagnostic characterization nor short-term ultrasound surveillance and instead recommends that patients continue routine ultrasound surveillance every 6 months [
14]. The European Association for the Study of the Liver (EASL) recommends follow-up ultrasound every 4 months for 12 months for subcentimeter observations. EASL is unique in grading the level of supporting evidence and the strength of the recommendation, classifying the level of evidence as low (i.e., “any estimate of effect is uncertain”) and the recommendation as weak [
15].
Although most subcentimeter observations are nonmalignant, the true cause of most such observations remains unknown. A prospective study bv Forner et al. [
16] performed before the introduction of US LI-RADS evaluated subcentimeter observations using fine-needle aspiration as the reference standard, but had a small sample of only 13 subcentimeter observations (two HCCs, three hemangiomas, and eight regenerative or dysplastic nodules). In another study conducted before the release of US LIRADS, 45% of subcentimeter observations were not visualized or were considered indeterminate on follow-up imaging [
17]. To our knowledge, the current study is the first to evaluate the outcomes of liver imaging findings that were clinically characterized as US-2 in a standardized manner in accordance with US LI-RADS.
The choice of follow-up imaging modality may also affect the rate of reidentification of US-2 observations at follow-up. In the present cohort, a correlate was identified on 61% of follow-up ultrasound examinations but on only 24% of follow-up multiphasic CT or MRI examinations. Furthermore, among observations without a correlate on CT or MRI, in 43% a correlate could be identified on a later follow-up ultrasound examination after the CT or MRI.
Because of the low progression rate, our study suggests that an extended period of intense follow-up for sonographically identified subcentimeter observations may not be routinely warranted. Indeed, intensive ultrasound surveillance at 3-month intervals compared with 6-month intervals has been shown to increase the likelihood of identifying subcentimeter focal observations without improving HCC detection or survival [
17]. Furthermore, the development in eight patients of HCCs unrelated to the baseline US-2 observations likely reflect the expected incidence of HCC in high-risk patients (ranging from 0.3 to 8% per year) rather than an actual additional increase in HCC risk associated with subthreshold observations [
2].
Society guidelines vary not only in terms of the initial management of subcentimeter observations, but also in terms of the total duration of required follow-up. If a subcentimeter observation is no longer visualized at initial follow-up, the Japanese Society of Hepatology recommends a return to routine surveillance, whereas EASL recommends 1-year follow-up [
13,
15]. However, US LI-RADS and the American Association for the Study of Liver Diseases (AASLD) are stringent in requiring 2-year stability to characterize a subcentimeter observation as benign [
2,
6]. If a subcentimeter observation shows no correlate on a follow-up ultrasound performed before completing 2 years of follow-up, US LI-RADS and AASLD are unclear whether to reclassify the findings as US-1 negative with a return to routine surveillance intervals or to continue surveillance examinations at 3–6 month-intervals until reaching a full 2 years of follow-up. Because of this ambiguity, we evaluated outcomes after 2 years of imaging follow-up to provide a rigorous assessment of the risk of HCC associated with subcentimeter observations. However, given the low likelihood of recategorizing subcentimeter observations as US-3 at ultrasound follow-up or as HCC at diagnostic characterization, a shorter interval for intensive follow-up may be acceptable. Future iterations of clinical guidelines may thus indicate that a US-2 observation that is stable on a follow-up ultrasound performed earlier than 2 years after baseline can be reclassified as US-1, without requiring continued short interval follow-up examinations for a full 2-year interval.
Our study had limitations. First, this was a single-institution retrospective study. A multicenter prospective design would help validate our findings. Second, pathologic findings were unavailable for most patients, and the cohort had heterogeneous follow-up by ultrasound, CT, or MRI, as well as by OLT in several patients. Nonetheless, a multiphasic CT or MRI was performed during the course of surveillance in most patients, despite US LIRADS not recommending that CT or MRI be performed for follow-up of US-2 observations. Third, this study was performed in the United States, and a large proportion of patients had hepatitis B. Patient populations vary across institutions and geographic regions, and the prevalence and nature of focal observations may differ according to the population's distribution of risk factors (e.g., alcoholic cirrhosis, steatohepatitis) and exposures (e.g., aflatoxins) [
2]. Thus, findings may differ in centers in which patients undergoing HCC screening and surveillance have a higher prevalence of other causes of chronic liver disease, such as alcoholic cirrhosis or steatohepatitis [
18,
19]. Finally, radiologist and sonographer experience may affect ultrasound interpretations and the ability to reidentify observations detected on earlier examinations [
20]; the design of the current study did not allow assessment of this potential association.
In conclusion, US-2 subthreshold observations are unlikely to progress or become HCC. At 2-year ultrasound follow-up, all US-2 subthreshold observations either showed no correlate or were stable. HCC was diagnosed in eight patients, though at separate locations of the liver from the baseline US-2 observations and diagnosed after a median follow-up of 2.0 years as part of routine surveillance; no HCC arose from progression of a US-2 observation detected as part of surveillance imaging. On multiphasic CT or MRI, most US-2 observations either had no correlate or were characterized as LR-1 or LR-2. Our data suggest that most US-2 subthreshold observations are clinically insignificant and that an extended period of intensive follow-up, as recommended by multiple professional societies, may not be warranted.