Lung cancer screening with low-dose CT has shown favorable patient outcomes in national and international trials with strong evidence to support expansion of lung CT screening into clinical practice for appropriate patients at high risk for developing lung cancer [
1–
5]. As lung CT screening becomes more widely available, optimizing techniques for low-dose imaging protocols [
6–
8] is critical to minimize radiation exposure in the screening population while preserving diagnostic accuracy for both solid and sub-solid nodules. Current imaging guidelines for low-dose CT screening by the American College of Radiology (ACR) recommend a volume CT dose index of less than 3.0 mGy for a “standard-sized patient” [
9].
Recent studies have showed that further radiation dose reduction is possible. According to Mettler et al. [
10], the mean effective dose for posteroanterior and lateral radiographs of the chest is 0.1 mSv (range, 0.05–0.24 mSv) [
10]. The radiation dose of microdose chest CT examinations varies from 0.06 to 0.16 [
6,
8]. At this dose level, chest CT can depict pulmonary nodules with an effective radiation dose comparable with that of a two-view chest radiograph [
11]. Microdose technique introduces inherent image degradation (increased noise levels); however, these limitations can be mitigated by different processing techniques to facilitate noise reduction and nodule detection.
Maximum-intensity-projection (MIP) reconstruction is a relatively simple postprocessing method applied to chest CT data that has proven to increase nodule detection rates and temporal detection efficiency in both standard- and low-dose chest CT [
12]. In addition, many vendors offer software solutions for assisting the radiologist with CT screening for lung cancer by automated lesion detection. Computer-aided detection (CAD) systems rely on different calculating algorithms with varying accuracy in lesion detection rates [
13,
14]. Multiple studies have investigated the utility of CAD in CT screening for lung cancer and have found that CAD software improves diagnostic sensitivity and specificity of screening nodule detection [
14,
15].
Several studies have detailed the effects of radiation dose, reconstruction kernel, MIP, and CAD on lung CT image quality and nodule detection [
16–
18]. However, to our knowledge, no comprehensive investigation has included all of these multivariate factors and examined their effect on diagnostic accuracy. The purpose of our study was to evaluate the lung nodule detection rate on standard-dose and microdose chest CT using two different CAD systems (SyngoCT-CAD, VA 20, Siemens Healthcare [CAD1]; Lung CAD, IntelliSpace Portal DX Server, Philips Healthcare [CAD2]) and MIP-processed images as well as the effect of variable reconstruction kernels.
Materials and Methods
Local ethics committee approval was waived because data were collected solely from anthropomorphic chest phantoms (chest phantom N1, Kyoto Kagaku).
Figure 1 shows the basic study design. The phantom represents an anatomically accurate life-sized model of an male human torso. The phantom features a synthetic heart, trachea, mediastinum, and lung architecture. The soft-tissue structures and the incorporated synthetic osseous structures are comparable with human tissues, providing similar absorption rates on CT. The lungs contained embedded nodules that varied in size and density. Four nodule diameters were used: 5, 8, 10, and 12 mm. Lesion attenuation simulated solid and ground-glass nodules measuring 100 HU and −630 HU, respectively (
Fig. 2). Solid and ground-glass nodules were randomly assigned to different bilateral lung segments and phantoms. In total, 133 ground-glass and 133 solid nodules were placed in 55 phantoms. The absolute number of nodules placed per phantom was 8 (range, 0–8). The allocation of nodules was recorded on a standardized spreadsheet by an independent investigator not involved in image interpretation.
Each phantom was then scanned with both standard-dose and microdose CT on a 128-MDCT scanner (Somatom Definition Flash, Siemens Healthcare) featuring iterative reconstruction algorithms (IRIS, Siemens Healthcare) and an integrated detector system (Stellar, Siemens Healthcare). The following scan parameters were applied for the microdose CT scan: helical acquisition with 80 kVp and 6 mAs. To achieve maximum dose reduction, automated dose reduction features were disabled on the scanner. A high pitch factor of 2.2 was chosen for further dose minimization. FOV was kept constant at 32 cm. Slice reconstruction thickness was 1 mm. Standard-dose chest CT scans were performed with 100 kVp and 100 Quality Reference mAs. All other imaging parameters were the same used for the microdose scan.
For raw dataset processing, level three iteration was applied; axial slices were reconstructed with three different kernels (i30, i50, and i70) at 1-mm slice thickness for interpretation; multiplanar reconstructions were not performed. Radiation dose was represented by the dose-length product (DLP) calculated by the scanner for a standardized body phantom with a diameter of 32 cm and a constant scan length (
z-axis) of 35 cm to image the entire phantom. Effective dose for both the standard and microdose CTs was calculated by multiplying DLP by the organ-specific weighting factor published by the International Commission on Radiologic Protection [
19].
Image Analysis
After data acquisition and processing, images were transferred to the PACS server (PACS IDS7, Sectra). Simultaneously, the imaging data were sent to CAD1 and CAD2. Four independent readers with between 3 and 5 years' experience in chest imaging read the MIP images on the PACS workstation. The readers were blinded to the location, distribution, composition, size, and number of embedded lung nodules. All scans were interpreted in a standard lung window setting (window center level, −600; window width, 1500) and with a MIP slab of 8 mm and a slice thickness of 2 mm. The image presentation sequence for each dose level (standard vs microdose) was randomly assigned. Furthermore, the radiologists were not aware of the CAD results at the time of MIP interpretation. They recorded the size and slice position at the maximum diameter for each suspected nodule. A fifth investigator, blinded to the distribution of the artificial lung nodules and the other readers' MIP interpretation results, performed the automated CAD1 and CAD2 analysis. The true-positive, false-positive, and false-negative results for each reader and interpretation technique were calculated.
Statistical Analysis
Sensitivities for both CAD and MIP nodule detection on standard dose and microdose CT were calculated for each reconstruction kernel. The total amount of false-positive results as well as the false-positive rate in a phantom-based analysis were calculated and recorded. Dose, kernel, device, and nodule-based comparisons of sensitivities and false-positive rate were performed using the test of comparison of proportions (z-test). Because there were four readers for the MIP images, the mean sensitivities were pooled to 452 nodules. Dose levels were analyzed with the comparison of two means test. The level of significance was set to p < 0.05. MedCalc (version 7.6.0.0, MedCalc Software) was used for statistical analysis.
Results
Nodule Detection With Computer-Aided Detection and Maximum Intensity Projection
Nodule detection sensitivities of the two different CAD systems and MIP at varying dose levels are outlined in
Table 1. At a standard dose, CAD1 showed similar overall detection sensitivities ranging from 96.0% to 97.3% (
p > 0.6) for both solid and ground-glass nodules at variable reconstruction kernels. Likewise, the best CAD performance on microdose CT was similar to that of standard-dose CT, with overall best sensitivities from each dose group of 96.0% and 97.3%, respectively (
p = 0.61).
In fact, no significant difference was seen in CAD1 performance between standard and microdose with one exception: The sensitivity for ground-glass nodules on microdose CT reconstructed with a hard lung window kernel (i70) was much lower when compared with standard-dose CT (78.8% vs 92.9%, respectively; p = 0.0044).
CAD2 performance was more variable than that of CAD1. The sensitivities of CAD2 and CAD1 for solid lung nodules were overall comparable for standard and microdose scans (CAD1, 98.2–99.1%; CAD2, 95.6–100%). However, CAD2 sensitivities for ground-glass nodules were significantly lower than what was seen with CAD1. By way of comparison, the sensitivity for ground-glass nodules with microdose CT and an i50 kernel on CAD1 was 92%; CAD2 had a sensitivity of only 12.4% (
p < 0.0001,
Table 1).
Sensitivity range for all nodules, doses, and reconstruction kernels was 88.9–97.3% for CAD1 and 49.6–73.9% for CAD2. The corresponding false-positive rates are shown in
Table 1. In the per-phantom analysis, false-positive test results with both CAD systems were highest with microdose CT and soft-tissue kernels (CAD1, 8.3; CAD2, 64.7). In comparing both CAD systems, CAD2 delivered consistently higher false-positive rates relative to CAD1 (
Table 1).
Highest sensitivity for lung nodule detection was achieved with the combination of standard-dose CT, i50 kernel, and CAD1 (97.3%). Of note, a similar sensitivity reaching 96% was achieved with the combination of microdose CT using i30 or i50 kernels and CAD1.
Using the MIP series as a stand-alone interpretive modality yielded sensitivity comparable with that of CAD1 for both solid and ground-glass nodules across variable dose and reconstruction kernels (
Table 1). As was the case with CAD1, MIP series were superior to CAD2 in detecting ground-glass nodules. However, MIP interpretation resulted in significantly fewer false-positives relative to CAD1 in the phantom-based analysis (MIP range: 0.04–0.16 per phantom; CAD1 range: 0.7–8.3 per phantom;
p < 0.0001). Comparing the best MIP result at standard dose level (99.1% sensitivity with i50 kernel) with the best CAD result (97.3% sensitivity on CAD1 with i50 kernel), there was no statistically significant difference in performance (
p = 0.14). At microdose level, the best MIP sensitivity (97.6% with i30 kernel) was comparable with the best CAD sensitivity (96.0% with i30 kernel;
p = 0.36).
Performance of Microdose CT
Mean phantom dose for microdose CT was significantly less than that for standard-dose CT (0.1323 mSv vs 1.65 mSv, respectively; p < 0.0001). Using the best MIP results for nodule detection, no statistically significant difference was seen in diagnostic accuracy and sensitivities between standard-dose and microdose CT (99.1% vs 97.6%, respectively; p = 0.1313). Likewise, the best CAD performance on microdose CT was similar to that of standard-dose CT, with overall best sensitivities from each dose group of 96.0% and 97.3%, respectively (p = 0.48).
Different Combinations of Maximum Intensity Projection, Computer-Aided Detection, and Reconstruction Kernel
MIP with standard-dose CT provided best performance for solid nodules with a soft-tissue reconstruction kernel (i30) compared with the medium (i50) and hard (i70) reconstruction algorithms (99.1% vs 97.4%, p < 0.0411). In the microdose examinations, i30 was also the most favorable reconstruction kernel for MIP interpretation when compared with i50 and i70 kernels (i30 vs i50, p < 0.0582; i30 vs i70, p < 0.0001). For the CAD analysis in microdose CT, i30 provided the best performance among the reconstruction algorithms (i30 vs i70, p < 0.0074).
With respect to the CAD-kernel relation, no statistically significant differences were seen in sensitivity between the three kernel reconstructions (p > 0.8) within each CAD. There were also no significant differences in sensitivity between CAD systems and kernels for solid nodules; however, as previously noted, CAD2 performed significantly worse than CAD1 for ground-glass nodules at each kernel reconstruction level.
Influence of Nodule Size and Composition
The influence of nodule size on lung lesion sensitivity relative to CT dose and kernel reconstruction is shown in
Table 2. In general, sensitivity increased with increasing nodule size for both CAD1 and MIP modalities with both standard and microdose CT and for all kernel reconstructions. The application of CAD1 performed as well for small nodules as it did for large nodules in standard and microdose technique (sensitivities between 5-mm and 12-mm nodules,
p = 0.077–0.1697). Also, no statistically significant difference was seen in CAD1 performance for nodule detection between standard and microdose CT. In comparing CAD systems, CAD1 performed significantly better than CAD2 for all nodule sizes irrespective of dose or kernel. Of note, CAD2 sensitivities were inversely proportional to nodule size.
The impact of kernel reconstruction on nodule detection is noted in
Tables 1 and
2. For both CAD and MIP with standard-dose CT, differences in sensitivities between kernels were not statistically significant. However, when applied to microdose CT, kernel reconstructions had a more appreciable impact on nodule detection relative to size. For example, an i30 kernel was significantly better than an i70 kernel for detection of nodules that measured 5 mm when using MIP reconstructions for microdose CT (sensitivities of 94.8% vs 62.9%, respectively;
p < 0.0331). No statistically significant difference was seen between kernels for larger nodules.
Overall, MIP and CAD performed similarly (
p = 0.3563–0.0776) regardless of nodule size or CT dose protocol. MIP performance was independent of nodule composition (solid or ground-glass) for both standard and microdose CT (
p = 0.15–0.24). When operated with standard-dose CT, the CAD1 system showed unimpaired performance, regardless of nodule composition (
p = 0.7). When tested with a hard lung kernel (i70) on microdose CT, CAD1 was significantly more sensitive for detecting solid nodules than for detecting ground-glass nodules (
p < 0.042). CAD2 was less reliable for detecting ground-glass nodules regardless of dose and kernel employed (
Table 1). Superior nodule perception was achieved when MIP and a soft-tissue kernel (i30) were used (
Table 1).
Satisfaction of Search in Maximum-Intensity-Projection Readings
To address the relationship of the number of missed nodules and the number of nodules embedded in the phantom, we separately analyzed sensitivity on the basis of nodules per phantom (
Table 3). A substantial satisfaction-of-search effect was seen after readers identified the first nodule in phantoms equipped with two lesions. However, this effect did not apply in phantoms with more than two lesions (three to eight nodules,
Table 3).
Discussion
Lung cancer is the leading cause of cancer-related deaths in the United States and worldwide [
20]. Screening with low-dose CT has been shown to reduce lung cancer-specific mortality through early detection [
2]. Low-dose CT screening for lung cancer has been endorsed by nearly every major international medical society and other agencies including the ACR, American Cancer Society, American College of Chest Physicians, American Thoracic Society, National Comprehensive Cancer Network, U.S. Preventive Services Task Force, and Centers for Medicare & Medicaid Services. Growing acceptance of low-dose CT screening has increased interest among patients who are at high risk for lung cancer, their physicians, and insurers, many of whom offer coverage for annual low-dose CT screening examinations and anticipate growing use of this important service. Given the potential impact on a large patient population, low-dose CT screening must be properly performed and optimized.
In this context, a discussion about the costs of CAD systems is warranted, because they may contribute to the overall cost-effectiveness of low-dose CT screening. The costs of CAD systems vary widely depending on the vendor and institutional contracts. Numerous large and small enterprises offer affordable solutions, with many scanner manufacturers offering CAD solutions embedded in their workspaces. With respect to effectiveness, CAD as a second reader increases sensitivity and reduces false-negatives at lower cost than double reading by a second radiologist [
21,
22]. However, increased sensitivity with CAD comes at a cost of increased reading times. Beyer and colleagues [
21] found a mean increase in reading time of 45 seconds when using CAD as a second reader compared with readings without CAD. Notwithstanding the increase in time for interpretation, current European lung cancer screening guidelines recommend the general use of CAD systems to improve quality, outcomes, and cost-effectiveness [
22].
Many parameters influence the accuracy and efficiency of pulmonary nodule detection on chest CT. This study compared the detection rate of solid and ground-glass pulmonary nodules of variable size using different CT dose levels and reconstruction kernels using one of two different CAD systems or MIP processing for interpretation. The data show a significant advantage of CAD1 over CAD2, with the former exhibiting higher sensitivities in both the standard-dose and microdose spectra. With a significant difference in the detection of ground-glass lesions, CAD1 outperformed CAD2. Notably, readers using MIP alone showed better detection rates than those using CAD2. CAD2 performed best with standard-dose CT and a hard lung kernel (i70), which likely reflects the most common imaging parameters for routine clinical diagnostic chest CT, but the performance of CAD2 on lower dose examinations and datasets with softer reconstruction kernels was suboptimal.
When we compared the performance of both CAD systems, an inverse relationship between nodule size and sensitivity for CAD2 was evident. Our findings indicate that CAD2 sensitivity decreases with increasing lesion diameter (
Table 2). Since both CAD systems were tested with the same scans, this issue is most likely attributable to the software algorithm applied in CAD2. CAD detection sensitivity has been reported to typically decrease with decreasing nodule size [
23,
24]. In a study by Yuan et al. [
25], the CAD system tested missed seven large (> 10 mm) nodules out of 27 lesions. Yuan and colleagues were able to verify that these large nodules were missed because of immediate continuity with normal anatomic structures such as fissures, pleura, or lung vessels. We hypothesize that the CAD2 system used in this study suffered from similar segmentation errors.
Radiation dose minimization places further demands on CAD systems. Systems cannot only perform well with standard-dose CT; current imaging guidelines and expectations necessitate increasing performance in a low-dose imaging range. In our study, CAD1 and the MIP reconstructions met these demands (
Table 1). Comparing the sensitivity of the two CAD systems related to the image reconstruction kernel, microdose CT performed better in combination with the soft-tissue kernel (i30) on both systems. Inversely, the performance of both CADs applied on microdose CT decreased with i50 and i70 kernels. This trend is likely a result of variations in the signal-to-noise ratio. As a result of minimizing tube voltage and tube current in obtaining microdose CT, image noise increases and is directly related to the applied folding kernel with higher (harder) kernels producing inferior signal-to-noise ratios compared with standard-dose CT. Using soft-tissue reconstruction kernels can improve the accuracy of nodule detection on microdose CT examinations when used in conjunction with CAD or MIP processing. In a study by Ebner et al. [
6], readers favored soft-tissue kernels over standard lung kernels for microdose CT. In contrast, standard-dose chest CT was more accurate with a hard kernel (i70). The present data outline that soft-tissue kernels increase the sensitivity and consequently increase the false-positive rate in microdose examinations. Despite the influence of a satisfaction-of-search effect in phantoms containing two lesions, our findings suggest that MIP reconstructions might deliver similar sensitivities in a screening population and thus be able to serve as a substitute for CAD systems. Furthermore, MIP reconstructions are feasible without requiring more expensive CAD software solutions. The two CAD systems we studied exhibited distinct differences in their false-positive rates in a phantom-based analysis. The false-positive results of CAD2 were more than 10 times higher than those of CAD1. Although CAD2 delivered markedly higher false-positive results, pooled sensitivity remained low. These discrepant results are not dose- or kernel-dependent and may be due to differences in software algorithms. Similar to the dose-kernel relationship, false-positive findings were minimized in standard-dose CT scans to which CAD1 was applied.
The current study shares inherent limitations related to the nature of phantom experimentation, such as lack of anatomic detail, interindividual demographic variation, and imaging artifacts. These limitations were minimized as much as possible by using a lifelike replica of a human torso, but the results will require validation in prospective trials with human subjects. An additional consideration in using a phantom is that the artificial lung parenchyma and synthetic nodules result in images with sharp contrast compared with the air-attenuating background, which may explain the excellent performance of all stand-alone reading devices, especially CAD1. Another point to consider is that the CAD software of both vendors is primarily designed to detect solid lung lesions. Although the vendors claim that their systems can also detect subsolid lesions, a lack of optimization for ground-glass nodules may explain the inferior performance of CAD2. Finally, a general limitation for CAD systems is the high false-positive rate (CAD1, maximum of eight false-positives per study; CAD2, maximum of 64.7 false-positives per study), which resulted in reduced specificity. Furthermore, we recognize that both dose and CT interpretation in a phantom study may not accurately reflect conditions encountered in clinical practice. The actual performance of CAD and MIP systems in a clinical setting may vary to a substantial degree not only because of the inherent differences of each system but also because of the nodule prevalence in a screening cohort. The present evaluation showed a high frequency of lung nodules (mean, 4.84 per patient); only five phantoms held no lesions, which may have introduced a selection bias.