|
|
||||||||
Original Research |
1 Department of Radiology, Medical College of Wisconsin, 9200 W Wisconsin Ave.,
Milwaukee, WI 53226-3596.
2 Present address: Department of Radiology, Hacettepe University Faculty of
Medicine, Sihhiye, Ankara 06100, Turkey.
3 Present address: Department of Radiology, University of Maryland Medical
System, Baltimore, MD 21201.
4 GE Healthcare, Waukesha, WI 53188.
Received December 2, 2004;
accepted after revision February 10, 2005.
Supported by a grant from GE Healthcare, Milwaukee, WI.
Abstract
|
|
|---|
MATERIALS AND METHODS. Fifty nodules, less than 20 mm in diameter, in 29 patients were scanned with 1.25-mm collimation using MDCT (time 1 = T1). During the same session, two additional scans, using identical technique, were obtained through each nodule (T2, T3). Three observers working independently then obtained volumetric measurements using a semiautomated volumetric nodule-sizing software package. Qualitative nodule characterization was also performed. The Bland-Altman method for assessing measurement agreement was used to calculate the 95% limits for agreement for nodule volumes at T1, T2, and T3.
RESULTS. Automated nodule segmentation was successful in 438 (97%) of 450 measurements. Forty-three nodules were available for final evaluation. Twenty-six nodules had well-defined edges, and 17 had irregular or spiculated margins. Seventeen were freestanding, 16 were juxtapleural, and 10 were juxtavascular in location. Average nodule volume was 345.5 mm3 (range, 49.31,434 mm3). The mean interobserver variability (repeatability) was 0.018% (SD = 0.73%), and the SD of the mean for the three contemporaneous scans (reproducibility) was 13.1% (confidence limits, ± 25.6%). SD and confidence limits narrowed as volumes increased.
CONCLUSION. Volumetric measurements show minimal interobserver variability (0.018%) but an interscan SEM of 13.1% (confidence limits, ± 25.6%). Repeatability and reproducibility of volumetric measurements are better than those of linear measurements reported in the literature.
Keywords: chest imaging CT CT image processing CT technique lung lung cancer lung nodule
|
|
|---|
With the introduction of multidetector scanners, thinner slices are more routinely available and better z-axis resolution renders nodule volume measurements more accurate. GE Healthcare has recently developed a software program (Advanced Lung Analysis [ALA]) that can semiautomatically determine nodule volume. This program segments and separates a nodule from adjacent structures and then sums the weighted voxels to determine volume. Before such a system is clinically useful, it must show repeatable and reproducible measurements in vivo. Measurement reproducibility is more critical than assessing absolute nodule volume because proportional change in volume over time is more clinically relevant than the absolute change in volume. If successful, nodule volume determination software should permit physicians to determine growth rates (doubling time) more accurately to assess the need for intervention and assess response to chemotherapy. It will also provide a method for determining doubling times of various primary tumors, metastatic nodules, and benign nodulesthereby improving our understanding of growth characteristics of many entities.
|
|
|
|
|---|
Data Acquisition
The initial CT scan was obtained in a manner appropriate for the patient's
medical condition. The patient remained supine throughout the study. After a
10- to 20-min pause, the nodule or nodules in question were rescanned twice on
separate breath-holds (T2, T3). In eight patients, IV contrast material was
given at the time of the initial diagnostic scan because it was clinically
indicated. No additional contrast material was administered at T2 or T3. All
CT scans were obtained on an 8- or 16-MDCT scanner (LightSpeed Ultra or
LightSpeed 16, GE Healthcare) with 120 kVp, 200400 mA, 0.50.8
sec, pitch of 1.351.375, and 5 mm (1.25 mm x 4 thickness) as a
standard technique. Patients were instructed to "breathe deeply and hold
your breath." No attempt was made to standardize lung volumes. Scans
were viewed at 1.25-mm section thickness and high-resolution (bone) algorithm.
For each patient, the scanning parameters were identical at T1, T2, and
T3.
Fifty nodules in 29 patients were scanned over a 5-month period. No patient was rescanned at more than two levels, and no patient had more than three nodules included in this study. Patient-identifying information was removed, and cases were randomized before measurements were made.
Nodule Analysis
The CT data were transferred electronically to a workstation (Advantage
Windows, GE Healthcare). One investigator chose the nodules to be evaluated
and assigned each patient's three scans (T1, T2, T3) randomly into one of
three groups (A, B, or C) for interpretation. Working independently, two
experienced thoracic radiologists (7 and 30 years' experience) and an
untrained biomedical engineering intern measured each nodule (T1, T2, T3)
using the volumetric (3D) analysis software (Figs.
1A and
1B). If automated segmentation
failed, it was attempted a second time. After two failures, a nodule was
eliminated from the study. At least 2 weeks passed between analyses of images
in groups A, B, and C.
At T1, both radiologists, using a checklist of agreed criteria, characterized nodule location (lobe), lung position (juxtapleural, juxtavascular, or interparenchymal), and lung background (normal, chronic obstructive pulmonary disease [COPD], fibrosis, or ground-glass opacification). They also rated nodule edge characteristics (smooth, lobular, irregular, or spiculated) and perceived nodule density (solid, totally calcified, or mixed). A third radiologist resolved differences in categorization between the two radiologists involved in scoring.
Interobserver agreement of nodule volume was determined on T1 readings (n = 150). Interscan agreement was evaluated on the basis of three independent reviewers evaluating three contemporaneous scans (T1, T2, T3) (n = 450).
Segmentation and volume calculation were automatic. To begin the segmentation and sizing routine, the user first places a cursor on any point in a nodule of interest in an axial view. The nodule bookmark location is optimized and repositioned automatically so that nodule volume computation always begins from the same seed point. After the seed point is repositioned, watershed segmentation is performed to separate nodule components from the lung parenchyma. Finally, a model-based shape analysis is performed to determine anatomic characteristics of various nodule types. This permits unique handling of the different anatomic presentations of lung nodules, including separation from adjacent chest wall or mediastinal structures and segmentation of vascular connections before nodule volume estimation. The nodule volume calculation uses a weighted sum of voxel volume on the border of the nodule based on a priori knowledge of the CT scanner point-spread function impact on nodule edges.
|
|
|
|
|---|
The population of 43 patients included 27 women and 16 men. The average age was 60.2 years (range, 3584 years). The CT characteristics and locations of the nodules are listed in Tables 1 and 2. Seventeen nodules were freestanding in the parenchyma, 16 were juxtapleural, and 10 were juxtavascular.
|
|
The average nodule volume was 345.5 mm3 (SD = 361.1 mm3) with a range of 49.31,434 mm3. The distribution of nodule volumes is displayed in Figure 4 along with equivalent diameters estimated from the calculated volumes as though the nodules were spherical. Thirty-five (81%) of the 43 nodules were 10.5 mm or smaller.
|
The sample was then subdivided to compare subgroups of nodules. No significant differences were found between these relatively small subgroups (Table 2). The SD of the mean for 27 smooth and lobulated nodules was 13.5% (95% confidence limits, ± 26.4%) and for 17 irregular and spiculated nodules was 12.8% (95% confidence limits, ± 25.2%). The SD of the mean for eight calcified nodules was 11.6% (95% confidence limits, ± 22.7%); for eight contrast-enhanced nodules, 6.6% (95% confidence limits, ± 32.7%); and for the remaining 28 nodules, 12.6% (95% confidence limits, ± 24.8%). Note that one nodule was both calcified and contrast-enhanced. When unenhanced noncalcified nodules were compared with contrast-enhanced or calcified nodules or with contrast-enhanced calcified nodules, the SD of the mean was 12.6% (95% confidence limits, ± 24.8%) and 14.5% (95% confidence limits, ± 28.4%), respectively. We compared 30 nodules surrounded by normal lung tissue with 10 found in diffusely abnormal lung background and found an SD of 13.1% (95% confidence limits, ± 25.6%) and 13.5% (95% confidence limits, ± 27.1%), respectively.
The sample was subdivided into nodules according to effective diameter,
which is the diameter of a sphere with the measured nodule volume. The
effective diameter ranges were chosen to evenly distribute the 43 nodules over
three size categories for statistical purposes: < 6 mm (n = 13), 6
to < 9 mm (n = 16), and 919 mm (n = 14). The SD of
the mean was 14.1% (± 27.67%), 16.0% (± 31.4%), and 9.1%
(± 17.9%), respectively. Note that for larger nodules (
9 mm) the
standard deviation of the mean was lower and the CI narrowed.
Figure 5 depicts the 95%
confidence limits for nodules above a given volume, illustrating that
confidence limits narrow as nodule volume increases.
|
|
|
|---|
In daily practice, linear measurements in the axial plane are commonly used to assess size changes. The World Health Organization utilizes a bidimensional cross product to follow nodule size (the largest diameter x the perpendicular length), whereas the Response Evaluation Criteria for Solid Tumors (RECIST) protocol used the largest dimension [13, 14]. If nodules were perfectly spherical, a change in diameter would accurately reflect overall changes in volume. However, nodules are frequently lobular, so judgments based on long- and short-axis measurements are necessarily subjective. In addition, nodule margins may be spiculated or indistinct, making it difficult to define borders precisely.
If important clinical decisions are to be made based on serial scans, the measurement tools must be observer-independent and reproducible from scan to scan. A 10% (or 1 mm) error in the diameter measurement of a 10-mm nodule could result in a perceived 39% change in volumea potentially clinically significant error.
The literature on accuracy of measurements of CT-detected nodules is limited and difficult to summarize because methodology varies from study to study and results are expressed differently by different authors.
For example, Wormanns et al. [6] assessed agreement between linear measurements of pulmonary nodules (diameter, 240 mm). Using hard-copy images, the reviewers had to classify a nodule as smaller than 5 mm, 510 mm, or larger than 10 mm. They found an interobserver correlation of 0.91 and 0.89 for nodules reconstructed at 3 and 5 mm, respectively. For the study conducted by Revel et al. [7], three radiologists measured 54 solid nodules on a PACS workstation three times in one session. Those authors concluded that to ensure a true increase in volume on serial scans (95% CI) when measurements were made by the same radiologist, the nodule diameter would have to increase by approximately 1.6 mm. With more than one reviewer, the nodule would have to increase by 1.7 mm to diagnose growth with certainty. Erasmus et al. [15] showed considerable intraobserver variability and even more interobserver variability (five radiologists, two readings) in both one-dimensional and 2D measurements in 33 patients with lung tumors (diameter, 1.88 cm). Intraobserver variability would have led to misclassification of growth in 9.5% of unidimensional measurements and 21% of bidimensional measurements. Misclassification due to interobserver variability was 30% and 43%, respectively.
Schwartz et al. [3] studied three types of tumors measured with handheld calipers, electronic calipers, and automated perimeter contour detection. Hand calipers and electronic calipers had an interobserver coefficient of variation of 0.19 and 0.17, respectively. Using automated perimeter contour detection, the coefficient of variation fell to 0.9.
The mentioned limitations of linear measurements and the nonspherical shape of most nodules have caused investigators to focus on the use of volumetric computerized methods. Yankelevitz et al. [9] introduced semiautomated nodule volume calculation software and assessed it in a phantom using both spherical and deformable silicone nodules (diameter, 3.911 mm). Phantom nodule volumes could be measured accurately to within ± 3%. Two recent studies have focused specifically on the problems of repeatability and reproducibility of volumetric measurement in vivo. Wormanns et al. [16] obtained two whole-lung CT scans within 10 min on 10 patients with multiple nodules. Using 50 nodules (diameter, 220 mm), they assessed intra- and interobserver agreement between two reviewers and then interscan agreement. Automated volumetric software was used for all measurements. Intraobserver variability was 0.5% (95% CI, 0.21.6%) and interobserver variability was 0.5% (95% CI, -3.0% to 1.4%). Interscan variability increased to approximately ± 20%.
Revel et al. [17] reported on 54 solid nodules evaluated three times by three reviewers during the same session using the same software that we used in our experiment. Segmentation was successful in 96%. Intraobserver variability ranged from 2.4% to 3.1%. Interobserver agreement was perfect in 35 patients (67%). The small sample size of the remaining nodules limited further statistical analysis.
Variables that may affect the accuracy of volume measurements have also been studied. Ko et al. [18] showed, in a phantom study of nodules (< 5 mm diameter), that higher precision computerized volumetric measurements could be obtained with the use of a high-frequency algorithm and diagnostic CT technique (120 mAs) rather than low amperage. Ground-glass nodules and small size were associated with increased measurement variability. In a phantom study, Winer-Muram et al. [8] showed that nodule volume was overestimated more on thick-section CT images than on thin-section CT images.
The design of our study incorporated the use of a high-frequency algorithm, thin-section images (1.25 mm), and a diagnostic CT technique. The technique was held constant on three consecutive scans. These parameters were designed to minimize variability of measurement related to the scan technique itself. One variable that was not held constant was lung volume, which could have been controlled more precisely by the use of a spirometer. Instead, all CT scans were obtained with instructions for deep inspiration, which is consistent with actual clinical practice.
The first question posed by our study investigated interobserver repeatability. Variation among three reviewers was extremely low (0.018%) for nodules between approximately 4 and 20 mm in diameter. Thus, for a given nodule, regardless of shape and edge characteristics, three observers obtained almost identical volumes. This finding is similar to those of the in vitro studies of Yankelevitz et al. [9] and Ko et al. [18] and the in vivo studies of Wormanns et al. [16] and Revel et al. [17].
The second question examines variations in measurement when the same nodule is scanned three times over 20 min. The SD of the mean was 13.1% (CI, ± 25.6%). This is better than literature reports for linear measurements on the same scan but still high for clinical work. Even for nodules over 9 mm in diameter, the SEM was 9.1% and confidence limits were ± 17.9%.
This study departs from previous work, except the recent work of Wormanns et al. [16], in its assessment of the same nodule scanned more than once. Because there is negligible interobserver variability, the nodule itself and its margins with surrounding lung parenchyma must vary from scan to scan. It is hypothesized that a number of changes may occur between serial scans that may affect results of both manual linear and automated volume measurements. Physiologic changes such as lung volume, phase of cardiac cycle, microatelectasis, or patient position on the table may occur, and technical changes such as slice registration and selection may lead to varying amounts of volume averaging. Volume averaging in our study was minimized by using 1.25-mm axial images. Boll et al. [10], using cardiac gating, recently showed that small nodules near the heart show as much as 34% volume change during the cardiac cycle.
This study was designed to assess all types of nodules rather than one specific type. The ability to analyze the subgroups of nodules is severely limited. Only a small percentage of nodules were of ground-glass or mixed attenuation. These are clinically considered to be the nodules most suspicious for cancer [19]. These nodules are probably more difficult to measure reproducibly than solid nodules. We did not have a sufficiently large sample of such nodules to examine this question. The inclusion of eight completely calcified nodules also represents a limitation. Dense or complete calcification, except in certain sarcomas, is considered a benign characteristic, one not requiring measurement. It is possible that the automated volumes are more reliable in calcified nodules, which have sharp edges; thus, our study results appear more favorable than if only noncalcified nodules had been evaluated. However, the SD of the mean for our eight calcified nodules was 11.6% (± 22.7%) versus 12.9% (± 25.2%) for the noncalcified unenhanced group.
Does IV contrast material change the density of the nodule and its volumetric measurement? This is a frequent clinical scenario because the initial scan is often obtained with contrast enhancement and the follow-up, without contrast enhancement. Eight contrast-enhanced nodules were included in this study. The volume between the first and third scans decreased 7.2%, well within the confidence limits. The SD of the mean of the eight contrast-enhanced nodules was 16.6% (CI, ± 32.7%) versus 12.6% (± 24.8%) for the noncalcified unenhanced group. The role of IV contrast enhancement requires further study.
Segmentation failed in six (12%) of 50 nodules. In four nodules in which segmentation failed, a vessel was included in the automated segmentation. This is easily visible to the operator and could be electronically cropped. In our study, we chose not to allow the operator to alter the bounding box defining the nodule and its immediate surroundings before the volumes were calculated. Thus, some failed segmentations might have been rescued with minor intervention.
Results of this study suggest that the overall variability of volume measurements is considerably less with the automated software than with manual measurements. This is an important result and has immediate implications. In clinical practice, it suggests that high-quality volumetric measurements, where available, are less variable than linear measurements and should improve assessment of nodule stability. However, caution is still required in applying this tool because the overall variability between scans in vivo is still substantial with wide confidence limits of 13.1% (confidence limits, ± 25.6%). Because one observer in our study was a nonphysician graduate student, a trained radiologist is not required to produce consistent measurements. Nonetheless, six nodules did not segment properly and a vague ground-glass nodule gave unreliable results. Volume measurements must be overseen by a trained observer.
Acknowledgments
We thank Sylvia Bartz for her secretarial assistance. For their technical
support, we are grateful to Beth Heckel and Saad Sirohey of GE Healthcare and
Maureen Levenhagen and Mary Thielke of the Medical College of Wisconsin.
|
|
|---|
This article has been cited by other articles:
![]() |
F. Girvin and J. P. Ko Pulmonary Nodules: Detection, Assessment, and CAD Am. J. Roentgenol., October 1, 2008; 191(4): 1057 - 1069. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, R. J. van Klaveren, H. J. van der Zaag-Loonen, G. H. de Bock, H. A. Gietema, D. M. Xu, A. L. M. Leusveld, H. J. de Koning, E. T. Scholten, J. Verschakelen, et al. Effect of Nodule Characteristics on Variability of Semiautomated Volume Measurements in Pulmonary Nodules Detected in a Lung Cancer Screening Program Radiology, August 1, 2008; 248(2): 625 - 631. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Honda, T. Johkoh, H. Sumikawa, A. Inoue, N. Tomiyama, N. Mihara, Y. Fujita, M. Tsubamoto, M. Yanagawa, T. Daimon, et al. Pulmonary Nodules: 3D Volumetric Measurement with Multidetector CT Effect of Intravenous Contrast Medium Radiology, December 1, 2007; 245(3): 881 - 887. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Gierada, T. K. Pilgram, M. Ford, R. M. Fagerstrom, T. R. Church, H. Nath, K. Garg, and D. C. Strollo Lung Cancer: Interobserver Agreement on Interpretation of Pulmonary Findings at Low-Dose CT Screening Radiology, December 1, 2007; 246(1): 265 - 272. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Beigelman-Aubry, P. Raffy, W. Yang, R. A. Castellino, and P. A. Grenier Computer-Aided Detection of Solid Lung Nodules on Follow-Up MDCT Screening: Evaluation of Detection, Tracking, and Reading Time Am. J. Roentgenol., October 1, 2007; 189(4): 948 - 955. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Unger A Pause, Progress, and Reassessment in Lung Cancer Screening N. Engl. J. Med., October 26, 2006; 355(17): 1822 - 1824. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |