|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Original Research |
1 Kurt Rossmann Laboratories for Radiologic Image Research, Department of
Radiology, University of Chicago, Chicago, IL.
2 Present address: R & D Center, Konica Minolta Medical and Graphic, 2970
Ishikawa-machi, Hachioji City, Tokyo, 192-8505, Japan.
Received September 1, 2007;
accepted after revision January 8, 2008.
Supported by USPHS grants CA62625 and CA98119.
Abstract
|
|
|---|
MATERIALS AND METHODS. We evaluated posteroanterior and lateral chest images of 21 patients with vertebral fractures, 31 patients with lung nodules, and 10 persons acting as controls. The total number of subjects was 60 because both lesions were present in four patients. Eighteen radiologists were asked to detect vertebral fractures and nodules simultaneously on posteroanterior and lateral images. The radiologists indicated their confidence level ratings regarding the presence or absence of lesions and the most likely location of each lesion on either posteroanterior or lateral images, first without and then with CAD output. The observers' performance was evaluated with use of receiver operating characteristic (ROC) and jackknife free-response ROC curves.
RESULTS. With the CAD scheme, the average area under the ROC curve for detection of vertebral fractures improved from 0.906 to 0.951 (p = 0.002). That for lung nodules also improved, but the improvement was not statistically significant (0.804–0.816, p = 0.297). The figure-of-merit values obtained with the jackknife free-response ROC program improved from 0.585 to 0.680 (p < 0.001) for vertebral fractures and from 0.622 to 0.650 (p = 0.017) for nodules, both results having statistical significance. Average sensitivity in the detection of lesions improved from 59.8% to 69.3% for vertebral fractures and from 64.9% to 67.6% for nodules.
CONCLUSION. In the detection of vertebral fractures and lung nodules on chest images, diagnostic accuracy among radiologists improves with the use of CAD.
Keywords: chest radiography computer-aided diagnosis lung nodules observer study vertebral fractures
|
|
|---|
Lateral chest radiography in the posteroanterior view is one of the most frequent examinations in radiology. Vertebral fractures can often be detected on lateral chest radiographs. According to some results [7–10], however, only 15–60% of vertebral fractures detected on lateral chest radiographs are mentioned in radiology reports. The others are underreported or underdiagnosed. To assist radiologists with image interpretation, we developed a computer-aided diagnosis (CAD) scheme for detection of vertebral fractures on lateral chest images. We [11] reported the sensitivity of our method for detection of fractures was 75% (24 of 32 cases) with a false-positive rate of 1.03 (33/32) fractures per image in a validation test. Because of those results, we wanted to evaluate whether our computerized scheme would help radiologists in detecting vertebral fractures on lateral chest images.
Because lateral chest radiographs usually are obtained for purposes other than fracture detection, such as evaluation of cardiovascular and pulmonary diseases, radiologists are generally not alerted to evidence of vertebral fractures, and it is difficult to perform an observer study for detection of these fractures on these radiographs. Another reason vertebral fractures on lateral chest radiographs are not mentioned in radiology reports may be that these lesions are not considered important, even though radiologists are aware of their presence. If they are prompted to pay special attention to vertebral fractures in an unusual situation, such as an observer study, radiologists may be able to detect all vertebral fractures.
Because the presence of nodules may reduce radiologists' attention to the detection of vertebral fractures, we conducted an observer study to simultaneously evaluate three computerized methods for detection of vertebral fractures on lateral chest images [11] and lung nodules on posteroanterior [12] and lateral chest images [13]. In an observer study of nodule detection, Kobayashi et al. [14] evaluated a CAD scheme for detection of lung nodules on posteroanterior images. They found that such a scheme can assist radiologists in the detection of lung nodules on posteroanterior images. To our knowledge, however, a CAD scheme for detection of lung nodules on lateral chest radiographs has not been evaluated in an observer study. The purpose of this study was to evaluate retrospectively whether our CAD schemes for detection of vertebral fractures and lung nodules on lateral chest radiographs can help radiologists in image interpretation.
|
|
|---|
Database
We used two databases, a fracture database and a nodule database. The
fracture database consisted of 1,000 cases in which posteroanterior and
lateral chest radiographs had been obtained with a computed radiography system
with the patient in the upright position. The radiographs were obtained
consecu tively from January 2005 to May 2005 in our department of radiology.
The 437 men and 563 women were 65 years old or older (mean age, 76 years).
Each image had a matrix size of 1,760 x 2,140, 1,760 x 1,760, or 2,140 x 2,140 with 1,024 gray levels and a pixel size of 0.2 mm. We used a subjective judgment proposed by Genant et al. [15] for classifying the severity of vertebral fractures. A grade for a case was determined according to the grade of the most severe fracture in the case. A radiologist initially classified the 1,000 lateral chest radiographs into cases with no fracture (grade 0) or fracture of mild deformity (grade 1), moderate deformity (grade 2), or severe deformity (grade 3). In a second step, 520 of the cases, including all grade 2 and grade 3 cases and some grade 0 and 1 cases, were selected. To reduce the time for radiologists to select the cases to use in the study, we excluded 480 cases: the other grade 0 and 1 cases and 66 cases (6.6% of the original 1,000) that met the exclusion criteria [11]. The 520 cases (all grades 2 and 3 and 100 grade 0 and 1 cases selected from the latest examinations in the study period) were classified by the aforementioned radiologist and by a second radio logist independently using the same guidelines. We evalu ated the effectiveness of our computerized method for differentiating cases of severe vertebral fractures (grade 3) from normal cases (grade 0). To avoid the effect of bias, we believe that all cases in all grades should be included in future studies.
The nodule database consisted of 426 cases of lung nodules on posteroanterior and lateral chest images obtained with the patient in the upright position. The images were acquired from January 1999 to July 2004 at the same institution as the vertebral fracture images. The 426 lung nodules were classified by three radiologists according to degree of subtlety for visual detection. The five degrees of subtlety in the detection of a lung nodule were defined for all nodule cases as extremely subtle, very subtle, subtle, relatively obvious, and obvious. Images in the nodule database were obtained with the same computed radiography system used for the fracture database with the same image size and image processing.
CAD Scheme
We used three computerized schemes for this observer study: detection of
vertebral fractures on lateral chest images and of lung nodules on
posteroanterior and lateral chest images. Our computerized method
[11] for detection of
vertebral fractures on lateral chest radiographs was based on depiction of
upper and lower vertebral edges by use of a multiple-threshold technique
followed by feature analysis. A curved search area, which included a number of
vertebral end plates, was first extracted automatically and then was
straightened so that vertebral end plates became oriented horizontally. Edge
candidates were enhanced with a horizontal line-enhancement filter on the
straightened image. A multiple thres hold technique followed by feature
analysis was used for identification of the vertebral end plates. Feature
values such as area of a candidate, angle between an estimated vertebral
centerline and the candidate, and the distance between the estimated vertebral
centerline and a centroid of the candidate were used for elimination of
false-positive edges. The height of each vertebra was determined from
locations of identified vertebral end plates. Verte bral fractures were
detected by comparison of the measured vertebral height with the expected
height.
Our computerized method [12] for nodule detection on posteroanterior views consisted of the following steps: segmentation of lung fields on the basis of visualized ribcage; anatomic classification of the lung area into the four regions apical, peripheral, hilar, and diaphragm and heart; calculation of image feature values in regions of interest, such as geometric image features, gray-level image features, background image features, and edge gradient image features; and nodule detection through sequential application of three artificial neural networks.
Our computerized method [13] for nodule detection on lateral views consisted of the following steps: segmentation of a lung area; nodule enhancement with an average radial gradient filter (determined by the magnitude of edge gradient and angle between the radial direction and the orientation of the edge gradient) and a gaussian filter; nodule detection by use of a multiple-threshold technique followed by feature analysis, including effective diameter, average pixel value, average and SD of edge gradient, irregularity, contrast, and difference in pixel value between inside and outside; and elimination of false-positive nodule candidates by use of rule-based tests and an artificial neural network.
Observer Study
The cases in this study were selected from the fracture and nodule
databases with the following inclusion and exclusion criteria. Common in
clusion criteria for vertebral fractures and lung nodules were patient in
upright position, 65 years old or older, and one image pair per patient. The
first common exclusion criterion was very poor image quality, which indicates
extremely low contrast due to scatter or large patient size, high noise due to
underexposure, and existing devices overlapped with vertebral bodies. The
second ex clusion criterion was technical error due to in correct
positioning.
For fracture cases, we selected cases with a severe vertebral fracture (grade 3) from the frac ture and nodule databases. The presence of a severe vertebral fracture was confirmed by con sensus of two radiologists or by two consistent identifications by one radiologist at readings more than 1 year apart. Cases of multiple severe frac tures and grade 2 fractures were excluded. With use of the inclusion and exclusion criteria, 21 fracture cases were selected. All cases of severe vertebral fractures that did not meet the exclusion criteria were used in this study. Twelve of the 21 fracture cases were selected by consensus of two radiologists; the other nine were identified twice consistently by one radiologist. Six of the 21 vertebral fractures were below the diaphragm.
We selected nodule cases as follows: lung nodule location confirmed with CT, lung nodule included in a case, largest nodule size 30 mm on both views, and extreme subtlety in comparison with relatively obvious nodules on either view. As a result, 21 fracture cases and 31 nodule cases (both abnormalities were included in four cases) were selected. The mean size of lung nodules was 19.7 mm (range, 8.3–29.6 mm) on the posteroanterior views and 21.6 mm (range, 12.9–29.8 mm) on the lateral views. Lung nodules were visible on both views in 15 of the 31 nodule cases and only on posteroanterior views in the other 16 cases. There was no case in which lung nodules were visible on lateral views only.
We randomly chose 10 control cases without fractures or nodules from the fracture database. The absence of fractures on radiographs was confirmed by consensus of two radiologists; the absence of nodules was confirmed with CT. Among the total of 60 cases in this study, some cases included other abnormalities, such as diffuse lung disease (four cases). These abnormalities did not overlap with vertebral fractures or lung nodules.
Eighteen radiologists participated in this observer study (average years of experience, 15.8; range, 3–31 years). The six chest radiologists and 12 general radiologists were given the following instructions before the study: the purpose of this study is to evaluate the usefulness of CAD for detection of grade 3 vertebral fractures and lung nodules on posteroanterior and lateral chest radiographs; the total number of cases is 60; the observers are blinded to the number of abnormalities and the computer performance levels.
The first reading was conducted without CAD. The radiologists were asked to click on bars to assess their confidence level about the presence or absence of a lung nodule smaller than 30 mm or of a fracture. The radiologists then clicked on the most likely location of a lung nodule or a vertebral fracture. If a nodule was present and visible on both views, the observers were asked to mark on either view the location in which the nodule was more obvious. The second reading was conducted with CAD. The observers were asked to record their confidence level and the most likely locations or to choose no change in confidence level. The radiologists were allowed to see an illustration for classifying the severity of vertebral fractures in an article by Genant et al. [15]. The radiologists also were asked to try to use the rating scale consistently and uniformly. There was no time limit. The average reading time for the 60 cases was 47 minutes (range, 32–73 minutes; 47 seconds per case).
Posteroanterior and lateral views of the same patient were shown on a liquid-crystal display (LCD) color monitor (1,600 x 1,200). Two images, one posteroanterior and one lateral view, were shown on the monitor. Radiologists were able to change window width and level by scrolling with the middle button of a mouse. They could see part of a magnified image by clicking a button on the user interface and could pan by dragging with the right mouse button depressed. They also could see inverted images (black bone images). The computer outputs for vertebral fractures and lung nodules were marked with light blue arrowheads and yellow triangles, respectively (Fig. 1A, 1B).
|
|
Statistical Analysis
The confidence levels attained by each radiologist in each case without and
with CAD were converted to continuous values (0.0–1.0) and were used for
statistical analysis. Receiver oper ating characteristic (ROC) analysis with
and with out localization was used for comparison of radiologists' performance
in the detection of vertebral fractures and lung nodules without and with the
computer output. A binormal ROC curve determined with use of the
"proper" binormal model
[16] was fitted to each
radiologist's confidence level ratings obtained without and with CAD for each
lesion. The Dorfman-Berbaum-Metz method
[17], which includes both
reader variation and case sample variation in an analysis of variance ap
proach, was used for testing the statistical significance of the differences
in areas under the ROC curves (AUCs) for observer readings without and with
CAD for all radiologists.
For ROC analysis, we used the computer program MRMC 2.1 (DBM). We determined localization ROC curves [18] for observers without and with CAD. These curves also were fitted with use of the proper binormal model [16]. Statistical significance of the difference in localization ROC curves and in sensitivities without and with com puterized schemes for the detection of vertebral fractures and lung nodules was estimated with a Student's two-tailed t test. Because the statistical power of the Student's two-tailed t test is not high, we also used jackknife free-response ROC software (FROC) [19, 20], which is used for estimating statistically significant differences between two averaged FROC curves, to evaluate localization ROC data for vertebral fractures and lung nodules [21].
We determined that a location marked by a radiologist was correct when the distance between the location marked by a radiologist and the center of a lesion was less than 8 mm vertically and 15 mm horizontally for vertebral fractures and less than 15 mm for lung nodules. The distance criterion for vertebral fractures was based on the average size of a fractured vertebral body in this study [22]. The distance criterion for lung nodules was chosen because our database contained lesions with diameters less than 30 mm on both views.
We evaluated radiologist performance by use of ROC analysis without and with localization, and we assessed sensitivity in correct detection of a lesion. When our scheme for detection of vertebral fractures was evaluated, cases with vertebral fractures were used as positive cases, and the other cases were controls. In the same way, for evaluation of our computerized scheme of nodule detection, cases with lung nodules were categorized as positive cases and the others as control cases.
|
|
|---|
Radiologist Performance
With CAD, the average AUC for detection of vertebral fractures improved
from 0.906 to 0.951 (p = 0.002)
(Fig. 2). The average AUC for
lung nodules also improved, from 0.804 to 0.816, but the change was not
statistically significant (p = 0.297)
(Fig. 3). The average AUC for
the localization ROC curves improved from 0.590 to 0.690 for vertebral
fractures (p < 0.001) and from 0.611 to 0.634 for lung nodules
(p < 0.001) with use of the computer output.
Table 1 shows the AUC values
for each radiologist for detection of vertebral fractures and lung nodules.
The figure-of-merit values obtained with the jackknife FROC program improved
from 0.585 to 0.680 (p < 0.001) for vertebral fractures and from
0.622 to 0.650 (p = 0.017) for nodules. The average sensitivity in
the detection of vertebral fractures and lung nodules improved from 59.8%
(226/378 observations) to 69.3% (262/378 observations) for vertebral fractures
(p < 0.001) and from 64.9% (362/558 observations) to 67.6%
(377/558 observations) for lung nodules (p = 0.001). The average
sensitivity in the detection of lung nodules for cases in which lung nodules
were detected with CAD on lateral chest radiographs only (9/31) improved from
67.9% (110/162 observations) to 71.6% (116/162 observations) (p =
0.010). The average number of the most likely position marked by radiologists
on lateral chest radiographs increased from 2.8 (50/558 observations) without
CAD to 3.3 (59/558 observations) with CAD (p = 0.015). Because the
radiologists were asked to mark the most likely position of nodules on either
the posteroanterior or the lateral view, the number of most likely positions
could be used as an index for the utility of each view in terms of the
detection of lung nodules.
|
|
|
The beneficial effects of CAD in the detection of vertebral fractures were confirmed mainly in cases in which a vertebral fracture was below the diaphragm and in the upper lung area. In terms of vertebral fractures below the diaphragm, which amounted to 29% (6/21) of the fracture cases, four fractured vertebrae in the six cases were correctly localized by the three radiologists with CAD but were not detected without CAD. For vertebral fractures, detrimental effects on the observers' performance were negligible in the localization and the confidence level ratings in this study. In one case, moderate vertebral fractures were adjacent to a severe vertebral fracture, and the severe vertebral fracture was not detected with CAD, but the moderate vertebral fractures were detected. Although one radiologist marked the severe vertebral fracture correctly without CAD, the most likely location with CAD was changed incorrectly to one of the moderate vertebral fractures marked with computer assistance.
The beneficial effects on observer performance in terms of the most likely locations for lung nodules were confirmed in three cases: a lung nodule overlapping a vertebral body on a lateral chest image, a lung nodule overlapping a clavicle on a posteroanterior view, and a lung nodule very close to the heart on a posteroanterior view. For lung nodules, the detrimental effects on observer performance were negligible in terms of localization but were present in the confidence level ratings. When the detrimental effect was defined as more than 0.2 detrimental change in the confidence level ratings between no CAD and CAD, detrimental effects were found in eight nodule cases by one radiologist and in three nodule cases by two radiologists. Lung nodules in six of the eight cases and all of the three cases were not detected with CAD on posteroanterior views. Nodules in two of the eight cases were visible on lateral chest images and were detected correctly with CAD only on lateral views.
|
|
|---|
We used only severe vertebral fractures in this observer study because the radiologists' agreement on a subjective judgment about severe vertebral fractures has been much better than that on mild and moderate fractures [11]. This finding implies that it is more difficult and less reliable to establish a reference standard if mild and moderate fractures are included in observer studies. Furthermore, in our previous study [11], 55% (11/20) of severe vertebral fractures were not mentioned in radiology reports. We therefore believe that an observer study of detection of severe vertebral fractures performed with cases with only one severe vertebral fracture, confirmed by consensus of two radiologists or identified twice consistently by one radiologist, would be helpful for evaluation of image interpretation. We should, however, be able to expand our target to include moderate vertebral fractures.
At the beginning of the experimental design of this observer study, we were concerned whether radiologists would be able to detect all of the vertebral fractures in the study if they were asked to focus on detection of vertebral fractures only. However, the radiologists did not detect all of the vertebral fractures in this study. For example, without CAD six radiologists missed a severe vertebral fracture in the upper lung area, but all radiologists were able to detect the fracture with CAD. This example shows that CAD was useful in helping radiologists avoid underdiagnosis of vertebral fractures.
The second purpose of this study was to evaluate the usefulness of CAD for detection of lung nodules on lateral chest radiographs, which has not been studied previously, to our knowledge. This part of the study was difficult because a computerized scheme for detection of lung nodules on lateral chest radiographs should be used together with a scheme that includes posteroanterior views. In this study, therefore, we first evaluated the usefulness of two methods for posteroanterior and lateral chest images in one CAD scheme for detection of lung nodules. We then separated the results to confirm that our nodule detection scheme for lateral views alone was also useful in assisting radiologists' interpretations. The prevalence of lung nodules visible on lateral chest images was generally small, 48% (15/31) of the cases in this study. Austin et al. [28] reported that 17% of missed lung nodules were seen better on lateral views than on posteroanterior views. If a computerized scheme had been used, the radiologists might have been able to detect these lung nodules on lateral chest images.
We believe the ROC analysis showed a lack of statistical significance in detection of lung nodules with CAD partly because of the relatively low sensitivity (51.6%, 16/31) of the computerized scheme for nodule detection on posteroanterior views. This low sensitivity was probably caused by the relative subtlety of the lung nodules selected on the basis of the inclusion and exclusion criteria for this study. Because our priority was to evaluate the usefulness of CAD in the detection of vertebral fractures and lung nodules on lateral chest radiographs, the selection of proper cases for lung nodules on posteroanterior views was limited.
The use of CAD with lateral chest radiographs can improve radiologists' image interpretation in the detection of vertebral fractures and lung nodules.
Acknowledgments
We are grateful to Hiroyuki Abe, Daniel Appelbaum, Youglin Pu, Akiko
Shimauchi, Christopher Straus, Kazuto Ashizawa, Shoji Kido, Xin Liu, Kiyoshi
Mori, Tomonori Murak ami, Katsumi Nakamura, Norihisa Nitta, Akitoshi Saito,
Shuji Sakai, Masayuki Suzuki, Shin Tsutsui, Tetsuji Yamaguchi, and Heber
MacMahon for participating as observers; Roger Engelmann for development of
the computer interface for the observer study; to Lorenzo Pesce for the use of
new software for the proper binormal model of ROC and localization ROC curves;
to Elisabeth Lanzl for improving the manuscript; and to Qiang Li and Chisako
Muramatsu for their valuable discussions.
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |