OBJECTIVE. The aim of this study was to evaluate the usefulness of a new commercially available computer-aided diagnosis (CAD) system with an automated method of detecting nodules due to lung cancers on chest radiograph.
MATERIALS AND METHODS. For patients with cancer, 45 cases with solitary lung nodules up to 25 mm in diameter (nodule size range, 8–25 mm in diameter; mean, 18 mm; median, 20 mm) were used. For healthy patients, 45 cases were selected on the basis of confirmation on chest CT. All chest radiographs were obtained with a computed radiography system. The CAD output images were produced with a newly developed CAD system, which consisted of an image server including CAD software called EpiSight/XR. Eight radiologists (four board-certified radiologists and four radiology residents) participated in observer performance studies and interpreted both the original radiographs and CAD output images using a sequential testing method. The observers' performance was evaluated with receiver operating characteristic analysis.
RESULTS. The average area under the curve value increased significantly from 0.924 without to 0.986 with CAD output images. Individually, the use of CAD output images was more beneficial to radiology residents than to board-certified radiologists.
CONCLUSION. This CAD system for digital chest radiographs can assist radiologists and has the potential to improve the detection of lung nodules due to lung cancer.
Lung cancer has become the primary cause of cancer-related deaths in the world , and its early detection and diagnosis are important for improving chances of survival. Chest radiography is currently the most frequently used screening procedure for lung cancer because it is economical and easy to use. However, radiologists may fail to detect lung cancers that are visible on chest radiographs in retrospect , and the detection of small lung cancers is especially difficult [3–5].
The purpose of computer-aided diagnosis (CAD) is to direct the radiologist's attention by identifying and indicating suspected focal opacities that may represent cancer on a radiograph. Several previous studies have reported that CAD systems with automated detection methods for lung nodules could significantly improve diagnostic accuracy for identifying lung nodules on chest radiographs [6–8]. However, the CAD systems used in previous studies were laboratory products. On July 2001, the United States Food and Drug Administration (FDA) approved RapidScreen RS-2000 (DEUS Technologies, Rockville, MD) for marketing as a device that assists in the detection of early-stage lung cancer [9–11]. This approval marked the first time that the FDA sanctioned the marketing of a device specifically targeted for the detection of lung nodules. Recently, we have developed a new CAD system with an automated detection method for lung nodules, which has become commercially available, but this new system has not been fully evaluated.
The purpose of our study was to evaluate the usefulness of this new CAD system for digital radiography in terms of its accuracy to detect lung nodules due to cancer, using an observer performance study.
Materials and Methods
All chest radiographs were exposed at 100 kV with a 10:1 grid and were obtained using a computed radiography system (FCR 9501, Fuji Photo Film, Tokyo, Japan). The imaging plate (ST-V, Fuji Photo Film) was 35 × 43 cm (matrix size, 1,760 × 2,140; gray level, 10 bit; pixel size, 200 μm). The output image was printed on a 23.5 × 23.5 cm film (67% reduction) using a laser printer (FL-IMD, Fuji Photo Film). Image reviewing of the computed radiography data was performed with an automatic exposure data-recognition method. Computed radiographs were processed with a sigmoid, long-contrast Hurter and Driffield curve and slight edge enhancement at higher frequencies. The enhancement factor in the unsharp masking was 0.2, with a frequency range of greater than approximately 0.35 cycle per millimeter. The gradation processing parameters were as follows: rotation amount, 0.9; gradation type, E; rotation center, 1.6; and gradation-shifting amount, –0.2.
Computerized Scheme for Automated Detection of Lung Nodules
The unprocessed computed radiography image data were transferred to a newly developed CAD system (Truedia, Mitsubishi Space Software, Tokyo, Japan). This CAD system consisted of an image server including CAD software called EpiSight/XR (Mitsubishi Space Software, Pentium III processor [Intel, Santa Clara, CA]: 930 MHz; RAM [random access memory], 768 MB; hard disk drive, 363 GB) and a diagnostic workstation computer (Pentium 4 processor [Intel]: 1.5 GHz; RAM, 512 MB; hard disk drive, 28 GB), which were placed side by side with the viewbox. It took approximately 15 sec to produce a CAD output image from a computed radiography image. Technical details of this automated lung nodule–detection method have been published previously [7, 12–14]. The CAD output images were produced with four basic steps: a difference image technique, in which a difference image is produced by subtracting a nodule-suppressed image from a nodule-enhanced image to reduce the complex anatomic structures in the chest radiograph as much as possible; a multiple gray-level thresholding method, which is processed to specify initial nodule candidates on the basis of the histogram of the lung field in the difference image; image features extraction from both the difference image and the original chest image to distinguish true nodules from nodules with false-positive findings, which include the geometric features (i.e., effective diameter, circularity, irregularity, aspect ratio) derived from the region-growing technique and the features based on the symmetry of right and left lung fields; and a rule-based analysis and artificial neural network, in which the previously derived image features are used to reduce the number of false-positive nodules (Fig. 1).
This CAD system was first preliminarily applied to a large database, consisting of 274 radiographs with 323 lung nodules. Two board-certified radiologists independently classified the radiographs with lung nodules according to the subtlety of each lung nodule with a 5-point scale: 1 (extremely subtle), 2 (very subtle), 3 (subtle), 4 (relatively obvious), and 5 (obvious). The subtlety of the nodules was calibrated beforehand using a large number of training cases included in a digital image database for chest radiographs , which is publicly available. The radiologists reached consensus about classification of the subtlety of the nodules when their ratings differed. Among the 323 lung nodules, there were 27 (8%) with subtlety 1; 100 (31%) with subtlety 2; 112 (35%) with subtlety 3; 54 (17%) with subtlety 4; and 29 (9%) with subtlety 5. The number of cases in which their ratings actually differed was only 47 (14.6%) of 323 nodules. Using this database, we achieved a detection sensitivity of 73% with an average of 4.0 false-positive detections per image.
Evaluation of False-Positive Detections with CAD Output Image
Before the observer performance study, two board-certified radiologists subjectively evaluated the frequency and characteristics of false-positive detections with this CAD system for 100 radiographs with normal findings, which were selected on the basis of chest CT confirmation of no existing abnormalities. The false-positive detections consisted of 60 women and 40 men who ranged in age from 24 to 82 years (mean, 62.8 years). The false-positive detections were classified into the following three groups on the basis of the consensus of these board-certified radiologists, according to the object detected by the CAD system: false-positives due to normal anatomic structures, which are easily recognized as pulmonary vessel, clavicle, rib, scapula, aortic arch, nipple; false-positives outside the lung fields, which are easily recognized as false-positives; or other false-positives, which are unrelated to the normal anatomic structures.
Observer Performance Study
No institutional review board approval was required for this study. Consecutive cases with solitary lung cancer were selected from a surgical file between January 1998 and February 2002. Two board-certified radiologists further selected 45 cases with nodules up to 25 mm in diameter for which chest radiographs and CT scans were available (nodule size range, 8–25 mm in diameter; mean, 18 mm; median, 20 mm). The 45 cases included three nodules of 0–10 mm, 11 nodules of 11–15 mm, 21 nodules of 16–20 mm, and 10 nodules of 21–25 mm. We excluded cases that were inadequately visible on the chest radiographs, although these cases were identified on CT scans. In the observer performance study, we used lateral radiographs and posteroanterior radiographs when lateral radiographs were available. Thirty-five (78%) of 45 nodules had lateral radiographs available. Pathologic proof was obtained from surgical resection (32 adenocarcinomas, seven squamous cell carcinomas, three large cell carcinomas, and three small cell carcinomas). There were 16 women and 29 men who ranged in age from 43 to 80 years (mean, 66.8 years). All the patients with normal and abnormal findings were Japanese. The subtlety ratings of the nodules were classified by two board-certified radiologists in the same way as that used for the classification of the database. There were 10 (22%) with 1; 17 (38%) with 2; 15 (33%) with 3; three (7%) with 4; and none with 5.
The nodules were located in all six lung areas as follows: 13 on right upper lung (two overlapped the clavicle); 12 on right middle lung (three overlapped the hilum); six on right lower lung; seven on left upper lung (one overlapped the clavicle); three on left middle lung; and four on left lower lung (one overlapped the heart).
For healthy patients, 45 were selected on the basis of CT findings. Twenty-eight (62%) of 45 healthy cases had lateral radiographs available. There were 24 women and 21 men who ranged in age from 40 to 84 years (mean, 61.2 years). Four cases with strandlike opacities, three with marked cardiomegaly, and two with calcifications were included.
Eight radiologists, including four board-certified radiologists and four radiology residents with 2–5 years' training, participated as observers. Before the test, 40 training cases (30 cases with normal findings, 10 nodule cases) were presented, and observers underwent sufficient training on CAD output images to be familiar with the typical patterns of false-positives. The observers were then informed of the fraction (50%) of lung nodule cases included in the study and the definition of a nodule that measured 8–25 mm in diameter.
For the observer performance study, the radiographs were mounted on a conventional viewbox. Next to the viewbox, the CAD output image was displayed on a liquid crystal display monitor (1280 × 1024 line; 15-inch [38-cm] diagonal screen, Mitsubishi RDT-154H, Mitsubishi, Tokyo, Japan). The observers were permitted to manipulate the monitor brightness and contrast with a track ball or push buttons. The room was under uniformly low ambient light conditions.
For these 45 nodule and 45 healthy cases, a sequential test method was used . The 45 nodule cases were intermixed with the 45 healthy cases using a computer randomization method for each observer. Only one case was shown at a time, and the reviewing time was not limited. A continuous rating scale with a line-marking method was used to record each observer's confidence level regarding the presence or absence of a lung nodule. The radiographs were first shown for conventional interpretation, and the observer marked his or her confidence level with a black pencil on a 7-cm line. The radiographs were evaluated using a lateral view and a conventional posteroanterior view. The CAD output image was then displayed on a monitor. The observer then viewed the CAD output image and the radiographs and marked his or her confidence level with a red pencil on the same line, if different from the first confidence level.
Receiver operating characteristic (ROC) analysis was used for comparison of observer performance in the detection of lung nodules without and with CAD output images. Estimates of the area under the ROC curve (Az) and SDs were computed using the computer program LABROC5 provided by Metz et al. . The observers were not required to indicate the location of a nodule; this procedure is the standard for ROC analysis. The statistical significance of the difference in Az values without and with CAD output images was estimated using the Dorfman-Berbaum-Metz method , in which both interobserver variation and case sample variation are considered. We calculated the difference between the confidence levels without (first rating) and with (second rating) CAD output images and assumed that a clinically relevant change in the confidence levels occurred only when the difference calculated in this way was greater than 30 units on a 0–100 confidence rating scale . The full length of the 7-cm line in the confidence rating scale corresponded to 100 units. A shift of more than 30 units in the direction of a correct diagnosis was assumed to indicate that the use of a CAD output image was beneficial. Similarly, a shift of more than 30 units in the direction of an incorrect diagnosis was assumed to indicate that the use of a CAD output image was detrimental. The difference between the average number of cases affected beneficially and those affected detrimentally by CAD output images was analyzed using a two-tailed paired Student's t test.
Evaluation of False-Positive Detections with CAD Output Image
There were 315 false-positive detections on 100 radiographs with normal findings with an average of 3.15 false-positives per image. Overall, 235 (75%) of the 315 false-positive detections were easily recognized as normal anatomic structures. Specifically, 155 (49%) of the 315 false-positives were recognized as pulmonary vessels, 44 (14%) as the junction of first rib and clavicle, 18 (6%) as the sternoclavicular joint, seven (2%) as the aortic arch, six (2%) as the scapula, two (1%) as the nipple, two (1%) as the rib edge, and one (0.3%) as calcification of costal cartilage. Seventeen (5%) of the 315 false-positives were outside the lung fields, located along the edge of the lung field and easily recognized as false-positives. Sixty-three (20%) of the 315 false-positives were unrelated to the normal anatomic structures. These 63 false-positives were located in all six parts of the lung as follows: 10 on right upper lung, 18 on right middle lung (four overlapped the hilum), 16 on the right lower lung, two on left upper lung, eight on left middle lung (four overlapped the hilum), and nine on left lower lung (two overlapped the diaphragm and one overlapped the heart).
Observer Performance Study
Average ROC curves showed a significant improvement in the accuracy of lung nodule detection with CAD output images (Fig. 2). The Az values for all eight observers are shown in Table 1. The average Az value increased significantly from 0.924 without CAD output images to 0.986 with the CAD output images (p < 0.001, Dorfman-Berbaum-Metz method ). Individually, the use of CAD output images was beneficial to all observers. There was no difference between residents and board-certified radiologists in the level of improvement in diagnostic accuracy with the CAD output images.
TABLE 1 Comparison of Az Values Without and With CAD Output Images
Figures 3 and 4 show the results of clinically relevant change in confidence levels for each observer. Among the 45 nodules, the average number of cases affected beneficially was significantly greater than those affected detrimentally (mean number of cases, 8.13 [18%] vs 0.13 [0.3%], p = 0.002). In the patients with nodules, the number of cases affected beneficially was greater for the subgroup of residents than for board-certified radiologists. Among the 45 healthy cases, the average number of cases affected beneficially and detrimentally was two (4%) and 0.25 (0.6%) (p = 0.004). The average number of cases affected beneficially in healthy cases was significantly less than that affected in nodule cases, (two [4%] vs 8.13 [18%], p = 0.009).
In this study, we evaluated a new CAD system with an automated method of detecting lung nodules, which has recently become commercially available. Our results confirm that this CAD system is useful for the detection of lung nodules on chest radiographs.
The criterion to use lung nodules up to 25 mm in diameter was chosen to examine the potential of this CAD system. If nodules that could have easily been detected with chest radiographs had been chosen, the improvement in diagnostic accuracy with CAD output images would have been less than the results in this study. In clinical practice, chest radiographs are usually obtained in posteroanterior and lateral projections. Therefore, in this study, we used lateral and conventional posteroanterior radiographs. Some of our images that were not appreciable in one projection had a higher conspicuity in the other projection. If the lateral chest radiographs had not been included in this study, the diagnostic accuracy obtained with and without CAD output images would have been less than the results obtained in this study.
We analyzed the effect of CAD output images separately in nodule and healthy cases. In nodule cases, the use of CAD output images was significantly beneficial. This result indicates that the use of CAD output images can prevent observers from missing lung nodules and that this effect of CAD output images was more beneficial for residents than for board-certified radiologists. Although the results of our observer study showed a significant improvement in the accuracy of lung nodule detection with CAD output images, our CAD performance level of 73% sensitivity is still not adequate to avoid overlooking lung nodules. We believe that further technologic developments in the CAD system make it possible to improve the sensitivity.
The average number of patients affected beneficially in healthy cases was clearly less than that affected in nodule cases. One possible cause for this result seems to be that the observer's confidence level in true-negative detection on radiographs was influenced because of the false-positives with CAD output images. In this study, we evaluated the frequency of false-positive detections with CAD using chest radiographs with normal findings obtained from daily clinical practice. The false-positive detection rate was an average of 3.15 false-positives per image. This result for the improvement of radiologists' performance in our study was not good compared with the reports of previous experimental studies [6, 7, 14]. However, observers were not detrimentally influenced by CAD output images. One reason seems to be that some of the false-positive detections with CAD output images may overwhelm the observer so that true-positive detections indicated by the CAD system are overlooked, resulting in overinterpretation of false-positive results. We evaluated the characteristics of false-positive detections with CAD output images to determine whether observers can distinguish them. The results indicated that 235 (75%) of the 315 due to the normal anatomic structures (Fig. 5A, 5B, 5C) and 17 (5%) outside the lung fields were easily recognized as false-positives. Thus, it was not difficult for observers to distinguish the false-positives from suspected focal opacities indicated on CAD output images.
The other reason may be that observers underwent sufficient training on CAD output images to become familiar with typical patterns of false-positives. The extent to which automated detection of lung nodules is used to assist radiologists in the interpretation of radiographs ultimately depends on the degree to which radiologists understand the performance of the CAD systems. In clinical settings, we believe that it is necessary for radiologists to have sufficient training on CAD output images to use them effectively as a second opinion. The average of 3.15 false-positive detections per image with the 100 radiographs with normal findings was found to be smaller than the average of 4.0 false-positive detections preliminarily evaluated by our database. The 100 healthy cases in the evaluation of false-positive detections were selected on the basis of chest CT confirmation of no existing abnormality. On the other hand, the 274 radiographs in the database included other pulmonary diseases such as pleural abnormality, scarring, heart failure, and emphysema in addition to lung nodules. The false-positive detections seem to be increased by these preexisting diseases.
Our study has several limitations. The first limitation is that only lung cancer was evaluated. The automated lung nodule–detection method presented in this study has the potential to identify any nodular opacity. Because lung cancers represent the most clinically important subset of such nodular opacities, we evaluated only lung cancer. However, metastatic tumors, benign granulomas, and other diseases can present as nodules, and the recognition of such conditions is also important.
The second limitation of our study design was that our observers were informed in advance of the diameters of the nodules for which they were looking. This situation may have caused underestimation of the false-positive rate in this study because their attention was specifically focused on the task of detecting solitary lung nodules.
The third limitation is the use of our own image database for chest radiographs with lung nodules. A common digital image database is useful for evaluating the performances among the different CAD systems. Although we achieved a detection sensitivity of 73% with an average of 4.0 false-positive detections per image in this study, our results cannot be compared directly with those obtained in other previous reports because of the differences in the database [9–11]. Additional studies seem to be needed to further validate the performance of our CAD system, comparing it with other systems. For this purpose, it is necessary to share the CAD testing databases so that we can certify and compare different approaches.
The fourth limitation is that this study is retrospective. In many cases, we could not ensure medical histories, case ascertainments, and enrollment procedures of our cases, although surgical notes and pathology reports in each case were reviewed to select consecutive cases with solitary lung cancer. However, this limitation was unavoidable because we wanted to evaluate this new CAD system in a short period of time on a rather large number of nodules that were confirmed lung cancers.
The temporal subtraction technique is another CAD method in which a previous chest radiograph is subtracted from a current radiograph so that interval changes are enhanced. The usefulness of the temporal subtraction technique was previously reported [18, 19] and is currently commercially available [20, 21]. However, this technique is not applicable if a previous radiograph is not available. Our new CAD system with an automated method of detecting lung nodules can be used even if only one radiograph is available. We believe that this automated lung nodule–detection method can complement the temporal subtraction technique in this situation, and both techniques can augment the diagnostic accuracy of chest radiographs.
We are grateful to Koji Kamada, Hiroko Okazaki, Takayuki Ohguri, Syuichiro Yamamoto, Katsuya Yahara, Hiromi Sato, Tomoaki Morioka, and Yutoku Son for participating as observers.
Freedman MT, Osicka T, Lo SCB, et al. Methods for identifying changes in radiologists' behavioral operating point of sensitivity-specificity trade-offs within an ROC study of the use of computer-aided detection of lung cancer. Prop SPIE 2001; 4324:184 –194
Freedman MT, Lo SCB, Osicka T, et al. Computer-aided detection of lung cancer on chest radiographs: effect of machine CAD false-positive locations on radiologists' behavior. Prop SPIE 2002; 4684:1311 –1319
Shiraishi J, Katsuragawa S, Ikezoe J, et al. Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists' detection of pulmonary nodules. AJR 2000; 174:71 –74
Johkoh T, Kozuka T, Tomiyama N, et al. Temporal subtraction for detection of solitary pulmonary nodules on chest radiographs: evaluation of a commercially available computer-aided diagnosis system. Radiology 2002; 223:806 –811