|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Original Research |
1 Department of Radiology, University of Occupational and Environmental Health
School of Medicine, 1-1 Iseigaoka, Yahatanishi-ku, Kitakyushu 807-8555,
Japan.
2 Department of Health Sciences, Kyushu University, Fukuoka, Japan.
3 Department of Diagnostic Radiology, Graduate School of Medical Sciences,
Kumamoto University, Kumamoto, Japan.
4 Kurt Rossmann Laboratories for Radiologic Image Research, Department of
Radiology, The University of Chicago, Chicago, IL.
Received December 18, 2006;
accepted after revision September 10, 2007.
CAD technologies developed in the Kurt Rossmann Laboratories have been
licensed to companies including R2 Technology, Deus Technologies, Riverain
Medical Group, Mitsubishi Space Software Company, Median Technologies, GE
Healthcare Corporation, and Toshiba Corporation. It is the policy of The
University of Chicago that investigators disclose publicly actual or potential
significant financial interests that may appear to be affected by research
activities.
Abstract
|
|
|---|
MATERIALS AND METHODS. Fifty maximum-intensity-projection MR angiograms in 50 patients (16 intracranial aneurysms and 34 negative cases) were used for this observer performance study. Sixteen radiologists—eight neuroradiologists and eight less experienced radiologists—participated in the observer studies and interpreted the MR angiograms without and with CAD output images using an independent test method. The reading times without and with CAD were compared separately for the aneurysm and negative cases. The observers' performances were evaluated using receiver operating characteristic (ROC) analysis. We analyzed separately the data obtained from neuroradiologists and from less experienced radiologists.
RESULTS. For all observers, the mean area under the ROC curve (Az) with CAD was improved compared with that without CAD (0.903 vs 0.851, respectively; p = 0.109), and the mean reading time per case was reduced significantly by 18.1 seconds (28.5%) (from 63.4 to 45.3 seconds, p < 0.05). When CAD output images were available, the mean Az for the less experienced radiologists was significantly improved (0.911 vs 0.787, p < 0.05), but not for the neuroradiologists. The mean reading time of the less experienced radiologists with CAD was significantly shorter than that of the neuroradiologists without CAD (39.8 vs 54.5 seconds, p < 0.05).
CONCLUSION. The use of a CAD system for the detection of intracranial aneurysms on MR angiography can shorten the reading time while improving diagnostic performance for less experienced radiologists.
Keywords: computer-aided diagnosis intracranial aneurysms MR angiography observer performance reading time receiver operating characteristic
|
|
|---|
Computer-aided diagnosis (CAD) is now considered to be one of the approaches that may improve the productivity of radiologists in interpreting images. Using a CAD system, the radiologist considers the information provided by the computer as a second opinion, and the final diagnostic decision is based on the expertise of the radiologist [5, 6]. Several previous studies have shown that CAD systems can significantly improve the diagnostic accuracy and productivity of radiologists for breast cancer screening by mammography and for lung cancer screening by chest radiography [7–11].
Recently, CAD systems with automated methods for the detection of intracranial aneurysms at screening MR angiography have been developed [12], and the findings from an observer study showed that the use of a CAD system substantially improved the accuracy of detecting intracranial aneurysms both for neuroradiologists and for general radiologists [13]. In that report, however, the effect of the CAD system on reading time was not evaluated.
We hypothesized that the use of a CAD system can shorten reading time while maintaining a high diagnostic accuracy of radiologists in the detection of intracranial aneurysms on MR angiograms. The purpose of our study was to verify this hypothesis using an observer performance study with and without CAD output images.
|
|
|---|
The technical details about the CAD method have been published previously [12–14]. First, the isotropic images from 3D TOF MR angiography were processed using selective enhancement filters [14] for the detection and characterization of intracranial aneurysms, vessels, and vessel walls. Second, initial candidates were identified using a multiple gray-level thresholding technique in dot-enhanced images and by finding short branches in skeleton images. Third, image features related to aneurysms were determined based on candidate regions segmented by use of a region-growing technique. Furthermore, additional features of small protrusions or small aneurysms were extracted using a shape-based difference technique [15]. Finally, many false-positives were removed by means of rule-based schemes and linear discriminant analysis on various 3D localized image features.
Database
The study was approved by our institutional review board, and written
informed consent was not required. Two experienced neuroradiologists
retrospectively reviewed the MR angiography files of examinations performed
between October 1999 and March 2004 and selected 53 consecutive patients with
61 unruptured intracranial aneurysms (diameter range, 3–26 mm; mean, 6.6
mm). In the 61 intracranial aneurysms, the diagnosis was proved with CT
angiography in 20 and digital subtraction angiography in 12, and the diagnosis
of the remaining 29 aneurysms was made based on MRI findings alone in
consensus review by two neuroradiologists. At retrospective review of the same
MR angiography files of examinations performed between May 2003 and January
2004, two neuroradiologists also selected 62 consecutive cases without
intracranial aneurysms as normal cases. The diagnosis of no intracranial
aneurysms was proved with consensus reading of MR angiograms by these two
neuroradiologists; source images obtained with MR angiography were also used
for diagnosis.
This database of 53 patients with 61 intracranial aneurysms and 62 normal cases was used for evaluation of the performance of the computerized scheme. The performance of the computerized scheme was evaluated by a round-robin (leave-one-out) test, where training was performed for all cases except one in the database, and the one excluded case was applied for testing with the trained computerized scheme. This procedure was repeated until every case in the database was used once. With the use of this database, the performance of our CAD scheme achieved a sensitivity of 84.0% in the detection of intracranial aneurysms at an average false-positive rate of 2.50 per case.
Case Selection for Observer Performance Study
From our database of 53 patients with 61 intracranial aneurysms, we further
selected 16 aneurysm cases for the observer performance study on the basis of
the following selection criteria: First, they were consecutive cases studied
between January 2002 and March 2004; and, second, each of these patients had
only one unruptured intra cranial aneurysm. In the 16 aneurysm cases, the
diagnosis was proved with CT angiography in eight and digital subtraction
angiography in two, and the diagnosis of the remaining six aneurysms was based
on consensus review of MR angiograms by two experienced neuroradiologists.
Table 1 lists the intracranial
aneurysms by location and size. Two patients with intracranial stenoocclusive
disease were included among the 16 aneurysm cases. There were 10 women and six
men who ranged in age from 26 to 82 years (mean, 59.0 years).
|
From our database of the 62 normal cases, the same two neuroradiologists also selected 34 cases as negative cases, which were consecutive cases studied between September 2003 and January 2004. There were 17 women and 17 men who ranged in age from 40 to 77 years (mean, 67.0 years). Four patients with intracranial stenoocclusive disease were included among the 34 negative cases. For the 50 cases (16 aneurysm cases and 34 negative cases) used in the observer study, the performance of our CAD scheme indicated a sensitivity of 81.0% in the detection of intracranial aneurysms at an average false-positive rate of 2.70 per case.
|
For reduction of learning effects, the interval between reading sessions was at least 3 months. During the first session, half of the observers (four neuroradiologists and four less experienced radiologists) interpreted MR angiograms without CAD output images and the other half interpreted them with CAD output images. During the second session, the observers interpreted images under a condition that was different from the first session. The 16 cases with intracranial aneurysms were intermixed with the 34 negative cases using a computer randomization method. A continuous rating scale with a line-marking method [17] was used to record each observer's confidence level regarding the presence or absence of an intracranial aneurysm. In each series, the observer marked his or her confidence level on a line. Then the observers located the most likely position of intracranial aneurysms on each maximum-intensity-projection MR angiogram.
An interface program was created for an image display without and with the computer output for the observer performance test. The observers viewed the maximum-intensity-projection MR angiograms displayed on a gray-scale monitor (2007 WFP, Dell) with a spatial resolution of 1,600 x 1,050. The monitor screen was divided into two areas: one area in which MR angiograms were displayed in z-axis, x-axis, and y-axis rotations and another area in which observers indicated their confidence levels in the observer performance study. On the monitor, observers could see 19 MR angiograms ranging from –90° to 90°, at 10° intervals, in the z-axis rotation and nine MR angiograms ranging from –40° to 40°, at 10° intervals, in the x-axis and y-axis rotations.
The observers were allowed to select the direction of the MR angiograms and the magnification of the images on the monitor and also to adjust the image window level and width. The observers were informed that the reading time was not limited, but the actual reading time for each observer was recorded in each case. The reading time was automatically measured from the moment that the radiologist first viewed the images (when the MR angiograms were displayed) to the moment that the radiologist marked his or her confidence level [11].
Data Analysis
The reading times without and with the CAD output images were compared
separately for the aneurysm and negative cases. The statistical significance
of differences in reading time was determined using a two-tailed paired
Student's t test.
Observer performance in the detection of intracranial aneurysms without and with the CAD output image was evaluated by a receiver operating characteristic (ROC) analysis, in which only observer confidence level was used. The area under the ROC curve (Az) was calculated using the computer program LABROC5 provided by Metz et al. [18]. The statistical significance of the difference in the average Az values without and with the CAD output images was estimated using the Dorfman-Berbaum-Metz method [19].
In this study, observers were required to identify the most likely position of the intracranial aneurysm on each MR angiogram. Localizations within a true lesion were scored as true-positive events, and all other events were scored as false-positive; the data obtained in this way were used only for determination of sensitivity and specificity. The differences in the sensitivity and the specificity for localization with and without the CAD output images were analyzed with a Student's t test. For all tests used, a p value of less than 0.05 was considered to indicate a statistically significant difference.
|
|
|---|
|
Evaluation of Reading Time
The mean reading times per case without and with CAD output images for all
16 observers are shown in Table
3. For all observers, the mean reading time per case was reduced
significantly by 8.8 seconds (17.4%) for the aneurysm cases (from 50.6 to 41.8
seconds, p < 0.05) and by 22.2 seconds (32.0%) for the negative
cases (from 69.4 to 47.2 seconds, p < 0.05). For the
neuroradiologists, the total mean reading time per case with CAD output images
was reduced from 54.5 to 50.7 seconds, but this difference was not significant
(p = 0.084). For the negative cases, the mean reading time per case
with the CAD output image was reduced significantly by 6.8 seconds (11.3%)
(from 60.4 to 53.6 seconds, p = 0.021); for aneurysm cases, however,
the mean reading time per case with the CAD output images was longer than that
without the CAD output images (from 44.6 to 41.9 seconds).
|
When CAD output images were available to the less experienced radiologists, the mean reading time per case was reduced significantly by 20.4 seconds (34.4%) for the aneurysm cases (from 59.3 to 38.9 seconds, p < 0.05) and by 38.2 seconds (48.7%) for the negative cases (from 78.4 to 40.2 seconds, p < 0.05). The mean reading time of the less experienced radiologists with CAD output images was significantly shorter than that of the neuroradiologists without CAD output images (p < 0.05).
Evaluation of Diagnostic Accuracy
The diagnostic performance of each observer in this study is summarized in
Table 4. For all observers, the
observer performance with CAD output images (mean Az =
0.903) was improved compared with that without CAD output images (mean
Az = 0.851), although a statistically significant
difference was not found (p = 0.109). For the neuroradiologists, the
observer performance with CAD output images (mean Az =
0.895) was not improved compared with that without CAD output images (mean
Az = 0.914). For the less experienced radiologists,
however, the mean Az value increased significantly from
0.787 without to 0.911 with CAD output images (p = 0.022,
Dorfman-Berbaum-Metz method). The mean Az value for the
less experienced radiologists using CAD output images was almost equal to that
for the neuroradiologists without CAD output images (0.911 vs 0.914,
respectively).
|
For all observers, sensitivity and specificity in the detection of intracranial aneurysms with CAD output images (sensitivity = 0.809; specificity = 0.886) were improved compared with those without CAD output images (sensitivity = 0.758; specificity = 0.827), and there was a significant difference in specificity (p < 0.05). For the neuroradiologists, sensitivity and specificity with CAD output images (sensitivity = 0.805 specificity = 0.857) were not significantly improved compared with those without CAD output images (sensitivity = 0.821, specificity = 0.842). For the less experienced radiologists, there were significant differences in the sensitivity (0.813 with CAD, 0.695 without CAD; p < 0.05) and the specificity (0.915 with CAD, 0.813 without CAD; p < 0.05).
|
|
|---|
For all observers, performance with CAD output images was improved compared with that without CAD output images, although a statistically significant difference was not found. We analyzed separately the mean Az obtained from the neuroradiologists and less experienced radiologists to determine whether the effect of the CAD output images on observer performance was dependent on clinical experience. Although the mean Az for the neuroradiologists did not improve with the use of CAD output images, that for the less experienced radiologists improved significantly. We found that when less experienced radiologists used the CAD system their diagnostic accuracy advanced to the same level as that of neuroradiologists who performed the assessment with or without CAD output images. For the less experienced radiologists, the use of CAD output images significantly increased the sensitivity for aneurysm detection. This result indicates that CAD output images can prevent observers from missing intracranial aneurysms.
The use of CAD output images also significantly increased specificity for aneurysm detection. This result means that the use of CAD output images decreased the number of false-positives compared with the interpretation of MR angiograms alone. Moreover, the less experienced radiologists had significantly shorter reading times with the CAD system, and the mean reading time of the less experienced radiologists with CAD output images was shorter than that of neuroradiologists without CAD output images (Table 2). Therefore, this CAD system can improve radiologists' productivity and the diagnostic accuracy of less experienced radiologists, such as general radiologists and residents.
In this study, the use of a CAD system did not improve the diagnostic accuracy of neuroradiologists in the detection of intracranial aneurysms. There are several explanations for this result. First, the data sets presented in our observer performance study may have been relatively easy to diagnose for the neuroradiologists and may have been relatively difficult for the less experienced radiologists. Second, the CAD system used in this study may not be good enough to improve neuroradiologists' performance.
Hirai et al. [13] have reported that neuroradiologists' performance for the detection of intracranial aneurysms with MR angiography was significantly improved using a CAD scheme, which achieved a sensitivity of 100% with 0.55 false-positive detection per patient in a consistency test. In contrast, the performances of our CAD scheme indicated a sensitivity of 81.0% in the detection of intracranial aneurysms at a false-positive rate of 2.70 per patient for the test cases. This difference in CAD performance could be caused mainly by the difference in methods used for determination of CAD performance: the round-robin test we used in our study versus the consistency test Hirai et al. used in their study. For the consistency test, cases used for training were also used for testing; therefore, a generalization from training samples was not involved in this test. With the round-robin test, all candidates except one were used for training, and the one excluded candidate was used for testing with two rule-based schemes and the linear discriminant function. This method is well established in the pattern-recognition literature as a statistically valid technique for estimation of the classifier performance in an unknown population [18–22]. We believe that further developments in CAD systems will improve CAD performance.
Third, source images obtained with MR angiography were not included in this observer performance study. In aneurysm cases, the results of clinically relevant changes in confidence levels for the neuroradiologists showed that the average number of cases affected detrimentally was larger than that affected beneficially. This result may indicate that some of the false-positive detections with CAD output images may have confused the neuroradiologists so that true-positive detections indicated by the CAD system may have been overlooked. The source images obtained with MR angiography would likely have been helpful in the diagnosis of intra cranial aneurysms, especially for the neuroradiologists. If source images with maximum-intensityprojection MR angiograms had been obtained and used in this observer performance study, the neuroradiologists would not have had difficulty distinguishing the true-positives from suspicious false-positives indicated by the CAD output images. Fourth, neuroradiologists may be the observers who did not undergo sufficient training on CAD output images to become familiar with typical patterns of true-positives and false-positives. The extent to which automated detection of intracranial aneurysms is used to assist radiologists in the interpretation of MR angiograms ultimately depends on the degree to which radiologists understand the performance of the CAD system. Some neuroradiologists might not have been able to use the CAD output images effectively as a second opinion.
There were several limitations to our study. First, in most aneurysm cases, the diagnosis could not be proven with digital subtraction angiography, which may be the gold standard for diagnosis of an intracranial aneurysm. Although some false-positive lesions might have been included in our cases, we believe that our careful interpretation based on MRI findings minimized the possibility of false-positives.
Second, in this study a single database was used for both training and testing, with the use of a "leave-one-out" method to yield output for the observer studies. In a clinical setting, however, neither radiologists nor a trained computer system would know the outcomes of the cases presented for interpretation. Therefore, this study design, which has been used in several previous studies, might not closely simulate the likely eventual clinical application of a CAD system. We believe that additional studies are needed to further validate the performance of our CAD system by using independent cases that are different from the training cases of the classification algorithm.
The third limitation of our study design is that only one intracranial aneurysm was evaluated in the analysis of observer performance. Therefore, the radiologists' attention was focused specifically on the task of detecting one focal abnormality. In clinical situations, however, radiologists often must consider possible additional intracranial vascular lesions, such as arteriovenous malformations and stenoocclusive diseases. This situation may have caused the effect of CAD output images to be overestimated regarding reading time in the diagnosis of intracranial aneurysms in this study.
In conclusion, we performed an observer study to evaluate the effect of a CAD system for the detection of intracranial aneurysms on reading time and diagnostic accuracy in the interpretation of MR angiograms. The use of the CAD system significantly shortened reading time and improved diagnostic accuracy for less experienced radiologists. However, the CAD system was not beneficial for neuroradiologists: Reading time and diagnostic accuracy did not improve with the use of CAD output images. Our future challenge is to improve the performance level of this CAD system to a level comparable to, or higher than, the performance level of neuroradiologists.
Acknowledgments
We are grateful to Yoshiko Hayashida, Chiaki Asao, Masayuki Yamura, Mika
Kitajima, Tomoko Okuda, Takeshi Sugahara, Norihiro Ohnari, Takatoshi Aoki,
Koji Kamada, Yoshiko Mishima, Hiroko Okazaki, Shoko Kawano, Hiromi Sato,
Takayuki Ohguri, Junji Moriya, and Yutoku Son for participating as observers;
Roger Engelmann for development of the software used in this observer study;
and Elizabeth Lauzl for English wording.
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |