AJR Join ARRS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Fukushima, A.
Right arrow Articles by Hayashi, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fukushima, A.
Right arrow Articles by Hayashi, K.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
AJR 2004; 183:297-305
© American Roentgen Ray Society


Application of an Artificial Neural Network to High-Resolution CT: Usefulness in Differential Diagnosis of Diffuse Lung Disease

Aya Fukushima1, Kazuto Ashizawa1, Tetsuji Yamaguchi1, Naohiro Matsuyama1, Hideyuki Hayashi1, Isao Kida1, Yoshihiro Imafuku1, Akiko Egawa1, Seigo Kimura1, Kenji Nagaoki1, Sumihisa Honda2, Shigehiko Katsuragawa3, Kunio Doi4 and Kuniaki Hayashi1

1 Department of Radiology and Radiation Oncology, Division of Radiological Science, Nagasaki University Graduate School of Biomedical Sciences, 1-7-1, Sakamoto, Nagasaki 852-8501, Japan.
2 Department of Radiation Epidemiology, Atomic Bomb Disease Institute, Nagasaki University School of Medicine, 1-12-4 Sakamoto, Nagasaki 852-8501, Japan.
3 General Research Center, Nippon Bunri University, Ichiki 1727, Oita 870-0397, Japan.
4 Department of Radiology, Kurt Rossmann Laboratories for Radiologic Image Research, The University of Chicago, 5841 S Maryland Ave., Chicago, IL 60637.

Received September 15, 2003; accepted after revision February 4, 2004.

 
Supported by a grant-in-aid for scientific research to K. Ashizawa from the Ministry of Education in Japan (no. 12670886) and by a grant from the U.S. Public Health Service (no. CA62625).

S. Katsuragawa and K. Doi are shareholders in R2 Technology, Los Altos, CA, and K. Doi is a shareholder in Deus Technologies, Rockville, MD.

Address correspondence to K. Ashizawa (ashi{at}net.nagasaki-u.ac.jp).


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. The purpose of our study was to evaluate the diagnostic performance of an artificial neural network (ANN) in differentiating among certain diffuse lung diseases using high-resolution CT (HRCT) and the effect of ANN output on radiologists' diagnostic performance.

MATERIALS AND METHODS. We selected 130 clinical cases of diffuse lung disease. We used a single three-layer, feed-forward ANN with a back-propagation algorithm. The ANN was designed to differentiate among 11 diffuse lung diseases by using 10 clinical parameters and 23 HRCT features. Therefore, the ANN consisted of 33 input units and 11 output units. Subjective ratings for 23 HRCT features were provided independently by eight radiologists. All clinical cases were used for training and testing of the ANN by implementing a round-robin technique. In the observer test, a subset of 45 cases was selected from the database of 130 cases. HRCT images were viewed by eight radiologists first without and then with ANN output. The radiologists' performance was evaluated with receiver operating characteristic (ROC) analysis with a continuous rating scale.

RESULTS. The average area under the ROC curve for ANN performance obtained with all clinical parameters and HRCT features was 0.956. The diagnostic performance of four chest radiologists and four general radiologists was increased from 0.986 to 0.992 (p = 0.071) and 0.958 and 0.971 (p < 0.001), respectively, when they used the ANN output based on their own feature ratings.

CONCLUSION. The ANN can provide a useful output as a second opinion to improve general radiologists' diagnostic performance in the differential diagnosis of certain diffuse lung diseases using HRCT.


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The differential diagnosis of diffuse lung disease is an important subject in chest radiology. Recently, a large number of CT examinations have been performed and high-resolution CT (HRCT) has become a major radiologic diagnostic method for differentiation of diffuse lung diseases [17]. However, the differential diagnosis of diffuse lung disease is often difficult and requires much experience and knowledge because of the variety of radiologic features in each disease and the overlap of radiologic findings among many diseases.

An artificial neural network (ANN), which is a computational model based on neurons in the human brain, has recently been applied to a variety of pattern recognition and data classifications in medical imaging such as chest radiography, chest CT, and mammography [816]. In the differential diagnosis, ANN has the ability to merge information, such as radiologic and clinical findings, and to learn the relationship between input data and output data by using different patterns obtained from a large number of clinical cases. Grenier et al. [17] established a baseline for development of a computer-aided diagnosis (CAD) system to specifically diagnose chronic diffuse infiltrative lung diseases with Bayesian analysis. In their article, they assessed the diagnostic value of clinical data, chest radiography, and chest CT for the differential diagnosis of chronic diffuse infiltrative lung diseases; their results showed the supplementary contribution of CT to clinical and radiographic data for the diagnosis. However, the effect of this diagnostic tool as CAD on radiologists' performance was not evaluated.

In this study, we applied an ANN to the differential diagnosis of diffuse lung disease on HRCT, and we evaluated the diagnostic performance of the ANN by using receiver operating characteristic (ROC) analysis. We also evaluated the effect of the ANN output on radiologists' diagnostic performance.


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
ANN Scheme
We used a single three-layer, feed-forward ANN with a back-propagation algorithm that was previously used for differential diagnosis of diffuse lung disease on chest radiographs [811] or for differentiation between benign and malignant solitary pulmonary nodules on chest radiographs [14] and HRCT images [15]. Figure 1 shows the basic structure of the ANN used in this study. The ANN was designed to differentiate among 11 diffuse lung diseases by using 33 input units for 10 clinical parameters and 23 HRCT features. Therefore, the ANN consisted of 33 input units, 11 output units corresponding to the 11 types of diffuse lung disease, and 22 hidden units. The number of hidden units was determined empirically, as is generally done in ANN applications. The output values obtained from the ANN ranged from 0 to 1 and represented the likelihood of the 11 types of diffuse lung disease in each case.



View larger version (42K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 1. Diagram shows basic structure of artificial neural network (ANN). Although only 13 hidden units are shown for illustration, ANN actually consists of 22 hidden units. HRCT = high-resolution CT, BOOP = bronchiolitis obliterans with organizing pneumonia, CMV = cytomegalovirus.

 

The 11 diffuse lung diseases selected for differential diagnosis were sarcoidosis, diffuse panbronchiolitis, nonspecific interstitial pneumonia, lymphangitic carcinomatosis, usual interstitial pneumonia, silicosis, bronchiolitis obliterans with organizing pneumonia (BOOP) and chronic eosinophilic pneumonia, pulmonary alveolar proteinosis, miliary tuberculosis, lymphangiomyomatosis, and Pneumocystis carinii pneumonia and cytomegalovirus pneumonia. These 11 diseases were selected because they include relatively common diffuse lung diseases. We placed BOOP and chronic eosinophilic pneumonia into one group because the HRCT findings for these two diseases are similar and overlap, and differentiation between them using HRCT is difficult [18, 19]. Both P. carinii pneumonia and cytomegalovirus pneumonia are opportunistic infections in an immunocompromised host and often occur simultaneously [20]. In addition, the HRCT findings for both diseases are similar [21]; thus, we also placed them into one group.

The 10 clinical parameters included the patient's age and sex; duration and severity of symptoms; temperature; immune status; underlying malignancy; and history of smoking, dust exposure, and drug treatment. The HRCT features were classified into four categories: the extent, distribution, and characteristics of the pulmonary lesion and other thoracic abnormality. The distribution of the pulmonary lesion included upper or lower, right or left, central or peripheral, and dorsal or ventral predominance. The characteristics of the pulmonary lesion were grouped into four subtypes using a modification of the classification described by Webb et al. [22] as follows: linear opacity (peribronchovascular interstitial thickening, interlobular septal thickening, centrilobular branching opacity, intralobular reticular opacity, and nonseptal line), nodular opacity (centrilobular and subpleural small nodules, random nodules, and nodules or masses), decreased lung opacity (bronchiectasis, honey-combing, lung cysts, and cavitary lesions), and increased lung opacity (ground-glass opacity and consolidation). Other thoracic abnormalities included pleural effusion, lymphadenopathy, and heart size. The reason for using these 10 clinical parameters and 23 HRCT features for the ANN was that the chest radiologists considered them important as input data for the differential diagnosis of diffuse lung diseases in the clinical work.

Database
We selected 130 actual clinical cases of diffuse lung disease for training and testing the ANN. These patients had undergone chest HRCT and received a definite diagnosis between January 1997 and December 2000. All 130 patients had only one disease entity; patients with two or more disease entities were excluded. The study group was composed of 57 men and 73 women who ranged in age from 18 to 85 years (mean, 52 years). The number of cases for each disease ranged from four to 28. There were 28 cases of sarcoidosis, 17 of diffuse panbronchiolitis, 16 of nonspecific interstitial pneumonia, 15 of lymphangitic carcinomatosis, 12 of usual interstitial pneumonia, 11 of silicosis, 11 of BOOP or chronic eosinophilic pneumonia, seven of pulmonary alveolar proteinosis, five of miliary tuberculosis, four of lymphangiomyomatosis, and four of P. carinii pneumonia or cytomegalovirus pneumonia. All cases were diagnosed on the basis of clinical criteria: pathologic proof for the pulmonary lesions (for all patients with sarcoidosis, nonspecific interstitial pneumonia, usual interstitial pneumonia, BOOP, and lymphangiomyomatosis; and for three patients with chronic eosinophilic pneumonia, three with diffuse panbronchiolitis, and three with pulmonary alveolar proteinosis), bacteriologic proof (all patients with miliary tuberculosis, Pneumocystis carinii pneumonia, and cytomegalovirus pneumonia), or detailed clinical correlation (all patients with lymphangitic carcinomatosis and silicosis; and the remaining patients with chronic eosinophilic pneumonia, diffuse panbronchiolitis, and pulmonary alveolar proteinosis).

An example of ratings for each clinical parameter is shown in Table 1. All clinical parameters in each case were available in our medical records. The absolute value was used for age, duration of symptoms, temperature, and history of smoking. The severity of symptoms was classified into five grades on the basis of the Hugh-Jones classification as described by Xu et al. [23] (Appendix 1). Sex, immune status, underlying malignancy, history of dust exposure and of drug treatment were defined as 0 or 1.


View this table:
[in this window]
[in a new window]

 
TABLE 1 Example of 10 Clinical Parameters Used as Input Data for the Case of Sarcoidosis Shown in Figure 2

 

View this table:
[in this window]
[in a new window]

 
APPENDIX 1. Hugh-Jones Classification

 



View larger version (165K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 2. CT images of 22-year-old man with sarcoidosis. These images can serve as examples of images used in this study. Two radiologists' ratings for this case are shown in Table 2.

 

Subjective ratings for the 23 HRCT features were provided independently by eight radiologists: four chest radiologists with 13, 10, 6, and 5 years of experience and four general radiologists with 9, 4, 3, and 3 years of experience. They were not informed about the correct diagnosis and clinical parameters except patient age and sex to eliminate bias. After two chest radiologists confirmed that lung lesions were distributed diffusely in all cases, six images—three HRCT images and three images were obtained using mediastinal window settings at the level of the aortic arch, the tracheal carina, and 2 cm above the right hemidiaphragm— were selected for the ratings to minimize the observers' review time, as shown in Figure 2. Several authors have reported that an accurate diagnosis could be made by using a limited number of slices for HRCT images as well as HRCT images at 1-cm intervals in the differential diagnosis of diffuse lung disease [24, 25]. In addition to these three anatomic levels, when a pulmonary lesion existed in the apex or lung base predominantly, we also showed the HRCT image of these levels. Table 2 shows examples of two radiologists' ratings for 23 HRCT features for the patient with sarcoidosis shown in Figure 2. All subjective ratings for HRCT features except the distribution of the pulmonary lesion were rated on a scale of 0 to 4. The distribution of the lesion was rated from 1 to 5 (1, upper > > lower; 2, upper > lower; 3, upper = lower; 4, upper < lower; and 5, upper < < lower). All input data for the ANN were normalized to a range from 0 to 1.


View this table:
[in this window]
[in a new window]

 
TABLE 2 Examples of Two Radiologists' Subjective Ratings for 23 High-Resolution CT Features for the Case of Sarcoidosis Shown in Figure 2

 

Evaluation of ANN Performance
We implemented a round-robin (or leave-one-out) method for training and testing the ANN by using all clinical cases. With this method, ratings for all the cases in the database except one were used for training, and ratings for the omitted case were applied to the testing with the trained ANN. The ANN was trained on a combination of the input data obtained from clinical parameters and subjective ratings for HRCT features for each case from the eight radiologists and was tested with the input data from each radiologist's feature ratings. In this method, we did not use the eight radiologists' ratings of the same case independently because of the potential correlation among them. If we had used one radiologist's ratings for training and the other radiologists' ratings of the same case for testing the ANN, this overlap could have produced a positive bias in the evaluation of ANN performance. Therefore, the ANN was trained and tested on a per-case basis. This procedure was repeated until every case in the database was used once as a testing case.

The ANN performance was evaluated with ROC analysis. Binormal ROC curves for diagnosing of diffuse lung disease were estimated by using the LABROC1 algorithm developed by Metz [2628]. An ROC curve for detecting each particular disease in the presence of the other 10 diseases was obtained by examining the output values from the single output unit that corresponded to the single disease in question and by considering cases of the disease as "actual positives" and cases of any other disease as "actual negatives." To assess ANN performance for each disease (disease-specific classification), we calculated the areas under each of these 11 ROC curves (i.e., the Az values) for the eight radiologists. We also evaluated the ANN performance for each radiologist by calculating the Az values for each of these eight ROC curves.

Observer Test
An observer test was performed 6 months after the eight radiologists had provided subjective ratings for the HRCT features in 130 cases. For the observer test, a limited subset of 45 cases was selected from the database of 130 cases by two chest radiologists who did not participate in the observer test. The reason for reducing the number of cases for the observer test was to decrease the time required for observers. The 45 cases included 17 men and 28 women who ranged in age from 19 to 85 years (mean, 52 years). For these 45 cases, the ANN performance (Az value) was comparable to that obtained for all 130 cases. In addition, the distribution of disease categories in the subset was similar to that of the complete set of 130 cases (10 cases of sarcoidosis, six of diffuse panbronchiolitis, six of nonspecific interstitial pneumonia, five of lymphangitic carcinomatosis, four of usual interstitial pneumonia, four of silicosis, four of BOOP or chronic eosinophilic pneumonia, two of miliary tuberculosis, two of pulmonary alveolar proteinosis, one of lymphangiomyomatosis, and one of P. carinii pneumonia or cytomegalovirus pneumonia).

Eight radiologists who provided subjective ratings for HRCT features in advance participated in the observer test. The observers were told that the subset of 45 cases used for the observer test had been selected from the 130 cases for which they had extracted HRCT features 6 months earlier; that only one of the 11 possible diseases was the correct diagnosis for each case and normal cases were not included; and that the ANN outputs presented to observers were obtained by using their own feature ratings as input data for the ANN. The observers were not informed of the distribution of each disease category in the subset of 45 cases.

Before the test, three training cases were shown to the observers to familiarize them with the rating method and with the use of the ANN output as a second opinion. Initially, each observer was presented HRCT images and clinical parameters and rated the likelihood of each of the 11 diffuse lung diseases. The observer's confidence level was represented on an analog continuous-rating scale with a line-checking method [10, 15, 29]. Observers marked their confidence levels along the 11 lines on the score sheet. Ratings of definitely absent and definitely present were marked above the left and the right ends of the line, respectively. Subsequently, the ANN outputs were presented to the observer. Figures 3A and 3B shows examples of graphs of the ANN output used in this observer test. In the second interpretations, observers used a red pencil to mark their confidence levels along the same 11 lines if they changed their confidence levels as a result of the ANN outputs.



View larger version (16K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 3A. Artificial neural network (ANN) output obtained by two radiologists' ratings of features for case shown in Figure 2. NSIP = nonspecific interstitial pneumonia, UIP = usual interstitial pneumonia, BOOP = bronchiolitis obliterans with organizing pneumonia, CEP = chronic eosinophilic pneumonia, ca = carcinomatosis, Tb. = tuberculosis, DPB = diffuse panbronchiolitis, PAP = pulmonary alveolar proteinosis, LAM = lymphangiomyomatosis, PCP = Pneumocystis carinii pneumonia, CMV = cytomegalovirus. Graphs show largest output values among 11 diseases correspond to correct diagnosis for observer B, a chest radiologist (A), Chest Imaging and observer G, a general radiologist (B).

 


View larger version (16K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 3B. Artificial neural network (ANN) output obtained by two radiologists' ratings of features for case shown in Figure 2. NSIP = nonspecific interstitial pneumonia, UIP = usual interstitial pneumonia, BOOP = bronchiolitis obliterans with organizing pneumonia, CEP = chronic eosinophilic pneumonia, ca = carcinomatosis, Tb. = tuberculosis, DPB = diffuse panbronchiolitis, PAP = pulmonary alveolar proteinosis, LAM = lymphangiomyomatosis, PCP = Pneumocystis carinii pneumonia, CMV = cytomegalovirus. Graphs show largest output values among 11 diseases correspond to correct diagnosis for observer B, a chest radiologist (A), Chest Imaging and observer G, a general radiologist (B).

 

Data Analysis
For data analysis, the confidence level was scored by measurement of the distance from the left end of the line to the marked point and converting the measurement to a scale from 0 to 100.

The radiologists' diagnostic performance without and with ANN output was evaluated by ROC analysis [26]. We defined confidence ratings data with the correct diagnosis as actual positives and those with any other diseases as actual negatives. For each observer and each interpretation condition (with and without ANN output), we used a maximum-likelihood estimation to fit a binormal ROC curve to the confidence ratings data for all 11 possible diseases in the 45 cases [27]. We combined data for all diseases because of the small number of cases of each disease. The Az value was then calculated for each fitted ROC curve. The statistical significance of differences between ROC curves for each interpretation condition was determined by applying a two-tailed Student's t test for paired data to the observer-specific Az values. Average ROC curves were generated to represent the overall performance for each group of observers for four chest radiologists and four general radiologists by averaging the two binormal parameters of their individual ROC curves [28]. We also calculated the sensitivity and specificity for each of eight radiologists using confidence ratings data. A case that was diagnosed correctly with the highest confidence rating was judged as one true-positive and 10 true-negative findings. Confidence ratings data in a case that was diagnosed correctly with the second or more highest confidence rating was judged as one false-negative, one false-positive, and nine true-negative findings.

Another indication of observer performance was the number of correctly diagnosed cases for which the observer's ranking was changed by the ANN output. We used four rankings: 1, 2, 3, and 4 and more, where 1 corresponded to a case that the observer diagnosed correctly with the highest confidence rating, 2 corresponded to a case diagnosed correctly with the second highest confidence rating, and so on. If a ranking was improved, such as a change from 2 to 1, by the ANN output, the ANN affected the diagnostic performance beneficially; the opposite indicated a detrimental effect. The statistical significance of the difference between the number of cases affected beneficially and that affected detrimentally was analyzed by a two-tailed t test for paired data.


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
ANN Performance
Figure 4 shows the average and the range of Az values by eight radiologists for each disease. There is a relatively large variation among Az values (0.78–1.00) for the 11 diseases. The Az values for silicosis, lymphangitic carcinomatosis, diffuse panbronchiolitis, and lymphangiomyomatosis are very high because of a strong correspondence between these diseases and certain clinical parameters (e.g., silicosis with dust exposure and lymphangitic carcinomatosis with underlying malignancy) or HRCT features (e.g., diffuse panbronchiolitis with centrilobular small nodules and lymphangiomyomatosis with lung cyst).



View larger version (18K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 4. Graph illustrates range (black line) and average (white bar) of area under receiver operating characteristic curve (Az) values of artificial neural network (ANN) performance by eight radiologists for each disease. NSIP = nonspecific interstitial pneumonia, UIP = usual interstitial pneumonia, BOOP = bronchiolitis obliterans with organizing pneumonia, CEP = chronic eosinophilic pneumonia, ca. = carcinomatosis, Tb. = tuberculosis, DPB = diffuse panbronchiolitis, PAP = pulmonary alveolar proteinosis, LAM = lymphangiomyomatosis, PCP = Pneumocystis carinii pneumonia, CMV = cytomegalovirus.

 

Table 3 shows the ANN performance for each radiologist based on their own feature ratings. The Az values for the eight radiologists ranged from 0.946 to 0.969 (average Az = 0.956). These Az values show a relatively high performance.


View this table:
[in this window]
[in a new window]

 
TABLE 3 Performance of Artificial Neural Network for Each Radiologist Based on Their Own Feature Ratings

 

When we evaluated the ANN performance, we used not only all 33 features, but also the 10 clinical parameters alone or the 23 HRCT features alone as input units for the ANN. The average Az value for the eight radiologists obtained with HRCT features alone was 0.910, and the Az value obtained with clinical parameters alone was 0.884. The ANN performance with all 33 features was superior to that with 10 clinical parameters alone or 23 HRCT features alone.

Observer Test
Table 4 shows the Az values for ROC curves of eight radiologists obtained without and with ANN output. The performance of each of the two groups of observers (i.e., four chest radiologists and four general radiologists) is illustrated by the average Az values in Figures 5A and 5B. The average Az value for the four chest radiologists without and with ANN output was 0.986 and 0.992, respectively (Fig. 5A). The improvement did not reach statistical significance for the chest radiologists (p = 0.071). The average Az value for the four general radiologists without and with ANN output was 0.958 and 0.971, respectively (Fig. 5B). The Az value for the general radiologists' subgroup increased significantly when the ANN output was available (p < 0.001). The sensitivity and specificity for each of eight radiologists without and with ANN output are shown in Table 5. The average values for both sensitivity and specificity for general radiologists were improved significantly with the use of the ANN output (p < 0.05), whereas the average sensitivity and specificity for chest radiologists were not increased significantly (p = 0.25 and p = 0.50, respectively).


View this table:
[in this window]
[in a new window]

 
TABLE 4 Az Values for Diagnostic Accuracy of Eight Radiologists Without and With Output of Artificial Neural Network (ANN)

 


View larger version (22K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 5A. Average areas under receiver operating characteristic curves (Az) for observers with and without artificial neural network (ANN) output. Graph shows Az values for chest radiologists.

 


View larger version (21K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 5B. Average areas under receiver operating characteristic curves (Az) for observers with and without artificial neural network (ANN) output. Graph shows Az values for general radiologists. Note that observer performance with artificial neural network (ANN) output improved significantly.

 

View this table:
[in this window]
[in a new window]

 
TABLE 5 Sensitivity and Specificity of Eight Radiologists Without and With Output of Artificial Neural Network (ANN)

 

The number of cases affected either beneficially or detrimentally by the ANN output for each radiologist is shown in Figure 6. The number of cases in which the observers changed their ranking for correct diagnosis was 34 of 360 (45 x 8) cases cumulatively. The observers changed their responses in 2.2–15.6% of the 45 cases. The confidence level was affected beneficially in 29 cases and was affected detrimentally in five cases. The average numbers of cases affected beneficially and detrimentally by ANN output for all radiologists were 3.6 and 0.6, respectively; this difference was statistically significant (p < 0.05).



View larger version (11K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 6. Graph shows number of correctly diagnosed cases for which observers' rankings changed because of artificial neural network (ANN) output. Black bars indicate number of cases in which ANN output was beneficial; white bars, number of cases ANN in which output was detrimental. ANN output clearly improved performance of observers.

 


Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The differential diagnosis of diffuse lung disease on HRCT requires two steps: the extraction of HRCT features and the subsequent merging of these features and the clinical parameters available. In this study, we selected 10 clinical parameters and 23 HRCT features as input data, and a high level of performance of the ANN (average Az = 0.956) was obtained. This result shows that ANN has the capability to consistently merge much information on clinical parameters and HRCT features as input data and to learn the relationship between input data and output data. Similar results have been reported in studies of ANN in the differential diagnosis of breast cancer on mammograms [16], interstitial lung disease on chest radiographs [811], and solitary pulmonary nodules on chest radiographs [13, 14] and HRCT images [15].

In this study, we used a round-robin (or leave-one-out) method for training and testing the ANN. With this method, the ANN was trained on a combination of the input data obtained from HRCT feature ratings by eight radiologists with various years of experience and was tested with the input data from each radiologist's feature ratings. When we use the ANN in an actual clinical situation, each radiologist will likely be required to extract features and make a differential diagnosis on the basis of their own extracted features. Therefore, this study simulated the potential application of the ANN in the clinical situation. In previous studies on the differential diagnosis of diffuse lung disease on chest radiographs or solitary pulmonary nodules on chest radiographs and HRCT images, however, feature ratings as input data were given by experienced radiologists, and an ANN trained on the basis of these data had a good performance and significantly improved the diagnostic performance of observers who did not extract features [10, 14, 15]. Therefore, it would be clinically more feasible to develop an ANN that is trained with input data provided by experienced radiologists in the future.

Because each radiologist extracted features subjectively, interobserver variation exists. In particular, less experienced radiologists could not always extract features consistently. We examined the effect of interobserver variation based on correlation coefficients on subjective ratings for 23 HRCT features among the eight radiologists. The median of the correlation coefficients was 0.566, showing that the correlation was not strong. However, the Az values of the ANN for each radiologist ranged from 0.946 to 0.969, which shows a relatively high performance. These results seem to indicate that the ANN can learn certain specific patterns even if interobserver variation for HRCT feature ratings exists to some degree, although 10 clinical parameters were also included as input data.

Researchers have reported that an accurate diagnosis could be made with clinical parameters such as age and sex and HRCT features only in the differentiation of diffuse lung disease [7]. However, Grenier et al. [17] reported that a higher diagnostic performance was obtained after HRCT features were used with clinical parameters and radiographic findings. Therefore, we evaluated ANN performance by using not only all 33 features, but also the 10 clinical parameters alone or the 23 HRCT features alone as input data. Compared with the diagnostic performance of the ANN using all 33 input data (Az = 0.956), the Az values with 23 HRCT features alone and 10 clinical parameters alone were 0.910 and 0.884, respectively. These results indicate that the ANN performance with all or some of the clinical parameters in addition to HRCT features was higher than that with HRCT features alone. However, the ANN with all input data used in this study may not necessarily be suitable for differential diagnosis of diffuse lung disease. For clinical application of the ANN, it is desirable that only a small number of essential input data would be applied while maintaining a high Az value for diagnostic performance. Therefore, further study is needed for examining the minimum number of input units required. Although all clinical parameters were used as input data for the ANN in the present study, these data are not always available in actual clinical settings. In the future, it may be necessary to design an ANN in which only the clinical parameters available can be used as input data.

Because training of the ANN depends strongly on the database, a comprehensive database that covers a wide distribution of patterns for each disease is desirable. It is impractical to select many types of diffuse lung diseases including uncommon diseases as output units. In addition, collecting a sufficiently large number of clinical cases at one institution, especially for less common diseases, would be difficult. Thus, we selected 11 types of relatively common diffuse lung diseases for differential diagnosis, and 130 cases for which HRCT was performed and a definite diagnosis was established at our institution in a certain period were available as a database. Although other differential diagnoses may be seen in some situations, these 11 disease entities account for most of the diffuse lung diseases that we encounter in our daily work. Therefore, the use of this ANN is helpful in providing the radiologists with a list as well as a likelihood of these 11 types of disease. Although the number of clinical cases was not large in this study, the number of cases for each disease correlated relatively well with the actual incidence of clinical cases. We need to increase the number of cases to represent a wide distribution of radiologic patterns for each disease in the future.

The effect of the ANN on radiologists' performance in the differentiation of diffuse lung disease was evaluated by an observer test. The difference in Az values without and with ANN output was higher for general radiologists than for chest radiologists, and the former reached statistical significance (p < 0.001). It should be noted that ANN outputs shown to observers were based on their own feature ratings and that this seems to simulate an actual clinical situation. The average Az value for general radiologists with ANN output was relatively close to that for chest radiologists without ANN output. This result indicates that the ANN has a potential usefulness in the future in clinical settings for assisting radiology residents or less experienced radiologists who do not specialize in chest radiology in making a correct diagnosis.

The diagnostic performance of the ANN alone was lower than that of radiologists without ANN output. Nevertheless, the diagnostic performance of general radiologists using ANN was significantly improved. Similar results were reported in studies on the detection of abnormalities such as lung nodules and interstitial opacities on chest radiographs and microcalcifications on mammograms [2931]. This finding can be interpreted as follows: compared with chest radiographs, HRCT images can be used by radiologists alone to make a correct diagnosis in the differentiation of diffuse lung disease with relatively high performance. However, less experienced radiologists would fail to recognize important findings; in these situations, the ANN output could alert radiologists to make a differential diagnosis by merging of HRCT features and clinical parameters again carefully, resulting in a correct diagnosis. These interpretations are supported by the fact that the number of cases affected beneficially was significantly higher than that affected detrimentally (p < 0.05).

In conclusion, ANN has the ability to differentiate among certain diffuse lung diseases using HRCT, and it can provide a useful output as a second opinion to improve the diagnostic accuracy of general radiologists.


Acknowledgments
 
We thank E. Lanzl for editing this manuscript.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Muller NL. Clinical value of high-resolution CT in chronic diffuse lung disease. AJR1991; 157:1163 -1170[Abstract/Free Full Text]
  2. Nishimura K, Izumi T, Kitaichi M, Nagai S, Itoh H. The diagnostic accuracy of high-resolution computed tomography in diffuse infiltrative lung diseases. Chest1993; 104:1149 -1155[Abstract/Free Full Text]
  3. Grenier P, Valeyre D, Cluzel P, Brauner MW, Lenoir S, Chastang C. Chronic diffuse interstitial lung disease: diagnostic value of chest radiography and high-resolution CT. Radiology1991; 179:123 -132[Abstract/Free Full Text]
  4. Remy-Jardin M, Remy J, Deffontaines C, Duhamel A. Assessment of diffuse infiltrative lung disease: comparison of conventional CT and high-resolution CT. Radiology1991; 181:157 -162[Abstract/Free Full Text]
  5. Swensen SJ, Aughenbaugh GL, Douglas WW, Myers JL. High-resolution CT of the lungs: findings in various pulmonary diseases. AJR 1992;158:971 -979[Abstract/Free Full Text]
  6. Murata K, Khan A, Herman PG. Pulmonary parenchymal disease: evaluation with high-resolution CT. Radiology1989; 170:629 -635[Abstract/Free Full Text]
  7. Swensen SJ, Aughenbaugh GL, Myers JL. Diffuse lung disease: diagnostic accuracy of CT in patients undergoing surgical biopsy of the lung. Radiology1997; 205:229 -234[Abstract/Free Full Text]
  8. Asada N, Doi K, MacMahon H, et al. Potential usefulness of an artificial neural network for differential diagnosis of interstitial lung diseases: pilot study. Radiology1990; 177:857 -860[Abstract/Free Full Text]
  9. Ashizawa K, Ishida T, MacMahon H, Vyborny CJ, Katsuragawa S, Doi K. Artificial neural networks in chest radiography: application to the differential diagnosis of interstitial lung disease. Acad Radiol 1999;6:2 -9[Medline]
  10. Ashizawa K, MacMahon H, Ishida T, et al. Effect of an artificial neural network on radiologists' performance in the differential diagnosis of interstitial lung disease using chest radiographs. AJR1999; 172:1311 -1315[Abstract/Free Full Text]
  11. Abe H, Ashizawa K, Katsuragawa S, MacMahon H, Doi K. Use of artificial neural network to determine the diagnostic value of specific clinical and radiologic parameters in the diagnosis of interstitial lung disease on chest radiographs. Acad Radiol2002; 9:13 -17[Medline]
  12. Gross GW, Boone JM, Greco-Hunt V, Greenberg B. Neural networks in radiologic diagnosis. 2 Interpretation of neonatal chest radiographs. Invest Radiol1990; 25:1017 -1023[Medline]
  13. Gurney JW, Swensen SJ. Solitary pulmonary nodules: determining the likelihood of malignancy with neural network analysis. Radiology1995; 196:823 -829[Abstract/Free Full Text]
  14. Nakamura K, Yoshida H, Engelmann R, et al. Computerized analysis of the likelihood of malignancy in solitary pulmonary nodules with use of artificial neural networks. Radiology2000; 214:823 -830[Abstract/Free Full Text]
  15. Matsuki Y, Nakamura K, Watanabe H, et al. Use-fulness of an artificial neural network for differentiating benign from malignant pulmonary nodules on high-resolution CT: evaluation with receiver operating characteristic analysis. AJR2002; 178:657 -663[Abstract/Free Full Text]
  16. Wu Y, Giger ML, Doi K, Vyborny CJ, Schmidt RA, Metz CE. Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer. Radiology1993; 187:81 -87[Abstract/Free Full Text]
  17. Grenier P, Chevret S, Beigelman C, Brauner MW, Chastang C, Valeyre D. Chronic diffuse infiltrative lung disease: determination of the diagnostic value of clinical data, chest radiography, CT and Bayesian analysis. Radiology1994; 191:383 -390[Abstract/Free Full Text]
  18. Izumi T, Kitaichi M, Nishimura K, Nagai S. Bronchiolitis obliterans organizing pneumonia: clinical features and differential diagnosis. Chest 1992;102:715 -719[Abstract/Free Full Text]
  19. Arakawa H, Kurihara Y, Niimi H, Nakajima Y, Johkoh T, Nakamura H. Bronchiolitis obliterans with organizing pneumonia versus chronic eosinophilic pneumonia: high-resolution CT findings in 81 patients. AJR 2001;176:1053 -1058[Abstract/Free Full Text]
  20. Webb WR, Muller NL, Naidich DP. High-resolution CT of the lung, 2nd ed. Philadelphia, PA: Lippincott Williams and Wilkins, 1996: 193-225
  21. Primack SL, Muller NL. High-resolution computed tomography in acute diffuse lung disease in the immunocompromised patient. Radiol Clin North Am 1994;32:731 -744[Medline]
  22. Webb WR, Muller NL, Naidich DP. High-resolution CT of the lung, 2nd ed. Philadelphia, PA: Lippincott Williams and Wilkins, 1996: 41-108
  23. Xu X, Tajima H, Ishioh M, et al. Study on the treatment of tracheobronchial stenosis using expandable metallic stent. J Nippon Med Sch 2001;68:318 -320[Medline]
  24. Leung AN, Staples CA, Muller NL. Chronic diffuse infiltrative lung disease: comparison of diagnostic accuracy of high-resolution and conventional CT. AJR 1991;157:693 -696[Abstract/Free Full Text]
  25. Oh E, Arbor A, Kazerooni EA, et al. Accuracy of HRCT with variable sampling frequency in the differential diagnosis of diffuse lung disease. Radiology1999; 213[suppl]:342
  26. Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986;21:720 -733[Medline]
  27. Metz CE, Herman BA, Shen JH. Maximum-likelihood estimation of receiver operating (ROC) curves from continuously distributed data. Stat Med 1998;17:1033 -1053[Medline]
  28. Metz CE. Some practical issues of experimental design and data analysis in radiological ROC studies. Invest Radiol1989; 24:234 -245[Medline]
  29. Kobayashi T, Xu XW, MacMahon H, Metz CE, Doi K. Effect of a computer-aided diagnosis scheme on radiologists' performance in detection of lung nodules on radiographs. Radiology1996; 199:843 -848[Abstract/Free Full Text]
  30. Monnier-Cholley L, MacMahon H, Katsuragawa S, Morishita J, Ishida T, Doi K. Computer-aided diagnosis for detection of interstitial opacities on chest radiographs. AJR1998; 171:1651 -1656[Abstract/Free Full Text]
  31. Chan H-P, Doi K, Vyborny CJ, et al. Improvement in radiologists' detection of clustered microcalcifications on mammograms: the potential of computer-aided diagnosis. Invest Radiol1990; 25:1102 -1110[Medline]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Am. J. Neuroradiol.Home page
K. Yamashita, T. Yoshiura, H. Arimura, F. Mihara, T. Noguchi, A. Hiwatashi, O. Togao, Y. Yamashita, T. Shono, S. Kumazawa, et al.
Performance Evaluation of Radiologists with Artificial Neural Network for Differential Diagnosis of Intra-Axial Cerebral Tumors on MR Images
AJNR Am. J. Neuroradiol., June 1, 2008; 29(6): 1153 - 1158.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
R. Tagaya, N. Kurimoto, H. Osada, and A. Kobayashi
Automatic Objective Diagnosis of Lymph Nodal Disease by B-Mode Images From Convex-Type Echobronchoscopy
Chest, January 1, 2008; 133(1): 137 - 142.
[Abstract] [Full Text] [PDF]


Home page
JNMHome page
Y. Nie, Q. Li, F. Li, Y. Pu, D. Appelbaum, and K. Doi
Integrating PET and CT Information to Improve Diagnostic Accuracy for Lung Nodules: A Semiautomatic Computer-Aided Method
J. Nucl. Med., July 1, 2006; 47(7): 1075 - 1080.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Fukushima, A.
Right arrow Articles by Hayashi, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fukushima, A.
Right arrow Articles by Hayashi, K.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS