|
|
||||||||
Original Research |
1 Department of Radiology, University of Michigan Health System, Cardiovascular
Center, Rm. 5481, 1500 E Medical Center Dr., Ann Arbor, MI 48109-5868.
2 Department of Internal Medicine, University of Michigan Health System, Ann
Arbor, MI.
3 Present address: Valley Radiologists/SDI, Ltd., Glendale, AZ.
4 Department of Radiology, Vancouver General Hospital, University of British
Columbia, Vancouver, BC, Canada.
5 Present address: Innovative Analytics, Kalamazoo, MI.
Received September 17, 2007;
accepted after revision May 7, 2008.
Address correspondence to B. Sundaram
(sundbask{at}umich.edu).
Abstract
|
|
|---|
MATERIALS AND METHODS. The cases of 100 patients with diffuse lung disease who underwent high-resolution CT and tissue diagnosis were studied. Three thoracic radiologists reviewed high-resolution CT images blindly and independently for patterns of abnormality, listing their three main diagnoses and level of confidence in the first choice. The effect of the findings on accuracy was analyzed.
RESULTS. For honeycombing, the accuracy of the main diagnosis was 96.6%, 92.2%, and 92.3% for the three readers, and that of the three main diagnoses was 96.6%, 96.1%, and 92.3%. For cysts, the accuracy of the main diagnosis was 88.9%, 80%, and 81.8% and of the three main diagnoses was 100%, 90%, and 90.9%. For bronchovascular thickening, the accuracy of the main diagnosis was 91.7%, 87.5%, and 90.9% and of the three main diagnoses was 91.7%, 100%, and 90.9%. For ground-glass opacification (GGO), the accuracy of the main diagnosis was 75.5%, 55%, and 44.2% and of the three main diagnoses was 89.8%, 75%, and 65.4%. Only combining honeycombing with GGO improved the accuracy of GGO. Anatomic craniocaudal distribution improved reader accuracy when GGO was predominantly present in the lower part of the lung. Interobserver agreement on the presence of major findings was a mean kappa value of 0.45 for honeycombing, 0.74 for lung cysts, 0.63 for bronchovascular thickening, and 0.56 for GGO. Agreement for the craniocaudal distribution of major findings was a mean kappa value of 0.48 for honeycombing, 0.52 for bronchovascular thickening, and 0.32 for GGO.
CONCLUSION. The predominant findings of honeycombing and bronchovascular thickening are associated with more than 90% accuracy in the first-choice diagnosis of diffuse lung disease; the finding of lung cysts has 80–89% accuracy. GGO as a predominant pattern had unreliable accuracy, but the accuracy improved when GGO was combined with either honeycombing or lower-lung distribution.
Keywords: accuracy diffuse lung disease high-resolution CT predominant findings
|
|
|---|
A multidisciplinary team composed of pulmonologists, radiologists, and pathologists is important in making the final correct diagnosis of diffuse lung disease. To arrive at the final diagnosis, it is critical for all members of the team to realize and appreciate their teammates' degrees of confidence and agreement in their prospective diagnoses. It has been proven [5] that team members, surprisingly even histopathology experts, change their initial impressions on the basis of other team members' analyses. It is critical that all members of a multidisciplinary team know their level of trust in their personal findings. The purpose of our study was to investigate the influence of a predominant pattern of abnormality on high-resolution CT scans on diagnostic accuracy across a group of readers and to determine whether the distribution or the combination of a finding with additional findings can be used to increase diagnostic accuracy.
|
|
|---|
|
|
|
Image Interpretation
High-resolution CT scans were retrospectively reviewed independently by
three fellowship-trained thoracic radiologists, each with more than 10 years
of thoracic radiology experience. The readers were blinded to all clinical
information about each specific case and to the type and frequency of each
disease in the study population. The definitions of high-resolution CT
findings were from the Fleischner Society glossary of terms for
high-resolution CT of the lungs
[6] and are illustrated in
Figures 1,
2,
3,
4.
|
|
Statistical Methods
The accuracy of each reader's first-choice diagnosis and three main
diagnoses was calculated with exact binomial 95% CI. Interobserver agreement
for the presence of predominant findings was evaluated with kappa statistics,
standard error, and 95% CI. For each pair of readers, exact binomial 95% CI
was used for the proportions. The McNemar test was used to test whether the
proportion of predominant findings was constant. Interobserver agreement for
the distribution of predominant findings was calculated with the Bowker test,
which is an extension of the McNemar test. A value of p < 0.05 was
deemed statistically significant.
|
|
|---|
|
|
When GGO was a predominant finding, reader accuracy of the principal diagnosis was 58.2% and of the three main diagnoses was 76.8%. There was no statistically significant variation in reader accuracy for either the first choice or the three main diagnoses when GGO was recorded as a predominant finding, as evidenced by the overlapping CIs on most occasions.
To determine whether combining GGO with another finding would increase the accuracy of the finding of GGO, accuracy was calculated for GGO as the sole predominant finding and for GGO associated with other findings (Table 3). When the sole predominant finding was GGO, mean reader accuracy of the first-choice diagnosis was 40.2% and of the three main diagnoses was 70.3%. When reticulation coexisted with GGO, reader accuracy improved in most instances, the mean accuracy of the first-choice diagnosis being 63.3% and of the three main diagnoses being 73.3%. When honeycombing coexisted with GGO as a predominant finding, mean reader accuracy improved in most instances, the mean accuracy of the first-choice diagnosis being 81.4% and of the three main diagnoses being 86.6%.
The accuracy of readers 2 and 3 improved when GGO was diffusely distributed in the craniocaudal direction. The 95% CI of each reader overlapped for first-choice and all-choice accuracy, as did the 95% CIs among readers' principal and combined accuracy rates. When GGO was predominantly present in the lower part of the lungs as opposed to the upper part, mean first-choice diagnostic accuracy was 62.9% and that of the three main diagnoses was 83.1%. The use of axial distribution in combination with the predominant findings had no effect on diagnostic accuracy.
Using a confidence scale of 1–3 (1, highly confident; 3, least confident), the readers were highly confident (mean, 1.39 ± 0.59) when all three agreed (57 of 100 cases) and least confident (mean, 2.24 ± 0.74) when all three did not agree (43 of 100 cases). The kappa values for accuracy of the principal diagnosis ranged from 0.58 to 0.66 (mean, 0.62). Pairwise comparison of the top diagnoses between readers revealed no statistically significant disagreement except between readers 1 and 3 (Table 4). In one different case each, a reader reported only a first-choice diagnosis with a confidence level of definite; in each of these three cases, the first-choice diagnosis was correct.
|
Paired kappa values for agreement on identification of predominant findings ranged from 0.25 to 0.56 (mean, 0.45) for honeycombing, 0.71 to 0.78 (mean, 0.74) for cysts, 0.56 to 0.75 (mean, 0.63) for bronchovascular thickening, and 0.46 to 0.70 (mean, 0.56) for GGO. Paired kappa values for agreement for identifying the presence of a finding (either major or minor) ranged from 0.51 to 0.71 (mean, 0.64) for honeycombing, 0.56 to 0.76 (mean, 0.65) for cysts, 0.51 to 0.68 (mean, 0.62) for bronchovascular thickening, and 0.41 to 0.55 (mean, 0.47) for GGO (Table 5). Kappa values for the distribution of these major findings in both craniocaudal and axial dimensions are shown in Table 6. In the craniocaudal dimension, the interobserver distribution agreement for honeycombing ranged from 0.4 to 0.62 (mean, 0.48); for bronchovascular thickening, 0.43 to 0.66 (mean, 0.52); and for GGO, 0.2 to 0.51 (mean, 0.32). In the axial dimension, the kappa values for honeycombing ranged from 0.46 to 0.65 (mean, 0.57) and for GGO from 0.37 to 0.49 (mean, 0.42). The kappa values of lung cysts in both dimensions and the axial dimension of bronchovascular thickening could not be statistically calculated because the readers' observations were the same.
|
|
Although there were moderate and substantial agreements for honeycombing as major and major-minor findings, respectively, there was statistically significant variability between readers. There also was moderate agreement for GGO as major and major-minor findings, and there was statistically significant variability between readers. There was substantial interobserver agreement, however, with no significant interobserver variability for lung cysts and bronchovascular thickening as major findings. The interobserver agreement for craniocaudal and axial distribution of honeycombing was moderate; for GGO it was fair and moderate. The craniocaudal distribution of bronchovascular thickening had moderate agreement. These agreement values had no significant differences. Other mean agreement values could not be calculated because some reader observations were the same.
Readers varied significantly in their reporting of honeycombing as a predominant finding (p < 0.0001 for each pairwise comparison). Specifically, readers 1, 2, and 3 found honeycombing in 29%, 13%, and 51% of high-resolution CT examinations. Readers 2 and 3 differed significantly (p = 0.014) in recording GGO as a predominant finding (52% vs 40%).
|
|
|---|
In a study involving 129 patients with proven interstitial pneumonitis, Johkoh et al. [9] found honeycombing in 71% of cases of UIP, 39% cases of DIP, 30% of cases of AIP, 26% of cases of NSIP, and 13% of cases of bronchiolitis obliterans with organizing pneumonia. Lynch et al. [10] found a high prevalence of honeycombing in patients with idiopathic pulmonary fibrosis. Using morphometric analysis to compare the morphologic features of surgical lung biopsy specimens with findings at high-resolution CT, Schettino et al. [11] found honeycombing on high-resolution CT to have strong histologic correlation in the cases of 25 patients with the diagnosis of idiopathic pulmonary fibrosis (p = 6 x 10–5). The reported sensitivity and specificity of high-resolution CT in the diagnosis of UIP in the presence of honeycombing range from 72% to 90% and 78% to 86% [1, 2], respectively, with a predictive value of 90% [12]. In our study, honeycombing as a predominant finding had an accuracy of more than 90% (range, 92.2–96.6% for a first-choice diagnosis and 92.3–96.6% within the three main diagnoses).
Whereas Lynch et al. [10] reported high interobserver agreement for the presence of honeycombing at high-resolution CT, the readers in our study, despite the significant interobserver variability, had moderate agreement (mean kappa value, 0.45) for honeycombing as a predominant finding and substantial agreement (mean kappa value, 0.64) as a predominant or minor finding. The readers also achieved moderate agreement for the distribution of honeycombing (mean kappa values for craniocaudal and axial distribution, 0.48 and 0.57) with no significant variability. The reliability of honeycombing in the diagnosis of UIP has gained acceptance among practicing physicians. In a 2005 survey conducted by the American College of Chest Physicians [13], 67% of the 230 respondents accepted a high-resolution CT diagnosis of UIP or idiopathic pulmonary fibrosis and did not require histologic proof of diagnosis in treating their patients. This finding reinforces the importance of accurate establishment of a specific diagnosis of UIP [14].
Diseases in which thin-walled lung cysts are the predominant finding include Langerhans cell histiocytosis, lymphangioleiomyomatosis, and, less commonly, Pneumocystis carinii pneumonia. Our study participants were outpatients undergoing evaluation for diffuse lung disease. One or more of the readers considered lung cysts a major finding in lymphangioleiomyomatosis, eosinophilic granuloma, UIP, sarcoidosis, and DIP. Bonelli et al. [3] reported the accuracy of high-resolution CT in the diagnosis of diseases producing cystic air spaces in comparison with emphysema. That series included 10 patients with Langerhans cell histiocytosis, nine patients with lymphangioleiomyomatosis, 10 patients with emphysema, and five healthy controls. The diagnostic accuracy was 84% for Langerhans cell histiocytosis and 79% for lymphangioleiomyomatosis.
Among 92 patients with chronic cystic lung disease, including UIP, DIP, respiratory bronchiolitis interstitial lung disease, lymphocytic interstitial pneumonia, em physema, lymphangioleiomyomatosis, and Langerhans cell histiocytosis, who underwent high-resolution CT, Koyama et al. [15] found an overall mean accuracy of 74% (range, 72–100%). Among the 57% of patients in whom the first-choice diagnosis was reported as confidently made, the mean accuracy was 93% (range, 88–100%). This finding is similar to our reader accuracy of 80–88.9% for the first-choice diagnosis and 90–100% for three main diagnoses when lung cysts were a predominant finding. Our readers also had substantial interobserver agreement for identifying lung cysts as a predominant finding and as either a predominant or minor finding (mean kappa values, 0.74 and 0.65) with insignificant interobserver variability. These findings suggest that a reliable high-resolution CT diagnosis can be made when lung cysts are a major finding, although the interobserver agreement for the distribution of lung cysts is still not clearly known.
Peribronchovascular thickening has been considered a nonspecific finding associated with many diseases, including sarcoidosis [16], polymyositis and dermatomyositis [17], infection [18], idiopathic interstitial pneumonitis [19], and lymphoproliferative disorders [20]. For example, in a study by Bergin et al. [21] involving 20 patients with sarcoidosis, at least one of two readers found prominent diffuse bronchovascular bundle thickening in 75% of the patients. Reittner et al. [22] reported on 28 patients with Mycoplasma pneumoniae; 82% of those patients had bronchovascular thickening with interobserver agreement of 0.7. Johkoh et al. [23] found that most of their patients with lymphocytic interstitial pneumonia had peribronchovascular thickening (19 of 22 patients, 86%). In a study by Emoto et al. [24], bronchovascular thickening was most frequently found in lymphoma (100%), leukemia (70%), tuberculosis (75%), and bacterial pneumonia (76.9%). In a retrospective review, Tanaka et al. [25] found striking thickening of the bronchovascular bundles in 81.8% of patients with leukemic lung infiltration. This finding had a positive predictive value of 75% and a negative predictive value of 90.5%. The diseases present when bronchovascular thickening was recorded as a predominant finding by at least one reader in that series were sarcoidosis, lymphoma, silicosis, and lymphangitis carcinomatosis.
With bronchovascular thickening a predominant finding in the setting of suspected diffuse interstitial lung disease, all patients being outpatients at the time of high-resolution CT, the readers in our study had an accuracy of 87.5–100% for both their first-choice (mean, 90%) and three main (mean, 94%) diagnoses. The readers also had substantial interobserver agreement in identifying bronchovascular thickening as a predominant finding (mean kappa value, 0.63) and as either a predominant or minor finding (mean kappa value, 0.62). The readers had moderate agreement on the distribution of bronchovascular thickening for craniocaudal distribution (mean kappa value, 0.52) with insignificant interobserver variability. Although peribronchovascular interstitial thickening is seen in many diseases, when it was present as a predominant high-resolution CT finding, this sign had high diagnostic accuracy with substantial interobserver agreement.
GGO is a common finding at high-resolution CT and is seen in a wide range of diagnoses, making it nonspecific. Diagnoses associated with GGO include UIP, NSIP, DIP, hypersensitivity pneumonitis, infection, edema, sarcoidosis, respiratory bronchiolitis, vascular disease with pulmonary hemorrhage, pulmonary edema, and chronic pulmonary hypertension [26]. At least one of our readers considered GGO a major finding in UIP, sarcoid, hypersensitivity pneumonitis, DIP, eosinophilic granuloma, lymphocytic interstitial pneumonia, respiratory bronchiolitis–interstitial lung disease, bronchiolitis obliterans, and lymphoma. GGO has been reported [27] to perform poorly in differentiation of underlying causes even when its distribution at the lobular level is taken into consideration.
If diffuse lung disease is clinically suspected and GGO is the predominant high-resolution CT finding, lung biopsy usually is warranted in the absence of a history of collagen vascular disease or in the absence of acute pulmonary symptoms in which NSIP and hypersensitivity pneumonitis are the predominant disease processes. GGO on high-resolution CT may correspond histologically to either inflammation or fibrosis. In a series on 26 patients with the predominant or exclusive finding of GGO on high-resolution CT Remy-Jardin et al. [28] found that 54% of the patients had fibrosis and 65% had inflammation. In our study, when GGO was a predominant finding, readers had a wide range (44–75.5%) of accuracy of first-choice diagnoses; the accuracy range within the three main diagnoses was 65.4–89.8%. The accuracy of GGO as a sole or predominant finding improved when GGO was combined with honeycombing. Lower-lung distribution also improved the accuracy of GGO as a predominant finding. Remy-Jardin et al. reported good interobserver agreement for scoring GGO on high-resolution CT scans. Daniloff et al. [29], in a study involving patients with chronic beryllium disease, found moderate interobserver agreement on the identification of GGO.
Our readers had moderate (mean kappa value, 0.56) interobserver agreement on identifying GGO as a predominant finding and as either a predominant or minor finding (mean kappa value, 0.47). The readers also achieved fair (mean kappa value, 0.32) and moderate (mean kappa value, 0.42) agreement on the distribution of GGO in the craniocaudal and axial distributions. Although the readers moderately agreed on the presence of GGO, this sign remains a nonspecific finding when it is the predominant finding at high-resolution CT, unless honeycombing also is present or GGO is mainly distributed in the lower part of the lungs.
In our study, the diagnostic accuracy of the finding of GGO alone or in combination with its distribution or the presence of honeycombing or reticulation might have had limitations because the radiologic and pathologic definitions of NSIP were not available during the study period. Even with a clear definition of NSIP, however, considerable overlap continues to exist between NSIP and UIP [30]. Although no expiratory images were obtained in our study, the number of patients with disease in which the finding of air trapping might have been helpful was small, and this factor probably had little effect. In addition, the readers were blinded to all clinical information and unaware of symptoms, results of pulmonary function tests, and findings with soft-tissue windows that may have altered the primary or three main diagnoses and influenced accuracy.
We recognize the accuracy of expert thoracic radiologists who have considerable experience with high-resolution CT may not be generalized to all radiologists. For reference standard verification, only histologically proven disease entities were included in the study, resulting in an unusually high number of cases of rare forms of diffuse lung disease. Although the diseases studied may not reflect the common scenario, the choice of cases probably did not have a major negative effect on transferability of the findings to routine clinical practice. Appreciation of a particular high-resolution CT finding and realization of that finding as major might have been variable among our readers and would have affected diagnosis.
The predominant finding of honeycombing, lung cysts, or peribronchovascular thickening on high-resolution CT scans independently results in consistently high diagnostic accuracy with satisfactory interobserver agreement. We found further evidence of the reliability of these three of the four major findings. Although honeycombing on high-resolution CT reliably suggests the histologic diagnosis, if lung cyst or peribronchovascular thickening is present as a dominant finding, the question whether a confident diagnosis at high-resolution CT renders histologic confirmation unnecessary remains unanswered. The answer to that question was beyond the scope of our study. The accuracy of GGO as a predominant finding, which was low, improved in the presence of honeycombing or reticulation as a coexisting finding, as did the accuracy of diffuse GGO in the craniocaudal dimension. When GGO is a predominant finding, lung biopsy often is warranted to establish the final diagnosis of diffuse lung disease. Multiple variables can confuse the clinical features of a patient with diffuse lung disease. It is important to perform an unbiased initial analysis based on the major findings and the distribution of the findings at high-resolution CT and to refine the radiologic interpretation on the basis of the entire clinical scenario to arrive at the correct final diagnosis of diffuse lung disease.
Acknowledgments
We sincerely thank Shokoufeh Khalatbari and Jamie D. Myles of the Michigan
Institute for Clinical Health Research for their help in statistical
analysis.
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |