AJR Join ARRS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Leader, J. K.
Right arrow Articles by Zheng, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leader, J. K.
Right arrow Articles by Zheng, B.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
DOI:10.2214/AJR.04.1225
AJR 2005; 185:973-978
© American Roentgen Ray Society


Original Research

Pulmonary Nodule Detection with Low-Dose CT of the Lung: Agreement Among Radiologists

Joseph K. Leader1, Thomas E. Warfel1, Carl R. Fuhrman1, Sara K. Golla1, Joel L. Weissfeld2, Ricardo S. Avila3, Wesly D. Turner3 and Bin Zheng1

1 Department of Radiology, Imaging Research Division, University of Pittsburgh, 300 Halket St., Ste. 4200, Pittsburgh, PA 15213.
2 Departments of Medicine and Epidemiology and University of Pittsburgh Cancer Institute, University of Pittsburgh, Pittsburgh, PA 15213.
3 General Electric Global Research Center, One Research Cir., Niskayuna, NY 12309.

Received August 2, 2004; accepted after revision November 19, 2004.

 
Supported in part by GE Healthcare, Waukesha, WI, and by grant number P50 CA90440 from the National Cancer Institute, Specialized Program of Research Excellence (SPORE) in Lung Cancer at the University of Pittsburgh, National Institutes of Health, Bethesda, MD.

Address correspondence to J. K. Leader (leaderjk{at}upmc.edu).


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. The purpose of our study was to assess relative intra- and interobserver agreement in detecting pulmonary nodules when interpreting low-dose chest CT screening examinations.

MATERIALS AND METHODS. Two hundred ninety-three selected low-dose CT examinations of the lung were independently interpreted by three radiologists to detect and classify pulmonary nodules. The data set selected was enriched with examinations depicting pulmonary nodules. A subset of 30 examinations was interpreted twice. All pulmonary nodules greater than 1.0 mm were marked. All nodules greater than 3.0 mm were marked, measured, and scored as to their probability of being benign or malignant. Nodule-based and examination-based relative reviewer agreements were evaluated using percentage of agreement and kappa statistics. Similar assessments were performed on the subset of examinations interpreted twice.

RESULTS. The three radiologists identified a total of 470, 729, and 876 pulmonary nodules of which 395, 641, and 778 were rated as noncalcified with some level of suspicion for being malignant. Nodule-based interobserver agreement among the radiologists was poor (highest kappa value in a paired comparison, 0.120). Examination-based agreement was higher (highest kappa value in a paired comparison, 0.458). Intraobserver agreement was higher than interobserver agreement for examination-based agreement (highest {kappa} = 0.889) but lower for nodule-based agreement (highest {kappa} = -0.035). Agreement improved as the suspicion of malignancy increased.

CONCLUSION. Unaided intra- and interobserver agreement in detecting pulmonary nodules in low-dose CT of the lung is relatively low. Computer-assisted detection may provide the consistency that is needed for this purpose.


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Lung cancer is one of the most prevalent cancers, with an estimated 173,770 new cases and 160,440 deaths attributed to the disease in 2004 in the United States alone [1]. Early detection of lung cancer has been associated with improved outcome [2-4]. Lung cancer screening may ultimately enable earlier detection and improve outcome. However, lung cancer screening outside of research protocols has been controversial and to date has not been recommended by any major health care organization [5-8]. One of the concerns regarding screening for the early detection of lung cancer is the possibility of unwarranted, potentially harmful management of false-positive detections.

Lung cancer had been commonly detected and diagnosed clinically or on chest radiography, but since the early 1990s X-ray CT has been reported to improve detection and characterization of both benign and malignant pulmonary nodules [9-11]. Lung cancer screening is currently implemented using low-dose CT examinations, which are generally defined as scanning techniques that use less than 100 mAs [12-14]. There are several methodologic issues regarding the optimal practice for low-dose CT screening (e.g., tube current, pitch, section thickness, reviewing format) [15-20]. In addition, the general desire to reduce motion artifacts and improve spatial resolution by rapid image acquisition with thinner image sections has resulted in advances in CT technology (e.g., multidetector scanners). Hence, the typical examination generates large-volume data sets. These large data sets challenge both the display systems and the interpreting radiologist.

Interobserver agreement for the detection of individual pulmonary nodules is reported to be relatively poor [15, 21]; one study reported a large number of missed nodules on retrospective review [22]. However, one study reported excellent interobserver agreement for examination-based interpretations— namely, whether any nodule was visible in a complete examination of the lung [10]. Reports describing interobserver agreement for sizing nodules have been mixed [21, 23, 24]. To our knowledge, there are no reports of intraobserver agreement for either examination-based interpretation or detection of individual nodules. Computer-assisted detection (CAD) schemes and nodule characterization algorithms are being developed to aid the radiologist during the interpretation of chest CT examinations [25-32]. These tools have the potential to improve radiologists' performance [25-29]. Because "ground truth" (i.e., benign vs malignant finding) is unknown for those pulmonary nodules that are not biopsied or resected, studies typically use consensus of expert radiologists evaluating the performance of radiologists (without and with CAD support).

In this study we assess the relative intra- and interobserver agreement for pulmonary nodule detection when using low-dose, thin-section CT examinations for the early detection of lung cancer. The term "relative intra- or interobserver agreement" indicates that observer agreement was evaluated against the other observers or themselves and not against a consensus panel or verified truth (outcome).


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Subjects
We used chest CT examinations of 293 subjects participating in a lung cancer screening program that were obtained under an institutional review board-approved protocol (consent was obtained). The selected examinations were a subset from those ascertained as part of a larger study (Specialized Program of Research Excellence [SPORE] in Lung Cancer) designed to evaluate lung cancer screening with low-dose CT examinations. Examinations for this study were selected by the institutional principal investigator of the SPORE project, who assembled a limited data set enriched with examinations originally reported as depicting pulmonary nodules. The mean age of the subjects whose examinations were included in this study was 60.9 years (range, 50-80 years). All patient information was removed from the examinations and image data were made anonymous and given an examination number. A separate file containing the participant identifiers, participant-related information, and the original report was provided to an "honest broker" for generating summary tables. The honest broker maintained the participants' protected health information and shielded it from the investigators and observers.

MDCT Data Acquisition
The CT examinations were performed using LightSpeed Plus 4-MDCT (n = 282) or LightSpeed Ultra 8-MDCT (n = 11) scanners (GE Healthcare). The helical CT scans were contiguous (nonoverlapping) volume scans encompassing the entire lung area acquired with 2.5-mm section thickness in the axial plane. Images were reconstructed with 512 x 512 pixel matrices using the GE Healthcare lung reconstruction kernel. The low-dose CT acquisition protocol varied slightly depending on patient size: tube voltage range, 120-140 kVp; mean tube current, 29.7 ± 10.7 (SD) mAs; and range of pixel dimensions, 0.60-0.98 mm. The CT examinations were acquired with an end-inspiratory breath-holding protocol.

Participating Radiologists
Three board-certified radiologists with 3, 21, and 24 years of experience in interpreting chest examinations (radiography and CT) participated as observers in the study. Two of the three specialize in thoracic imaging and the third is a general radiologist who routinely interprets chest imaging examinations, among others. All observers participated in the past in different observer performance studies and are familiar with our general procedures in performing these undertakings. They performed the interpretations at their own pace as time permitted, and all interpretations were completed during a 23-week period. After a minimum separation of 13 weeks, the same radiologists again interpreted (reinterpreted) a subset of 30 selected cases during an 8-week period. The project leader selected 30 cases that were a representative sample of the entire data set in terms of the number, size, and clinical importance of pulmonary nodules.

Interpretation Protocol
A GE Healthcare Advantage Workstation running Advanced Lung Analysis 1 (ALA) software was used to review and rate the CT examinations. The workstation was placed in the main thoracic radiology reading room for convenience of the participating radiologists, and reviewers were notified by the project leader if they fell substantially behind in the planned interpretation schedule. The full functionality of the ALA software was available to the participating radiologists (e.g., window and level settings, zoom, cine mode, and maximum intensity projection [MIP]). The radiologist reviewed the CT images and identified (marked) suspected pulmonary nodules. For each marked nodule, a computerized scoring form regarding the nodule's size and characteristics was completed using the computer mouse and keyboard (Table 1). Features recorded on the scoring form included the location, size, presence or absence of calcification, density, surface, cavitation, fat, pleural attachment, and clinical importance in terms of a 5-category ordinal scoring system as to whether the nodule in question was likely to be benign or malignant (Table 1). Nodule size was measured using the digital calipers in the ALA software. The presence or absence of calcification could be determined by changing the window and level values.


View this table:
[in this window]
[in a new window]
 
TABLE 1: Categories for Scoring Pulmonary Nodules

 

Before beginning the interpretations, observers received a detailed Instructions to Observers form that described the task at hand and specifically identified the primary and secondary questions to be answered in each study. Observers were then trained in the use of the workstation for the study. The purpose of the study and the nature of the examinations to be reviewed were explained in general terms, but the mix of positive and negative examinations was not provided. Observers were told that we used an "enriched data set."

Solid nodules were defined as any pulmonary (or pleural) lesion represented on a chest CT image (displayed on lung windows) as sharply defined, discrete, and nearly circular soft-tissue-density opacity with a diameter measuring between 1.0 and 30.0 mm. Nonsolid nodules (e.g., ground-glass opacity) were defined in the same manner except that the density of the opacity was not solid or soft-tissue attenuation but less than soft tissue, allowing visualization of background structures (e.g., blood vessels). Partially solid nodules (i.e., mixed) were defined as a combination of solid and nonsolid nodules. Reviewers were asked to mark the location of all three nodule types larger than 1.0 mm and to provide characterization information only for those larger than 3.0 mm. However, the question regarding calcification (calcified or noncalcified) was responded to for all marked nodules regardless of size.

Data and Statistical Analysis
All pulmonary nodules detected by at least one of the three radiologists were tabulated and analyzed. This was done because there was no verified outcome for most of the cases (and nodules). A consensus score of the three reviewers was determined for nodule size, calcification, and clinical importance. Individual reviewer's measured nodule size was defined as the maximum of the length and width on one individual section depicting the nodule. The consensus nodule size was computed as the average reported size as indicated by those reviewers detecting (marking) the nodule in question. If a reviewer did not score the length and width of an identified nodule (i.e., for nodules < 3.0 mm), his or her reporting was ignored during the computation of a consensus size. A nodule was defined as a noncalcified nodule (NCN) when all reviewers rated it as noncalcified. If one radiologist rated a nodule as calcified, the consensus rating was defined as calcified. The density consensus rating was the minimum density rating among the radiologists detecting the specific nodule. In other words, if one radiologist rated a nodule nonsolid, the assigned consensus rating was nonsolid. The hierarchy went from solid as the highest assigned density to nonsolid as the lowest assigned density. The consensus scoring as to whether an individual nodule represented cancer was defined as the highest scoring assigned to the nodule among the reviewers detecting that nodule (Table 1).

Descriptive statistics were tabulated for all nodules and for all NCNs with a consensus "clinical importance" equal to or greater than 1. Each group (all nodules, all NCNs) was further stratified by size as follows: less than 3.0 mm, equal to or greater than 3.0 mm but less than 10.0 mm, and equal to or greater than 10.0 mm but less than 30.0 mm. If all nodules observed (marked) by a reviewer in an examination were discounted during the categorization of group 2 (i.e., calcified nodules or those scored "definitely not cancer" [clinical importance = 0]), the examination was considered to be negative (no marked nodules). The number and percentage of missed (not marked) nodules by at least one reviewer were calculated and stratified by clinical importance, size, and density. To evaluate reviewer variability in measuring nodule size, we also calculated the fraction of nodules with differences in reported size that were greater than 3.0 mm among the reviewers and the absolute difference between two reviewers' reported sizes as a percentage of their mean reported size.

Relative reviewer agreement was evaluated for NCNs with clinical importance equal to or greater than 1 as follows: nodule-based, for individual nodules and negative examinations equally weighted; and examination-based, for positive examinations with one to six nodules and negative examinations with no marked nodules or those with more than six marked nodules. In the former, interobserver agreement was based on all nodules observed by any of the three observers; hence, a reviewer not involved in specific paired analyses could influence the measured agreement. In other words, when reviewer 1 marked a specific nodule that was not marked by either reviewer 2 or reviewer 3, reviewers 2 and 3 were considered to be in agreement. Percentage of agreement and kappa values for paired reviewers were used as measures of intra- and interobserver agreement in each of the analyses for the different categories. Finally, the positive predictive value of one reviewer scoring an NCN with clinical importance equal to or greater than 3 predicting another reviewer scoring the same nodule with clinical importance equal to or greater than 1 was computed for the six reviewer pairings. Specifically, if a reviewer was concerned about a detected nodule, what was the probability that another reviewer detected the same nodule and assigned it a clinical importance equal to or greater than 1?


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The three reviewers identified a total of 1,317 pulmonary nodules in 293 CT examinations, with 16 examinations rated as negative by all three reviewers (Table 2). The size distribution of these nodules as measured by the radiologists and classified by consensus was as follows: 21.3% (280/1,317) were sized as less than 3.0 mm, 63.7% (839/1,317) were equal to or greater than 3.0 mm but smaller than 10.0 mm, and 15.0% (198/1,317) were equal to or greater than 10.0 mm but smaller than 30.0 mm. There were 1,136 NCNs identified and scored for clinical importance equal to or greater than 1 with 547, 509, 75, and 5 of these scored with a clinical importance of 1, 2, 3, and 4, respectively. This classification resulted in 20 examinations rated as negative by all three reviewers. The size distribution of these nodules was 17.2% (195/1,136), 66.8% (759/1,136), and 16.0% (182/1,136) for nodules less than 3.0 mm, equal to or greater than 3.0 mm but smaller than 10.0 mm, and equal to or greater than 10.0 mm but smaller than 30.0 mm, respectively. The number of these nodules rated as solid, partially solid, and nonsolid density were 727, 305, and 104, respectively. Reviewers 1, 2, and 3 marked at least one NCN with clinical importance equal to or greater than 1 (but not more than six NCNs) in 195 (66.6%), 216 (73.7%), and 230 (78.5%) of the 293 CT examinations, respectively.


View this table:
[in this window]
[in a new window]
 
TABLE 2: Nodules and Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ≥ 1)

 

The interpretation sessions were not monitored or timed; however, anecdotally, the reviewers reported that they typically serially scrolled through each examination image by image and occasionally used the MIP feature, with a typical interpretation time of 4-10 min per case. The most common tactic reported by the reviewers was to review one lung at a time.

Measurements of nodule size were inconsistent among the three reviewers, with relatively large differences reported. Fifty-nine percent of all nodule size measurement pairings for the three reviewers had a size difference equal to or greater than 1.0 mm. The percentages of measured size differences equal to or greater than 3.0 mm were 14.5% (39/269), 11.1% (29/261), and 13.7% (57/415) for nodules reported by both reviewers for the pairings of reviewers 1 and 2, 1 and 3, and 2 and 3, respectively. The mean absolute percentages of differences (percentage of the mean size) between the reported sizes for the pairing of reviewers 1 and 2, 1 and 3, and 2 and 3 were 27.0% ± 23.2%, 16.3% ± 16.3%, and 30.0% ± 25.9%, respectively.

Intraobserver agreement was poor in the 30 repeated examinations for the detection of individual nodules (highest {kappa} = -0.035) but was good to excellent in the examination-based evaluation (i.e., all examinations with one or more detected nodules) (Table 3). Reviewer 1 had the highest intraobserver agreement for the detection of individual nodules, but that reviewer also detected the lowest number of nodules. Conversely, reviewer 3 had the highest intraobserver agreement in the examination-based evaluation ({kappa} = 0.889) and detected the highest number of nodules.


View this table:
[in this window]
[in a new window]
 
TABLE 3: Intraobserver Agreement for Marked Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ≥ 1) and Negative Examinations (n = 30) That Were Interpreted Twice

 

Interobserver agreement was poor for the detection of individual nodules and marginal for examination-based evaluation (i.e., for examinations with one or more detected nodules) (Table 4). The agreement between any pair of radiologists was less than 55% for the detection of individual NCNs with a clinical importance equal to or greater than 1 based on all nodules detected by the three reviewers. The interobserver agreement among the three reviewers was 18.9%. The agreement between reviewers 1 and 2 was the highest ({kappa} = 0.120); those two reviewers detected the lowest number of nodules. With respect to the group, interobserver agreement for the detection of individual nodules improved with increasing nodule size, but agreement between pairs of observers was not consistent as a function of nodule size. Examination-based interobserver agreement was relatively constant between pairs of observers. In addition, examination-based interobserver agreement improved as a function of scored clinical importance (data not shown). Specifically, agreement increased as the suspicion of malignancy increased.


View this table:
[in this window]
[in a new window]
 
TABLE 4: Interobserver Agreement for Marked Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ≥ 1) and Negative Examinations

 

Positive predictive value for NCNs of concern (clinical importance ≥ 3) to one reviewer ranged from 59.5% to 85.4% for the six comparisons. In other words, when one reviewer scored a nodule with a clinical importance of 3 or greater, most of these were detected and scored with at least a clinical importance score of 1 by another reviewer.

The fraction (percentage) of nodules missed (not marked) by the three reviewers was relatively high for all levels of clinical importance, sizes, and densities (Table 5). The fraction of missed nodules decreased as clinical importance and size of the nodules increased. There was no correlation between missed nodules and nodule density, and the fraction of missed nodules was greater for solid nodules than for either partially solid or nonsolid nodules. The fraction and type of the missed nodules during the individual reviewer's first and second interpretations (intraobserver analysis) were similar to those found for the interobserver missed nodules (data not shown).


View this table:
[in this window]
[in a new window]
 
TABLE 5: Distribution of Missed (Not Marked) Noncalcified Nodules (NCNs) with Some Suspicion of Malignancy (Clinical Importance ≥ 1) by Rated Clinical Importance, Size, and Density

 


Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Detection of pulmonary nodules depicted on low-dose, thin-section CT examinations of the chest may become an important part of the daily tasks of many radiologists, both those who specialize in thoracic imaging and general radiologists. We observed a large intra- and interradiologist variability in the detection of individual pulmonary nodules among the three reviewers in terms of relative reviewer agreement. As expected, the more nodules detected by a reviewer, the greater the variability and the lower the agreement with oneself during repeated interpretations and with other radiologists. Intraobserver agreement for examination-based interpretation was reasonably high, but agreement between reviewers in examination-based interpretation was low and perhaps marginal in terms of clinical acceptability. Although positive predictive value was not perfect, the fact that most NCNs considered probably or definitely cancer as determined by one radiologist were at least detected and considered to be of some clinical importance by the other radiologists (as determined by the positive predictive value) is encouraging. Our results, although somewhat surprising in terms of the relatively low agreement we found, are optimistic in that all participating radiologists recognized the potential problems we face and were quite interested in understanding the issues that affect their differences in detection and reporting.

If screening for early detection of lung cancer proves to be efficacious, the volume of examinations and image data to be interpreted will increase rapidly, and therefore interpretation efficiency and providing consistent results will be important goals. One important diagnostic step that needs to be better understood is the variability among radiologists interpreting chest CT examinations in lung cancer screening programs, and there are few data in this regard. Two smaller studies on examinations with high prevalence and a relatively large number of pulmonary nodules per examination suggest that agreement among radiologists in the detection of individual pulmonary nodules is poor [15, 21]. However, a larger study reported that examination-based agreement (i.e., does an examination depict any nodules?) is quite high, but the CT images in that study were reviewed at a thickness of 5.0 mm and therefore targeted larger nodules than our study [10]. Our results are in general agreement with the reported agreement levels of these studies in regard to both individual nodule detection and examination-based evaluation. As expected, both intra- and interobserver agreement was much better in examination-based interpretations than in the detection of individual pulmonary nodules. More important perhaps is the increasing level of agreement with increasing suspicion that a finding represents cancer.

However, one must remember that these results are directly affected by the number and type of pulmonary nodules depicted in each chest CT examination. The fact is that reviewer agreement per examination may not be a clinically relevant index of performance in lung cancer screening, particularly after the initial screening examination. Our study suggests that even in a laboratory experiment when the task at hand is well defined, agreement among radiologists per individual pulmonary nodule is relatively poor, and the fraction of missed (not marked) nodules is high across all sizes and types (i.e., clinical importance and density) of nodules, but especially in smaller nodules and nodules rated as having low clinical importance. Both of these findings highlight the need for training and standardization of reporting in this area. These observations lead to the idea that even if a CAD scheme does not perform at an extremely high level, its use could help reduce variability and improve agreement within and among reviewers. In addition to possible improvement in reviewer consistency, the use of CAD may also improve consistency and possibly accuracy in the estimates of the size of pulmonary nodules, which will be an increasingly important parameter for assessing change over time in repeated examinations.

In clinical protocols that use a nodule size threshold to initiate "action" (or no action), even a relatively small difference in size measurement may be clinically significant in that different management decisions may result. We observed inconsistent measurements of nodule size and relatively large measurement differences, which findings were in agreement with other studies [21, 23, 24]. Although the magnitude of size differences reported by Wormanns et al. [21] are similar to our findings and those of Revel et al. [23], Wormanns et al. reported good agreement based on Pearson's correlation coefficients. For our study, we defined nodule size as maximum length, but there are a number of methods (e.g., linear, area, volumetric) to do so [24, 33, 34]. The most reliable method for indicating or predicting malignancy has yet to be determined [24, 33, 34].

Our study has several limitations. First, we used chest CT examinations performed at one institution and interpreted by a group of reviewers who are largely academic radiologists. Additional data are needed from other institutions and for other types of radiologists, but we suspect that the results will not differ substantially. Perhaps more training might have reduced the reviewer variability, but we believe that observers who routinely interpret chest CT examinations should be proficient at detecting pulmonary nodules. This laboratory experiment may or may not be generalized to the clinical environment. However, it will take several years before actual data on performance in the clinical environment are available, in particular for examinations with verified outcome. We focused on relative reviewer agreement in the detection of pulmonary nodules, but the ultimate performance index of interest will be the agreement in diagnosis and recommendation for follow-up. Unfortunately, a number of the pulmonary nodules detected in our study are currently being followed and the verified diagnostic outcome (pathology or otherwise) for most patients is unavailable. A consensus panel could have been used to define a gold standard (reference) for each case, but it would not have affected relative agreement (or lack thereof) among reviewers. Because of the breadth of missed (not detected) nodules, we do not believe a consensus review would reveal a common cause for missed nodules in this study. Our low-dose screening protocol, which used a somewhat lower dose than is typically used, may have resulted in an increased variability. Data entry errors due to the multiple uses of a computer mouse and keyboard may have also contributed to the variability, but we do not believe they substantially affected any of the results or conclusions presented here. Finally, because of the asymmetric distribution of examinations (i.e., the number of examinations without nodules was low), often one cell on the diagonal in the contingency tables for measuring agreement was relatively small, particularly in the intraobserver analysis. Therefore, kappa values may not have been the most appropriate measure of agreement.

In conclusion, our study indicates that detecting pulmonary nodules depicted in large-volume CT examinations is a daunting task that requires vigilance and diligence. Although intraobserver agreement was reasonably good in the examination-based analysis, intraobserver agreement was poor for the detection of individual nodules. This suggests that there may be a need for the development of consistent search criteria and standardized reporting practices. If nothing else, this preliminary study clearly suggests that there are significant observer-related issues that cannot be ignored regarding the use of low-dose chest CT examinations for the early detection of pulmonary nodules and lung cancer.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Jemal A, Tiwari RC, Murray T, et al. Cancer statistics, 2004. CA Cancer J Clin 2004;54 : 8-29[Abstract/Free Full Text]
  2. Flehinger BJ, Kimmel M, Melamed MR. The effect of surgical treatment on survival from early lung cancer: implications for screening. Chest 1992; 101:1013 -1018[Abstract/Free Full Text]
  3. Sobue T, Suzuki T, Matsuda M, Kuroishi T, Ikeda S, Naruke T. Survival for clinical stage I lung cancer not surgically treated: comparison between screen-detected and symptom-detected cases. The Japanese Lung Cancer Screening Research Group. Cancer 1992;69 : 685-692[CrossRef][Medline]
  4. Greenwald HP, Polissar NL, Borgatta EF, McCorkle R, Goodman G. Social factors, treatment, and survival in early-stage non-small cell lung cancer. Am J Public Health 1998;88 : 1681-1684[Abstract/Free Full Text]
  5. Humphrey LL, Teutsch S, Johnson M. Lung cancer screening with sputum cytologic examination, chest radiography, and computed tomography: an update for the U.S. Preventive Services Task Force. Ann Intern Med 2004; 140:740 -753[Abstract/Free Full Text]
  6. Smith RA, Cokkinides V, Eyre HJ. American Cancer Society guidelines for early detection of cancer, 2004. CA Cancer J Clin2004; 54:41 -52[Abstract/Free Full Text]
  7. Aberle DR, Gamsu G, Henschke CI, Naidich DP, Swensen SJ. A consensus statement of the Society of Thoracic Radiology: screening for lung cancer with helical computed tomography. J Thorac Imaging 2001; 16:65 -68[CrossRef][Medline]
  8. Stanley RJ. Inherent dangers in radiologic screening. AJR 2001; 177:989 -992[Free Full Text]
  9. Tsubamoto M, Kuriyama K, Kido S, et al. Detection of lung cancer on chest radiographs: analysis on the basis of size and extent of ground-glass opacity at thin-section CT. Radiology2002; 224:139 -144[Abstract/Free Full Text]
  10. Henschke CI, McCauley DI, Yankelevitz DF, et al. Early Lung Cancer Action Project: overall design and findings from baseline screening. Lancet 1999; 354:99 -105[CrossRef][Medline]
  11. Sone S, Takashima S, Li F, et al. Mass screening for lung cancer with mobile spiral computed tomography scanner. Lancet1998; 351:1242 -1245[CrossRef][Medline]
  12. Henschke CI, Yankelevitz DF, McCauley DI, Libby DM, Pasmantier MW, Smith JP. Guidelines for the use of spiral computed tomography in screening for lung cancer. Eur Respir J 2003;39 [suppl]: 45S-51S
  13. Rusinek H, Naidich DP, McGuinness G, et al. Pulmonary nodule detection: low-dose versus conventional CT. Radiology1998; 209:243 -249[Abstract/Free Full Text]
  14. Gartenschlager M, Schweden F, Gast K, et al. Pulmonary nodules: detection with low-dose vs conventional-dose spiral CT. Eur Radiol 1998; 8:609 -614[CrossRef][Medline]
  15. Gruden JF, Ouanounou S, Tigges S, Norris SD, Klausner TS. Incremental benefit of maximum-intensity-projection images on observer detection of small pulmonary nodules revealed by multidetector CT. AJR 2002; 179:149 -157[Abstract/Free Full Text]
  16. Itoh S, Ikeda M, Mori Y, et al. Lung: feasibility of a method for changing tube current during low-dose helical CT. Radiology 2002;224 : 905-912[Abstract/Free Full Text]
  17. Diederich S, Lentschig MG, Overbeck TR, Wormanns D, Heindel W. Detection of pulmonary nodules at spiral CT: comparison of maximum intensity projection sliding slabs and single-image reporting. Eur Radiol 2001; 11:1345 -1350[CrossRef][Medline]
  18. Diederich S, Lentschig MG, Winter F, Roos N, Bongartz G. Detection of pulmonary nodules with overlapping vs non-overlapping image reconstruction at spiral CT. Eur Radiol 1999;9 : 281-286[CrossRef][Medline]
  19. Wright AR, Collie DA, Williams JR, Hashemi-Malayeri B, Stevenson AJ, Turnbull CM. Pulmonary nodules: effect on detection of spiral CT pitch. Radiology 1996;199 : 837-841[Abstract/Free Full Text]
  20. Buckley JA, Scott WW Jr, Siegelman SS, et al. Pulmonary nodules: effect of increased data sampling on detection with spiral CT and confidence in diagnosis. Radiology 1995;196 : 395-400[Abstract/Free Full Text]
  21. Wormanns D, Diederich S, Lentschig MG, Winter F, Heindel W. Spiral CT of pulmonary nodules: interobserver variation in assessment of lesion size. Eur Radiol 2000;10 : 710-713[CrossRef][Medline]
  22. Hartman TE, Swensen SJ. Lung cancer screening with low-dose computed tomography. Semin Roentgenol2003; 38:34 -38[CrossRef][Medline]
  23. Revel MP, Bissery A, Bienvenu M, Aycard L, Lefort C, Frija G. Are two-dimensional CT measurements of small noncalcified pulmonary nodules reliable? Radiology 2004;231 : 453-458[Abstract/Free Full Text]
  24. Winer-Muram HT, Jennings SG, Tarver RD, et al. Volumetric growth rate of stage I lung cancer prior to treatment: serial CT scanning. Radiology 2002;223 : 798-805[Abstract/Free Full Text]
  25. McCulloch CC, Kaucic RA, Mendonca PR, Walter DJ, Avila RS. Model-based detection of lung nodules in computed tomography exams: thoracic computer-aided diagnosis. Acad Radiol2004; 11:258 -266[CrossRef][Medline]
  26. Awai K, Murao K, Ozawa A, et al. Pulmonary nodules at chest CT: effect of computer-aided diagnosis on radiologists' detection performance. Radiology 2004;230 : 347-352[Abstract/Free Full Text]
  27. Brown MS, Goldin JG, Suh RD, McNitt-Gray MF, Sayre JW, Aberle DR. Lung micronodules: automated method for detection at thin-section CT— initial experience. Radiology 2003;226 : 256-262[Abstract/Free Full Text]
  28. Armato SG 3rd, Li F, Giger ML, MacMahon H, Sone S, Doi K. Lung cancer: performance of automated lung nodule detection applied to cancers missed in a CT screening program. Radiology2002; 225:685 -692[Abstract/Free Full Text]
  29. Gurcan MN, Sahiner B, Petrick N, et al. Lung nodule detection on thoracic computed tomography images: preliminary evaluation of a computer-aided diagnosis system. Med Phys2002; 29:2552 -2558[CrossRef][Medline]
  30. Ko JP, Betke M. Chest CT: automated nodule detection and assessment of change over time—preliminary experience. Radiology 2001;218 : 267-273[Abstract/Free Full Text]
  31. Erasmus JJ, Connolly JE, McAdams HP, Roggli VL. Solitary pulmonary nodules. I. Morphologic evaluation for differentiation of benign and malignant lesions. RadioGraphics 2000;20 : 43-58[Abstract/Free Full Text]
  32. McNitt-Gray MF, Hart EM, Wyckoff N, Sayre JW, Goldin JG, Aberle DR. A pattern classification approach to characterizing solitary pulmonary nodules imaged on high resolution CT: preliminary results. Med Phys 1999; 26:880 -888[CrossRef][Medline]
  33. Jennings SG, Winer-Muram HT, Tarver RD, Farber MO. Lung tumor growth: assessment with CT—comparison of diameter and cross-sectional area with volume measurements. Radiology2004; 231:866 -871[Abstract/Free Full Text]
  34. Yankelevitz DF, Reeves AP, Kostis WJ, Zhao B, Henschke CI. Small pulmonary nodules: volumetrically determined growth rates based on CT evaluation. Radiology 2000;217 : 251-256[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
JNMHome page
J. W. Fletcher, S. M. Kymes, M. Gould, N. Alazraki, R. E. Coleman, V. J. Lowe, C. Marn, G. Segall, L. A. Thet, K. Lee, et al.
A Comparison of the Diagnostic Accuracy of 18F-FDG PET and CT in the Characterization of Solitary Pulmonary Nodules
J. Nucl. Med., February 1, 2008; 49(2): 179 - 185.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
D. S. Gierada, T. K. Pilgram, M. Ford, R. M. Fagerstrom, T. R. Church, H. Nath, K. Garg, and D. C. Strollo
Lung Cancer: Interobserver Agreement on Interpretation of Pulmonary Findings at Low-Dose CT Screening
Radiology, December 1, 2007; 246(1): 265 - 272.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
P. F. Pinsky, M. Freedman, P. Kvale, M. Oken, N. Caporaso, and J. Gohagan
Abnormalities on chest radiograph reported in subjects in a cancer screening trial.
Chest, September 1, 2006; 130(3): 688 - 693.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Leader, J. K.
Right arrow Articles by Zheng, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leader, J. K.
Right arrow Articles by Zheng, B.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS