AJR Join ARRS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Skaane, P.
Right arrow Articles by Castellino, R. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Skaane, P.
Right arrow Articles by Castellino, R. A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
DOI:10.2214/AJR.05.2207
AJR 2007; 188:377-384
© American Roentgen Ray Society


Original Research

Effect of Computer-Aided Detection on Independent Double Reading of Paired Screen-Film and Full-Field Digital Screening Mammograms

Per Skaane1, Ashwini Kshirsagar2, Sandra Stapleton2,3, Kari Young1 and Ronald A. Castellino2

1 Department of Radiology, Breast Imaging Center, Ullevaal University Hospital, Kirkeveien 166, N-0407 Oslo, Norway.
2 R2 Technology, 2585 Augustine Dr., Santa Clara, CA.
3 Present address: 1720 Holt Ave., Los Altos, CA 94024.

Received December 22, 2005; accepted after revision May 25, 2006.

 
A. Kshirsagar, S. Stapleton, and R. A. Castellino are (or were at the time of our study) employees of R2 Technology, which makes the CAD system discussed herein.

Address correspondence to P. Skaane (per.skaane{at}ulleval.no).


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. The purpose of this study was to evaluate the performance and potential contribution of computer-aided detection (CAD) to independent double reading of paired screen-film and full-field digital screening mammograms.

MATERIALS AND METHODS. The cases of 3,683 women who underwent both screen-film mammography and full-field digital mammography (FFDM) with independent double reading for each technique were followed for 2 years to include cancers detected in the interval between screening rounds and cancers detected at the next screening round. Fifty-five biopsy-proven cancers were diagnosed. The baseline screening mammograms of the 55 cancers were defined as having positive findings if at least one of two independent readers scored it 2 or higher on a 5-point rating scale. The baseline mammograms of interval (n = 10) or secondround (n = 16) cancers were retrospectively classified as overlooked (n = 2), minimal sign actionable (n = 8), minimal sign nonactionable (n = 5), and normal (n = 11). The baseline mammograms of these cases of cancer were evaluated with a CAD system, and the CAD results were compared (McNemar's test for paired proportions) with the findings at prospective independent double reading of mammograms obtained with each technique.

RESULTS. For FFDM, CAD sensitivity was 95% (37/39) compared with 64% (25/39) for double reading (p = 0.006), and for screen-film mammography, CAD sensitivity was 85% (33/39) compared with 77% (30/39) for prospective double reading (p = 0.57) of radiographically visible lesions in baseline mammograms. CAD correctly marked five (13%) of 39 cancers on screen-film mammography and 14 (36%) of 39 cancers on FFDM not detected at prospective independent double reading.

CONCLUSION. CAD showed the potential to increase the cancer detection rate for FFDM and for screen-film mammography in breast cancer screening performed with independent double reading.

Keywords: breast • breast cancer • computer-aided detection • digital images • mammography • screening


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The success of screening mammography depends on the detection of small and subtle lesions. Radiologists differ substantially in their interpretations of mammograms, and wide variability in the analysis of screening mammograms has been reported [1, 2]. Double reading can result in detection of additional cancers compared with single reading, and double reading with consensus or arbitration may achieve a higher cancer detection rate and a reduced recall rate [3].

Conventional screen-film mammography has been the technique of choice for screening programs. Full-field digital mammography (FFDM), however, offers several advantages, and its benefits are probably best realized with soft-copy display and interpretation of images. In three prospective studies comparing screen-film mammography and FFDM with soft-copy reading in screening populations, fewer cancers were detected with FFDM compared with screen-film mammography in the Colorado-Massachusetts study [4] and the Oslo I study [5], but more cancers were detected with FFDM in the Oslo II study [6]. The differences, however, were not statistically significant. The results of the large American College of Radiology Imaging Network trial [7] showed that the diagnostic accuracy of screen-film mammography was similar to that of FFDM. The accuracy of FFDM, however, was significantly higher than that of screen-film mammography among women younger than 50 years, women with heterogeneously dense and extremely dense breasts, and premenopausal and perimenopausal women.

Computer-aided detection (CAD) is designed to help radiologists increase the cancer detection rate and rate of detection of early-stage cancers and to decrease interobserver variability [8-11]. Both retrospective [8-11] and prospective [12-16] clinical studies have yielded evidence of the benefit of CAD in screen-film mammography. Two sequentialreading studies [8, 12] (in which the images are interpreted without and then with CAD input) from a community practice and an academic setting showed 19.5% and 7.4% increases, respectively, in the cancer detection rate. A large historical controlled study [14] (in which cancer detection rates were compared in pre-CAD and CAD periods) did not show an overall significant increase in cancer detection. Subsequent analysis [15] of the reported data showed a 19.7% increase in cancer detection by 17 of the 24 radiologists in the study who were in the low-volume category. In a study with a similar historical control design, Cupples et al. [16] found CAD was associated with a 16.1% increase in the cancer detection rate and, of importance, a 164% increase in the detection of invasive cancers 1.0 cm or smaller.

CAD systems are beneficial when they show malignant lesions that are visible and actionable on the image but that are overlooked by the radiologist (observational oversight) and when radiologists recognize and act on the missed or overlooked cancers identified with the CAD system. The former scenario is typically evaluated with retrospective studies, the ideal CAD system showing all cancers not identified by radiologists. The latter scenario can only truly be evaluated with prospective studies.

Although CAD with screen-film mammography has been evaluated in both retrospective and prospective studies, the performance of CAD with FFDM has not been widely established, even with retrospective studies, despite a general belief [7] that one key advantage of FFDM is easier implementation of CAD. One challenge with evaluating CAD with FFDM has been the lack of availability of databases from FFDM screening programs. The database from the Oslo I trial provides data on known cancers consecutively detected with FFDM screening. Because the Oslo I study was a paired screen-film mammography and FFDM trial with follow-up, the database contains cancers classified as missed or overlooked. This database can be used to evaluate the performance of a CAD system in a set of known cases of cancer and to evaluate the amount of correlation between cases of cancer the CAD system marks and the cases of cancer identified by radiologists.

The aim of our study was to evaluate the effect of a CAD system on mammograms acquired with FFDM and to compare the prospective independent double reading of screen-film mammography and FFDM in the paired Oslo I study with the findings of retrospective analysis of the cancer cases with a commercially available CAD system.


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Study Population
The paired Oslo I study group consisted of 3,683 women who underwent both screen-film mammography and FFDM with soft-copy reading between January 3 and June 22, 2000. All 3,683 women in our study population were informed about the study beforehand, and their participation in the project was voluntary. The regional ethics committee approved the study.

The Norwegian Breast Cancer Screening Program protocol has a 2-year interval between screening rounds. In this analysis, the study population was evaluated for more than 2 years to include all cancers detected in the interval between screening rounds and cases of cancer found at the second screening round. The diagnostic evaluations of women recalled were performed at the breast imaging center of our institution within 2 weeks after the consensus meeting. All cytologic and histologic examinations were performed in the department of pathology. The baseline interpretation results, results from diagnostic evaluation of recalled patients, cytologic and histologic findings for patients undergoing surgery, and histologic findings for patients with interval cancers and cancers detected at the second screening round were sent to a central database of the Norwegian Breast Cancer Screening Program located at the Cancer Registry of Norway. All malignant tumors diagnosed in Norway must be reported to the Cancer Registry of Norway, and this database is linked to the mammographic screening database for each county. This system enables complete surveillance of all women in the screening program, including women whose interval cancers might have been diagnosed at other institutions.

Imaging
The 3,683 women (mean age, 58.2 years) in the study population underwent both screen-film mammography and FFDM. All screen-film mammograms were acquired on one of two mammography units (Mammomat 300, Siemens Medical Solutions) with Kodak Min-R 2000 film and Min-R 2190 screens (Eastman Kodak). The FFDM images were acquired on a Senographe 2000D unit (GE Healthcare). Mammograms for both imaging techniques consisted of the two standard views (craniocaudal and mediolateral oblique) of each breast. Both examinations were performed on the same day within minutes of each other in the screening unit in downtown Oslo. Screen-film mammography was always performed first in case the woman, for whatever reason, chose not to participate in the double examination.

Eight radiologists, all with more than 4 years of screening mammographic experience, were divided into two teams. One team interpreted screen-film mammograms and the other team, FFDM for 1 week. Each team of four radiologists alternated weekly between screen-film mammography and FFDM interpretations. In each group, two radiologists independently interpreted either screen-film mammographic or FFDM examinations.

The findings of prospective independent double reading of the baseline screen-film and FFDM screening images were recorded in the central database of the Norwegian Cancer Registry. A 5-point rating scale for probability of cancer was used: 1 = normal, definitely benign; 2 = probably benign; 3 = indeterminate; 4 = probably malignant; 5 = malignant. If at least one of the two readers categorized a mammographic finding 2 or higher (hereafter called positive), the case was reviewed at a baseline screening consensus meeting. The baseline screening consensus meeting could dismiss cases with a low abnormal mammographic score (rating score of 2). However, recall for diagnostic evaluation was mandatory for cases categorized 3 or higher by at least one of the two original readers. The reading protocol has been previously published [5].

The baseline screening mammograms of women in whom an interval cancer developed or cancer was detected in the second screening round were retrospectively reviewed in a meeting of the radiologists taking part in the study. The baseline screening mammograms were retrospectively classified into four categories as follows: Normal indicated that a later-developing malignant tumor could not be seen, even when the location and mammographic appearance of the subsequently developed cancer were known. Minimal sign nonactionable indicated that although minor changes were seen at the location of the later-developing cancer, these mammographic features were so minimal and nonspecific that a true-positive CAD prompt would likely have no influence on decision making regarding recall. Minimal sign actionable indicated that suspicious mammographic features were present that should have initiated a recall if prompted by CAD findings. Overlooked (missed) cancer indicated that obvious malignant mammographic features were present and the woman should have been recalled in the baseline screening round if the abnormality had been detected by the radiologists or was prompted by CAD findings.

The mammographic findings of the cancers detected at baseline screening and of the subsequent cancers visible and actionable in retrospect (i.e., minimal sign actionable and overlooked categories) were classified as one of the following: ill-defined mass, spiculated mass, distortion and asymmetric density, microcalcifications only, and density with calcifications.


Figure 1
View larger version (13K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 1 —Flowchart shows results of computer-aided detection (CAD) and prospective independent double reading of screen-film mammograms (SFM) and full-field digital mammograms (FFDM) with soft-copy reading of 29 cases of cancer (after exclusion of two cases) in baseline screening round.

 
Outcome of Double Reading
Cancers detected at baseline screening—In the study population of 3,683 women, a total of 31 breast cancers (0.84%) were diagnosed in the baseline screening round. The mean age of these 31 women was 59.2 years (range, 50-68 years). Two cancers were excluded from further analysis: One ductal carcinoma in situ was an incidental finding on sonographic examination of a woman recalled at FFDM because of a presumed simple cyst. One invasive ductal carcinoma detected on screen-film mammography was not included at FFDM because of positioning failure. Thus, a total of 29 baseline cancers were included in the analysis. Independent double reading of screen-film mammograms resulted in the diagnosis of 27/29 (93%) cases of cancer, and reading of FFDM images, in the diagnosis of 22/29 (76%) cases of cancer.

Interval cancers—Ten interval cancers were diagnosed in the study population. Six interval cancers were interpreted as normal by all four radiologists during the baseline screening round and were classified either as minimal sign nonactionable or as normal in the retrospective review of the baseline mammograms. Four interval cancers had a true-positive score (3 on screen-film mammography and 1 on FFDM) in the baseline screening round but were dismissed at the baseline screening consensus meeting. Histologic examination of the three cancers with a true-positive screen-film mammographic score revealed two invasive ductal carcinomas and one ductal carcinoma in situ. The interval cancer with a true-positive FFDM score was invasive lobular carcinoma.

Cancers detected at second screening round— Sixteen cases of cancer in 15 women were diagnosed (one woman had bilateral breast cancer) in the screening round 2 years after the baseline. Three of these cases of cancer (two invasive ductal carcinoma and one invasive lobular carcinoma) had a true-positive FFDM score at baseline interpretation 2 years earlier but were dismissed at the baseline screening consensus meeting. During the retrospective review meeting on the baseline mammograms of the 26 subsequently detected cases of cancer (10 interval cases and 16 cases detected in the second round), 11 were classified as normal, five as minimal sign nonactionable, eight as minimal sign actionable, and two as overlooked. The total of 26 interval cancers and cancers detected at the second round are hereafter called subsequent cancers.

CAD
The baseline screen-film mammograms of the 55 cases of cancer were analyzed with the latest screen-film mammographic version of a commercially available CAD system (ImageChecker version 8.0, R2 Technology, pending U.S. Food and Drug Administration approval at the time of the study) at an operating point where the average number of false marks per normal four-view case was 2.2, as measured in 345 clinically confirmed normal cases from different institutions [17]. The baseline FFDM images were analyzed with the FFDM version of the same CAD system, which produced an average of 1.9 false-positive marks per normal four-view case measured in 97 consecutively evaluated normal cases from the Oslo I study [18]. This CAD system displays two types of marks. An asterisk indicates a mass or area of architectural distortion, and a triangle indicates an area suggestive of microcalcifications.

Scoring and Statistical Analysis
The areas marked by the CAD system were assessed by the consensus panel to determine whether the location and characterization (mass or microcalcification mark) of the CAD marks corresponded to the mammographically detected cancer. Each case was classified as having true-positive or false-negative findings on the basis of biopsy-proven ground truth. A CAD prompt was considered true-positive if the center of the CAD mark was within the confines of the cancer in at least one of the two standard views.

The cancer detection rates reported in this study for CAD and for the radiologists were calculated on the basis of true-positive scores on images in the baseline interpretation session, even if the cancer was diagnosed subsequently. A rating score of 2 or higher on the 5-point rating scale for probability of cancer by at least one of the two independent readers was defined as a true-positive score in estimates of the cancer detection rates of the radiologists. The number of cancers detected in prospective independent double reading of the baseline screen-film and full-field screening mammograms was determined. The number of baseline detected cancers correctly marked with the CAD system was determined, as was the number of cancers judged at the retrospective review to be minimal sign actionable or overlooked on the baseline mammograms of the subsequently diagnosed cancers. McNemar's test for paired proportions (Epi Info, version 6, Centers for Disease Control and Prevention) was used for statistical analysis. A statistically significant result was considered p < 0.025.


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Performance of CAD
Cancers detected with baseline screening—CAD correctly marked 93% (27/29) of the cancers detected on baseline screen-film mammograms and FFDM images. Figure 1 shows the flowchart of the results of CAD on these cases for both techniques. The CAD system correctly marked all seven cancers missed in double reading of the FFDM cases. The one malignant lesion with a true-positive CAD mark missed at double reading of screen-film mammograms was marked by CAD on one view. Of the seven cases of cancer missed at double reading of FFDM images and correctly marked by CAD, four were correctly marked by CAD on both views and three on one view. The two cases of cancer not detected with CAD in the screen-film mammograms and FFDM images were the same ones. One lesion, a 9-mm invasive ductal carcinoma manifesting as a small, ill-defined mass, also was missed by three of the four independent readers. The other tumor, a 25-mm invasive lobular carcinoma manifesting as minimal distortion on mammography, was overlooked by two of the four readers. Table 1 summarizes the results of prospective independent double reading of screen-film mammograms and FFDM images for 29 baseline screen-detected cancers and the retrospective performance of the CAD system in this case set.


View this table:
[in this window]
[in a new window]

 
TABLE 1: Histologic Findings, Mammographic Features, Interpretation Scores at Prospective Independent Double Reading, and CAD Results in Baseline Screening-Detected Cancers in Oslo 1 Study [5]

 

Subsequent cancers—In 10 of the 26 cases of subsequently diagnosed cancer, the cancer was judged in retrospect to be visible and actionable—that is, overlooked or minimal sign actionable categories. On the baseline mammograms, CAD marked 60% (6/10) of these cases on screen-film mammograms and 100% (10/10) on FFDM images. Of the four cancers with false-negative findings at double reading of screen-film mammograms and a true-positive score at CAD, two cases were correctly marked by CAD on both views and two cases on one view. However, one of the latter two cases revealed a positioning failure, the lesion not being visualized on the craniocaudal view. Of the seven cases of overlooked lesions and suspicious findings with false-negative interpretation on FFDM, CAD correctly marked three cases on both views and four cases on one view. The malignant lesion not seen on the craniocaudal screen-film mammogram because of positioning failure also was outside the image on the FFDM craniocaudal view (i.e., positioning failure). Whether this lesion would have been correctly marked by CAD on both views if there had been no positioning failure is an open question. In a clinical setting, these cases had the potential for earlier detection with the assistance of CAD. CAD marked both cancers overlooked with both screen-film mammography and FFDM. Although CAD correctly marked 5/5 (100%) and 2/5 (40%) of the five screen-film mammography and FFDM cases, respectively, that were retrospectively judged minimal sign nonactionable, a true-positive CAD mark would likely have had no influence on the decision to recall the patient. The mammographic changes were minimal and nonspecific and therefore were not considered further in this analysis. Figure 2 shows a flowchart of the results of CAD in these 15 cases.


Figure 2
View larger version (16K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 2 —Flowchart shows results of computer-aided detection (CAD) and prospective independent double reading of screen-film mammograms (SFM) and full-field digital mammograms (FFDM) with soft-copy reading of 15 of 26 cases of subsequent cancer visible in retrospect. Seven cases of cancer with true-positive score at baseline interpretation session were dismissed at consensus conference.

 

Comparison of CAD Results and Double Reading Findings
Comparison of the screen-film mammographic and FFDM double reading findings with the CAD results for 39 cases of cancer (29 baseline and 10 subsequently diagnosed actionable lesions) showed a slightly higher true-positive score for CAD with screen-film mammography (33/39 vs 30/39 true-positive findings) and a remarkably higher score for CAD with FFDM (37/39 vs 25/39 true-positive findings), especially for spiculated masses (12/12 vs 5/12 true-positive findings) and microcalcifications (10/10 vs 6/10 true-positive findings).

The standalone sensitivity of CAD was determined by summing the data presented in Tables 1 and 2 for screen-film mammography and FFDM. In FFDM, CAD correctly marked 27/29 of the baseline malignant lesions and 10/10 of the subsequent visible and actionable lesions, for an FFDM CAD sensitivity of 94% (37/39). In screen-film mammography, CAD correctly marked 27/29 of the baseline malignant lesions and 6/10 of the subsequent visible and actionable lesions, for a screen-film mammography CAD sensitivity of 85% (33/39).


View this table:
[in this window]
[in a new window]

 
TABLE 2: Histology, Mammographic Features, Retrospective Classification of Previous Screening Findings, Interpretation Scores at Prospective Independent Double Reading, and CAD Results of Subsequent Cancers in Oslo I Follow-Up Study [28]

 

If all cancer cases given a score of 2 or more by at least one of the two independent readers had been acted on, the standalone sensitivity of independent double reading for the 29 baseline cases of cancers and 10 subsequent visible and actionable lesions would have been 64% (22/29 + 3/10 = 25/39) for FFDM and 77% (27/29 + 3/10 = 30/39) for screen-film mammography. Thus, for FFDM, CAD sensitivity was 94% (37/39) compared with 64% (25/39) for double reading, and for screen-film mammography, CAD sensitivity was 85% (33/39) compared with 77% (30/39) for double reading. In a two-by-two-table analysis of cancer detection rate, McNemar's test showed no significant difference (p = 0.57) between independent double reading and CAD of screen-film mammograms. The comparison did show a statistically significant difference (McNemar's test p = 0.006) for interpretation of FFDM images.

Overall comparison for the 39 actionable cancers showed that CAD marked 5/39 (13%) of lesions on screen-film mammograms that were not recalled in a double-reading environment (Table 3). Thus, it is possible that 13% more cancers could have been detected on screen-film mammography, reaching a combined detection rate of 90%. An interesting finding was that on FFDM, CAD marked all 14/39 (36%) malignant lesions not recalled at independent double reading, potentially increasing the cancer detection rate 36% and reaching a 100% combined detection rate (Table 3) for these 39 cancers.


View this table:
[in this window]
[in a new window]

 
TABLE 3: Results of Prospective Independent Double Reading and Computer-Aided Detection for 39 Cancers in Oslo I Follow-up Study [28]

 


Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
A mammographic CAD system is designed to direct attention to features that may be associated with breast cancer, thereby reducing the number of false-negative mammographic findings in breast cancer screening [9, 19-21]. Prospective clinical studies [12, 13, 15, 16] have shown an increase as high as 19% in the cancer detection rate with CAD. All of these studies were performed with CAD systems based on screen-film mammography.

Double reading is another method of reducing the false-negative rate of mammographic screening, the increase in cancer detection being as high as 15% [22]. Because of practicality and cost-effectiveness concerns, however, double reading has not been widely adopted, except in European countries with population-based screening programs in which double reading is mandated. Therefore, another goal for CAD is to produce increases in cancer detection rates similar to those of double reading without the increased costs and complexities that arise when two radiologists review the same examinations. CAD consequently is considered to represent an alternative to and not an additional procedure in double reading. In an experimental study design [23], CAD was found to be almost as sensitive as simulated double reading.

The few reports in the literature on CAD performance in FFDM indicate that the results would be equivalent to the results reported for CAD analysis of secondarily digitized images [24-26]. This finding is not surprising, because previous experimental retrospective studies comparing screen-film mammography and FFDM showed that FFDM is comparable with screen-film mammography in detectability, conspicuity, and characterization of microcalcifications [27].

To our knowledge, few studies have been conducted to evaluate CAD in a double reading environment. In a retrospective study, Destounis et al. [11] found that CAD had the potential to decrease the false-negative rate at double reading by more than one third. In our study, after exclusion of minimal sign nonactionable lesions, CAD correctly marked five cases of cancer on screen-film mammograms and 14 cases on FFDM; these cases had been missed at baseline independent double reading.

We found a 36% potential benefit of CAD in FFDM with soft-copy reading. In the Oslo I study, the lower cancer detection rate with FFDM compared with screen-film mammography can be explained in part by a learning curve effect and suboptimal reading environments with FFDM soft-copy review [28]. This explanation is confirmed by the higher cancer detection rate for FFDM in the Oslo II study. Oslo II was performed with the same radiologists as in Oslo I, but they had gained more experience in FFDM soft-copy reading and had improved reading environments [6]. Our study was performed after completion of the Oslo I study, and we used the database from that study. Oslo II was in progress during the data collection and analysis phases of our study. Nevertheless, our results show that CAD has the potential to increase the cancer detection rate even in mammographic screening programs with double reading. Our results also indicate that CAD with FFDM may have special benefit for radiologists inexperienced with FFDM and soft-copy reading.

Many breast cancers detected at screening are in retrospect visible on the previous mammograms. The rate of missing detectable cancers was estimated to be 29% in one study [29], and other studies have shown that approximately 50-67% of malignant tumors are visible in retrospect on previous mammograms [9, 30]. The judgment whether a lesion is visible but not actionable in retrospect versus when a lesion should be considered minimal sign actionable or missed (overlooked) is subjective and depends to a large extent on whether the readers are blinded or informed [31].

Previous screening mammograms of patients with interval cancer have been classified into four categories: screening error, minimal sign present, occult, and occult also at diagnosis [32]. According to this classification, 13% of previous findings are screening errors and 38% are minimal sign present; that is, approximately one half of findings identified on previous mammograms are actionable [31].

Ikeda et al. [21] discussed a subset of cancers that have perceptible but nonspecific mammographic findings marked with CAD technology even when the findings do not warrant recall. We therefore separated the minimal sign group into actionable (warrant recall) and nonactionable (nonspecific findings that probably would not warrant recall, even if prompted, in daily practice). After the latter are excluded from analysis, there is still an additional potential benefit of CAD in a double reading setting. Of the 39 radiographically visible and actionable cancers in our study, 14 (36%) of the cases of cancer interpreted as normal after double reading were correctly marked by CAD in at least in one view on FFDM. This number was five (13%) for screen-film mammography. We believe patients would have a high probability of being recalled in the screening round if CAD were used.

Three limitations of this study need to be addressed. First, the study was a retrospective CAD analysis, which can provide insight into the potential effect of CAD on reader sensitivity. A prospective study is necessary to determine the actual benefit of CAD in daily practice. Our analysis was focused on the degree of correlation between CAD findings and clinical decisions to measure the potential benefit from CAD. Although other investigators have retrospectively measured this potential benefit of CAD, we are not aware of studies that included an analysis of cancers detected with FFDM screening or that were conducted with cases collected from a screening program such as the Oslo I trial. We believe a strength of our study was that our data set consisted not only of cancers detected with consecutive baseline screening with both screen-film mammography and FFDM but also of baseline mammograms of interval and subsequently detected cancers.

In a retrospective analysis such as that performed in this study, the measured effect of CAD is a potential rather than an actual benefit, because one cannot know whether a particular true-positive CAD mark would change a radiologist's decision in daily practice. We took a conservative approach in our retrospective classification of the missed cancers. We did not include all retrospectively visible cancers but included only cancers visible in retrospect that were also judged to be actionable, that is, those classified as overlooked and minimal sign actionable. Thus, if a retrospective review of the baseline mammograms were to reveal only a subtle nonspecific finding in the location of the subsequent cancer, it is likely that the radiologist in daily practice would (and should) discard such a prompt, assuming it is a false CAD mark. Acting on such nonspecific findings (although "correctly" marked by CAD) would result in an unacceptable increase in the recall rate, as shown by Ikeda et al. [21]. We believe that inclusion of such a CAD mark in an analysis as a true-positive CAD mark would lead to an overestimate of the potential benefit of CAD in daily practice. Therefore, we subgrouped minimal-sign lesions into nonactionable and actionable and included the latter only when marked by CAD as a true-positive finding and one that radiologists would likely act on.

A second limitation of this study was the small number of cases of cancer and consequently the lack of statistical power. The low number of cases of cancer may explain the 100% potential performance of radiologists plus CAD in FFDM mode. This finding is likely spurious. It seems unlikely that with the use of CAD, all patients with lesions marked by CAD would have been recalled in the baseline round, even though the lesions were visible and actionable. However, these cancers would have had a higher probability of being recalled in the baseline screening round with the use of CAD.

Third, this study did not show whether the CAD false-marker rate would lead to an increase in recall rate greater than the increase in cancer detection rate. Experience from single interpretation of screen-film mammograms suggests that the recall rate increases with CAD but at a rate comparable with or less than the increase in cancer detection rate [12-16].

Our findings clearly show the potential benefit of CAD input for radiologists interpreting FFDM screening mammograms, even in screening programs with independent double reading. To our knowledge, our study is unique in that we compared CAD with the results of prospective independent double reading with a 5-point rating scale as used in daily practice and in that we used two imaging techniques (screen-film mammography and FFDM). Furthermore, we measured the potential benefit of CAD in FFDM. We believe it was important to measure the correlation between the radiologists' findings and CAD performance on FFDM and therefore to measure the potential benefit of CAD in FFDM.

In conclusion, our results indicate that CAD has the potential for increasing the cancer detection rate, even in breast cancer screening programs using independent double reading. Furthermore, our results indicate that CAD may be of particular value when FFDM with soft-copy reading is introduced into breast cancer screening programs.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Beam CA, Layde PM, Sullivan DC. Variability in the interpretation of screening mammograms by US radiologists. Arch Intern Med 1996; 156:209 -213[Abstract]
  2. Elmore JG, Wells CK, Lee CH, Howard DH, Feinstein AR. Variability in radiologists' interpretation of mammograms. N Engl J Med 1994; 331:1493 -1499[Abstract/Free Full Text]
  3. Dinnes J, Moss S, Melia J, Blanks R, Song F, Kleijnen J. Effectiveness and cost-effectiveness of double reading of mammograms in breast cancer screening: findings of a systematic review. Breast 2001; 10:455 -463[CrossRef][Medline]
  4. Lewin JM, Hendrick RE, D'Orsi CJ, et al. Comparison of full-field digital mammography with screen-film mammography for cancer detection: results of 4,945 paired examinations. Radiology2001; 218:873 -880[Abstract/Free Full Text]
  5. Skaane P, Young K, Skjennald A. Population-based mammography screening: comparison of screen-film and full-field digital mammography with soft-copy reading: Oslo I study. Radiology2003; 229:877 -884[Abstract/Free Full Text]
  6. Skaane P, Skjennald A. Screen-film mammography versus full-field digital mammography with soft-copy reading: randomized trial in a population-based screening program—the Oslo II study. Radiology 2004;232 : 197-204[Abstract/Free Full Text]
  7. Pisano ED, Gatsonis C, Hendrick RE, et al. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med 2005; 353:1773 -1783[Abstract/Free Full Text]
  8. Birdwell RL, Ikeda DM, O'Shaughnessy KF, Sickles EA. Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 2001;219 : 192-202[Abstract/Free Full Text]
  9. Burhenne LJW, Wood SA, D'Orsi CJ, et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000;215 : 554-562[Abstract/Free Full Text]
  10. Jiang Y, Nishikawa RM, Schmidt RA, Toledano AY, Doi K. Potential of computer-aided diagnosis to reduce variability in radiologists' interpretations of mammograms depicting microcalcifications. Radiology 2001;220 : 787-794[Abstract/Free Full Text]
  11. Destounis SV, DiNitto P, Logan-Young W, Bonaccio E, Zuley ML, Willison KM. Can computer-aided detection with double reading of screening mammograms help decrease the false-negative rate? Initial experience. Radiology 2004;232 : 578-584[Abstract/Free Full Text]
  12. Freer TW, Ulissey MJ. Screening mammography with computer-aided detection: prospective study of 12,860 patients in a community breast center. Radiology 2001;220 : 781-786[Abstract/Free Full Text]
  13. Birdwell RL, Bandodkar P, Ikeda DM. Computer-aided detection with screening mammography in a university hospital setting. Radiology 2005;236 : 451-457[Abstract/Free Full Text]
  14. Gur D, Sumkin JH, Rockette HE, et al. Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst2004; 96:185 -190[Abstract/Free Full Text]
  15. Feig SA, Sickles EA, Evans WP, Linver MN. Re: Changes in breast cancer detection and mammography recall rates after the introduction of a computer aided detection system. J Natl Cancer Inst2004; 96:1260 -1261[Free Full Text]
  16. Cupples TE, Cunningham JE, Reynolds JC. Impact of computer aided detection in a regional screening mammography program. AJR 2005; 185:944 -950[Abstract/Free Full Text]
  17. Zeng X, Medeiros S, Wood SA, et al. Computer-aided detection for mammography: improved algorithm performance with operator determined points characterized by new metrics. In: Lemke HU, Vannier MW, Inamura K, Farman AG, Doi K, Reiber JHC, eds. Computer assisted radiology and surgery, CARS 2004. International Congress Series 1268. Philadelphia, PA: Elsevier, 2004: 855-860
  18. Skaane P, Kshirsagar A, Stapleton S. False positive CAD marks on FFDM acquired normal screening mammograms. In: Pisano E, ed. Proceedings of the 7th International Workshop on Digital Mammography. Chapel Hill, NC: Biomedical Research Imaging Center,2005 : 559-564
  19. Baker JA, Rosen EL, Lo JY, Gimenez EI, Walsh R, Soo MS. Computer-aided detection (CAD) in screening mammography: sensitivity of commercial CAD systems for detecting architectural distortion. AJR 2003; 181:1083 -1088[Abstract/Free Full Text]
  20. Evans WP, Burhenne LJ, Laurie L, O'Shaughnessy KF, Castellino RA. Invasive lobular carcinoma of the breast: mammographic characteristics and computer-aided detection. Radiology 2004;231 : 564-570[Abstract/Free Full Text]
  21. Ikeda DM, Birdwell RL, O'Shaughnessy KF, Sickles EA, Brenner RJ. Computer-aided detection output on 172 subtle findings on normal mammograms previously obtained in women with breast cancer detected at follow-up screening mammography. Radiology 2004;230 : 811-819[Abstract/Free Full Text]
  22. Thurfjell EL, Lernevall KA, Taube AA. Benefit of independent double reading in a population-based mammography screening program. Radiology 1994;191 : 241-244[Abstract/Free Full Text]
  23. Ciatto S, Del Turco MR, Burke P, Visioli C, Paci E, Zappa M. Comparison of standard and double reading and computer-aided detection (CAD) of interval cancers at prior negative screening mammograms: blind review. Br J Cancer 2003;89 : 1645-1649[CrossRef][Medline]
  24. Baum F, Fischer U, Obenauer S, Grabbe E. Computer-aided detection in direct digital full-field mammography: initial results. Eur Radiol 2002; 12:3015 -3017[Medline]
  25. Li L, Clark RA, Thomas JA. Computer-aided diagnosis of masses with full-field digital mammography. Acad Radiol2002; 9:4 -12[CrossRef][Medline]
  26. O'Shaughnessy K, Castellino R, Muller S, et al. Computer-aided detection (CAD) on 90 biopsy-proven breast cancer cases acquired on a full-field digital mammography (FFDM) system. (abstr) Radiology 2001;221 [suppl]:471
  27. Fischer U, Baum F, Obenauer S, et al. Comparative study in patients with microcalcifications: full-field digital mammography vs screen-film mammography. Eur Radiol 2002;12 : 2679-2683[Medline]
  28. Skaane P, Skjennald A, Young K, et al. Follow-up and final results of the Oslo I study comparing screen-film mammography and full-field digital mammography with soft-copy reading. Acta Radiol2005; 46:679 -689[CrossRef][Medline]
  29. Yankaskas BC, Schell MJ, Bird RE, Desrochers DA. Reassessment of breast cancers missed during routine screening mammography: a communitybased study. AJR 2001;181 : 687-693
  30. Broeders MJ, Onland-Moret NC, Rijken HJ, Hendriks JH, Verbeek AL, Holland R. Use of previous screening mammograms to identify features indicating cases that would have a possible gain in prognosis following earlier detection. Eur J Cancer 2003;39 : 1770-1775[CrossRef][Medline]
  31. Hofvind S, Skaane P, Vitak B, et al. Influence of review design on percentages of missed interval breast cancers: retrospective study of interval cancers in a population-based screening program. Radiology 2005;237 : 437-443[Abstract/Free Full Text]
  32. van Dijck JA, Verbeek AL, Hendriks JH, Holland R. The current detectability of breast cancer in a mammographic screening program. Cancer 1993; 72:1933 -1938[CrossRef][Medline]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Am. J. Roentgenol.Home page
H. P. Forman
Back to the Beginning
Am. J. Roentgenol., February 1, 2007; 188(2): 295 - 296.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Skaane, P.
Right arrow Articles by Castellino, R. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Skaane, P.
Right arrow Articles by Castellino, R. A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS