Breast Imaging
Rates and Causes of Disagreement in Interpretation of Full-Field Digital Mammography and Film-Screen Mammography in a Diagnostic Setting
OBJECTIVE. This study was performed to determine the rates and causes of disagreements in interpretation between full-field digital mammography and film-screen mammography in a diagnostic setting.
SUBJECTS AND METHODS. Patients undergoing diagnostic mammography were invited to participate in the digital mammography study. Three views, selected by the radiologist interpreting the film-screen mammography, were obtained in both film-screen mammography and digital mammography. Radiologists independently assigned a Breast Imaging Reporting and Data System (BI-RADS) category to the film-screen mammography and the digital mammography images. The BI-RADS categories were grouped into the general categories of agreement, partial agreement, or disagreement. A third and different radiologist reviewed all cases of disagreement, reached a decision as to management, and determined the primary cause of disagreement.
RESULTS. Six radiologists reviewed digital mammography and film-screen mammography diagnostic images in a total of 1147 breasts in 692 patients. Agreement between digital mammography and final film-screen mammography assessment was present in 937 breasts (82%), partial agreement in 159 (14%), and disagreement in 51 (4%), for a kappa value of 0.29. The primary causes of disagreement were differences in management approach of the radiologists (52%), information derived from sonography or additional film-screen mammograms (34%), and technical differences between the two mammographic techniques (10%).
CONCLUSION. Significant disagreement between film-screen mammography and digital mammography affecting follow-up management was present in only 4% of breasts. The most frequent cause of disagreement in interpretation was a difference in management approach between radiologists (interobserver variability). This source of variability was larger than that due to differences in lesion visibility between film-screen mammography and digital mammography.
Variability in the interpretation of film-screen mammography has been measured in a number of studies [1,2,3,4]. Elmore et al. [1] compared the results of 10 radiologists, each of whom independently interpreted the same set of 150 cases that included 27 cases of biopsy-proven cancer. The rate of disagreement in the recommendation for immediate follow-up was 15% of the total number of comparisons or 27% of the maximum number of possible disagreements. In 1996, Beam et al. [2] described the testing of 108 radiologists from 50 American College of Radiology—accredited centers, each of whom interpreted an enriched set of 79 screening mammograms that included 45 cases of cancer. The average radiologist had a sensitivity of 79%, a specificity of 89%, and an area under the receiver operating characteristic curve of 0.85. Although the fraction of disagreements was not reported, the authors reported that sensitivity ranged from 47% to 100%, specificity ranged from 36% to 99%, and areas under the receiver operating characteristic curve based on BI-RADS (Breast Imaging Reporting and Data System) categories [5] ranged from 0.74 to 0.95 [2]. Kerlikowske et al. [3] determined intraobserver and interobserver variability by having two radiologists interpret 356 screening mammograms twice and each independently interpret 2578 screening mammograms once. These researchers found a 10% intraobserver disagreement rate and a 16% interobserver disagreement rate in terms of a finding or no finding on all cases, and a 10% intraobserver disagreement rate and a 14% interobserver disagreement rate on cancer cases [3].
Disagreement in screening interpretations of full-field digital mammography and film-screen mammography has been reported in interim results of an ongoing study [6]. In that study, the overall rate of disagreement in terms of positive or negative interpretations for different radiologists interpreting digital mammograms and film-screen mammograms was 17% (494/2985 cases) and the rate of disagreement for cancer cases was 58% (11/19 cases).
One digital mammography system (Senographe 2000D; General Electric Medical Systems, Milwaukee, WI) has been approved by the United States Food and Drug Administration for clinical use. Other digital mammography systems are likely to follow over the next year or two. Studies seeking to compare the cancer detection capability of digital mammography with that of film-screen mammography need to take interpretational differences into account. The aim of this article is to quantify the rate of disagreement between digital and film-screen mammography in interpretation of diagnostic mammograms.
Recruitment for this study was conducted between November 1998 and November 1999 under institutional review board approval. A woman scheduled for diagnostic mammography was eligible for the study if she met the following selection criteria: she was at least 40 years old, she was not pregnant or suspected of being pregnant, her breasts could be imaged completely on an 18 × 24 cm image receptor, she was mobile, and she was able to give informed consent.
Diagnostic mammography is scheduled routinely in our breast center for patients with one or more of the following clinical concerns: clinical symptoms such as abnormal findings on physical examination of the breast, palpable abnormality, or spontaneous nipple discharge; abnormal findings on screening mammography requiring further evaluation (BI-RADS category 0); follow-up of probably benign lesions (BI-RADS category 3); personal history of breast cancer (i.e., lumpectomy or mastectomy); and prior placement of a breast implant. Anxious patients or those traveling a significant distance for their mammography are also referred for diagnostic mammography by their physicians to avoid the inconvenience of being called back for further imaging.
Research assistants who had no knowledge of the clinical history or prior mammographic findings recruited patients to the study. The assistants were free to recruit any patient waiting for diagnostic examination who met the selection criteria.
Digital mammography was performed on a prototype Senographe 2000D. Film-screen mammography was performed on one of eight American College of Radiology—accredited and Food and Drug Administration—certified mammography units. The film-screen units consisted of Mammomat 300 and Mammomat 3000 (Siemens Medical Systems, Iselin, NJ) or 600T, 800T, and DMR units (General Electric Medical Systems).
Digital interpretations were performed using both soft-copy and printed images. Soft-copy images were displayed on an Advantage Workstation (General Electric Medical Systems) with two high-resolution (2K × 2.5K), high-luminance (140 footlamberts) monitors, and hard-copy images were printed using an Agfa LR5200 laser imager (Agfa-Gevaert, Brussels, Belgium).
Breast sonography was performed using dedicated breast sonographic units (HDI 3000 and HDI 5000; Advanced Technology Laboratories, Bothell WA) with 10-MHz linear array transducers (L10-5 38-mm broadband and CL 10-5 22-mm Ertos; Advanced Technology Laboratories).
After review of the history sheet and any prior mammograms, a maximum of three images were selected by the radiologist managing the patient. The images were chosen to best evaluate the clinical or mammographic findings of concern. Imaging in the same views was performed in both digital and film-screen mammography by the same mammography technologist. Possible mammographic views obtained included standard craniocaudal, mediolateral, mediolateral oblique, exaggerated craniocaudal, implant-displaced, spot compression with or without geometric magnification, and open magnification views.
The radiologist interpreting the film-screen mammogram at the time of the examination decided the clinical management of the patient before the patient left the facility. Additional film-screen mammography and breast sonography were performed when the radiologist managing the patient deemed it necessary. To limit the breast radiation dose, any additional film-screen mammograms in addition to the original three images obtained in both techniques were not repeated with digital mammography.
In each case, one radiologist independently reviewed the digital mammograms, and another radiologist independently reviewed the film-screen mammograms. The radiologist interpreting the film-screen images did not have access to the digital images, and the radiologist interpreting the digital mammograms did not have access to the film-screen images. Each of six participating radiologists interpreted approximately the same number of digital and film-screen mammograms. Digital and film-screen interpretations were made under similar circumstances, with access to all pertinent clinical history and prior mammograms for comparison. When applicable, the recent film-screen screening mammogram with abnormal findings that prompted the diagnostic examination was also available for review to both the radiologist interpreting the digital and the radiologist interpreting the film-screen mammograms.
All interpretations of digital mammograms resulted in a BI-RADS classification [5]. Film-screen interpretations using BI-RADS categories were recorded at several points in time. An “initial” film-screen mammography BI-RADS classification was recorded after evaluation of the first three film-screen mammograms. These images were identical to those obtained with digital mammography, although the digital images were not available to the radiologist interpreting the film-screen mammograms. A “total” film-screen mammography BI-RADS interpretation was recorded after all additional (after the initial three) film-screen images were obtained. A “final” film-screen mammography BI-RADS interpretation was recorded after the entire examination was completed and included all film-screen images plus breast sonograms.
Interpretations obtained after review of digital images were compared with those obtained after initial, total, and final film-screen mammography interpretations. Agreement in interpretation occurred when both radiologists assigned the same BI-RADS classification, when one assigned BI-RADS 1 and the other BI-RADS 2, or when one assigned BI-RADS 4 and the other BI-RADS 5. A partial agreement in interpretation occurred when one radiologist assigned BI-RADS 1 or 2 and the other, BI-RADS 3. Disagreement in interpretation occurred when one radiologist assigned BI-RADS 1, 2, or 3 and the other assigned 4 or 5.
Cases of disagreement between initial film-screen mammography and digital mammography interpretations were reviewed by a third radiologist who was not familiar with the case. This third radiologist reviewed all digital, film-screen, and sonographic images and reached a decision BI-RADS recommendation. The decision radiologist determined the final treatment decision for the patient and determined the primary cause of disagreement.
Data were maintained using Access software (Microsoft, Redmond, WA) and were analyzed using SAS software (version 6.12) (SAS, Cary, NC). Descriptive statistics were used to summarize the findings. The chi-square test was performed for statistical significance. Kappa statistics were used to compare differences between the initial, total, and final film-screen mammography classifications across BI-RADS categories and differences between film-screen mammography and digital mammography interpretations across BI-RADS categories [7, 8].
Of the 991 patients invited to participate in the study, 692 (70%) consented. In the study population of 692 patients, we examined 1147 breasts using diagnostic mammography. We also used sonography to examine 235 breasts in these 692 patients.
The patients ranged in age from 40 to 88 years old (average, 53 years). One hundred forty-five patients (21%) had a personal history of breast cancer, 153 (22%) had a first-degree relative with breast cancer, 141 (20%) had a second-degree relative with breast cancer, and 292 (42%) had a history of late childbearing (30 years old or older) or were nulliparous.
Table 1 summarizes the indications for diagnostic mammography in the study population. Mammography was defined as diagnostic if the patients had a clinical or mammographic finding precipitating the diagnostic examination, or if they had implants, lumpectomy, or mastectomy. These criteria have been previously recommended for diagnostic mammographic examination [9]. Mammography was defined as screening if the patients were asymptomatic—that is, lacking both mammographic and clinical concern. For example, a patient presenting with a palpable lump in the right breast and having both breasts examined would have a right breast examination classified as diagnostic and a left breast examination classified as screening. Of the 1147 breasts examined, 609 were asymptomatic and 538 were diagnostic. Note that a breast could have more than one reason to be considered diagnostic (e.g., a palpable abnormality in a breast treated with prior lumpectomy).
Twelve technologists participated in the study and obtained 7613 mammograms in both digital and film-screen mammography. Standard mammographic views were most frequently obtained, including 960 mediolateral oblique, 943 craniocaudal, and 838 mediolateral views obtained on both film-screen mammography and digital mammography. Other views obtained on both film-screen and digital mammography included 323 spot magnification, 98 exaggerated craniocaudal, 62 implant-displaced, and 47 spot nonmagnification views. Images obtained on film-screen mammography in addition to the maximum of three initial images obtained with both techniques included 479 spot magnification, 216 exaggerated craniocaudal, 87 mediolateral, 79 mediolateral oblique, 77 craniocaudal, and 70 spot nonmagnification views.
Six radiologists participated in the study and interpreted 1147 mammograms independently. The interpreting radiologist for each technique recorded the density of the breast, the lesion location, and the type of lesion present, and assigned a BI-RADS category to each mammogram. Table 2 summarizes the breast density encountered in the study population as recorded in the film-screen interpretation and in the digital interpretation. Complete agreement in BI-RADS classification of breast densities between film-screen mammography and digital mammography was seen in 476 (69%) of the 692 cases (κ = 0.46). Results on digital mammography were shifted to less dense classifications compared with film-screen mammography.
Table 3 summarizes the mammographic findings by technique among the 1147 breasts examined. The most common findings were microcalcifications (45% of breasts on digital mammography, 36% on film-screen mammography), followed by masses (16% of breasts on digital mammography, 14% on film-screen mammography). Digital mammography revealed more clustered microcalcifications than film-screen mammography (183 vs. 146, p < 0.0002), more scattered calcifications than on film-screen mammography (329 vs. 254, p < 0.0001), and more masses than film-screen mammography (188 vs.160, p < 0.005). The detection of architectural distortions (44 on both) and focal asymmetric densities (55 vs. 58, p = 0.336) were comparable for the two techniques.
Table 4 compares the BI-RADS categories for film-screen mammography with the BI-RADS categories for digital mammography after review of the three initial images of each technique. These initial film-screen mammography interpretations are directly comparable to the digital interpretation, because the radiologist interpreting film-screen mammography had exactly the same information as the radiologist interpreting digital mammography. The 5 × 5 table yielded a kappa score of 0.20. Note that 47 breasts (4.1%) were considered suspicious enough to warrant a recommendation for biopsy (BI-RADS categories 4 and 5) based on interpretation using the three initial film-screen mammograms. The same number of breasts were recommended for biopsy on the basis of digital mammography, but only 22 breasts were rated BI-RADS category 4 or 5 on both techniques.
In 581 breasts, only the initial BI-RADS categories were obtained because neither additional images nor sonography was needed. In 471 breasts, additional images were obtained; and in 235 breasts, sonography was performed. A change in BI-RADS category (from initial to total) occurred after obtaining additional film-screen images in 39 breasts (8.3%). A change in BI-RADS category (from total to final) occurred after obtaining breast sonograms in 69 breasts (29.4% of those in which sonography was performed). The rates of agreement, partial agreement, and disagreement in BI-RADS categories between digital mammography and all three film-screen mammography BI-RADS interpretations were similar, but agreement improved slightly with additional film-screen mammograms and sonograms (Table 5). The rate of full agreement increased slightly from 79% to 82%, and the rate of disagreement stayed constant at 4%.
Table 6 compares the rate of agreement in BI-RADS categories between film-screen mammography and digital mammography for different lesion types. The lowest rate of agreement was for masses (58%). Significantly higher rates of agreement occurred between the two techniques for calcifications (p < 0.001) and for architectural distortions (p < 0.001) than for masses. The agreement rate for masses was not significantly different from that for focal asymmetry (p = 0.096).
Table 7 illustrates the agreement rates between digital mammography and film-screen mammography based on the indication for mammography. As previously defined, screening mammograms were asymptomatic and diagnostic mammograms met at least one of the criteria for diagnostic mammography. Screening mammograms had significantly higher agreement rates and lower disagreement rates between film-screen mammography and digital mammography than diagnostic mammograms (p < 0.0001). Diagnostic mammograms were further subdivided into studies performed to evaluate abnormal findings identified in recent prior screening mammograms (within the last 3 months)—denoted “callback mammograms”— and the rest of the diagnostic mammograms that had no prior abnormal findings on screening mammography within the last 3 months (“non-callback mammograms”). Because recent abnormal findings on film-screen mammograms were available at the time of both digital mammography and film-screen mammography interpretations for callback mammograms, the radiologists had more mammograms to review at the time of BI-RADS classification, modifying the interpreting conditions compared with those for noncallback mammograms. A significantly lower agreement rate between film-screen and digital mammography was noted in callback mammograms than in noncallback mammograms (p < 0.0001) (Table 7).
Fifty-five breasts were recommended for biopsy on the basis of either final film-screen mammography (i.e., after all films and sonography were obtained) or digital mammography interpretation. This recommendation resulted from a BI-RADS 4 or 5 category based on final recommendations from film-screen mammography in 52 of the 55 breasts. Twenty-seven biopsies were recommended on the basis of digital mammography findings, three of which were not recommended for biopsy on the basis of film-screen mammography findings. Biopsy was not performed in five patients for the following reasons: two patients refused biopsy, one biopsy was cancelled after review of prior films from another institution, and two were cancelled at the time of biopsy because the lesion was no longer identified. The histology results of the remaining 50 cases that underwent biopsy resulted in 32 benign and 18 malignant findings. Of the 18 malignant findings, 11 were ductal carcinoma in situ and seven were invasive carcinoma. The positive biopsy rate for final film-screen mammography was 36% (17 malignancies found in 47 biopsies recommended by film-screen mammography and performed) and for digital mammography was 45% (10 malignancies found in 22 biopsies recommended by digital mammography and performed).
Eighteen cases of cancer were found in the study group. Table 8 shows the comparison of BI-RADS categories for initial film-screen mammography and digital mammography of the 18 cancers found. For cancer, the agreement rate between initial film-screen mammography and digital mammography was 67% (12/18). Of the 18 cancers, initial film-screen mammography correctly showed 16 as BI-RADS 4 or 5 lesions (89% sensitivity), whereas digital mammography correctly identified 13 (72% sensitivity). This difference in sensitivity between techniques was not statistically significant (p = 0.20).
Overall disagreement between both initial and final film-screen mammography and digital mammography existed in only 4% of breasts (50/1147 for initial film-screen mammography, 51/1147 for final film-screen mammography). The consensus radiologist agreed with the film-screen mammography interpretation in 32 cases (64%) and with the digital mammography interpretation in 18 cases (36%) (p < 0.033).
The causes of disagreement between film-screen mammography and digital mammography interpretations of diagnostic mammograms were differences in management approach between radiologists (52%), information from additional film-screen mammography or sonographic workup (34%), and technical differences in the examinations (10%), as shown in Table 9. Although film-screen mammography and digital mammography were performed by the same technologists within minutes of each other, technical differences still accounted for a different management recommendation in five cases. These technical differences consisted of slight positioning differences between digital mammography and film-screen mammography in four cases and better image contrast in one case. In one case a cancerous lesion was detected as a cluster of microcalcifications seen on digital mammography only (Fig. 1A,1B), and in another case a cancerous lesion was perceived as a mass on film-screen mammography only (Fig. 2A,2B,2C).
![]() View larger version (159K) | Fig. 1A. —49-year-old woman who presented to diagnostic center with no clinical complaints. Film-screen mammogram (not shown) was interpreted as having normal findings. Photographic magnification of full-field digital mediolateral mammogram after electronic magnification reveals faint cluster of indeterminate microcalcifications (arrow) in lower breast. Cluster was not visible, even retrospectively, on film-screen mammography. Patient was called back on basis of full-field digital mammography results and underwent biopsy for microcalcifications with stereotactic guidance. |
![]() View larger version (159K) | Fig. 1B. —49-year-old woman who presented to diagnostic center with no clinical complaints. Film-screen mammogram (not shown) was interpreted as having normal findings. Radiograph shows multiple microcalcifications in several core specimens. Pathology found ductal carcinoma in situ, cribriform morphology, 3.5 mm, grade 1 of 3. |
![]() View larger version (94K) | Fig. 2A. —47-year-old woman who presented for examination of indurated area around right nipple. Mediolateral full-field digital mammogram reveals normal-appearing tissue. |
![]() View larger version (101K) | Fig. 2B. —47-year-old woman who presented for examination of indurated area around right nipple. Craniocaudal digital mammogram reveals focal nodularity medial to nipple (arrow). This mammogram was interpreted as showing normal findings. |
![]() View larger version (122K) | Fig. 2C. —47-year-old woman who presented for examination of indurated area around right nipple. Magnified spot compression film-screen mammogram in craniocaudal projection confirms spiculated mass. Pathology found infiltrating lobular carcinoma, 2.5 cm, grade 2 of 3. |
The introduction of digital mammography to the field of breast imaging and its recent Food and Drug Administration approval for clinical mammography by at least one manufacturer have generated enthusiasm and hope for the improved detection of early breast cancer while raising important questions about the differences, advantages, and disadvantages that digital mammography brings to breast imaging. Variations in mammographic interpretations among radiologists have been reported to be substantial [1,2,3,4] and to be responsible for potential delays in breast cancer diagnosis [6]. This variability in interpretation results in background “noise” that confounds observer studies designed to separate the diagnostic differences between digital mammography and film-screen mammography.
The American College of Radiology introduced and developed a lexicon of terminology (BI-RADS) with definitions to provide standardized language, a reporting structure, and a decision-oriented approach to the assessment of mammograms in an effort to provide clear and accurate reports that are understandable and decisive [5]. Still, studies have documented variability in the use of BI-RADS among radiologists [3]. As we set out to investigate the differences and possible advantages that digital mammography offers, it is evident that an assessment of the contribution of variability in mammographic interpretation to the results of comparative studies is needed.
The BI-RADS lexicon and categorization offers a standardized method of classifying mammographic interpretations, so that interpretations can be grouped and agreement compared. The definition of “agreement” in mammographic interpretation needs to consider treatment implications to the patient. Clearly, no clinical impact results from a discordant interpretation if one radiologist assigns BI-RADS 1 (normal) and another assigns BI-RADS 2 (benign findings), because both result in a recommendation for yearly mammography. Clinical impact is minimal if categorization of BI-RADS 1 or 2 and BI-RADS 3 (probably benign) is discordant, because only the timing of the follow-up examination would be changed (from 1 year to 6 months). We defined this discrepancy as “partial agreement” to reflect the reduced impact on clinical care (Table 1).
Significantly different treatment recommendations result when one interpretation recommends biopsy of a finding suspicious for carcinoma (BI-RADS 4 or 5) and the other interpretation categorizes the finding as negative or amenable to mammographic follow-up (BI-RADS 1, 2, or 3). The level of disagreement observed in our study between digital mammography and film-screen mammography interpretations of diagnostic mammograms was lower (4%) than expected when compared with interobserver disagreement rates of 15-27% found by Elmore et al. [1] for a biopsy recommendation among radiologists interpreting the same set of two-image film-screen mammograms.
Several reasons may exist for the lower disagreement rate in BI-RADS categories measured in our study. Possible explanations include different indications for mammographic examination than those previously reported (i.e., diagnostic instead of screening mammography), the number and type of mammograms obtained in this study, an unusual distribution of breast parenchymal density among our study population, or a more uniform set of radiologists dedicated to breast imaging.
To elucidate the effect of different indications for mammography on the rate of disagreement among BI-RADS categories, it is useful to consider diagnostic mammograms separately from screening mammograms in our study group. By doing so, a difference becomes apparent in BI-RADS categories between asymptomatic breasts and breasts with clinical symptoms or mammographic areas of concern. Screening mammograms had a higher rate of agreement between film-screen mammography and digital mammography than diagnostic mammograms (87% vs. 70%) and a lower rate of disagreement (2% vs. 7%) than diagnostic mammograms (p < 0.0001). Our disagreement rate for screening mammograms (2%) was considerably lower than that found in the screening study of Lewin et al. [6] (17%). This difference may be attributable in part to the presence of one extra image with each technique in our study, a difference in the patient population, or differences in radiologists between the two studies. In our study, although the radiologists varied in years of experience (1-11 years of dedicated breast imaging work), they all practiced in the same breast center, and their radiology practice consisted only of breast imaging. The lower rate of disagreement in our study is not likely to be attributable to a more difficult case mix in the screening studies than in our study, because the difference persists when only screening mammograms are considered.
The purpose of further separating the indications for diagnostic mammography into those resulting from abnormal findings on a recent screening study (callback mammography) and those resulting from other diagnostic examinations was to evaluate the effect that the availability of a recent film-screen mammography screening study (and thus, additional mammographic images) might have on BI-RADS categorization. The screening study being available for both digital mammography and film-screen mammography interpretations could affect agreement among callback and noncallback diagnostic cases. Indeed, our results indicate a difference in agreement among diagnostic mammograms based on whether they were callback cases, with a higher rate of agreement for noncallback cases (p < 0.0001). We speculate that a case selection difference may be responsible for these results. That is, the callback cases based on screening may represent a subset of cases in which the findings are more questionable mammographically, and thus more susceptible to variability in interpretation, than diagnostic mammograms for other indications.
A second possible contributing factor to disagreement in BI-RADS categories is that the diagnostic workup in our study consisted of three images in 1010 (88%) of 1147 breasts and two images in 125 breasts (11%), instead of the standard two images per breast obtained in other studies [4]. The additional information added by a third projection on both techniques, which included at least one geometric magnification view in 28% of breasts (Table 3), may have contributed to the higher rate of agreement in this study.
Finally, breast density can affect mammographic interpretation because mammograms with dense breast parenchymal patterns are more difficult to interpret than those with predominantly fatty tissue [10,11,12,13]. The breast parenchymal density distribution in our study (i.e., 4% fatty, 51% mixed, 43% dense, and 2% extremely dense) is similar to that reported in other studies [6] and does not account for our higher rate of agreement between techniques.
An interesting result of this study was the lower agreement rate and higher disagreement rate between film-screen mammography and digital mammography for masses than for calcifications or architectural distortions (Table 6). This difference was highly significant (p < 0.001) and to our knowledge has not been reported previously. The differences in agreement rates among other lesion types was not significant (p = 0.243).
The general trend of having lower agreement rates between film-screen mammography and digital mammography for cancer was true in our study as in others (11/18 or 61% agreement for this diagnostic study, vs. 8/19 or 42% agreement for the screening study of Lewin et al. [6]). However, the number of cancer cases in each study was too small for the difference in agreement rates between screening and diagnostic cohorts to be statistically significant (p = 0.199).
Although the overall disagreement rate was low (4%), the primary cause of disagreement was differences in treatment decisions by the radiologists (interobserver variability). This was the cause of disagreement in 26 (52%) of the 50 cases in which disagreement occurred. Additional information garnered from additional film-screen mammography images (in addition to the three diagnostic images that matched those obtained in digital mammography) and from sonography of suspicious lesions was a secondary cause of disagreement (in 17/50 or 34% of disagreement cases).
On the basis of these results, it appears unlikely that the clinical implementation of digital mammography in the diagnostic setting will result in a significant change in the technical ability of mammography to reveal breast cancer.
In conclusion, a high level of agreement was observed between film-screen mammography and digital mammography in the diagnostic evaluation of breast cancer. The primary causes of disagreement were differences in the treatment approaches of the radiologists and the additional information provided by the additional film-screen mammography and sonography of suspicious lesions. These differences resulted in greater variation than did differences in lesion visualization between film-screen mammography and digital mammography.
Supported by a grant from the Lynn Sage Cancer Research Foundation and through a research agreement between Northwestern University Medical School and General Electric Medical Systems, Inc.
Address correspondence to L. A. Venta.

Audio Available | 




