AJR InPractice
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Good, W. F.
Right arrow Articles by Klym, A. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Good, W. F.
Right arrow Articles by Klym, A. H.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Hotlight (NEW!)
Right arrow
What's Hotlight?
AJR 2000; 175:1573-1576
© American Roentgen Ray Society


Detection of Masses and Clustered Microcalcifications on Data Compressed Mammograms

An Observer Performance Study

Walter F. Good1, Jules H. Sumkin, Marie Ganott, Lara Hardesty, Brenda Holbert, Christopher M. Johns and Amy H. Klym

1 All authors: Department of Radiology, University of Pittsburgh, A451 Scaife Hall, 3550 Terrace St., Pittsburgh, PA 15261-0001.

Received February 28, 2000; accepted after revision May 15, 2000.

 
Supported in part by grants CA62800, CA66594, and CA67947 from the National Cancer Institute.

Address correspondence to W. F. Good.


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. To evaluate observers' ability to detect breast masses and clustered microcalcifications depicted on data compressed mammograms, an observer performance study was performed.

MATERIALS AND METHODS. Eight observers assessed 60 mammographic images obtained in six modes, ranging from noncompressed to a maximum data compression level of 101:1. Observers were asked to rate the images on a scale of 0 to 100 for the likelihood of the presence of a mass and also independently for the likelihood of the presence of clustered microcalcifications. In addition, observers were asked to rate their subjective assessment of the quality of each image for the detection of a mass and separately for the detection of microcalcifications. Receiver operating characteristic analyses were performed.

RESULTS. The average area under the receiver operating characteristic curve, Az, for the detection of clustered microcalcifications decreases significantly at the highest data compression level when compared with the noncompressed and two lowest levels of data compression (p < 0.01), and a trend test of the average area under the receiver operating characteristic curve for all observers is statistically significant (p < 0.05). No statistically significant differences among or between any of the data compression level modes for the detection of masses were detected.

CONCLUSION. At a high level of mammogram data compression, observer performance was degraded for the detection of clustered microcalcifications. Detection of masses was not affected by the data compression methods and levels used in this study.


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Investigators in the field of digital imaging believe that some form of image data compression will be necessary for the efficient handling of image data in modern radiology departments [1,2,3]. Mammography continues to be the most commonly used imaging modality for early detection of breast cancer. It is believed that digital acquisition of mammograms, in conjunction with digital management and the ability to transmit and display mammograms remotely, will ultimately increase utility, efficiency, and efficacy of this procedure in the screening environment [4]. There is hope that remote review and consultation will enhance the value of screening mammography, particularly in underserved areas. Because of the diagnostic impact of errors that may result from the use of transmitted images, observer performance cannot decrease when these compressed images are used for this purpose. Several previous studies on the effects of data compression in chest and mammography procedures failed to conclusively show any significant differences in diagnostic accuracy at different levels of data compression [5,6,7,8]. For this reason, we have conducted an observer performance study to evaluate mammographic images compressed to increasingly higher ratios using a scheme that is compatible with the Joint Photographic Experts Group (JPEG) compression standard. This is a frequency-dependent data compression scheme that has been used extensively in imaging in general and in medical imaging in particular [9].


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Film Preparation
Sixty single-view mammograms were used in this study. Of these, 27 mammograms depicted clusters of microcalcifications, 26 depicted masses, and nine depicted both. The abnormalities revealed in 14 of the 60 images were malignant. All mammograms with positive findings were verified from pathology reports. The depicted masses ranged from 6 to 18 mm (larger dimension) with a mean size of 11.6 mm (±SD, ±4.0 mm). Eight of the masses were spiculated. Nine of the masses were rated as "easy" to detect, and 18 were rated as "subtle" (n = 4) or "extremely subtle" (n = 4) by an experienced observer who knew the true diagnosis (the presence of a mass and its location). The size of individual microcalcifications was not measured. However, of the 29 distinct clusters visualized on 27 images, 16 were rated subjectively as subtle or extremely subtle. The number of depicted microcalcifications varied significantly, with most mammograms (n = 20) depicting between four and nine. For the images with negative findings, a follow-up examination obtained at least 1 year after the mammogram with negative findings used in this study verified the examination in question. The images were subjectively rated as "fatty" (n = 3), "mostly fatty" (n = 29), "heterogeneously dense" (n = 22), and "very dense" (n = 6) by an experienced observer. We did not use the original (conventional) nondigitized images in this study, because these could be readily recognized. To stress the diagnostic system compared with the typical clinical environment, a larger fraction of the images selected were rated as subtle (difficult) either to detect the abnormality in question or to correctly identify the image as showing negative findings. All images were acquired at our institution on a DMR mammographic unit (General Electric Medical Systems, Schenectady, NY) and a Min R screen and MRE-1 film (Eastman Kodak, Rochester, NY).

The digitization process used a high-resolution, high-contrast sensitivity modified Lumisys-150 laser film digitizer (Lumisys, Sunnyvale, CA) that produces a scan matrix of 4096 x 5120 pixels for an 8 x 10 inch (18 x 24 cm) film by digitizing at a 50-µm sampling interval. This pixel resolution results in a Nyquist spatial frequency of 10 cycles/mm, which preserves much of the signal resolution of the analog film. The modulation transfer function at the Nyquist frequency is 30% in the fast-scan direction and 33% in the slow-scan direction. The 16-bit analog-to-digital converter of the digitizer permits a density measurement with a root mean square error of less than 0.01.

Each digitized image was compressed at five different compression levels. The JPEG compression algorithm used in this study is a 12-bit version of the standard. Before the JPEG algorithm was applied, each digitized mammogram was segmented to identify image areas corresponding to tissue as opposed to background, and background pixels were set to a constant value. The actual procedures used in this process have been described in detail elsewhere [9]. Because it is the degree of quantization rather than the compression ratio itself that determines the degradation of an image by compression, a measure of quantization was used as an independent variable and not the actual compression ratio. Specifically, we produced quantization matrices by scaling a base matrix that had been designed to preserve slightly more accuracy for the lowest frequency components (i.e., the zero-frequency component q00 = 0.5; low-frequency components q01 = q10 = q11 = 0.7; and all other entries qij = 1.0 when i,j >= 2). The five constant scaling factors we applied to this base matrix were 128, 176, 240, 304, and 400, which produced mean compression ratios of 24:1, 34:1, 52:1, 70:1, and 101:1, respectively. With this scheme, increased image noise will reduce compressibility for the same quantization factors, and digitization at different resolutions will result in different spatial frequencies in the image being affected by each quantization factor. In our data set, the highest level of quantization produced a mean compression ratio of 101:1, which is significantly higher than what is required to permit the efficient handling of mammographic images with current digital technology, and it is in the range in which studies of other algorithms applied to different image types have shown significant deterioration in image quality [9, 10]. Note that these particular quantization factors are appropriate only for images digitized at the resolution and with the noise characteristics under consideration in this study.

All images used in this study, both the data compressed versions as well as the original (noncompressed) digitized data versions, were laser-printed onto film with a specially modified Kodak Ektascan 2180 laser film printer (Eastman Kodak). This laser printer was modified to print a 43-µm pixel size. Laser-printed films were used for all image comparisons. The noise contributed by the laser printer is substantially lower than image noise in the original film.

The data sent to the printer are first passed through a specially designed lookup table. The first portion of the table, which corresponds to the brightest pixels, is quite flat (low contrast). The second portion of the table is steeper at the values that correspond to most of the tissue pixels in the image. The third part is relatively flat at the values corresponding to the darkest pixels in the image. This three-segment lookup table compensates for the laser printer's lower dynamic range (as compared with the original film) while preserving much of the contrast and brightness of the original film in the optical density range of 0.3 to 3.2. No enhancement processing was performed on the images (e.g., unsharp masking).

Observer Performance Study
Each image was labeled with a number. At the start of each observer performance session, each participating mammographer, all of whom were unaware of the study's purpose, started a computer program that produced a list of 30 random numbers in one mode (not identified to the participant). The observers pulled the film images in the order corresponding to the printed list from an ordered file and started a computer program that provided a unique scoring screen for each image in the same order as shown on their list; the image number was displayed on each scoring screen to ensure that the answers entered into the computer were correctly matched with the image. Each screen was identical, and each observer would view the film image and use a mouse to move a slider on a scale of 0-100 (0, definitely absent; 100, definitely present) to indicate confidence as to the likelihood of the presence of clustered microcalcifications and use a separate slider for the presence of a mass. Observers used similar sliding scales to assess the image quality for the detection of clustered microcalcifications and separately for the detection of a mass. Only when all the information was entered for a given image could the observer access the next screen. The time the answer form remained on the computer was recorded as an approximation of the time, in seconds, it took to review the accompanying image and was recorded in the answer file for that image.

Data Analysis
The area under the estimated receiver operating characteristic curve (Az) was calculated for each observer, each compression level, and each of the abnormalities with the use of the ROCKIT program (Charles E. Metz, University of Chicago), which allows the option of calculating the Az for continuous data [11]. This calculation resulted in a total of 96 Az estimates (eight observers times six modes times two abnormalities).

The data were analyzed with the use of a single-factor one-way analysis of variance and a trend analysis to test for any significant differences in observer performances due to data compression. The trend test was a regression analysis, for which the statistical inference was derived by the slope estimate and the corresponding t statistic.

For masses and clustered microcalcifications, image quality ratings were averaged for each observer, and the mean and standard error of the averages were calculated for each mode. Similarly, the estimates of the observer times, in seconds, were averaged for each observer, and the mean and standard error of those averages were computed for each mode. Times exceeding 300 sec (5 min) were excluded from the analysis in an effort to discard times that were excessively long because of external factors such as telephone calls and discussions. In this study, approximately 4% of the interpretations were excluded from the computation.


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Tables 1 and 2 summarize the areas under the receiver operating characteristic curves and the corresponding standard deviations by observer and compression level for detection of clustered microcalcifications and masses. Averages were obtained for the eight observers who completed the study, and the mean and standard error were computed for each compression level. The one-way analysis of variance results, when the variability due to observers and modes is considered, was statistically significant for the detection of clustered microcalcifications (p < 0.01), but not for the detection of a mass (p = 0.76). For clustered microcalcifications, Tukey's test was applied to determine which specific pairwise differences were among the average areas of the modes. This method established a statistically significant difference between only the following three pairwise comparisons of compression ratios: noncompressed versus 101:1, 24:1 versus 101:1, and 34:1 versus 101:1 (p < 0.05). All other pairwise comparisons were not statistically significant. The trend test for the microcalcification data showed a statistically significant decrease among the average areas as the compression level increased (p < 0.05). The trend test remained significant for clustered microcalcifications (p < 0.05) after omitting the results of the highest compression level (101:1).


View this table:
[in this window]
[in a new window]
 
TABLE 1 Observer-Specific Area Under the Receiver Operating Characteristic Curve Values (Az) for Detection of Clustered Microcalcifications After Data Compression to Different Compression Levels

 

View this table:
[in this window]
[in a new window]
 
TABLE 2 Observer-Specific Area Under the Receiver Operating Characteristic Curve Values (Az) for Detection of Masses After Data Compression to Different Compression Levels

 

The means of the image quality ratings for each successive data compression level, starting with noncompressed, were 55, 53, 53, 53, 50, 50 and 49, 47, 47, 44, 38, 33 for the detection of a mass and clustered microcalcifications, respectively. The results of a trend test showed a statistically significant decrease in the perceived image quality rating for both abnormalities as data compression levels increased (p < 0.01). For the detection of microcalcifications, the perceived image quality was judged as poorer and decreased faster with increasing data compression levels. The mean interpretation times (±SD) in seconds for the different compression levels were 67 (±15), 65 (±10), 58 (±17), 62 (±17), 62 (±21), and 59 (±14) for the noncompressed and increasing data compression levels, respectively. These results were not significantly different for any mode (p >= 0.12), and a trend test was also not significant (p = 0.79).


Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The results of this study indicate that the JPEG data compression, as used here, affects observer performance in the detection of clustered microcalcifications at high levels of compression. This lower performance may be partially because of the increased noise that may mimic the microcalcification clusters. The trend test indicated that the Az value decreases consistently with increasing levels of compression. The trend test remained significant after the exclusion of the highest level from the analysis. Hence, levels of compression substantially below 101:1 may also prove to affect the detection of clustered microcalcifications in digital mammography.

The relevance of this study is limited somewhat by the fact that we used only a single view, whereas multiple views are used (two at a minimum) in the clinical environment. However, relative detection performance levels and trends in performance as a function of levels of data compression are the main interest here rather than absolute detection rates. Therefore, we believe that the study results and conclusions are scientifically valid even if direct clinical relevance is reduced. As indicated, we did not use the conventional nondigitized film in our study because it was readily recognizable. Because of its higher quality (mainly resolution > 10 lp/mm) when compared with the laser-printed film, one would expect potentially better performance with these films. However, as to the noncompressed digital images, the quality was quite high, and the detection performance is expected to be similar to that with conventional film.

In a previous multipoint rank-order study performed by our group [12], a subset of the cases used in this study was presented to a different group of mammographers who were asked to rank the images in each case from best to worst with respect to their usefulness in making the correct diagnosis of masses and clustered microcalcifications. Observers were able to order the films roughly close to the order of the compression level even though the observers were unaware of the study's exact purpose. Although subjective, these results are consistent with the objective performance-based findings presented here.

Assuming that there is acceptable performance in interpreting mammograms at relatively high levels of data compression, the subjective ratings of image quality, which decrease steadily with increasing levels of data compression, suggest that subjective comfort of radiologists rather than objective degradation in performance may ultimately determine the maximum acceptable data compression level. We want to emphasize that objectively measured detection performance levels—rather than subjective ratings of perceived image quality—should ultimately determine what is acceptable (or not) in the clinical environment, but in practical terms this ideal is not always the case. When objective and subjective measures largely concur, as in this case, one should seriously evaluate the implications for both diagnosis as well as individual preferences.

In conclusion, high levels of data compression seem to affect observer performance in the detection of clustered microcalcifications. As important, perhaps, is that subjective assessment of image quality seems to indicate that comfort level of the observer decreases as well. The consistent trends we have observed suggest that there may also be an effect at lower compression levels, but measuring such effects would require a prohibitively large sample size. Hence, although possibly real, these effects may not be relevant to the clinical environment.


Acknowledgments
 
We thank Jill King and David Gur for their participation in the study and data analysis as well as with their help with manuscript preparation.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Egashira K, Nakata H, Watanabe H, et al. Clinical evaluation of irreversible data compression for computed radiography of the chest. J Digit Imaging 1998;11:176 -181[Medline]
  2. Aberle DR, Fergus G, Sayre JW, et al. The effect of irreversible image compression on diagnostic accuracy in thoracic imaging. Invest Radiol 1993;28:398 -403[Medline]
  3. Chan H, Lo SB, Niklason LT, Ikeda DM, Lam KL. Image compression in digital mammography: effects on computerized detection of subtle microcalcifications. Med Phys 1996;23:1325 -1336[Medline]
  4. Pisano ED. Current status of full-field digital mammography. Radiology 2000;214:26 -28[Free Full Text]
  5. MacMahon H, Doi K, Sanada M, et al. Data compression: effect on diagnostic accuracy in digital chest radiography. Radiology 1991;178:175 -179[Abstract/Free Full Text]
  6. Kido S, Ikezoe J, Kondoh H, et al. Detection of subtle interstitial abnormalities of the lungs on digitized chest radiographs: acceptable data compression ratios. AJR 1996;167:111 -115[Abstract/Free Full Text]
  7. Good WF, Matiz G, King J, Gennari R, Gur D. Observer performance assessment of JPEG-compressed high-resolution chest images. Proc SPIE 1999;3663:8 -13
  8. Adams CN, Aiyer A, Betts BJ, et al. Evaluating quality and utility of digital mammograms and glossy compressed digital mammograms. In: Doi K, Giger M, Nishikawa RM, Schmidt RA, eds. Proceedings of the Third International Workshop on Digital Mammography, Amsterdam: Elsevier Science, 1996:169 -176
  9. Good WF, Maitz GS, Gur D. Joint Photographic Experts Group (JPEG) compatible data compression of mammograms. J Digit Imaging 1994;7:123 -132[Medline]
  10. Rockette HE, Johns CM, Weissman JL, et al. Relationship of subjective ratings of image quality and observer performance. Proc SPIE 1997;3036:152 -159
  11. Metz CE, Herman BA, Roe CA. Statistical comparison of two ROC-curve estimates obtained from partially paired datasets. Med Decis Making 1998;18:110 -121[Abstract/Free Full Text]
  12. Good WF, Sumkin JH, Dash N, et al. Observer sensitivity to small differences: a multipoint rank-order experiment. AJR 1999;173:275 -278[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
RadiologyHome page
M. Penedo, M. Souto, P. G. Tahoces, J. M. Carreira, J. Villalon, G. Porto, C. Seoane, J. J. Vidal, K. S. Berbaum, D. P. Chakraborty, et al.
Free-Response Receiver Operating Characteristic Evaluation of Lossy JPEG2000 and Object-based Set Partitioning in Hierarchical Trees Compression of Digitized Mammograms
Radiology, November 1, 2005; 237(2): 450 - 457.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
S. Suryanarayanan, A. Karellas, S. Vedantham, S. M. Waldrop, and C. J. D'Orsi
Detection of Simulated Lesions on Data-compressed Digital Mammograms
Radiology, July 1, 2005; 236(1): 31 - 36.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Good, W. F.
Right arrow Articles by Klym, A. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Good, W. F.
Right arrow Articles by Klym, A. H.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Hotlight (NEW!)
Right arrow
What's Hotlight?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS