|
|
||||||||
1
Department of Biomedical Engineering, ND20, Agnes Christine Roberts Breast
Imaging Laboratory, The Cleveland Clinic Foundation, 9500 Euclid Ave.,
Cleveland, OH 44195.
2
Department of Biostatistics and Epidemiology, The Cleveland Clinic Foundation,
Cleveland, OH 44195.
3
Division of Radiology, The Cleveland Clinic Foundation, Cleveland, OH
44195.
Received September 30, 1999;
accepted after revision December 8, 1999.
Supported by institutional research grant RPC-3528 from the American Cancer
Society.
Abstract
|
|
|---|
MATERIALS AND METHODS. Four different image processing algorithms (adaptive unsharp masking, contrast-limited adaptive histogram equalization, adaptive neighborhood contrast enhancement, and wavelet-based enhancement) were applied to one image of secondarily digitized mammograms of forty cases (10 each of benign and malignant masses and 10 each of benign and malignant microcalcifications). The four enhanced images and the one unenhanced image were displayed randomly across three high-resolution monitors. Four expert mammographers ranked the unenhanced and the four enhanced images from 1 (best) to 5 (worst).
RESULTS. For microcalcifications, the adaptive neighborhood contrast enhancement algorithm was the most preferred in 49% of the interpretations, the wavelet-based enhancement in 28%, and the unenhanced image in 13%. For masses, the unenhanced image was the most preferred in 58% of cases, followed by the unsharp masking algorithm (28%).
CONCLUSION. Appropriate image enhancement improves the visibility of microcalcifications. Among the different algorithms, the adaptive neighborhood contrast enhancement algorithm was preferred most often. For masses, no significant improvement was observed with any of these image processing approaches compared with the unenhanced image. Different image processing approaches may need to be used, depending on the type of lesion. This study has implications for the practice of digital mammography.
|
|
|---|
Only a few of these algorithms have undergone any kind of clinical testing [5, 12, 15]. Also, in most cases, the results were printed back to film. However, with the advent of full-field digital mammography, it is important that the performance of these enhancement algorithms be evaluated for soft-copy display. It is also important to compare these algorithms with each other because algorithms in each category represent different image-processing methodologies. Such a comparison will help to decide which algorithm will be truly useful to clinicians and to evaluate the role of mammographic image enhancement in a soft-copy display environment. Thus, the purpose of this study was to determine the preference of mammographers among several of these algorithms in a clinical soft-copy display setting and to establish a promising set of algorithms for use with mammograms. These algorithms could then be evaluated in a diagnostic accuracy study with a larger number of cases.
|
|
|---|
All mammography at our institution was performed on American College of Radiologyaccredited mammography units, using 25-30 kVp with phototiming and molybdenum targets and filters. The film-screen combination was Min-R screens coupled with Min-R E film (Eastman Kodak, Rochester, NY). The units meet Mammography Quality Standards Act specifications. Film used to record mammograms, in general, comes in two sizes: 8 x 10 inch (20 x 25 cm) films for smaller breasts, and 10 x 12 inch (25 x 30 cm) films for larger breasts.
All digital images were cropped to a size of 800 x 2000 pixels for analysis. In the case of small films (8 x 10 inches [20 x 25 cm]), the entire breast was included with minimal background. In the case of the larger films (10 x 12 inches [25 x 30 cm]), cropping deleted a small portion of the breast area. However, in these cases, care was taken to ensure that the entire region on the mammogram containing the abnormality was included.
Image Enhancement
One algorithm each from the categories of adaptive unsharp masking
[17], contrast-limited
adaptive histogram equalization
[7], adaptive neighborhood
contrast enhancement [11], and
wavelet-based enhancement [13]
was selected for analysis. (Because the fuzzy modeling
[16] mammographic enhancement
algorithm is still preliminary, it was not considered for this study.) These
algorithms were implemented in the "C" programming language and
run on an Iris Indigo TM XZ workstation (Silicon Graphics, Mountain View, CA).
The 40 images in our data set (20 with microcalcifications [Fig.
1A,1B,1C,1D,1E]
and 20 with masses [Fig.
2A,2B,2C,2D,2E])
were processed with each of the four algorithms to obtain four enhanced images
per case. For all algorithms except the adaptive neighborhood contrast
enhancement, certain parameters had to be fixed before running the algorithm.
The parameters for each algorithm were chosen within the range suggested in
the original references. Brief descriptions of the selected algorithms follow.
Please refer to the original references for details.
|
|
|
|
|
|
|
|
|
|
Adaptive unsharp masking.Unsharp masking is the process of
subtracting a blurred image from an original and is often used as an image
enhancement tool in photography. For a digital image I, an unsharp
masked image, IUNS, can be obtained by first creating
low-pass filtered and high-pass filtered versions of I, namely,
IL and IH. IUNS is
then given by:
![]() | (1) |
This gain (G) [17] is based not only on the local characteristics of the image but also on two important properties of the human visual system. The first property is the fact that the threshold contrast in a region is a function of the spatial frequency in that region. The second is the increased sensitivity of the human eye to noise in "busy" or detailed areas in which the spatial activity is higher than in smoother regions.
The algorithm consists of first creating three images from I: IL, a low-pass filtered image obtained using a moving average filter; IH, a high-pass filtered image obtained by subtracting IL from I; and IG, a gradient image obtained using two orthogonal Prewitt standard gradient operators. I is also divided into contiguous regions that are blocks with 50% overlap. IH is used to obtain the local contrast in every region. IG is used to classify the regions in I into smooth and detail regions, and to obtain the spatial frequency and thus the threshold contrast over every detail region. For a smooth region, G is equal to 1. For a detail region, G is greater than 1 only if the local contrast is less than the threshold contrast. The G value for each region is applied to the central pixel in the region. G at any other pixel is obtained by bilinear interpolation of the contrast gains of the pixel's four neighboring blocks. The final image is then given by equation 1.
For the unsharp masking procedure, although the original algorithm includes a global enhancement procedure, before the actual calculation of the unsharp mask image, it was found that this step was not required. Two parameters were required to be specified: pT, a threshold to separate smooth and detail regions, was selected as 0.1, and JND0, the Just Noticeable Difference at low spatial frequencies, was set to 50.
Contrast-limited adaptive histogram equalization.Histogram equalization [18] is a standard technique used for the enhancement of images and is performed by replacing every pixel in a given image by the integral of the histogram of the given image up to the value of the pixel. With adaptive histogram equalization, a local histogram is calculated and used to obtain the final value. With standard and adaptive histogram equalization, there is a danger of overenhancing the image because of the enhancement of noise. High peaks in the histogram, caused by nearly uniform regions, lead to large values in the final image because of integration. This problem can be corrected by limiting the amount of contrast enhancement at every pixel, which is achieved by clipping the original histogram to a limit. This is the central idea behind the contrast-limited adaptive histogram equalization algorithm proposed by Pizer et al. [7]. Because histogram clipping requires redistribution of those pixels that are above the contrast limit, an iterative binary search procedure is used to perform this redistribution. Thus, the procedure consists of obtaining a local histogram at every pixel, clipping this histogram to the specified clipping limit, modifying the histogram by redistributing pixels using the binary search procedure, and integrating the histogram up to the value of the pixel to obtain the final value. For our analysis, the contrast limit was selected as 0.5. Although the algorithm does not require binning of the histogram, this step was found to be necessary. We selected a bin size of 100. At every pixel, a neighborhood size of 20 x 20 pixels was selected. According to the algorithm in the study by Pizer et al., the original image can be divided into blocks, with the histogram modification procedure being performed only for the central pixel in each block, so that bilinear interpolation can be used to obtain the remaining pixel values. However, this method was found to give poor results. Hence, the histogram modification procedure was applied to every pixel, which was computationally expensive.
Adaptive neighborhood contrast enhancement.In the algorithms already discussed, the local neighborhood at every pixel is a rectangular region, which may not adequately represent the local characteristics of a pixel. Thus, at every pixel it is important to define a contextual region, appropriate to that particular pixel, that defines the local neighborhood of that pixel. This is done with adaptive-neighborhood or region-based image processing.
We used the adaptive-neighborhood contrast enhancement algorithm described
by Morrow et al. [11] for our
analysis. The algorithm consists of the following: Every pixel in the image is
a seed pixel for a region aggregation procedure. Starting at the seed pixel,
the local region around the pixel is formed by adding neighboring pixels to
the region if they are within a specified gray-level deviation, k,
from the seed pixel. Once the local region (called the foreground) is formed,
a background (having approximately the same number of pixels as the
foreground) is grown around the foreground. If f and b are
the mean gray-level values of the foreground and background, then the
contrast, C, of the region is given by:
![]() | (2) |
This equation is similar to Weber's ratio, W, which defines the
ratio of luminance difference between a just noticeable object and its
background, to its background luminance
[18]. W is
approximately 0.02 for simple objects viewed under standard lighting
conditions, which implies that regions that are less than 2% in luminance
difference from their background will be indistinguishable from their
background. The minimum contrast, Cmin, that a region has
with respect to its background is related to the gray-level deviation
k by Cmin
k / 2
[11]. Hence, for the region or
feature to be just distinguishable from its background, k has to be
at most 0.04. Once the foreground and background are grown for a given seed
pixel, C is calculated using equation 2. An empirically defined
look-up table has been described
[11], by which a pixel whose
region has a contrast C from its background is given a new value to
increase the contrast. The look-up table has been constructed so that regions
with high contrast are not affected, and only regions with low to moderate
contrast are enhanced. Equation 2 can be rearranged to obtain the new gray
level at the seed pixel. All pixels in a foreground region having the same
value as the seed pixel (called redundant pixels) are also updated to the same
new value, thus saving computational time.
Wavelet-based enhancement.A multiresolution or wavelet
decomposition of an image approximately divides the image into several
subbands containing features at different scales. The advantage of a
multiresolution decomposition of mammograms is that small features like
microcalcifications will be prominent in one subband, whereas larger features
like masses will be dominant in a different subband. We have used the dyadic
wavelet enhancement algorithm
[13,
19] for our analyses. The
given image I is first decomposed into a set of subband images using
appropriate analysis filters F. The image can be reconstructed or
synthesized from its subband images using synthesis filters G. An
L-level M-orientation decomposition and reconstruction of
I is given by:
![]() | (3) |
After dyadic wavelet decomposition, we used the multiscale adaptive gain
procedure [13] to enhance each
subband image. This procedure consists of suppressing pixel values of very low
amplitude and enhancing those that are higher than a certain threshold within
each level. A graylevel y in the original subband image is converted
to f(y) by:
![]() | (4) |
![]() | (5) |
![]() | (6) |
In this step, b and c control the threshold and rate of enhancement, respectively, and must be specified. We chose three levels for the decomposition, with b = 0.2 and c = 20. Although, according to the procedure described by Laine et al. [13] all subband images are enhanced, it was noted that results were much better when only the first level was enhanced.
Clinical Review
An outline image was prepared for each case that showed the outline of the
breast and indicated the location of the lesion. A Motif X-Windows display
program (X/Motif version 1.2; Open Software Foundation, Woburn, MA) was
developed to display six images (four enhanced images with the original and
outline) across three 2K high-resolution monitors (Megascan, Billerica, MA)
driven by an MD2K graphics display card (DOME Imaging Systems, Waltham, MA)
with two images displayed per monitor. The contrast settings and luminance
output on all monitors were adjusted to be the same and were verified with a
photometer. Once the contrast settings were fixed, they remained unchanged
during the entire study.
A randomization protocol was written so that the enhanced images and an unenhanced original image appeared in random positions across the three monitors, with the outline image always being shown on the left side of the leftmost monitor. Radiologists were shown the 20 images with microcalcifications first, and then the 20 images with masses. The order of cases in each category was randomized, so that benign and malignant lesions in each category were shown in random order. The display software was written so that once the first case was pulled up, subsequent cases could be loaded one after the other by simply pushing a button. The radiologists were asked to rank the five images from 1 (best) to 5 (worst) with regard to the visibility and characterization of the lesion. The pathology and location of the lesion were available to them in each case. When the observers judged two images to be equal, those two were given the same rank, and the rank immediately below was skipped. A study design such as this, in which all the images from a case are displayed side by side and the observer rank-orders them from best to worst, has been used and advocated recently by Good et al. [20]. They used the term "multipoint rank ordering" to describe the study design.
Four experienced board-certified radiologists participated in the study. They were initially trained to use the display software with eight cases (not part of the study set), consisting of two cases each of benign and malignant microcalcifications and masses. Once the training session was completed for all radiologists, they observed all 40 cases in one session and were not allowed to use any aids (e.g., magnifying glass, etc.). Every observer maintained approximately the same lighting conditions during their session. They were also not allowed to change the monitor settings in any way.
Statistical Analysis
We used the Friedman two-way nonparametric test to test the null hypothesis
that all five images would be equally preferred, versus the alternative
hypothesis that there would be at least one difference in observer preference
among the five. If the null hypothesis was rejected at a significance level of
0.05, then we used Wilcoxon's signed rank test to assess all pairwise
comparisons of the five images. To control the type I error rate of these
pairwise analyses, we adjusted the p values using the method of Holm
[21]. A separate analysis was
performed for each of the four observers.
With a sample size of 20 microcalcifications (and 20 masses), we have power of 0.80 to detect a difference between two algorithms if one is preferred in 85% or more of cases.
This sample size estimate is based on Wilcoxon's signed rank test with a type I error rate of 0.05 for a two-tailed test [22].
|
|
|---|
|
|
For microcalcifications, the adaptive neighborhood contrast enhancement algorithm had the lowest average ranking overall (highest preference), followed by the wavelet enhancement algorithm. Of the 80 interpretations (20 cases x 4 observers), the adaptive neighborhood contrast enhancement algorithm was chosen as the most preferred in 49% (39/80) of the interpretations, the wavelet in 28% (22/80), and the unenhanced image in 13% (10/80).
For masses, the unenhanced image had the lowest average ranking, followed by the unsharp masking algorithm. Of the 79 interpretations, the unenhanced image was the most preferred in 58% (46/79) of cases, followed by the unsharp algorithm in 28% (22/79).
Statistically significant differences were noted among the five images (Friedman test; p <0.001 for all four observers). Tables 3 and 4 summarize the results of the pairwise analyses for calcifications and masses, respectively.
|
|
For microcalcifications, the adaptive neighborhood contrast enhancement and
wavelet algorithms were preferred over contrast-limited adaptive histogram
equalization by all four observers (p
0.011 and p
0.030, respectively); adaptive neighborhood contrast-enhancement was preferred
over the original by two of the four observers; the wavelet was preferred over
the original for one of the four observers. Adaptive neighborhood contrast
enhancement and wavelet were preferred over unsharp by one of the four
observers. The differences between adaptive neighborhood contrast enhancement
and wavelet did not reach statistical significance (p ranged from
0.135 to 1.0). The original and unsharp images were preferred to
contrast-limited adaptive histogram equalization by three of the four
observers (p = 0.001 and p
0.010, respectively). For
masses, the original was preferred by all four observers over contrast-limited
adaptive histogram equalization (p
0.017), adaptive neighborhood
contrast enhancement (p
0.013), and wavelet (p
0.016). The original was preferred over the unsharp by one of the four
observers. The unsharp was preferred over the wavelet by one observer, over
contrast-limited adaptive histogram equalization by two observers, and over
adaptive neighborhood contrast enhancement by no observers. Adaptive
neighborhood contrast enhancement was preferred over contrast-limited adaptive
histogram equalization by two observers.
|
|
|---|
From the preferences established in this study, it is clear that algorithms that change the image appearance drastically (like contrast-limited adaptive histogram equalization) are the least preferred by radiologists, whereas algorithms that try to preserve the appearance of the original images (like adaptive neighborhood contrast enhancement) are the most preferred by radiologists. It is reasonable to assume that this preference for enhanced images that maintain the appearance of the original images is applicable to direct digital images as well.
In our case, parameters for each algorithm were chosen in the range suggested in the original references. However, those ranges were established by trial and error on a small set of cases in the original references, and hence it is possible that slightly different enhancements could be obtained with a different choice of parameters. It would thus be useful to establish the optimal set of parameters for each algorithm before use in a clinical setting. Puff et al. [23] proposed a technique to establish parameters for mammogram enhancement algorithms. The method consists of performing a large number of experiments using mammograms with normal findings with embedded simulated masses and microcalcifications. The mammograms are enhanced with different parameter settings, and optimal parameters are established on the basis of observer preferences. However, because parameter selection is based on simulated structures, it is not clear whether the same parameter combination would be appropriate for use with real mammograms. It would thus be useful to perform a study similar to the one by Puff et al. using several real cases of abnormal findings. Such a study would also be required to establish the optimal set of parameters for use with direct digital images. We believe that with direct digital images, although parameter tuning may be required, the actual preferred algorithms may not change.
The radiologists' preference for a particular enhancement algorithm applied to a lesion of known pathology, as established in this study, does not directly give any indication of the usefulness of the same algorithm in the diagnosis of unknown lesions. However, it still gives an indication of the more promising algorithms that could be used in a truly diagnostic setup.
Although we focused on enhancement of masses and microcalcifications, there are other significant features of interest in mammograms (i.e., architectural distortions). In practice, the number of cases with masses and microcalcifications far outweighs the cases with architectural distortions. Also, in many cases architectural distortions are associated with other findings. However, it would be useful to include such cases as part of a future study.
The main objective of this study was to establish a set of promising enhancement algorithms that could then be evaluated with blinded cases in a receiver operating characteristic curve study. From our study it is clear that for microcalcifications, the adaptive neighborhood contrast enhancement and wavelet algorithms appear to be the most promising. We propose to conduct a diagnostic accuracy study with mammograms containing both benign and malignant microcalcifications to evaluate these two algorithms. The diagnostic accuracy study will be designed to test whether the accuracy and call-back rates of an unenhanced image are improved by wavelet enhancement, whether the accuracy and call-back rates of an unenhanced image are improved by adaptive neighborhood contrast enhancement, and whether the accuracy and call-back rates of an unenhanced image with wavelet enhancement are different from the accuracy of an unenhanced image with adaptive neighborhood contrast enhancement.
Because none of these methods of enhancement was appropriate for mass enhancement and because it would be desirable to have one algorithm that works equally well for both masses and microcalcifications, we propose to develop a new method based on the combination of the wavelet and adaptive neighborhood contrast enhancement approach [11, 13] to be applied to both masses and microcalcifications. Developing such a new method will be the focus of future research.
In conclusion, the performance of four image enhancement algorithms on secondarily digitized mammograms containing benign and malignant microcalcifications and masses of known disorders were compared on softcopy display. For microcalcifications, adaptive neighborhood contrast enhancement was the most preferred algorithm, followed by wavelet enhancement. For masses, no enhancement was significantly preferred over the unenhanced image. Different image processing approaches may need to be used, depending on the type of lesion. Future work should concentrate on developing an appropriate algorithm that will enhance both masses and microcalcifications.
Acknowledgments
We thank Christine Quinn, Philip F. Murphy, and Diedre M. Coll for their
review of mammograms. We also thank R. M. Randayyan of the University of
Calgary, Alberta, Canada, for providing the code for the adaptive neighborhood
contrast enhancement algorithm.
|
|
|---|
This article has been cited by other articles:
![]() |
F. Zanca, C. Van Ongeval, J. Jacobs, G. Marchal, and H. Bosmans A QUANTITATIVE METHOD FOR EVALUATING THE DETECTABILITY OF LESIONS IN DIGITAL MAMMOGRAPHY Radiat Prot Dosimetry, March 4, 2008; (2008) ncn049v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Van Ongeval, A. Van Steen, C. Geniets, F. Dekeyzer, H. Bosmans, and G. Marchal CLINICAL IMAGE QUALITY CRITERIA FOR FULL FIELD DIGITAL MAMMOGRAPHY: A FIRST PRACTICAL APPLICATION Radiat Prot Dosimetry, March 4, 2008; (2008) ncn029v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Bosmans, A.-K. Carton, F. Rogge, F. Zanca, J. Jacobs, C. Van Ongeval, K. Nijs, A. Van Steen, and G. Marchal Image quality measurements and metrics in full field digital mammography: an overview Radiat Prot Dosimetry, December 1, 2005; 117(1-3): 120 - 130. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |