|
|
||||||||
1 All authors: Department of Radiology, College of Medicine, and H. Lee Moffitt Cancer Center and Research Institute at the University of South Florida, 12901 Bruce B. Downs Blvd., Box 17, Tampa, FL 33612-4799.
Received March 11, 2003;
accepted after revision September 12, 2003.
Address correspondence to M. Kallergi
(kallergi{at}moffitt.usf.edu).
Abstract
|
|
|---|
MATERIALS AND METHODS. A wavelet algorithm was designed to attenuate the image spectral characteristics responsible for the long-range image correlation that often interferes with digital display. The algorithm was evaluated with a localization response operating characteristic (LROC) experiment with 500 negative, benign, and cancer cases with masses and calcification clusters. Three observers reviewed the original and wavelet-enhanced images on a 5-Mpixel monitor using a custom-made workstation user interface.
RESULTS. Performance indexes were estimated for four different case combinations, each observer, and each interpretation mode. Wavelet enhancement improved the performance of all observers in all case combinations. Detection accuracy ranged from 0.678 to 0.827 for the unprocessed original data and 0.7090.871 for the enhanced cases. Localization accuracy ranged from 0.547 to 0.785 for the original images and 0.5680.847 for the enhanced cases, yielding increases of 515%. The difference between enhanced and original performances was statistically significant at the 0.10 level and in a few combinations at the 0.05 level.
CONCLUSION. Soft-copy digitized mammography could replace standard film mammography under appropriate display parameters and conditions. The optimization of the soft-copy quality is expected to require more advanced processing techniques than standard gray-scale adjustments. Wavelet-based algorithms, such as the one proposed here, offer better soft-copy quality than the originals and a better starting point for additional manual gray-scale adjustments or automated postprocessing.
|
|
|---|
Despite technical limitations, recent reports on full-field digital mammography have clearly shown equivalency between soft-copy full-field digital and standard film mammography [4, 5]. Few receiver operating characteristic (ROC) studies have been conducted to show the equivalency between film and soft-copy digitized mammography [610]. Results from the latter studies are promising and suggest that soft-copy digitized mammography is equivalent to standard film mammography; our own ROC experiments showed no statistically significant difference between the two techniques [6, 8, 9]. These studies, however, are not always in agreement and are not consistent across observers or cases or both. Even when no statistically significant difference is found between the two techniques, soft-copy interpretation of digitized mammograms yields mostly lower performance indexes than film interpretation [6, 8, 9]. Key factors in showing equivalency seem to be the digitization parameters, including pixel size (spatial resolution) and depth (dynamic range), the soft-copy "hanging" protocolthat is, the sequence with which images are shown to the observerthe speed of the user display interface, and the quality of the digitized image that is, the appearance and distribution of the gray values [610]. Our studies have shown that conventional linear and nonlinear methods do not fully use the range of pixel values of the digitized images and have limited success [6, 8]. Hence, there is a need for advanced techniques that allow a balanced representation of fat and dense areas in a digitized mammogram as well as maximization of the information presented in the display.
Advanced image-processing techniques may include a thickness equalization process [11] or image enhancement or both [1217]. The former is a computationally intensive process, difficult to generalize and apply to all digitized images independent of digitizer and digitization conditions. Proposed enhancement techniques include global or local aspects, and most involve a set of parameters, the optimal values of which are determined empirically. Global image-enhancement techniques are easier to implement and adapt to the various images and could play a significant role in the optimization of soft-copy display. However, traditional approaches such as brightness and contrast adjustments (window level and width adjustments) or gamma correction are not as sufficient for digitized mammograms as they may be for MRI or CT. Digitized mammograms require more sophisticated approaches for reaching the optimal image quality and display, as suggested by the ROC experiments [610]. Reasons include the variety of signals in the breast image (e.g., fat, dense tissue, calcifications, cysts, solid tumors, lymph nodes, and arteries) and significant nonuniformity due in part to the long-range varied (not well-defined) correlation properties and to the various components of the imaged object (e.g., thickness differences from the nipple to the chest wall) [1217]. Image-processing techniques designed to enhance specific features such as calcifications or masses are of particular interest because of their greater potential to improve detection and diagnosis than general enhancement algorithms [18]. Such techniques, however, are better applied locally and after the initial image display and global inspection.
Our initial attempts to improve the display of digitized mammograms were based on standard techniques such as gamma correction and window level and width adjustments performed either manually or automatically [6, 8]. Results showed that the quality of the images needed to be improved significantly if unambiguous equivalency between film and soft-copy digitized mammography was to be achieved [6]. Hence, we focused on the design of an enhancement algorithm that would satisfy at least two requirements: produce an image that would be similar to film and not appear foreign to the radiologist and integrate monitor characteristics with fundamental image properties derived from a statistical understanding of the image. The first requirement aimed at attaining the mammographer's acceptance and eliminating display bias. The second condition aimed at attaining consistency at the enhancement output in a global sense. Our goal was a method that yields robust but balanced performance across images as opposed to a method that works well on some images but poorly on others or yields different outcomes across image sets.
|
|
|---|
![]() | (1) |
A detailed description of this technique is provided elsewhere [1921]. Briefly, the 2D wavelet processing is performed by extending the conventional one-dimensional approach. The wavelet coefficients are derived by repeated convolutions with high-pass and low-pass filter kernels corresponding to the mother wavelet and scaling functions, respectively. The 2D wavelet basis functions are formed by the direct product of the one-dimensional functions. There are many wavelet bases to choose from. For this work, we have used a 12-coefficient basis that is nearly symmetric (orthogonal wavelet theory precludes true symmetry). The mother wavelet has a large, almost symmetric, center lobe that resembles the profile of the average calcification to some degree. Our choice followed from this similarity.
Equation 1 represents the expansion in the image domain. Each expansion
image may be found by setting all the wavelet coefficients equal to zero
except those corresponding to the level of detail associated with a particular
dj image and inverting the transform, which is also
implemented by repeated convolutions. The dj images,
referred to as detail images, represent the raw image, f0,
at various resolutions (these images represent an octave sectioning of the
frequency information as viewed in the Fourier domain). The small j
values indicate finer detail (higher frequency information), and
fj is a low-resolution (lower frequency information)
version of the raw image, in which the level of coarseness depends on the
magnitude of j. For a 60-µm pixel, this expansion is performed
with j equal to 5. The detail images are not intensity biased (zero
average). Adding the dj images together can be considered
as the detail representation, dD, and results in an
alternative form of equation 1 given by
![]() | (2) |
The enhancement application defined in equation 2 is accomplished in one operation by stopping the forward transform at a particular level of j and damping the low-resolution wavelet component, followed by inversion, as opposed to the operation defined in equation 1.
A useful analogy facilitates the understanding and importance of equation 2. The image may be considered as a surface or a landscape. For mammograms in particular, the landscape can be interpreted as a series of valleys, rolling hills, and occasional cliffs. As with any landscape, from a distance, one observes the basic land topography (changing elevations and contours) that obscures the smaller details on the surface, such as rocks of various sizes (scales), grass, and other small-scale elements of the terrain. Equation 2 represents a division of the fine and coarse detail from the surface, on which the smaller details are represented by the first component as projected onto a flat surface, and the second component contains the general landscape (changing elevations) topography without significant detail. Our analysis shows that the power distribution in the image increases as j increases or as the level of detail decreases [2225]. In fact, our work shows to a good approximation that the images are well represented in the frequency domain as 1/f process [2325]. In short, the analysis shows that most of the image power resides in the lower frequency information. In terms of equation 2, this observation implies that considerably more power resides in the lower resolution term than in the total detail representation. The large power at low resolution gives rise to long-range image correlation and the irregular (often called "lumpy") image appearance [24, 25]; it is also responsible for the less than optimal appearance of the raw image on the monitor display. As an interesting aside, theoretic arguments developed in the appendix of an earlier work [25] show the reasons for the wide dynamic variation (instability) of images that are statistically similar to mammograms.
In previous research [19],
we formed the premise that for soft-copy display, the mammographer should be
provided with the option to view the image at any desired level of detail in
accord with equation 1. Bringing this idea to fruition while considering our
guiding doctrine of preserving both components in equation 2, we found that
maintaining detail and reducing contrast may be achieved by a slight
adjustment of equation 2 as follows:
![]() | (3) |
Initial experimentation indicated that when k equals 0.5, the lower resolution power of the original image was reduced to 25% and gave an appearance that paralleled our design goals. In particular, the original details and features were maintained in the enhanced image (fenh), but the background "terrain" was reduced in presence, resulting in more abrupt changes between dense and nondense (fatty) areas compared with the original image. In essence, equation 3 resulted in an image with lower dynamic range but with a sharper appearance than the original. The increased contrast and sharpening effect of the wavelet enhancement could be readily appreciated on both low- (Fig. 1A, 1B) and high- (Fig. 2A, 2B) density mammograms showing calcifications or masses. In addition to the abnormal findings, linear structures and edges were also significantly sharper on the wavelet-processed images than on the originals.
|
|
|
|
Evaluation Experiment
Statistical approach.Visual signal detection experiments
are based on two main psychophysiology techniques. The most popular in medical
imaging is the ROC method, in which a specified signal or disease may or may
not be present in the image and the observer is asked to rank his or her
confidence as to signal or disease presence or absence on a rating scale (5
point, 10 point, continuous, or other)
[27,
28]. ROC experiments deal with
signal likelihood but not with signal location. In contrast, the localization
response operating characteristic (LROC) approach involves both signal
likelihood and signal location tasks and, hence, offers a more complete
analysis of observer performance
[2931].
An LROC experiment was performed in this work to evaluate the impact of the
proposed wavelet enhancement technique on the ability of the observers to
detect breast cancer or suspicious lesions on soft-copy digitized
mammography.
Image database.Five hundred single-view mammograms were used for the evaluation of the proposed enhancement technique. Of the 500 cases, 250 had no findings, 131 had benign, and 119 had malignant findings. A total of 375 findings were present in the benign and cancer cases, 182 of which were masses (98 benign and 84 cancer) and 193 were calcification clusters (100 benign and 93 cancer). Negative cases were selected from negative mammograms with at least 2 years of negative follow-up. Negative images matched the images with positive findings (benign and malignant) in terms of breast size, parenchymal density, and density pattern. Films were digitized at 30 µm and 16 bits per pixel with an Image Clear R3000 scanner (DBA, Melbourne, FL). The images were then resized by a factor of 2 to generate images of 60 µm per pixel using mathematic interpolation. Our study was approved by the institutional review board as a research study using existing medical records and exempted from individual patient consent requirements. The patient identifiers were obliterated from all images.
Workstation and user interface.Digitized mammograms from the negative, benign, and cancer cases in this study were presented to each observer, who sat at a computer workstation developed for this study that included two CRT monitors. Case selection and reporting were performed on a 1,280 x 1,024 pixel Sun Microsystems monitor (Santa Clara, CA), whereas interpretation was performed on a high-resolution DR110 monitor (Data-Ray, Westminster, CO) with an Md5/SBX board (Dome, Waltham, MA) in an UltraSparc 2 workstation (Sun Microsystems). The DR110 monitor provided a 2,048 x 2,560 pixel display with an 8-bit digital-to-analog converter.
A user interface was developed in-house for the LROC experiment. The interface software was implemented using C programming language. The code made use of the Sun XIL image-processing library and Sunvision (vff) file format (Sun Microsystems). The user interface allowed the selection of an image from a dialog box on the low-resolution monitor and the display of the full image on the high-resolution monitor. Digitized images were reduced by a factor of approximately 2 in each dimension to fit the DR110 screen. The resolution reduction was performed in real time while the image was loaded for display. Display speed was not fully optimized for this experiment because it was beyond the scope of this work. It is estimated that a 1,800 x 3,300 pixel image was reduced to a 900 x 1,650 pixel image in approximately 10 sec. Other features of the interface included the option of manual adjustment of the image gray scales using window level, width, and gamma correction functions; selection of the location of the finding on the screen using the computer mouse; and an interpretation module that allowed rating of each finding with options to review and modify previously selected locations and ratings. The x and y coordinates and corresponding ratings of the detected findings were automatically recorded in a text file for subsequent data analysis.
Interpretation protocol.Three board-certified mammographers with different degrees of experience in interpreting soft-copy mammography participated in the study. The observers reviewed all 500 images in both formats (original and enhanced) over the course of 10 sessions. One hundred images were presented per session (randomly mixed negative, benign, and cancer cases in original or enhanced format). Each session lasted approximately 30 min and was conducted in a light-controlled environment in which the ambient light level was maintained below 50 lux. Session volume and duration were selected to eliminate or significantly reduce potential memory bias and fatigue associated with soft-copy interpretation. Images and formats were presented in a random order that was different for each observer.
The observers were told that half of the cases were negative and half were benign or malignant; the latter cases had at least one biopsied lesion (calcification cluster or mass) with a maximum of five. For each suspicious finding, each observer specified its location by a computer-mouse click at the center of the finding and rated the likelihood of it being breast cancer. A 5-point scale was used for ratings in which 1 corresponded to low likelihood for cancer (or high likelihood for the finding to be normal tissue or a benign lesion) and 5 corresponded to high likelihood for the finding to be cancer (or low likelihood for benign lesion). This rating was applied to both the cancer and the benign cases. For the images with no abnormalities (e.g., the negative cases) or for those in which no abnormalities could be detected, the observer was asked to identify a single most suspicious area in the image and assign a low rating (forced localization choice) [29].
Data Analysis
The x and y coordinates and corresponding rating of the
detected findings were recorded electronically for each observer. In the first
stage of the analysis, the selected x and y coordinates were
compared with a gold standard file to determine the number of correct and
incorrect localizations. The gold standard file was established by an
independent expert mammographer who reviewed the patients' files, including
radiology and pathology reports, and determined the locations of the findings
and their pathology. Coordinates of each true finding were recorded both in
pixels and in millimeter distance from a reference point. A finding for which
the Euclidean distance between the position specified by the observer and the
known position of the lesion in the gold standard file was less than 200 times
the distance between adjacent pixels was scored as correctly localized.
Otherwise, the finding was scored as incorrectly localized.
The LROC program, 1998 version, was used in the analysis of the data. This program was developed by Swensson [29] and Swensson et al. [31] and may be accessed at the University of Arizona's Department of Radiology Web site [32]. ROC- and LROC-fitted curves were generated including estimates of the areas under the ROC curve and their standard errors (SEs). The highest-rated report of a finding on each image was used as the summary rating that represented the entire image in the analysis process [29, 31]. Two performance indexes were primarily considered and compared in this study: the detection accuracy, which corresponds to the area under the ROC curve (AROC), and the localization accuracy, which corresponds to the ordinate of the LROC curves [29].
Analysis was performed for four data combinations as follows: First, negative and cancer case interpretations were analyzed. In this combination, the responses from the 250 negative images with no lesions were compared with the responses from the 119 cases with cancerous lesions. This comparison allowed us to evaluate the ability of the observers to detect and diagnose cancer in the digitized images before and after wavelet enhancement. Second, the benign and cancer case interpretations were analyzed. In this combination, the 131 benign cases were considered as nontarget or nondisease images and were compared with the 119 cancer cases. This comparison allowed us to evaluate the ability of the observers to differentiate benign and malignant cases on soft-copy display before and after wavelet enhancement. Third, the combined negative and benign cases were analyzed relative to the cancer cases. In this combination, the responses from the 381 combined negative and benign cases (nontarget) were compared with those from the 119 cancer cases (target). This comparison may be closer to a clinical setup in which the observers are asked to differentiate negative, benign, and cancer cases to focus on the latter. Finally, the negative and benign case interpretations were analyzed. In this combination, the responses regarding 250 negative images with no lesions were compared with those regarding the 131 cases with benign lesions. This comparison allowed us to evaluate the ability of the observers to detect, correctly localize, and diagnose lesions associated with benign disease before and after wavelet enhancement. For this analysis, the benign case ratings were entered in the LROC analysis program as the target data.
|
|
|---|
|
The statistical significance of the differences between the performance indexes before and after enhancement was tested with the paired t test at the 90% and 95% CI levels. The t statistic, probabilities, and CIs were estimated. Results showed that almost all differences were statistically significant at the 0.10 level (Table 1); the detection accuracy estimated from the analysis of the benign versus cancer cases was the only exception. At the 95% CI, only two differences were statistically significant: the localization accuracy for the negative versus the cancer cases (Table 1) and the detection accuracy for the negative versus the benign cases. All statistically significant differences favored the interpretation of the wavelet-enhanced digitized mammograms. All observers seemed to perform better in the detection of cancer when only negative cases were considered (Table 1) and worse in the detection of cancer when the benign cases were part of the analysis, independent of image format (original or enhanced).
The consideration of clinically relevant operating points on the ROC curves provided some insight on the potential practical impact of the proposed enhancement technique. For example, at a fixed true-positive rate of 90%, the false-positive rate decreased for all observers. Specifically, the analysis of the negative versus the cancer cases showed that the average of the three observers' false-positive rates was 0.46 for the original images compared with 0.31 for the wavelet-enhanced cases. Similarly, at a fixed false-positive rate of 20%, the true-positive rate of all observers increased. The average of the three observers' true-positive rates was 0.65 for the original images compared with 0.80 for the wavelet-enhanced cases.
All observers showed an increase in the number of correctly localized benign and cancer cases and a corresponding decrease in the number of incorrectly localized cases with wavelet enhancement. The three-reviewer averages for correctly localized values per finding and pathology before and after wavelet enhancement are presented in Figure 3; a corresponding decrease was observed in the incorrectly localized cases when interpretation was performed on the enhanced mammograms. Observers did not differ significantly in their performance. Correctly localized increase ranged from 8% to 23%, depending on the finding and the pathology. The largest increase was observed for the benign clusters and the smallest for the malignant clusters. Incorrectly localized decreases ranged from 15% to 37%. The largest decrease was observed for the benign clusters and the smallest decrease for the malignant masses. Overall, observers localized findings more accurately in the enhanced benign cases than in the enhanced malignant cases.
|
|
|
|---|
The enhancement approach outlined in this article is analogous to unsharp
masking but lends itself to a more general framework. The wavelet expansion
images are not linearly correlatedthat is, their linear correlation
coefficient is zero and may be generated rapidly. These properties have
advantages when compared with a generic low-pass filtering as applied in the
unsharp masking approach that is, generally the higher frequency
enhanced image is given by
![]() | (4) |
Comparing our wavelet-enhancement approach with peripheral equalization, we noted that peripheral equalization techniques, as applied to mammography, reduce dynamic range to compensate for variations in the thickness of the compressed breast in the area near the breast edge and off-breast background. This process can interfere with the overall digital display because the border region projection tends to be overexposed. Our wavelet technique attempts to compensate for breast thickness nonuniformities in the border by adjusting the low pass or consant bias in the region. In our approach, we have lessened the constant bias across the entire image.
Negative, benign, and cancer cases were used in the evaluation of the proposed enhancement technique. Four case combinations were studied in an effort to evaluate performance under conditions that may represent different perspectives in the observers' performance. ROC and LROC performance indexes suggested that the proposed enhancement technique can significantly improve detection, lesion localization, and diagnostic performance on soft-copy mammography, and it is definitely a step in the right direction. Localization accuracy increased with enhancement for both the benign and the cancer cases. Localization accuracy was greater for the benign than for the cancer cases and greater for the calcification clusters than for the masses. In contrast, when localization was combined with the observer's rating of cancer likelihood, performance improved more on the enhanced cancer cases than on the enhanced benign cases. These outcomes are not necessarily contradictory. Localization is a detection task that is related to image properties and lesion characteristics. A review of the size and contrast characteristics of the benign and cancer lesions showed that benign lesions tend to be larger and have more contrast than malignant ones. Hence, benign lesions may be better enhanced by our technique than cancerous ones because of their larger size and contrast. However, assessment of cancer likelihood is not likely to change significantly with an enhancement technique and may continue to reflect the low negative predictive value of mammography.
Almost all performance differences between original and enhanced interpretations were found to be statistically significant at the 0.1 level, and all favored enhancement. Significance was not as extensive at the 0.05 level. The latter result may be the outcome of the small number of observers (n = 3) used in this study. Six radiologists actually participated in this work, but three could not complete all cases. Their data were not part of the analysis presented here, but a review of their responses and relative comparisons suggests the same pattern of improvement with the enhanced mammograms relative to the original images. One may view this pattern as partial evidence that the proposed enhancement technique could have yielded significant differences even at the 0.05 level.
In summary, our study showed that the proposed wavelet approach could significantly improve the detection of abnormalities in soft-copy mammography with the potential of offering a robust and generally applicable technique independent of film-digitization conditions or digitizer. The technique could achieve even better results by modifying the algorithm to address areas of weakness (e.g., the cases of low breast density) or by combining it with other postprocessing techniques. For example, after this study, we perceptually linearized the display of each image using the Digital Imaging and Communications in Medicine standard [33]. Namely, image pixel values were first transformed to luminance values of the monitor on the basis of the monitor's characteristic curve. Display luminance was then transformed to the brightness response of the human eye following Barten's model. An initial evaluation of the new data indicated that as much as 20% improvement in performance may be possible with the addition of the relatively simple perceptual linearization process [33].
Our work and results impact the implementation of soft-copy digitized mammography as opposed to soft-copy direct digital mammography. Other studies have shown the equivalency of soft-copy digital mammography to film mammography, but the same is not true for soft-copy digitized mammography. The latter is expected to be a major component of a filmless mammography and radiology environment, and its acceptance and success as an equivalent alternative to standard film interpretation depend greatly on the outcome of studies like ours and the development of workstations and digital displays that yield performance similar to, if not better than, that of film mammography.
Acknowledgments
We thank William Gross for his assistance in programming and image
processing; Jihai J. Kim, Lisa Hooper, and Catherine Nolte for their valuable
comments and assistance in the LROC experiments; and Eleni Tsalla, Mugdha
Tembey, and Anand Manohar for their assistance in the analysis of the
data.
|
|
|---|
This article has been cited by other articles:
![]() |
M. Kallergi, B. J. Lucier, C. G. Berman, M. R. Hersh, J. J. Kim, M. S. Szabunio, and R. A. Clark High-Performance Wavelet Compression for Mammography: Localization Response Operating Characteristic Evaluation Radiology, January 1, 2006; 238(1): 62 - 73. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |