|
|
||||||||
Original Research |
1 Department of Radiology, University of Munich, Campus Downtown, Nussbaum-Str.
20, D 80336 Munich, Germany.
2 Department of Obstetrics and Gynaecology, University of Munich, Munich,
Germany.
Received December 26, 2007;
accepted after revision June 30, 2008.
Address correspondence to H. F. Boehm
(holger.boehm{at}med.uni-muenchen.de).
Abstract
|
|
|---|
MATERIALS AND METHODS. One hundred digital mammograms were classified by consensus of two experienced readers. A topologic parameter extracted from the Minkowski functional spectra was obtained for retromammilar image sections (512 x 512 pixels). From the gray-level histogram of each of these samples, the 20th percentile, median, and mean were determined. Discriminant analysis was used to assess the predictive value of the methods with respect to correct categorization.
RESULTS. The mean gray-level intensity of normal breast tissue was 90 ± 9, and the 20th percentile was 68 ± 18. The mean gray-level intensity was 84 ± 7 for involution and 90 ± 8 for fibrosis; the 20th percentile was 75 ± 6 for involution and 73 ± 10 for fibrosis. The results of discriminant analysis showed that use of the gray-level histogram parameters led to correct classification in 66% of cases. Use of topologic analysis with Minkowski functionals increased the rate of correct classification to 83%. When a combined model of histogram-derived parameters and Minkowski functionals was used, 89% of cases were categorized correctly.
CONCLUSION. Topologic analysis of x-ray attenuation patterns on digital mammograms obtained with Minkowski functionals is simple and robust, and the results agree with radiologists' ratings. Because correct classification is significantly higher than with use of density features, our technique may be an objective and quantitative alternative in the evaluation of the parenchymal structure of the breast.
Keywords: breast tissue classification digital mammography image processing Minkowski functionals
|
|
|---|
The diagnostic potential of mammography in the detection of lesions of the breast and for recognition of multiple focal lesions strongly depends on the characteristics of the tissue surrounding a lesion [5, 6]. In a breast predominantly made up of atrophic tissue, identification and interpretation of focal lesions are relatively trivial tasks with all diagnostic techniques, including palpation and mammography. At the other extreme, the diagnostic success of palpation and mammography of breasts dominated by fibroglandular tissue is limited or even compromised, and additional diagnostic procedures (e.g., sonography and MRI) are needed to increase sensitivity. Furthermore, screening intervals may have to be adjusted, especially for patients with a genetic predisposition to breast cancer [7].
Several studies have shown the relation between risk of development of breast cancer and general parenchymal properties. Women with dense breast tissue have been reported to be at higher risk of cancer than women with fatty breast tissue [8–11]. At least two intuitive explanations for this phenomenon exist. The first is that on mammograms, dense breast tissue is likely to obscure lesions [12] that later become malignant. However, the relation between breast density and cancer risk continues to exist even after the masking effect is taken into account [9]. The other popular explanation is based on the fact that breast cancer most frequently arises from the epithelial lining of the ductal and lobular glands and regions [12]. Thus, the risk of cancer is proportional to the concentration of fibroglandular tissue, which correlates with the breast density assessed at mammography.
Classification of parenchymal attenuation patterns on mammograms can assist in clinical studies of the role of mammographic appearance as a factor in breast cancer. Mammographic pattern analysis can be used for monitoring density alterations due to hormone replacement therapy, which can lead to changes in breast tissue, and for evaluation of the need for detailed interpretation of certain mammograms or adjustment of screening intervals. However, when classification criteria are based on subjective estimation of tissue composition, interobserver and intraobserver variability leads to an ambiguous outcome. Several studies [13–16] of the assessment of breast cancer risk in postmenopausal women showed no clear correlation between the disease and tissue density. Even in studies showing evidence of such a relation, the magnitude of the association varied.
It is desirable to develop objective methods for classification of breast density depicted on digital mammograms. Among the limitations of digital mammography is the lack of linearity of the characteristic curve of film and the potential lack of linearity of the image digitization process in combination with the lack of uniformity of thickness of breast tissue during compression [17]. These factors influence the accuracy of the relation between the mammographic finding of apparently dense breast tissue and the x-ray attenuation of breast tissue. Objective classification methods can be of practical use in a number of aspects of screening for breast cancer and for improvement of computer-aided detection systems. Use of these methods may help to optimize image display, breast parenchymal segmentation results, and risk assessment.
In clinical practice, mammograms are evaluated visually by application of classification criteria based on subjective estimation of the tissue composition. Radiologists assess breast density not only according to opacity on a mammogram but also on the basis of the distribution of parenchymal and breast tissue patterns, that is, textures. It can be expected that the inclusion of quantitative textural features in the classification of mammograms removes intraobserver and interobserver variability and leads to an objective judgment. Several techniques of computer-assisted pattern classification have evolved whereby the mammographic appearance of breast tissue is categorized into two to four groups [8, 17–22]. The spectrum of methods ranges from segmentation of dense breast tissue to classification based on fractal dimensions.
In previous studies, we [23, 24] found great potential of Minkowski functionals in quantitative assessment of the textural properties of various tissues. In particular, topologic features based on Minkowski functionals have been used successfully for assessment of risk of fractures of trabecular bone and identification and classification of lung lesions. Minkowski functionals are a set of topographic descriptors that can be used universally irrespective of the origin, dimensionality, and scale of the data sets. In this study, in a natural extension of the method, we applied analytic techniques based on Minkowski functionals to mammograms. We tested the hypothesis that the spatial distribution of x-ray attenuation values on digital mammograms can be analyzed quantitatively with topologic techniques based on Minkowski functionals. The goal of inclusion of quantitative textural features in the classification of mammograms is to remove intraobserver and interobserver variability in order to obtain an objective assessment. The results are compared with those obtained with techniques based on evaluation of gray-level histograms.
|
|
|---|
Acquisition of Mammograms
For each woman, mammograms of each breast were acquired in the craniocaudal
and mediolateral oblique projections with a molybdenum anode and molybdenum or
rhodium filtration in combination with a digital system (CR 75, Agfa-Gevaert).
The mammographic unit (Mammomat 3000 Nova unit, Siemens Medical Solutions) was
equipped with 18 x 24 cm and 24 x 30 cm phosphor storage plates.
Image acquisition was performed in automatic exposure mode with a tube voltage
of 26–30 kV.
Preprocessing of Image Data
The mammographic data displaying the craniocaudal projections of the right
breast were exported from the PACS storage device in JPEG format (255 gray
levels, 28 pixels/cm) and transferred to a PC-based system for further
processing and quantitative analysis. With a dedicated 2D navigation tool
coded in IDL (version 6.3, ITT Visual Information Solutions), the principal
investigator positioned a square region of interest (ROI) with an edge length
of 512 pixels on each of the mammograms in an area containing representative
parenchymal tissue. We chose a region of retromammilar parenchymal tissue in
which the density pattern was more or less homogeneous within the predefined
ROI. Each of these virtual probes was stored separately for further
analysis.
Classification
For collection of ground-truth data, the probes were classified visually by
two experienced mammogram readers in consensus. The classification scheme
consisted of three categories: normal, involution atrophy, and fibrosis. A
probe was referred to as normal if a diffuse even soft-tissue density of
glandular tissue was present and fatty structures were organized in a
relatively regular way. The estimated areal contribution of dense tissue
ranged from 25% to 50%. If the ratio of opaque to translucent tissue was
visually appreciated to be less than 25%, the probe was categorized as
atrophic; greater than 50% was considered fibrotic.
Quantitative Image Analysis
Density-based measurements—For each of the data sets, we
determined the mean, median, and 20th percentile values of gray-level
intensity as described by Biernacki et al.
[25].
Topologic gray-level analysis—As explained in great detail
in an earlier publication
[26], mathematic topology
entails use of Minkowski functionals to quantify the shape and connectivity of
binary convex excursion sets. For geometric 2D interpretation of the Minkowski
functionals obtained for the set of pixels of a binary image pattern, the
first functional is proportional to the area (i.e., the total number of white
pixels), the second functional represents the perimeter, and the third
functional corresponds to the so-called Euler-Poincaré number (
),
which is calculated as number of connected components – number of
cavities.
Michielsen et al. [26]
described a convenient way to determine the Minkowski functionals from a
number of pattern features obtained at pixel scale, that is, the number of
object pixels, open edges, and vertices
(Fig. 1). These values can be
inserted into the following elementary formulas for calculating Minkowski
functionals:
![]() |
![]() |
|
|
|
|
Data Analysis
Receiver operating characteristic analysis—The receiver
operating characteristic (ROC) analysis was based on a moving-threshold
procedure [27]. Given a set of
Minkowski functional fractions, a threshold value allows division of the data
set into two groups. Two quantities are then evaluated, namely, specificity
and sensitivity, as follows:
![]() |
![]() |
|
|
|
Discriminant analysis—We used discriminant analysis to estimate the predictive potential of the histogram-derived factors and Minkowski functionals in 2D. Discriminant analysis is a well-established method of modeling the value of a dependent categoric variable on the basis of its relation to one or more predictors. It is used to find linear combinations, so-called discriminant functions, of a set of independent variables that best separate the groups of cases. Discriminant analysis entails use of Wilks lambda tables for estimating how well the discriminant model works as a whole. Wilks lambda is a measure of how well each function separates cases into groups. It is equal to the proportion of the total variance in the discriminant scores not explained by differences among the groups. Smaller values of lambda indicate greater discriminatory ability of the function.
Cross validation—We performed statistical validation of the discrimination results. Cross validation is used to estimate the generalization ability of a statistical classifier [28]. The SDs for two or more parameters—in this study, mean AUC of both the histogram-derived textural measures and Minkowski functional parameters— are calculated. The most extreme and simple form of cross validation is so-called leave-one-out validation, whereby the performance of statistical models is estimated in cases in which the number of data sets is limited. As a basic principle, each element simultaneously serves as an item of training and an item of testing data. The result of leave-one-out analysis is an estimate of statistical stability. If the SDs of the mean AUCs of both methods are similar, it can be concluded that the methods compared are equally stable or robust.
Statistical tests—Means and SDs of the topologic and histogram-derived parameters were calculated for normal, atrophic, and fibrotic cases. Statistical significance of the differences was obtained with a Student's t test at the 95% significance level. Correlations between the topologic measure Minkowski functionals in 2D and the parameters obtained by evaluation of intensity histograms were assessed with Pearson's coefficient of correlation and a two-tailed Student's t test of significance. The statistical significance of the difference in AUC values of ROC curves of different parameters was obtained with the DeLong test [29]. All statistical computations were processed with SPSS version 10.0.7 and IDL version 6.3 software. Visualization and further processing of imaging data were accomplished with software composed at our institution under IDL and C++ (Dev-C++ version 4, Bloodshed Software) environments.
Precision error—For ethical reasons and because of the design of the study, we did not consider assessing the effect of positioning, mechanical compression, and other parameters associated with mammographic imaging. Because the digital evaluation steps were applied to a set of virtual tissue probes, no human interaction was necessary. Because the study population was healthy women, the ground truth of classification was defined by the readers' consensus diagnosis rather than the results of histologic evaluation of the breast tissue.
|
|
|---|
|
|
|
|
ROC analysis revealed that normal parenchymal probes were differentiated from atrophic and fibrous probes on the basis of features extracted from the gray-level histograms with AUCs of 0.67, 0.63, and 0.58 for median, mean, and 20th percentile gray-level intensities, respectively. The best results in identifying specific entities were found for atrophic breast tissue probes, which can best be differentiated from normal and fibrous ROIs with the parameter median gray-level intensity (AUC, 0.72).
At discriminant analysis we found that with the parameters obtained from
the gray-level histograms, only 37–66% of cases were classified
correctly (
= 0.482–0.95)
(Table 2). The highest
agreement with ground-truth data was found for the probes representing normal
breast tissue (73.7%), followed by cases with fibrosis (68.6%).
|
Topologic Texture Analysis
The Minkowski functional in 2D for normal cases was –2.5 x
107 ± 2.1 x 106 and for atrophic cases was
–1.5 x 107 ± 1.4 x 106. The
Minkowski functional in 2D for fibrotic tissue probes was 5.4 x
107 ± 7.6 x 106. No statistically
significant relation existed between Minkowski functionals in 2D and the three
parameters derived from the densitometric analysis, R2 ranging from 0.006 to
0.214 (p
0.44). ROC analysis showed that normal ROIs were
identified with an AUC of 0.93, atrophic ROIs with an AUC of 0.81, and
fibrotic ROIs with an AUC of 0.68 (Figs.
3A,
3B, and
3C). The statistical
significance of the differences in AUC values between Minkowski functionals in
2D and the densitometric parameters was p < 0.001 for the normal,
p = 0.088 for the atrophic, and p = 0.112 for the fibrotic
data sets.
Discriminant models of the spectral information of one of the three
Minkowski functionals or a combination of all of them exhibited a rate of
correct classification of 76–83% (
= 0.227–0.154). The
rate of detection of atrophic (74.1–81.5%) and normal (84.2–92.1%)
parenchymal patterns were generally higher than those for the mammographic
images showing fibrosis (68.6–74.3%). In a combined statistical model
comprising all histogram-derived and topologic quantities, the overall rate of
correct classification rose to 89% (
= 0.101), the highest result
being for the category normal tissue (94.7%)
(Table 2).
Cross Validation
In tests of the stability of the Minkowski functionals–based textural
feature, leave-one-out cross validation resulted in mean AUC values of 0.67,
0.81, and 0.92 for fibrotic, normal, and atrophic cases, the average SD being
0.021. Performance of the parameters based on gray-level histograms was
significantly lower than that of the parameters based on Minkowski functionals
in 2D. Among these parameters, mean gray-level intensity was best suited to
correct differentiation of the three classes of breast tissue, having mean AUC
values of 0.61, 0.55, and 0.51 for fibrotic, normal, and atrophic cases and an
average SD of 0.022.
|
|
|---|
Topologic Analysis with Minkowski Functionals
In this study, we used an automatic method of classification of parenchymal
patterns based on results of topologic analysis of breast tissue depicted at
digital mammography. The procedure extracts objective textural quantity
Minkowski functionals in 2D on the basis of the Minkowski functionals from the
imaging data, allowing reproduction of expert radiologists' ratings
[23,
24,
26]. Compared with measures
based on histograms of x-ray attenuation values, results with our method were
in far better agreement with the ground-truth data, probably because of the
averaging nature of density measurements irrespective of the underlying
structural patterns. This finding suggests that the technique may be useful as
an objective and quantitative alternative for breast density assessment. It is
automatic, easily reproducible, and removes observer variability.
Our textural measurements were based on Minkowski functionals, a universal nonlinear tool for data analysis in general and for image analysis in particular that has its roots in the study of complex systems. Because of their universal character, Minkowski functionals have been applied successfully to problems in wide-ranging fields of study irrespective of the origin of the data and to various techniques in biomedical image processing.
In various fields of science concerned with the study of large amounts of data, for example, cosmology, image processing, and pattern recognition, it is known that for the analysis of complex systems, standard (i.e., linear) techniques are of limited use in extracting essential information [30]. A more sophisticated approach is needed to reduce such systems to their essential components and to derive their characteristics by measurement of corresponding properties [31]. Without further confinement, we can assume that the texture of breast tissue depicted at mammography has linear and nonlinear properties and as such can be considered a complex system. In this study, we focused on development of an analytic technique for extracting nonlinear textural information from digital mammograms. The nature of Minkowski functionals sufficiently justifies their application in characterizing the parenchymal texture depicted on mammograms. Measurements based on Minkowski functionals integrate image information from all gray levels and operate independently of binarization of the data to extract structural properties.
Limitations
The study was designed to test the feasibility of the proposed
image-processing method. The classification scheme was based on the ratio of
opaque to translucent tissue estimated by experienced human readers. We used a
three-category system to analyze the topologic properties of breast
attenuation. We realize that BI-RADS requires four categories, but a
substantially larger number of training cases would have to be processed.
Because this study included healthy screening patients, rather than performing biopsies on all women, which would have been highly unreasonable ethically, we used as the reference standard readers' ratings reproduced with a computer-aided detection system. The limited resolution of our diagnostic scheme in its present form probably compromised the detection of a statistically significant relation between the composition estimates and the risk of breast cancer. Therefore, we did not choose the latter as our primary research goal. Our goal was development of an accurate method of estimating tissue composition with digital mammography. Ultimately, such methods have to be calibrated and tested against an objective reference standard, such as volumetric attenuation measurements of cadaveric specimens imaged with CT or in vivo findings on MR images of the breast.
The size of ROI used in our study was a limitation because the ROI represented different percentages of breast area for women with different breast sizes. Because it is important to incorporate breast size in the analysis, in a future population study we plan to vary the sizes of ROIs used for different sizes of breasts. Use of the entire breast area as the ROI is ideal for evaluating percentage attenuation of a breast. In this preliminary study we decided to measure parenchymal attenuation in the single ROI used for topologic classification.
Some limitations were related to the comparably small number of patients or, rather, data sets in our study. The study population was not large enough for a training and a testing set. Fortunately, there exists a statistical technique that enabled us to overcome this problem. With leave-one-out cross validation, we readily found that the variation in AUC for the topologic textural measurement and the quantities obtained from the attenuation histogram had the same order of magnitude, which indicates that our method is stable and the results are reproducible.
Comparison with Other Studies
Karssemeijer [22] applied a
K–nearest neighbor classifier to features calculated from gray-level
histograms in selected regions of 615 digitized mammograms. Agreement with the
subjective rating of a radiologist was achieved in 67–80% of the cases.
Sivaramakrishna et al. [32]
evaluated 32 digitized mammograms by introducing variation images to enhance
the contrast between dense and fatty areas. Kittler's optimal threshold was
used to segment the densities with high correlations (R =
0.92–0.95) between a human reader and the automated procedure.
Sheshadri and Kandaswamy [33] used the four-category scheme of BI-RADS [34] to classify on the basis of textural features on gray-level histograms 320 mammographic images obtained from the mini–Mammogram Image Analysis Society database. In 80% of cases, the classification results of the reading radiologist and the algorithm were in concordance. Martin et al. [35] used an automated segmentation procedure (mammographic density estimation) to evaluate digitized mammograms of 65 women. Using a threshold-based gray-level histogram, the investigators estimated the proportional area of glandular tissue in relation to the whole area occupied by breast tissue. Using a classification scheme related to BI-RADS, they observed a close correlation between the ratings of the human reader and the mammographic density estimation (R = 0.89).
Our algorithm is different from those in the foregoing studies in the sense that the original information contained in the imaging data is not cropped by segmentation or thresholding of glandular tissue. All gray-level information enters the analysis. In the methods oriented to gray-level histograms, the textural information is widely lost owing to the averaging nature of this type of analytic tool. Findings in our previous studies [23, 24] of complex systems showed that nonlinear analytic tools were best suited for description and evaluation of the data.
Future Studies
In general, the results of topologic analysis are presented on a continuous
scale. If the scale can be calibrated against an objective, quantitatively
accurate standard, such correlations may be available in the future. Such an
objective standard may be obtained experimentally with volumetric analysis of
breast-tissue signal intensity from MRI data in vivo or, possibly, from CT
data on cadaveric breast specimens.
We designed our study strictly to evaluate whether breast tissue can be characterized on the basis of topologic properties of attenuation patterns on digital mammograms. We consider our findings preliminary in the sense that the results are evidence of the general feasibility of the method itself, that is, that the ratings of a panel of human readers with respect to predefined categories can be closely reproduced. Our findings should clear the path for a larger project with parameters obtained with optimization procedures from our data sets.
The data used for the statistical experiments consisted of a set of ROIs obtained at approximately the same anatomic region of the craniocaudal mammographic projection of the right breasts of 100 subjects. Because of the limited number of samples used in this preliminary study, it was not possible for statistical reasons to increase the number of categories. A logical extension of our initial approach would be to apply the analysis to the whole breast, preferably using both craniocaudal and mediolateral oblique projections.
An effective way to conduct objective density measurements based on average areal x-ray attenuation and on topologic analysis with Minkowski functionals, which by definition are additive, would be to apply a fully automatic procedure to segmentation of breast tissue and placement of the ROI. For future studies, in the interest of greater precision and because of the large amount of data involved, it will be desirable to use an automated segmentation procedure. The proposed algorithm may then become more precise in the sense that we will be able to, for example, express the results in terms of areal fractions of the different categories that will lead to the diagnosis.
Future activities of our group will include application of topologic textural analysis to quantification of mammographic alterations, such as those due to hormone replacement therapy, at follow-up examinations. Another interesting topic will be characterization and automated detection of masses by use of topologic properties.
Conclusion
The proposed method based on topologic analysis of the gray-level
distribution of digital mammograms with Minkowski functionals is simple and
robust, and the results are in good agreement with radiologists' ratings with
respect to the three categories, normal tissue, atrophy, and fibrosis. The
rate of correct classification is significantly higher than that of
characteristics derived from gray-level histograms. This finding suggests that
our technique may be useful for objective and quantitative evaluation of
breast parenchymal structure. The technique is automatic and easily
reproducible and removes observer variability.
As the next step, in-depth evaluation of the efficiency of our proposal in conjunction with larger databases is necessary. Future applications of the approach may include quantification of textural alterations with follow-up examinations and characterization and automated detection of breast lesions.
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |