|
|
||||||||
Original Research |
1 Department of Radiology, The Elizabeth Wende Breast Clinic, 170 Sawgrass Dr.,
Rochester, NY 14620.
2 Department of Radiology, Roswell Park Cancer Institute, Buffalo, NY.
3 Ovation Research Group, Highland Park, IL.
4 Digital Radiography Systems Division, Analogic Corporation, Peabody, MA.
5 UCD School of Medicine and Medical Sciences, Dublin, Ireland.
6 Department of Mathematics and Statistics, University of New Hampshire, Durham,
NH.
Received November 5, 2004;
accepted after revision September 18, 2005.
Address correspondence to M. L. Zuley.
Abstract
|
|
|---|
MATERIALS AND METHODS. Five radiologists evaluated normal anatomy and features of 61 abnormalities in 48 full-field digital mammograms. A 9-point Likert scale was used to compare images on two identical soft-copy review workstations, one equipped with two 5-megapixel CRTs and the other with two 5-megapixel LCDs. Outcomes were evaluated using a random-effects analysis of variance model. Means and SEs were reported. Ninety-five percent confidence intervals and p values were calculated.
RESULTS. The two systems were equivalent for most features. The LCDs were rated better for the sharpness of mass margins (p = 0.011) and mass conspicuity (p = 0.050). For calcium features, the LCDs were rated better than the CRTs for lesion conspicuity (p = 0.010) and number of calcifications (p = 0.043). For architectural distortions, there was no statistically significant difference between the monitors in any of the features evaluated. For display characteristics, the LCDs were better for luminance (p = 0.021). The CRTs were significantly better for image noise (p = 0.001). In the overall ratings, there was no statistically significant difference between the two displays.
CONCLUSION. The 5-megapixel monochrome active-matrix LCD is equivalent to and in some respects better than the 5-megapixel CRT display for full-field digital mammograms over a range of normal and abnormal findings.
Keywords: breast breast cancer mammography PACS
|
|
|---|
Eye fatigue is also a problem because CRTs require the screen to be
constantly refreshed, or repainted, because the image on the screen relies on
light emitted from phosphors that fade quickly. Despite improvements in
refresh-rate standards from 60 to 75 MHz by Video Electronics Standards
Association (VESA) standards, eye fatigue is still an issue for the
radiologist. In addition, the emissive nature of the CRT results in blooming
of the focused beam at the periphery of the monitor, which degrades resolution
in these areas. This emissive nature also creates veiling glare (unwanted
scattered light), which results in degradation of the overall contrast
resolution. CRTs are heavy (
40 lb [18 kg] each); they have a large,
cumbersome footprint; and they have a high heat output, which requires
additional air conditioning. For these and other reasons, liquid crystal
display (LCD) monitors are coming into favor.
LCD monitors are lightweight (< 15 lb [7 kg]), have a small footprint, and are becoming less cost-prohibitive. They have a long life expectancy, with minimal cost impact of replacing the fluorescent lamp backlight. Refresh rates are not a concern with LCD monitors because they virtually hold a charge until updated. The design of the LCD does not require a focused beam to produce an image. Instead, an electronic current is applied to a thin-film transistor (TFT), rendering the entire surface of the LCD a uniform resolution. Moreover, the flat-panel design provides a better overall resolution for a given display matrix as illustrated in a study comparing the clinical impact of 3-MP LCD with 5-MP CRT for lung nodules (Siegel E et al., presented at the 2002 annual meeting of the International Society for Optical Engineering). Monitor luminance has been shown to be at least as important as monitor resolution [1-3], and with superior luminance of approximately 700 cd/m2nearly double the luminance of the CRTLCDs are an attractive alternative to CRTs. To date, some studies have shown equivalence between LCDs and CRTs in the display of radiographic abnormalities [4].
Our study was designed to determine if the display of full-field digital mammograms on a 5-MP LCD monitor was at least equivalent to the display of the same on a 5-MP CRT monitor. This study was performed in the context of trying to determine the optimal display of full-field digital mammography images and represents just one of many small steps necessary to be able to integrate the display, storage, and retrieval of digital mammography into PACS systems, which so many of us already use.
|
|
|---|
Case Selection
The reports of all screening mammograms performed on one full-field digital
mammography unit (SenoScan, Fischer Imaging) from March 2003 to November 2003
were reviewed (n = 2,500). The hardcopy images of all cases that were
given a BI-RADS category of 2 or 0 were evaluated (n =331) by the
study coordinator in medical record number order to obtain as random a sample
as possible. We then enriched the data set with missing lesion types and
breast densities. This was done with the intention of representing the
spectrum and frequency of mammographic abnormalities to include mass; calcium;
mass with calcium; and architectural distortion in dense, heterogeneously
dense, scattered, and adipose-replaced tissue types. The resultant data set
consisted of 48 cases containing 61 abnormalities including 30 mass lesions,
21 calcium lesions, and 10 architectural distortions. Asymmetric densities
were grouped in the mass category. Two of the 48 cases had a calcified mass.
The calcium and mass features for these two cases were evaluated separately by
the reviewers and the results included in the mass and calcium numbers. Tables
1 and
2 show the distribution of
lesion features, breast density, and size. In addition, five normal cases were
chosen across the four tissue types.
|
|
|
APPENDIX 1: Mammographic Features Evaluated on the Likert Scale
|
The Likert scale design was based on the mammographic features detailed and defined by the American College of Radiology BI-RADS [5], which is familiar to all radiologists and quite complete in detailing mammographically found features. Five test cases were then reviewed by two board-certified radiologists who are fellowship trained in mammography. Based on this test cohort, the scale was adjusted for the full study group analysis.
Reviewer Training
Before starting the evaluations, the three radiologists who were not
involved in the Likert scale development participated in a training session to
become familiar with the scale, the definitions, and the image display
protocols. The test set of five mammograms that were initially used to refine
the Likert scale was also used for this reviewer training session.
Case Analysis
The study group was evaluated independently by the five radiologists. The
reviewers had an average of 8 years (range, 5-12 years) of experience in
screen-film mammography and 1 year of experience in soft-copy reviewing of
digital mammography. Each radiologist interprets approximately 30,000
mammograms per year in our practice. The evaluations were done with the cases
displayed in random order as to tissue type and abnormality. Because the study
was not done to evaluate each radiologist's interpretation skills, and to
avoid the possibility of inadvertent evaluation of the wrong lesion, the
radiologist was directed to the lesion of interest with a lesion-specific data
form that indicated the type and geographic location of the lesion of
interest. Four mass features were assessed: shape, margin sharpness, density,
and conspicuity. The calcium features analyzed were number, shape, and
sharpness of edges; distribution; and conspicuity. Four architectural
distortion features were evaluated: spiculation, density, parenchymal edge
distortion, and conspicuity. For all cases, including the five normal cases,
the skin was evaluated for thickness, subcutaneous tissue visibility, and
overall conspicuity. In addition, display characteristics, including
luminance, dynamic range, image sharpness, background homogeneity, image
distortion, display noise, and image size, were compared (see
Appendix 2 for definitions).
Last, for each mammogram, the reviewers were asked to provide an overall
assessment of the CRTs and LCDs from zero to 100% better for either monitor,
with zero being equivalent. This overall assessment was not tied to any word
anchors, allowing the reviewer to choose from the full range of the scale.
APPENDIX 2: Definitions and Guidelines Used in the Likert Scale Evaluation
|
The Monitors
Two identical soft-copy review workstations were used to conduct the study.
One was equipped with and optimized for two 5-MP CRTs (MGD521M, Barco) with
Dome R5 (5-MP) display controllers (Planar Systems), and the other was
equipped with and optimized for two 5-MP monochrome active-matrix LCDs (Dome
C5i with Dome DX [5-MP] display controllers, Planar Systems). Images were sent
simultaneously to both workstations for display. Both monitor systems were set
to accept 8-bit images and display at 8 bits. Target monitor luminance was
chosen by the vendor and calibrated as would be typical in the clinical
setting, with the CRT pair set at 300 cd/m2, and the LCD pair set
at 550 cd/m2. Before each session, quality assurance testing was
performed as specified by the vendor.
Full-Field Digital Mammograms
The full-field digital mammography images were acquired using the SenoScan.
This system acquires images at 12 bits and then performs a 12- to 8-bit
transformation on the data before sending the images to the soft-copy review
workstation for display. This process is standard for this unit. The
acquisition and display bit depth were not altered for this study.
Data Collection
Two 1-hour reviewing sessions were arranged for each radiologist to
evaluate the 53 cases (five normal and 48 with abnormalities). The reviewers
were blinded to the other reviewers' results. During each session, the
research assistant filled in all the data on the Likert scales. The cases were
reviewed by acquisition-date order so that they were random with respect to
tissue and lesion type. The images were viewed on paired monitors, with the
5-MP LCDs in the center immediately adjacent to each other and the 5-MP CRTs
flanking the LCDs (Fig. 1). The
radiologists had the ability to put any pair of monitors in black-screen mode
if the light was limiting evaluation of the other pair. The images were always
first displayed with identical window and level parameters on both sets of
monitors; however, the radiologist could change the window and level of any of
the images during the study. The cases were interpreted in a standard
reviewing room without ambient light. A definition and guideline sheet of each
item to be assessed was prepared and provided to the reviewers
(Appendix 2). Once the session
was over, the radiologist could not change any responses.
Hanging Protocol
The hanging protocols between comparison components were matched. The cases
were displayed on each set of monitors with the following hanging protocol:
four-view mammogram, four views were hung on one monitor; bilateral
craniocaudal (CC) views were hung simultaneously, one each on the right and
left monitor of each set of monitors followed by bilateral mediolateral
oblique (MLO) images displayed in the same way. Finally, the two views of each
breast (CC and MLO views) were displayed again at full resolution, one on each
monitor. The monitors were all turned slightly toward the midline so that the
reviewer had as close to a perpendicular viewing angle to the face of each
monitor as possible.
Statistical Analysis
All outcomes were evaluated using a random effects analysis of variance
model in which both reviewer and case are treated as random effects. The
advantage of this model is that the p values and confidence intervals
are calculated in a way that takes into account the likely clustering of
ratings for the same case reviewed by multiple reviewers and the likely
clustering of ratings for the same reviewers in their reviews of multiple
cases. In this way, the results may be projected to a new reviewer
interpreting a new case. For each outcome, the mean and model-based SEs were
reported. A 95% confidence interval and the p value were calculated
for each characteristic for which all the reviewers did not rate the monitors
as equivalent.
Interobserver variability for the five reviewers was reported when the difference between systems was statistically significant. In these instances, the range of individual reviewer averages was reported. Intraobserver variability could not be truly evaluated because each reviewer interpreted each case only once. However, intraobserver variability was addressed by presenting ranges of individual case averages across reviewers. This is the case variability, which is defined as the average preference that the reviewers had for each case given any particular feature. This is important to illustrate that the results were not skewed by any one case but also shows the extent to which the level of preference did differ over cases.
|
|
|---|
|
Calcium Features
The LCD monitors were either equivalent or better than the CRTs with
respect to calcium features (Table
4). In particular, reviewers favored the LCD for conspicuity (6.2%
better, p = 0.010; variation in reviewer averages, 0-10.7% better;
variation in case averages, 5.0% worse to 25.0% better). Reviewers also
favored the LCDs for number of calcifications (2.4%, p = 0.043;
variation in reviewer averages, 0-6.0% better; variation in case averages,
0-15.0% better). The LCD and CRT displays did not differ with respect to
shape, sharpness of edges, or distribution of calcifications.
|
Architectural Distortion Features
The two displays did not differ significantly for architectural distortion
features (Table 5). The
observed differences favoring the LCD system fell well short of statistical
significance (p = 0.588 for spiculation, p = 0.323 for
parenchymal edge distortion, and p = 0.802 for conspicuity). A rating
of "no difference" was given by every reviewer for every case for
the density feature of architectural distortion, so no SE or inferential
statistics could be computed; however, the consistent rating of zero is
clearly strong evidence in favor of equivalence.
|
Evaluation of Display Features for All Cases, Including the Normal Set
In the evaluation of display features, which included both the normal cases
and the cases containing abnormalities
(Table 6), the LCDs were
significantly better with respect to luminance (14.3%, p = 0.021).
Individual reviewer averages ranged from 2.2% to 26.5% better for luminance.
The CRTs had a significant advantage for image noise (2.8%, p <
0.001; variation in reviewer averages, 0.5-4.8% better; variation in case
averages, 0-10% better). The two displays did not significantly differ for
dynamic range, skin thickness, subcutaneous tissue, conspicuity of the skin
and subcutaneous tissues, image distortion, display noise, or image size.
|
Overall
In addition to the feature-by-feature rating, an overall rating was
provided on a continuous 100% to 100% scale
(Table 7). The LCD and CRT
displays were both statistically and clinically equivalent for evaluation of
all features included.
|
|
|
|---|
This study showed that the 5-MP LCD display is equivalent to and in some respects better than the 5-MP CRT display for full-field digital mammograms over a range of mammography cases. We found that the LCD monitors showed improved calcium conspicuity. Likely, these results are at least in part due to the fixed matrix of the LCDs. This fixed matrix creates a crisper image compared with a CRT because each pixel in the LCD matrix remains constant over time in both location and size. In contrast, each pixel in the matrix of the CRT varies over time in both location and size, producing an inherent slight blurring of pixel edges. The light source of the CRT is fired thousands of times per second to create the resultant image. This constant refreshing of each pixel is minutely variable in location rather than fixed. Further, because the technology is based on a fired light source, there is slight divergence of the light beam as it travels through space, producing a slight blur.
These two aspects of the CRT result in a slightly smoother display than the LCD, but one that is less sharp. The fixed matrix of the LCD produces a more grainy display. Image noise was the only characteristic that was found to be statistically superior on the CRT. However, in our study this image noise, or graininess, did not affect the radiologists' ability to evaluate lesion features. The structured noise inherent in the design of the TFTs likely accounts for the noise seen on the LCDs in this study. Newer TFT designs are eliminating this problem. Background homogeneity was also rated superior on the CRTs but fell short of statistical significance. The reduction of background homogeneity on the LCDs is due to the limitations of the viewing angle and luminance falloff from off-axis viewing with the LCDs. One way to limit this problem is to carefully maintain a viewing angle as close to 0° as possible (straight-on viewing or perpendicular to the screen). Further, CRTs typically have a blacker background than LCDs. It is our opinion that the increased luminance and wider dynamic range of the LCDs more than compensate for this difference and that this difference did not affect our ability to assess lesion features.
Among the lesion features for which differences were identified between systems, there was limited interreviewer variability. In these instances, all five reviewer averages showed either no preference or a preference for the same system. Case variability was somewhat greater. However, there was no outlier case that skewed the data in favor of one system or another for any of the features. For example, the CRT was not preferred for any single case or by any reviewer for number of calcifications, and the LCD was not preferred for any single case or by any reviewer for image noise. Finally, in the overall assessment of the two monitors, no statistically significant difference was noted.
There are several limitations of this study. The comparison was done with an 8-bit display system using images that are intended to be displayed at 8 bits. Some full-field digital mammography manufacturers display at 10 bits. The industry standard for display of digital radiography has historically been set at 8 bits based on research showing that the human eye can realistically see approximately 256 shades of gray (or 8 bits). The monitors evaluated in this study have an 8-bit control card and 8-bit display. It has not yet been shown that 8-bit display is the most optimal bit-depth display for digital mammography. In fact, there is some controversy among digital mammography manufacturers regarding the optimal bit depth for acquisition or display of digital mammograms. Full-field digital mammography images are typically acquired at a predetermined bit depth (12-14 bits) and then undergo processing to transform the images to 8-12 bits for transfer. Finally, the images that arrive at the soft-copy review workstation may require that the display control card perform another transformation for final display at 8-10 bits. Users should understand the acquisition bit depth, the hardware that performs this processing, and the display bit depth of the system they are using. The work to determine the optimal bit depth for display of full-field digital mammography still needs to be done.
The physical setup of placing the LCDs in the center flanked by the CRTs may also have presented a potential bias. This setup was chosen to eliminate problems with off-angle viewing on the LCDs. We tested the arrangement of the LCDs flanking the CRTs before starting the study. The luminance falloff and "rainbow" effect resulting from this arrangement caused a significant loss in perceptible image quality of the LCDs, so this arrangement was not used in the trial. Conversely, because all of the reviewers had significant previous experience interpreting full-field digital mammograms on CRTs, a potential bias existed toward that with which the radiologists were already comfortable, namely, the CRTs. Unfortunately, in this study design, it is impossible to eliminate all bias because it is readily apparent which monitor is which.
We considered these potential biases unavoidable for a direct side-by-side comparison of the images. An alternative design would have been to evaluate the monitors in different reviewing sessions, but because we were using the CRTs as the gold standard and the details of mammographic lesions are so subtle, we thought the most accurate comparison would be side by side so that these subtle differences could be detected and evaluated.
In summary, we found that 5-MP flat-panel monitors are at least equivalent to and in some aspects superior to 5-MP CRTs in the display of full-field digital mammographic images.
|
|
|---|
This article has been cited by other articles:
![]() |
K. B. Krug, H. Stutzer, R. Schroder, J. Boecker, J. Poggenborg, and K. Lackner Image Quality of Digital Direct Flat-Panel Mammography Versus an Analog Screen-Film Technique Using a Low-Contrast Phantom Am. J. Roentgenol., September 1, 2008; 191(3): W80 - W88. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |