|
|
||||||||
Original Research |
1 Department of Radiology, Seoul National University Bundang Hospital, Seoul
National University College of Medicine, Institute of Radiation Medicine, and
Seoul National University Medical Research Center, 300 Gumi-dong, Bundang-gu,
Seongnam-si, Gyeonggi-do, Seoul 463-707, Korea.
2 Department of Radiation Applied Life Science, Seoul National University
College of Medicine, Seoul, Korea.
3 Department of Computer Science, University of British Columbia, Vancouver, BC,
Canada.
Received November 25, 2007;
accepted after revision February 4, 2008.
This work was supported by the Korea Research Foundation Grant funded by
the Korean Government (MOEHRD) (KRF-2006-311-D00168).
Abstract
|
|
|---|
MATERIALS AND METHODS. Twenty images were compressed reversibly and irreversibly to 6:1–30:1. To analyze the two regions separately (lungs; and chest wall and mediastinum), the compressed pixels outside each tested region were replaced with the original pixels. By comparing the compressed and original images, three radiologists independently rated the compression artifacts as grade 0, none, indistinguishable; 1, barely perceptible; 2, subtle; or 3, significant. At each compression level, the two regions were compared for the readers' responses, peak signal-to-noise ratio (PSNR), and HDR-VDP results. Wilcoxon's signed rank tests and exact tests for paired proportions were used with a p value threshold of 0.05.
RESULTS. Artifacts were rated as lower grades in the lungs than in the chest wall and mediastinum, showing statistical significances at 10:1–20:1 for reader 1, 8:1–15:1 for reader 2, and 8:1–20:1 for reader 3. Grade 0 was more frequent in the lungs, showing statistical significances at 10:1 for reader 1 and at 8:1–10:1 for readers 2 and 3. The results of PSNR indicated greater artifacts in the lungs (p < 0.001), whereas HDR-VDP results indicated fewer artifacts in the lungs (p < 0.001).
CONCLUSION. Although compression artifacts are mathematically greater in the lungs than in the chest wall and mediastinum, radiologists' artifact perceptions are the opposite, which can be successfully reproduced by HDR-VDP.
Keywords: artifacts CT data compression image quality metric low-dose CT lung cancer screening visually lossless threshold
|
|
|---|
In the vision science fields, computer-based image quality metrics modeling the human visual system have been developed to predict human perceptions of image distortions [9]. If such a perceptual metric can accurately reproduce the aforementioned regional variance in compression artifacts perceived by radiologists, it can potentially be used for region-based adaptive compressions. Theoretically the compression level for a given image can be adjusted and optimized for the region of main clinical interest in that image, or multiple regions in an image can be compressed to different levels so that minimal perceptual artifacts can be evenly distributed throughout the image. Low-dose chest CT would be a promising target for such intelligent compressions because a chest CT image can be largely divided into two distinct regions—the lungs and the chest wall and mediastinum—and the diagnostic task is usually focused on the lungs.
|
|
|
|
|
|
|---|
Study Sample
A chest radiologist with 5 years of clinical experience compiled 20
consecutive examinations in which he detected lung nodules during his clinical
work for a week in May 2007. From each examination, he selected a transverse
CT image that represented the lung nodule most clearly. The patients were 16
men and four women, ranging in age from 33 to 77 years (mean, 56.4 years). The
20 selected images contained 25 nodules of 2–28 mm (median, 4 mm) in
diameter. More than one nodule appeared in five images. The nodules were solid
(n = 13), calcified (n = 9), or ground-glass opacity
(n = 3) and were located within the peripheral third (n =
16) or central two thirds (n = 9) of a lobe.
CT
Either a 16-MDCT (n = 9) or a 64-MDCT (n = 11) scanner
(Brilliance, Philips Healthcare) was used. Image acquisition parameters were
as follows: detector collimation, 1.5 or 0.625 mm; gantry rotation time, 0.75
or 0.5 second; tube potential, 120 kVp; effective mAs, 24–32, using the
automatic tube current modulation; pitch, 1.19–1.25; reconstruction
thickness, 5 mm; reconstruction filter, medium-sharp (type C); matrix, 512
x 512; and field of view, 263 and 385 mm.
Image Compression
Each image had a bit depth of 12 bits/pixel, and each pixel was packed on a
2-byte boundary with four padding bits. Using a JPEG 2000 algorithm (PICTools,
version 2.00.543, Pegasus Imaging Company), each original image was compressed
to seven different levels (ratio of original 16 bits/pixel to compressed size
in bits/pixel) [12]:
reversible and irreversible 6:1, 8:1, 10:1, 15:1, 20:1, and 30:1. The JPEG
2000 encoder was set to default settings: reversible 5–3 wavelet filter
or irreversible 9–7 wavelet filter; single tile; wavelet decomposition
level, 6; code-block, 64 x 64; size of precinct, 32,768 x 32,768;
and a single layer.
|
On each original image, a chest radiologist marked the lung and thorax silhouettes by carefully drawing lines along the pleura and skin surface, respectively, using a graphic tablet (Graphire Pen Tablet, Wacom Technology). The lung silhouettes were then superimposed over corresponding compressed images, where the pixels outside the lungs were replaced with the corresponding pixels from the original images. Therefore, each of the 140 resulting images (20 images x 7 compression levels) had the pixels from the compressed images inside the lungs, and the pixels from the original images outside the lungs. Likewise, we prepared 140 additional images, of which the pixels inside the chest wall and mediastinum (between the lung silhouettes and the thorax silhouette) were from the compressed images and the remainder were from the original images (Fig. 1). A bit depth of 12 was maintained throughout this procedure. Hereafter, we refer to these 280 masked compressed images as "compressed images." For subsequent analyses, window level and width were set at a lung setting, –600 and 1,500 H.
Human Visual Analysis
Three board-certified body radiologists with 11, 8, and 7 years of clinical
experience participated. Each of the 280 compressed images was paired with its
original. The 140 image pairs with compressed lung regions were assigned to
seven reading sessions, and the repetition of any patient's images in a
session was avoided. Likewise, the 140 image pairs with the compressed chest
wall and mediastinum region were assigned to seven reading sessions. The order
of the 14 reading sessions was randomized for each reader. Sessions were
separated by a minimum of 3 weeks. At the beginning of each session, readers
were informed which region—either lungs or chest wall and
mediastinum—should be analyzed for possible artifacts.
Each image pair was alternately displayed on a single monitor, while the order of the original and compressed images was randomized. The reader selectively toggled between the two images, returning to the first image as desired. Each reader, blinded to the compression levels, independently determined if the two images were indistinguishable (grade 0) or distinguishable. Any perceived differences between the two compared images were regarded as perceptual artifacts. If an image pair was rated as distinguishable, the readers were asked to grade the image difference (or compression artifacts) as follows: grade 1, barely perceptible; 2, subtle and would not affect the diagnosis; or 3, significant and potentially affecting diagnosis. When making comparisons for the lung, the readers were asked to pay attention to the lung nodules as well as to other structural details (i.e., the small airways, pulmonary vessels, interlobular septa, and interlobar fissures) and the texture of the pulmonary parenchyma. To help the readers to identify nodules, hard-copy images with marks for nodules were provided during the review. When making comparisons for the chest wall and mediastinum, readers were asked to pay attention to the texture of the chest wall and mediastinum and the structural details in the ribs, vertebrae, mediastinal lymph nodes, heart, and great vessels.
|
Image review was conducted at each reader's convenience without time constraints. The reading distance was limited to a range of 41–56 cm by aiming a laser beam in front of each reader's forehead to a ruler perpendicular to the monitor. The readers' habitual viewing distances had been measured during 30 minutes of their clinical work. Limiting the reading distance was meant to reproduce our clinical practice because reading from too close or too far a distance would artificially enhance or degrade the readers' sensitivity to compression artifacts [8].
PSNR
After converting the images to 8-bit images by applying the lung window
setting, we calculated PSNR (in decibels [dB]) for each ROI in the
irreversible compressions, as follows:
![]() |
![]() |
Perceptual Metric
HDR-VDP is a publicly available
[11] perceptual metric that
simulates human perception mechanisms by taking into account local adapt
ation, the photoreceptors' nonlinear response, contrast sensitivity, and
frequency-selective channels in the human visual system. Unlike other
perceptual metrics [9], HDR-VDP
can cover the full visible range of luminance (high dynamic range). Because
modern medical display systems offer higher dynamic range and are
significantly brighter than older cathode ray tube displays, HDR-VDP is likely
the most suitable perceptual metric for our application. HDR-VDP takes two
images as input and then outputs a probability-of-detection map in which the
pixel value indicates the probability that a human observer viewing the two
images can detect a difference at that pixel location
[10].
The model prediction was performed on each ROI in the irreversibly
compressed images after transforming each 8-bit image to a high–dynamic
range luminance format, according to the display function of our display
system. The same viewing conditions (matrix size, display size, reading
distance range, and maximum luminance) as those for the human observers were
used. The Minkowski metric
[14] with a summation
parameter (β) of 2.4 was used to summarize the probability-of-detection
map in a single value [9].
Similar to PSNR, the Minkowski metric was modified to consider only the ROI,
as follows:
![]() |
Statistical Analysis
Interobserver agreement was measured using kappa statistics
[15]. At each compression
level, we compared the lungs and the chest wall and mediastinum for each
reader's grading, PSNR, and HDR-VDP results using Wilcoxon's matched pair
signed rank tests. In addition, the percentage of indistinguishable pairs
(grade 0) was compared using exact tests for paired proportions
[16]. A p value of
< 0.05 indicated a statistical significance.
|
|
|---|
|
Regarding the radiologists' binary responses (i.e., distinguishable or indistinguishable), kappa statistics were 0.70 and 0.67 for the lungs and the chest wall and mediastinum, respectively. The percentage of indistinguishable pairs (grade 0) also tended to be greater for the lungs than for the chest wall and mediastinum (Table 2). Although no significant difference was observed at the reversible and 6:1 compressions, the difference became statistically significant at 10:1 for reader 1 and at 8:1–10:1 for readers 2 and 3. At higher compressions, the statistical significance disappeared because all the readers rated most image pairs as distinguishable for both the lungs and the chest wall and mediastinum.
|
At each irreversible compression level, the PSNR was smaller, indicating greater mathematic artifacts, in the lungs than in the chest wall and mediastinum (p < 0.001) (Fig. 5), whereas HDR-VDP results were smaller, indicating less perceptual artifacts, in the lungs than in the chest wall and mediastinum (p < 0.001) (Fig. 6).
|
|
|
|
|---|
10:1)
[1,
4–6],
our readers' response patterns for the lungs at a tested compression level
(e.g., 10:1) were very similar to those for the chest wall and mediastinum at
the adjacent lower compression level (e.g., 8:1). In other words, with the
same level of artifacts, the lung regions were more compressible than the
chest wall and mediastinum regions. Although the difference between 6:1 and
8:1 or between 8:1 and 10:1 is seemingly unimportant, the difference becomes
significant for the cumulative data of repetitive examinations of screening
populations. These results raise a previously unnoticed issue regarding compressing low-dose chest CT images: Based on the premise that minor artifacts outside the lungs are less significant, an entire image, which originally aims at pulmonary nodule detection, can be compressed to a higher level according to the acceptable threshold for the lungs. Alternatively, from a more conservative viewpoint, a lower compression level might be needed than the previously reported acceptable thresholds [1, 4–6] to avoid potential diagnostic inaccuracy for extrapulmonary abnormalities. An ideal situation would be that different compression levels could be applied to the lungs and the chest wall and mediastinum within a single image using ROI-coding techniques [17].
These results corroborate our previous observation of the regional difference in compression artifacts in standard-dose chest CT images [6]. However, apart from the different radiation dose, which is one of the important factors affecting the compression artifacts in CT images [7], a methodologic advance in this study should be noted. In this study, we analyzed the lungs and the chest wall and mediastinum separately after removing compression artifacts outside each ROI. Otherwise, the compression artifacts outside an ROI might have affected the visual analysis of the ROI. Such interference has been mentioned also by Ringl et al. [4] who used rectangular collimation to cover the chest wall in measuring an acceptable compression level for lung CT images. We filled the region outside the ROI with pixels from the original images instead of a constant value (e.g., gray). This was to reproduce all the viewing conditions, including luminance adaptation in the readers' visual system, in clinical practice. We believe this masking procedure enabled a more accurate measurement of the regional differences in compression artifacts.
Radiologists' perceptions of compression artifacts are affected, in large part, by two factors: One is an absolute change of pixel values that occurs during the quantization step of image compression (mathematic factor) and the other is how the human visual system perceives such distorted pixels displayed on a monitor (perceptual factor). It is well known that minute compression artifacts introduced by a very low level compression are usually imperceptible [3].
In our results, the PSNR was smaller in the lungs than in the chest wall and mediastinum, indicating more mathematic artifacts in the lungs. Most irreversible compression techniques, including JPEG 2000, exploit the fact that the human visual system is less sensitive to distortions in high-frequency signals than to those in low-frequency signals. They typically compress an image by transforming it from a spatial domain into a frequency domain and then discarding the high-frequency coefficients more than the low-frequency coefficients [18]. The sharp edges of the pulmonary vessels and bronchial wall, forming the pulmonary texture, would be represented as high-frequency coefficients, whereas the homogeneous grainy area in the chest wall and mediastinum would cor respond to low-frequency coefficients. This would be a reasonable explanation for why more mathematic artifacts occurred in the lungs than in the chest wall and mediastinum.
Contrary to the PSNR results, the radiologists responded that the lungs showed less artifacts than the chest wall and mediastinum. This discrepancy might be attributable to multiple factors that are not completely understood from this study. One plausible explanation is that the visual masking effect on the compression artifacts was more significant—the visibility of the compression artifacts was lower—in textured areas (lungs) than in homogeneous areas (chest wall and mediastinum) [9]. Whatever the reason was, our results suggest that the perceptual factors can outweigh the mathematic pixel-wise distortion in the actual appearance of compression artifacts; therefore, perceptual factors relating to the display system and human visual system need to be stressed in medical image compressions.
PSNR has been widely used to measure compressed image quality because of its computational simplicity; however, PSNR is known to inaccurately correlate with human artifact perception across a wide range of image content [9]. To overcome this limitation, several perceptual metrics [9] have been proposed and introduced into medical fields [19–23]. Among these metrics, HDR-VDP can cover a full visible luminance range [10] and, therefore, is probably the most suitable for medical applications using brighter displays. Recent studies [20, 23] showed promising results of HDR-VDP in predicting radiologists' perceptions of compression artifacts and its potential for automatically calculating an acceptable threshold of a given CT image. Although our study was limited to per-image analysis, we extended our study to the regional difference in the same image. In our results, HDR-VDP successfully reproduced the regional difference in the radiologists' responses for the compression artifacts as opposed to the PSNR results. This implies that the perceptual metric has a greater potential than PSNR to be adopted in a per-region adaptive compression technique.
In grading the compression artifacts, we relied on the readers' subjective decisions. Whether the artifacts rated as grade 1 or 2 are acceptable for clinical interpretation is debatable. These minute artifacts probably correspond, in part, to denoising effect, which is known to be one of the first perceivable changes in an image as the compression level increases [3]. However, the denoising effect is inevitably accompanied by blurring artifacts to some degree at the same JPEG 2000 compression, thereby altering inherent organ textures. To determine whether such minute artifacts can hinder diagnosis, larger studies using receiver operating characteristic analysis are required; however, such a study covering a broad range of potential abnormalities in the chest is likely unrealistic.
Therefore, to be more conservative, we in addition analyzed the readers' responses for the presence of any perceivable artifacts. We believe this analysis was also less subjective, although the readers' sensitivities still varied in artifact perception. If a compressed image is indistinguishable from the original, there is no basis for arguing that this compression hinders diagnostic accuracy [24]. This "visually lossless" criterion has been rapidly gaining support as a conservative and practical guideline for medical image compression [4, 8, 23–25]. Despite the aforementioned individual variation and subjectivity in our visual analysis, our results regarding the regional difference in the presence and relative magnitude of perceptual artifacts remain valid.
This study has other limitations. First, we tested images only with the lung window setting even for the chest wall and mediastinum. Whether the regional artifact difference would occur at other window settings remains uncertain. However, identifying the effects of different window settings on the perceptual artifacts was not the purpose of this study. We aimed to show the regional artifact difference in a given image—that is, a low-dose chest image with a lung window setting. Second, we did not analyze the artifacts according to different nodule characteristics. This subgroup analysis was precluded because of our small study sample. Such analysis has been performed by Ko et al. [1] who focused on nodule detection. Third, although we finally conducted 280 image comparisons (20 images x 7 compression levels x 2 ROIs), the number of tested original images was only 20. However, this small study sample size was large enough to show statistical significance for the difference between the lung and the chest wall and mediastinum.
In conclusion, although JPEG 2000 compression introduces greater mathematic artifacts in the lungs than in the chest wall and mediastinum in low-dose chest CT images with a lung window setting, radiologists perceive fewer artifacts in the lungs than in the chest wall and mediastinum. The tested perceptual image quality metric (HDR-VDP) can successfully reproduce such a regional difference in radiologists' perceptions of the compression artifacts.
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |