|
|
||||||||
1 Section of Cardiovascular Imaging, Division of Radiology/ Hb6, The Cleveland
Clinic Foundation, 9500 Euclid Ave., Cleveland, OH 44195.
2 Department of Biostatistics and Epidemiology, The Cleveland Clinic Foundation,
Cleveland, OH 44195.
Received September 26, 2003;
accepted after revision June 30, 2004.
Address correspondence to S. S. Halliburton
(hallibs{at}ccf.org).
Abstract
|
|
|---|
SUBJECTS AND METHODS. Fifty-six patients were imaged twice using an identical prospectively ECG-triggered sequential scanning protocol. The Agatston, volume, and mass scores were computed by two observers independently. In addition, a patient's total Agatston score was referenced to an age- and sex-stratified database to determine a percentile ranking. Interscan, interobserver, and intraobserver variability and the resultant impact on patients' risk stratifications were assessed.
RESULTS. Significant interscan differences were found for all mean coronary calcium scores (Wilcoxson's signed rank test, p < 0.0001). Although the median percentage of interscan variability was low for all scoring methods, the interquartile range was wide, indicating significant variability in the data. Median scores (lower quartileupper quartile) for observers 1 and 2, respectively, were as follows: Agatston, 5% (079%) and 6% (083%); volume, 12% (051%) and 12% (057%); and mass, 14% (057%) and 14% (058%). Interobserver and intraobserver differences between mean calcium scores were not significant, and consequently, lower interobserver and intraobserver variabilities (narrow interquartile ranges of 05%) were observed for all scores. Despite significant interscan differences in calcium scores, the percentile ranking assigned to the two scans differed in only 13% of patients. Interobserver differences resulted in a change in the percentile ranking in 79% of patients, whereas intraobserver differences caused a change in only 5% of patients.
CONCLUSION. The accuracy of sequential MDCT for coronary calcium quantification is sufficient in most cases for stratification of patient risk.
|
|
|---|
Additional studies have evaluated the reproducibility of calcium scoring with MDCT. The interscan variability in the measurement of coronary calcium from images obtained using a sequential scanning mode on a 500 msec rotation time 4-slice MDCT scanner was investigated in clinical studies [7]. Hong et al. [7] performed repeated sequential examinations in patients with mild to severe calcified plaque burden and reported a mean interscan variability in the aggregate calcium score in all vessels of 20.4% for Agatston, 13.9% for volume, and 9.3% for mass scoring. Decreased interscan variability was reported for MDCT images obtained with the same scanner using the helical scanning mode in both phantom and clinical studies [8, 9]. In the clinical study of a population with mild to severe plaque burden conducted by Ohnesorge et al. [9], the mean interscan variability was 12% for Agatston, 8% for volume, and 810% for mass scoring.
The clinical significance of the accuracy of a given patient's calcium score depends on the interpretation of results in clinical practice. Often prognostic information is obtained by referencing a patient's total calcium score to age- and sex-stratified databases [1012]. A patient is assigned to a percentile range of risk on the basis of his or her total calcium score; the percentile range is defined by flexible thresholds that take into account the independent effects of age and sex on the amount of total coronary calcium. Therefore, to have a significant clinical impact, changes in a patient's total calcium score for a given examination must be great enough to change the patient's risk stratification.
The purposes of this study were to determine interscan, interobserver, and intraobserver variability in the quantification of coronary calcification with sequential MDCT using Agatston, volume, and mass scoring algorithms and to determine the potential clinical significance of the measured variability on patients' risk stratification.
|
|
|---|
Image Acquisition
All patients were imaged on a 4-slice MDCT scanner (SOMATOM Sensation 4,
Siemens Medical Solutions) using prospectively ECG-triggered sequential
techniques. Scanning parameters were as follows: gantry rotation time, 500
msec; tube voltage, 140 kV; tube current, 100 mA; slice thickness, 2.5 mm; and
temporal resolution, 250 msec. These parameters resulted in an effective dose
of 0.8 mSv for women and 0.7 mSv for men. After the initial scanning, patients
were asked to move around for 5 min, then they were repositioned on the table,
and the identical scanning protocol was repeated.
In addition to scanning patients, we performed monthly scanning of a calibration phantom (Anthropomorphic Cardio Phantom, Institute of Medical Physics, and QRM GmbH) containing a calcification insert with a known hydroxyapatite density and a water insert using the same scanning protocol developed for the patient studies. Calibration measurements from the phantom were used to determine a calibration factor and calculate absolute values of calcium mass.
Data Analysis
All images were transferred to a stand-alone workstation (NetraMD,
ScImage). Two cardiac radiologists who were unaware of the acquisition
technique used independently evaluated scans 1 and 2 obtained in each patient,
allowing at least 2 weeks between the interpretations of each scan. Scan 1 was
evaluated twice, with at least 2 weeks elapsing between the interpretation
sessions. Calcified lesions in the left main, left anterior descending, left
circumflex, and right coronary arteries were identified from images with a
threshold of 130 H.
Calcium was quantified by calculating the Agatston score, the volume score,
and the absolute mass score. The Agatston score for each lesion in each
coronary artery was computed in the standard way as the product of the area of
each lesion with a weighting factor assigned according to the maximum
attenuation value of the lesion
[5]. The sum of the scores for
all lesions in all coronary arteries yielded the total Agatston score. The
volume score for each lesion was simply estimated as the product of the number
of voxels containing calcium and the volume of one voxel
[13]. The scores for all
lesions in all coronary arteries were added to obtain the total volume score.
The calcium burden was quantified in terms of absolute calcium mass
[7,
14,
15]. The calcium mass score in
a given lesion was calculated as the product of a calibration factor, the
volume of each lesion, and the mean attenuation value of the lesion. The
calibration factor (CHA) was calculated as follows:
|
|
HA is the density of the known calcification,
CTHA is the mean attenuation value of the known
calcification, and CTwater is the mean attenuation value
of water. The calibration factor was determined from monthly calibration
measurements. The total mass score was the sum of the scores for all lesions
in all coronary arteries. All statistics were computed for the aggregate, or total, calcium scores summed over all coronary arteries. To assess the reproducibility of calcium scores determined from two scans, we computed the mean absolute difference between scores of the same type and compared it with zero, using Wilcoxson's signed rank test with a p value of 0.05 set as the level of significance. In addition, 95% confidence intervals (CIs) were constructed to determine the range of calculated values believed to encompass the actual value. The data were also examined for any pattern of bias by computing the Spearman's correlation coefficient between the mean value of a given calcium score from scan 1 and scan 2 versus the absolute value of the difference in scores between scan 1 and scan 2. A correlation coefficient significantly different from zero suggests that the bias is linearly related to the magnitude of measurement.
In addition, percentage change of interscan variability was calculated for each scoring method as the mean of the absolute differences between scores from scan 1 and scan 2 divided by their mean value multiplied by 100%. The distribution of data is described using percentiles. The xth percentile is the value below which x percent of the data represented on a continuum lies and above which the remainder of the data lies. The 50th percentile is known as the median, and the 25th and 75th percentiles are known as the lower and upper quartiles, respectively [16]. The correlation between heart rate and the absolute difference in total calcium measurements from each scan was also evaluated. A correlation coefficient that is significantly different from zero suggests that the interscan differences depend on heart rate.
We also assessed the impact of interscan variability on the prognostic value of Agatston calcium scores. A patient's total Agatston score was referenced to a database of age- and sex-stratified scores to determine the patient's percentile ranking [12]. Patients were assigned to the 10th, 25th, 50th, 75th, or 90th percentile. The number of patients for which interscan differences in the Agatston score changed the percentile ranking was calculated.
Interobserver and intraobserver variability were evaluated for the Agatston, volume, and mass scores using the same statistical methods: the mean absolute difference between scores of the same type was computed and compared with zero, CIs were constructed, and the percentage of variability was calculated. Patients were again assigned percentile rankings on the basis of the stratified Agatston scores, and the resulting differences between observers and between the ratings of an individual observer were computed to assess the clinical impact of inter- and intraobserver variability, respectively.
|
|
|---|
Comparison of the aggregate coronary calcium scores derived from scan 1 and
scan 2 by both observers showed that the mean Agatston, volume, and mass
scores were statistically significantly different (Wilcoxson's signed rank
test, p < 0.0001) (Table
1). A pattern of bias between the scores from scan 1 and scan 2
(Spearman's correlation, r
0.92, p < 0.0001) was
also detected for both observers, suggesting that the absolute differences
between scores of the same type for each scan increased with the magnitude of
measurement (Table 1).
|
Interscan differences resulted in a high percentage of variability in the estimation of Agatston, volume, and mass scores (Fig. 1). Although the median interscan variability in the determination of the Agatston score was low (5% for observer 1 and 6% for observer 2), the interquartile range for both observers was wide (079% for observer 1 and 083% for observer 2), with an upper data limit of 200% (indicating that some calcium was detected on one scan but no calcium was detected on the other) for both observers. Therefore, the interscan variability was high in a significant number of cases. The interquartile range was narrower for the volume and mass scores, indicating high variability in fewer cases (volume, 051% for observer 1 and 057% for observer 2; mass, 057% for observer 1 and 058% for observer 2), but the median was higher (volume, 12% for observers 1 and 2; mass, 14% for observer 1 and 12% for observer 2) such that little improvement in interscan variability was achieved with the alternative scoring methods.
|
The correlation between interscan differences in calcium measurement and
heart rate ranged from 0.06 to 0.04, depending on the particular
observer and scoring algorithm (p
0.65). Therefore, we found
insufficient evidence that interscan differences in the measurement of calcium
are heart-rate dependent.
The specific clinical impact of interscan differences is reflected in resulting differences in the assignment of a percentile ranking based on age- and sex-stratified scores [9]. In this study, interscan differences in the estimation of Agatston score resulted in a change in the percentile ranking for seven patients (13%) according to observers 1 and 2. Therefore, despite high interscan variability, changes in a patient's total calcium score were clinically significant in only a relatively small percentage of cases (Fig. 2A, 2B).
|
|
Interobserver and intraobserver measurement of the Agatston, volume, and mass scores was less variable than interscan measurement. Although the mean absolute difference between scores of the same type obtained using all methods of quantification were statistically significantly different from zero, all differences were very close to zero and still well within the range of clinically tolerable variability (Tables 2 and 3). For example, the results showed with a 95% CI that the mean absolute difference in the Agatston score determined by the observers from scan 1 was between 0.40 and 2.90. Even a difference in the Agatston score as great as 2.90 still is not clinically significant.
|
|
A pattern of bias between scores from each observer (Spearman's
correlation, r
0.49, p < 0.0001) was shown but
suggested that the interobserver differences increased only moderately with
the magnitude of measurement (Table
2). A similar pattern of bias was seen in the intraobserver
measurement of all scores by observer 1 using scan 1 (Spearman's correlation,
r
0.45, p < 0.001)
(Table 3). The pattern of
intraobserver bias for observer 2 was less significant (Spearman's
correlation, r
0.32, p < 0.015).
In addition, interobserver and intraobserver percentages of variability were low for all scoring methods (Figs. 3 and 4). The inter-quartile ranges for interobserver variability were extremely narrow (05%), with few outliers (Fig. 3). The 25th, 50th, and 75th percentiles for intraobserver variability in the Agatston, volume, and mass scores were all equal to zero for both observers (indicating that the calculated variability in at least 75% of scans equaled zero) but with a greater number of outliers (Fig. 4).
|
|
Although inter- and intraobserver variabilities were low overall, the variability was great enough in some patients to change their risk stratification: interobserver variability led to a change in percentile ranking for five patients (9%) according to the evaluation of scan 1 and for four patients (7%) according to evaluation of scan 2. Intraobserver variability had less of a clinical impact: assignment of percentile ranking changed for three patients (5%) with both observers.
|
|
|---|
Significant differences in calcium scores measured on two identically acquired prospectively ECG-triggered sequential scans obtained in the same patient were observed. Variability was highest with Agatston scoring and showed slight improvement with volume and mass scoring. However, assignment of an age- and sex-stratified percentile ranking based on a patient's Agatston score differed between the two scans in only 13% of patients. In addition, the results of the study provided evidence of low interobserver and intraobserver variability in the measurement of coronary calcium. However, interobserver differences were significant enough to affect a change in percentile ranking for 79% of patients (depending on the scan) and intraobserver differences for only 5%. These results suggest that standardized image evaluation techniques may be as important as the acquisition method and the scoring algorithm for reproducibility in the quantification of coronary artery calcium and thus deserve further consideration.
In contrast to similar studies of reproducibility [7], our study included patients with a calcium score of 0 on one or both examinations in the analysis because it is in this setting that calcium scores have the greatest potential predictive value. An absent or low score (e.g., Agatston score < 10) may indicate low risk for the development of coronary heart disease [18, 19]. Therefore, particularly in young patients, the difference between the absence of calcium and the presence of calcium could be clinically significant. In fact, seven patients (13%) showed no calcium on one scan and measurable calcium on the other scan. The interscan differences in the Agatston score were significant enough in most of those cases to change the percentile ranking assigned to a patient (four of the seven for observer 1 and five of the seven for observer 2). Because of the inclusion of patients with calcium scores of 0 on one or both scans and the resulting significant number of patients with 0% variability (due to the absence of calcium on both scans) or 200% variability (due to the absence of calcium on only one scan), the percentage of variability was described using medians and interquartile ranges rather than the more common descriptors of means and SDs that might be artificially low or high, given the extreme values in the data.
In addition, the interscan variability between the two scans obtained in the same patient was higher than that reported in previous studies. This difference may be attributable in part to the fact that in our study, patients were removed from the patient table and repositioned between the two examinations, whereas in previous studies [7] patients remained stationary on the table between the two acquisitions. This practice presumably increases the contribution of partial volume averaging to interscan differences. Changes in a patient's position between examinations introduce a change in slice position relative to the calcified lesion such that changing degrees of partial volume averaging may be observed.
Previous studies, primarily those using electron beam CT, have found the
greatest variability in the measurement of smaller amounts of calcium
[20,
21]. However, in our study, a
pattern of bias between scores from scan 1 and scan 2 (Spearman's correlation,
r
0.92, p < 0.0001) was detected, suggesting that
the absolute difference between scores of the same type from each scan
increased with the magnitude of measurement. The previous finding that
variability decreases with increases in calcium does not necessarily conflict
with the finding in our study. The measured calcium burden was classified as
absent in most patients that we evaluated, with the calcium burden in the
remainder of the patients primarily classified as minimal, mild, or moderate.
Only two to four patients (47% of the sample population depending on
the particular scan and observer) displayed severe amounts of calcium.
Therefore, although the trend within this group with milder plaques suggests
increases in variability accompany increases in magnitude, the trend over a
broader range of calcium values could be different.
One limitation of our study is that the assessment of the variability in patient risk stratification was based only on the Agatston score because of the lack of reference data for the volume and mass scores. Results based on the volume and mass scores presumably would have been comparable to the results based on the Agatston score because the variability of all scoring methods was equivalent.
The results of this study indicate that the accuracy of the sequential MDCT technique for coronary calcium quantification is sufficient for patient risk stratification in most cases. Although significant interscan differences were observed for repeated sequential examinations, patient risk stratification based on calcium scores varied between the two scans in only 13% of patients. Therefore, the actual clinical impact of differing calcium scores is much less than the high percentage of interscan variability suggests. Approaches such as helical scanning should be pursued to reduce this clinical impact. However, decreased variability in the interpretation of calcium measurementsnot decreased variability in the actual measurement alonemust be shown to justify the clinical use of the technique exposing the patients to a higher level of radiation.
|
|
|---|
This article has been cited by other articles:
![]() |
M.-T. Wu, P. Yang, Y.-L. Huang, J.-S. Chen, C.-C. Chuo, C. Yeh, and R.-S. Chang Coronary Arterial Calcification on Low-Dose Ungated MDCT for Lung Cancer Screening: Concordance Study with Dedicated Cardiac CT Am. J. Roentgenol., April 1, 2008; 190(4): 923 - 928. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. H. McCollough, S. Ulzheimer, S. S. Halliburton, K. Shanneik, R. D. White, and W. A. Kalender Coronary Artery Calcium: A Multi-institutional, Multimanufacturer International Standard for Quantification at Cardiac CT Radiology, May 1, 2007; 243(2): 527 - 538. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Weinreb, P. A. Larson, P. K. Woodard, W. Stanford, G. D. Rubin, A. E. Stillman, D. A. Bluemke, A. J. Duerinckx, N. R. Dunnick, and G. G. Smith American College of Radiology Clinical Statement on Noninvasive Cardiac Imaging Radiology, June 1, 2005; 235(3): 723 - 727. [Full Text] [PDF] |
||||
![]() |
J. M. Provenzale Radiological Advances: Putting Things in Perspective Am. J. Roentgenol., February 1, 2005; 184(2): 363 - 363. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |