|
|
||||||||
Original Research |
1 Department of Radiology, Orthopedic University Hospital Balgrist, Zurich,
Switzerland.
2 Department of Diagnostic Radiology, University Hospital Zurich, CH-8091
Zurich, Switzerland.
3 Present address: Department of Radiology, Stanford University, 300 Pasteur
Dr., Room S072B, Stanford, CA 94305-5105.
4 Department of Radiology, Waid Hospital, Zurich, Switzerland.
5 Department of Orthopedic Surgery, Orthopedic University Hospital Balgrist,
Zurich, Switzerland.
Received December 5, 2004;
accepted after revision February 7, 2005.
Address correspondence to J. E. Roos
(justus.roos{at}stanford.edu).
Abstract
|
|
|---|
MATERIALS AND METHODS. An experienced musculoskeletal radiologist (reviewer 1), a fellow in musculoskeletal radiology (reviewer 2), and a junior staff member in orthopedic surgery (reviewer 3) evaluated MR images displayed on PACS monitors and hard copies independently and in a blinded fashion with regard to the presence or absence of meniscal tears. Seventy-one patients (mean age, 45.4 years; range, 1680 years) were consecutively included if they had undergone both MRI of the knee and arthroscopy within 4 months. Arthroscopy was the standard of reference. Evaluation time and the reviewer's confidence in his or her diagnosis (Visual Analogue Scale, possible values of 0100) were determined.
RESULTS. Accuracies, sensitivities, and specificities in diagnosing meniscal tears were 8087%, 6385%, and 8793% for soft copies and 8285%, 6476%, and 8794.0%, respectively, for hard copies. Intrareviewer differences between PACS and hard copies were not significant for any of the three reviewers (McNemar tests). Reviewer 3 was less sensitive but more specific in the diagnosis of meniscal tears than reviewers 1 and 2. This difference was significant for both the PACS and hard copies. The reviewers' confidence in their diagnoses and evaluation times were not significantly different for PACS and hard copies (analysis of variance with Bonferroni post hoc analysis).
CONCLUSION. Differences in the diagnostic performance of suspected meniscal tears depend on reviewer experience rather than on the type of documentation.
Keywords: knee meniscal tears MRI musculoskeletal imaging PACS
|
|
|---|
Several studies have evaluated the effect of PACS on radiologists' productivity and accuracy for thoracoabdominal CT [7, 9], mammography [10, 11], conventional chest radiography [12], and conventional skeletal radiography [1315]. To date, little has been published on PACS-based image interpretation of MR images, especially in the field of musculoskeletal MRI [16].
The purpose of this study was to compare diagnostic performance, reviewer confidence, and time requirements in the MRI diagnosis of meniscal tears for three types of reviewers and two types of image documentation (soft copy [PACS] vs hard copy).
|
|
|---|
In accordance with applicable state law, the hospital's institutional review board has issued a general permit for retrospective review of image data based on the hospital's policy protecting patient privacy, which includes the patiens' right to reject the use of their image data for scientific purposes.
MRI
MRI was performed on either a 1.0-T magnet (Expert, Siemens Medical
Solutions; n = 30 patients) or a 1.5-T magnet (Symphony, Siemens
Medical Solutions; n = 41 patients) on the basis of the availability
of the scanners. A dedicated circularly polarized, sendreceive
extremity coil was used on both systems.
The following MR sequences were acquired with the 1.0-T scanner: sagittal intermediate-weighted (TR/TE, 3,800/16) and T2-weighted (3,800/98) turbo spin-echo (section thickness, 3 mm; field of view, 156 x 250 mm; matrix, 170 x 512), coronal T1-weighted (608/20) spin-echo (section thickness, 4 mm; field of view, 140 x 160 mm; matrix, 224 x 512), transverse 3D double-echo steady-state (30/9) gradient-echo images (section thickness, 2.7 mm; field of view, 140 x 140 mm; matrix, 256 x 256), and coronal T2-weighted (4,500/96) turbo spin-echo sequence with fat suppression (section thickness, 4 mm; field of view, 135 x 180 mm; matrix, 210 x 512). The acquisition times for each sequence varied between 2 min 18 sec and 4 min 35 sec.
The MRI protocol on the 1.5-T scanner included the following sequences: sagittal intermediate-weighted (3,760/14) and T2-weighted (3,760/95) turbo spin-echo (section thickness, 3 mm; field of view, 143 x 180 mm; matrix, 204 x 512), coronal T1-weighted (450/14) spin-echo (section thickness, 3 mm; field of view, 138 x 170 mm; matrix, 208 x 512), transverse 3D double-echo steady-state (multiecho data image [MEDIC]) (466/26) gradient-echo images (section thickness, 2 mm; field of view, 170 x 170 mm; matrix, 256 x 512), and coronal STIR (5,550/35; inversion time, 160 msec) (thickness, 3 mm; field of view, 135 x 170 mm; matrix, 203 x 512). The acquisition times for these sequences varied between 2 min 57 sec and 4 min 38 sec.
Image Analysis
Three reviewers with varying experience in musculoskeletal MRI and with the
PACS system evaluated the examinations. Reviewer 1 was a staff radiologist who
had been specializing in musculoskeletal radiology, including MRI, for 5 years
and who had routinely worked with the PACS system used in this study for 1
year. Reviewer 2 was a fellow in musculoskeletal radiology with 1 year of
experience in MRI, including 4 months of specific training in musculoskeletal
MRI and 4 months of experience with the PACS system used in this study.
Reviewer 3 was a junior staff member in orthopedic surgery with 5 years of
practical experience with MRI of the knee and 1 year of experience with PACS
(although he was accustomed to the Web-based PACS viewer, the screen design
and image handling features of which were in part different from the
workstation-based version used in this study). The three reviewers were aware
that patients had undergone arthroscopy or surgery after MRI. However, they
were blinded with regard to the prospective MRI diagnosis and the clinical and
intraoperative findings. To reduce recognition and learning bias, the order of
image interpretation (hard copy and soft copy) was changed after 20, 40, and
60 patients. The evaluation of soft copies and hard copies was at least 1 week
apart within the groups of 20 examinations.
Hard-copy images were printed in a 12-on-1 format for sagittal images and in a 16-on-1 format for axial and coronal images. The images were printed on 35 x 43 cm film (Drystar 1000B, Agfa-Gevaert) on an Drystar 3000 printer (Agfa-Gevaert). The MRI technicians chose window and level settings that allowed differentiation of even minor signal abnormalities within the meniscal substance but that were still useful for the evaluation of all relevant structures of the knee. The MRI technicians were highly experienced in musculoskeletal MRI, with their daily routine including more than 95% musculoskeletal imaging. The hard copies were placed on two viewboxes arranged at eye level to provide all MR sequences at a single glance.
The image interpretation on the PACS system (Image Devices, ID.Station report, version 5.2) was based on the stack mode display. This feature displays all MR sequences in an image stack with the images linked according to sliced position. A two-monitor viewing workstation was available for review (BarcoView, 2 MegaPixel [1,280 x 1,600 resolution] black-and-white display).
Both menisci were separately evaluated with regard to the presence or absence of a meniscal tear. A meniscal tear was diagnosed when a signal abnormality unequivocally reached the articular surface of the meniscus on at least two adjacent images. The following types of meniscal tears were initially differentiated but not further evaluated for this study: horizontal or oblique partial-thickness tears, radial tears, vertical or complex full-thickness tears, and tears with displaced meniscal fragments [1720].
The reviewers indicated their level of confidence in their diagnosis on a Visual Analogue Scale (VAS) [21]. They placed a mark on a line with the right anchor representing extremely low confidence in the diagnosis and the left anchor representing very high confidence in the diagnosis. The distance between the anchors was 100 mm. One of the authors who was not involved in the evaluation of MR images measured the distance between the left anchor and the reviewer's mark to the nearest millimeter.
In addition, each reviewer individually noted the length of time between the start of the evaluation on the PACS workstation or the hard copies and the time at which he or she wrote the diagnosis on the evaluation sheet. The retrieval times of the MR images on the PACS workstation and of hanging of the hard copies on the viewbox were not included in the evaluation time.
Intraoperative findings were used as the standard of reference. An error analysis was performed by two experienced musculoskeletal staff radiologists who were not involved in the initial evaluation of the images. All examinations for which at least one reviewer made an incorrect diagnosis were reviewed and compared with the surgery reports. The following categories of disagreement were used by the two reviewers: image quality problems (such as motion or ringing artifacts, inadequate window setting on hard copies, inadequate magnification factors on hard copies), reviewer error, overinterpretation of equivocal findings (typically in the presence of ill-demarcated meniscal signal), underinterpretation of equivocal findings (such as in meniscal degeneration potentially obscuring signal abnormalities reaching the meniscal surface or small radial tears), and unexplained discrepancy between the MRI diagnosis and intraoperative findings.
Statistical Analysis
Differences in diagnostic performances (sensitivity, specificity, and
accuracy) were evaluated using the McNemar test. Differences in the confidence
in the diagnosis and in evaluation times were assessed with analysis of
variance using the Bonferroni test for post hoc comparisons
[22]. Values for p of
less than 0.05 were considered to be statistically significant. Interobserver
agreement was assessed using kappa statistics. According to Landis and Koch
[23], the agreement was rated
as follows: a kappa value of 00.20 indicated slight agreement,
0.210.40 indicated fair agreement; 0.410.60, moderate agreement;
0.610.80, substantial agreement; and greater than 0.81, excellent
agreement. Absolute agreement would be 1.00. We used SPSS for Windows
(Microsoft) software (SPSS), version 10.
|
|
|---|
|
|
|
The reviewers' confidence in making the diagnosis of a meniscal tear was
not significantly different among the three reviewers and the two viewing
methods. Time for image interpretation was significantly shorter for reviewer
3 (mean time for PACS vs hard copies, 1 min 4 sec vs 1 min 8 sec) in
comparison with reviewers 1 (1 min 30 sec vs 1 min 26 sec) and 2 (1 min 20 sec
vs 1 min 32 sec), independent of the viewing method (p
0.024).
Intertechnique differences were not significant for all three reviewers. The
kappa values for the interobserver agreement
(Table 4) ranged from
substantial to excellent for the PACS (0.690.83) and were substantial
for the hard copies (0.720.76).
|
Error analysis was performed in a total of 34 examinations in the soft-copy and 31 examinations in the hard-copy cohort. None of the errors in the PACS cohort was attributed to image quality problems such as motion or ringing artifacts. In 11 of the 34 examinations with diagnostic problems, the reviewers thought that the error was a reviewer error. In eight examinations, overinterpretation of equivocal findings was thought to be present. In 10 examinations, underinterpretation of equivocal findings was most probably the cause of the incorrect diagnosis. In five examinations, error analysis indicated that the MRI diagnosis differed from intraoperative findings even in retrospect (two false-negative and three false-positive findings).
During review of the hard copies, again no error was attributed to image quality. In four examinations, a reviewer error was the probable cause of discrepancy. In 10 examinations, overinterpretation, and in 13, underinterpretation, of equivocal MRI findings was thought to be present. In four cases the discrepancy between MRI and surgery could not be resolved.
|
|
|---|
In 1991, Brown et al. [16] compared the diagnostic performance of multiscreen digital workstations with that of conventional hard copies for meniscal and anterior cruciate tears. The area under the receiver operating characteristic curve for two reviewers revealed no significant diagnostic difference. However, interpretation time was significantly longer when a digital workstation was used instead of hard copies. On the basis of these data, the authors concluded that from the point of view of diagnostic accuracy, the digital workstation documentation might serve as an alternative to conventional hard-copy interpretation but not from the perspective of productivity.
Because the speed and functionality of PACS workstations have improved since the publication of the study by Brown et al. [16], the impact of PACS-based image interpretation needs to be reevaluated. Our study has confirmed that from a diagnostic point of view, PACS is comparable to hard copies. The difference in evaluation time found in the study by Brown et al. could no longer be confirmed with nonsignificant differences in mean evaluation times. Our results also indicate that reviewer experience and preferences are more relevant than the type of documentation. Reviewer 3 (junior staff member in orthopedic surgery) was significantly less sensitive but more specific and faster in the diagnosis of meniscal tears than reviewers 1 and 2 using both soft-copy and hard-copy image documentation. The sensitivities reached by both radiologists (reviewers 1 and 2) are within the range of the published data in the literature dealing with fast spin-echo MRI of meniscal tears (> 80% sensitivities) [30, 31]. The sensitivity of reviewer 3 was below these values. However, he compensated for it with superior specificity. All reviewers made their diagnosis of meniscal tears with the same confidence using either soft-copy or hard-copy image interpretation.
Apparently, soft copies did not influence the most probable cause of diagnostic errors, which is underinterpretation of equivocal findings. The presumed reasons were similar for the two types of image documentations. This is interesting because a number of potential differences exist, such as suboptimal window and level settings on hard copies or loss of spatial resolution on soft copies.
A study by Reiner et al. [9] has shown that the use of workstation tools such as window level adjustment, zoom, and magnification can enhance CT interpretation by radiologists, particularly when evaluating confined areas with relatively low differences in contrast, such as the mediastinal and hilar regions in the chest. Although software tools, including window and level settings and magnification, were systematically used by our reviewers, the results of this study do not show a significant difference in diagnostic performance between soft-copy and hard-copy documentation. These results are indirectly supported by a study evaluating hard copies with narrow versus standard window and level settings (Buckwalter et al. [32]). No difference was found between the two types of window and level settings in meniscal tear detection.
On the other hand, the use of workstation tools did not add evaluation time for soft-copy images in accordance with previous studies concentrating on other radiology techniques [13, 15]. This finding may be explained by the facts that software tools have become more user-friendly, relatively simple tools for evaluation of the menisci and that reviewers are increasingly trained in the use of workstations.
In conclusion, the results of our study indicate that differences in the diagnostic performance in suspected meniscal tears depend on reviewer experience rather than on the type of documentation (PACS soft copy vs hard copy).
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |