|
|
||||||||
1
Department of Radiology, University of Iowa Hospitals and Clinics, 200 Hawkins
Dr., Iowa City, IA 52240.
2
College of Medicine, University of Iowa Hospitals and Clinics, Iowa City, IA
52240.
3
Department of Orthopaedic Surgery, University of Iowa Hospitals and Clinics,
Iowa City, IA 52240.
4
Department of Biomedical Engineering, University of Iowa Hospitals and
Clinics, Iowa City, IA 52240.
Received September 1, 1999;
accepted after revision November 3, 1999.
Presented at the annual meeting of the American Roentgen Ray Society, New
Orleans, May 1999.
Abstract
|
|
|---|
MATERIALS AND METHODS. During a 12-month period, we retrospectively reviewed 556 consecutive ankle radiographic studies consisting of anteroposterior, mortise, and lateral views. One hundred twenty patients with at least one ankle fracture were paired with 140 healthy control subjects. Each image in the three-view examination was separated and sorted by view and studied independently; all images were reviewed by two skeletal radiologists and two orthopedic surgeons. Each radiograph was evaluated for fracture of the medial, lateral, and posterior malleoli and the foot using a five-point confidence rating. Performance of each view and modeled two- and three-view combinations of views was evaluated with modified receiver operating characteristic analysis.
RESULTS. The data provide little support for preferring either two-view combination (anteroposterior-lateral or mortise-lateral) for any type of fracture. The three-view combination does detect significantly more fractures than some two-view combinations in some locations, and there is a statistically significant cost in diagnostic accuracy for eliminating the anteroposterior or mortise view.
CONCLUSION. Reducing the ankle radiographic series from three to two views would result in a small but significant decrease in the detection of fractures of the ankle and foot. Both two-view combinations are equivalent for fracture detection.
|
|
|---|
The standard radiographic examination of the ankle for fracture consists of the anteroposterior, mortise, and lateral views. Multiple images are obtained because some fractures are subtle or invisible on some views (Figs. 1A,1B,1C and 2A,2B,2C). Several investigators have suggested that two views can replace the standard three-view series without adversely affecting patient outcome [1, 3, 8, 9]. Some investigators suggest that the anteroposterior and lateral two-view combination should be used [1, 8], whereas others recommend using the mortise and lateral views [3, 9]. In most of these studies, the two-view combination that was tested against the standard three-view series was selected on the basis of investigator preference; to our knowledge, no study has assessed both two-view combinations against the standard three-view series for fracture detection. In some studies, a two-view series was evaluated and then the third view placed on the viewbox, with assessment of whether this view added information. However, choice of viewing order may introduce bias. In addition, the samples in some of these studies consisted primarily of cases with fractures and had few healthy control subjects, thus differing from the typical clinical setting in which only 29% of ankle studies show a fracture [1].
|
|
|
|
|
|
How many views are necessary to evaluate the ankle for possible fracture is an important question. We wanted to answer this question with a systematic analysis of the contribution of each view to fracture detection. We designed a study to compare the performance of individual views in the diagnosis of fractures at the ankle and hindfoot, and then combined the results of individual views to compare the performance of two- and three-view ankle studies. Our aim was to determine if any two view-combination was similar to three views for fracture detection. We also wanted to determine whether one two-view combination was superior to the other.
|
|
|---|
To determine the presence or absence of fractures, the interpretation of the original radiologist was compared with a gold standard interpretation. The gold standard consisted of a two-step interpretation process: an initial review of the original interpretation of each ankle series by a musculoskeletal radiologist with the patient's medical record and a review of all follow-up imaging studies. If the original interpretation and the reinterpretation were in agreement, the original interpretation was considered correct. If the original interpretation and the reinterpretation were discrepant, four reviewers, including musculoskeletal radiologists and orthopedic surgeons specializing in foot and ankle trauma, further evaluated these cases. All additional interpretations were performed independently; the results were pooled for a final consensus on the discrepant cases that served as the gold standard interpretation.
From this set of radiographs, 120 patients had at least one fracture present. To obtain a similar number of normal studies, every other examination with normal findings was excluded on the basis of the randomly assigned last two digits of the patient's hospital number. We then further excluded normal examinations using a random number table until 140 examinations remained, yielding a total of 260 ankle series in the study population. The fracture group consisted of 51 female patients (average age, 43 years; range, 14-75 years) and 69 men (average age, 37 years; range 18-82 years), compared with 68 females (average age, 31 years; range, 15-74 years) and 72 males (average age, 28 years; range, 6-61 years) in the group of healthy control subjects. Each three-view radiographic series was separated into individual images, and identifying features were masked. As a result, 780 separate images made up the study sample. Four reviewers who were not part of the gold standard group viewed each image. Each reviewer worked independently. Images were sorted by projection into anteroposterior, mortise, and lateral views and were presented to the four observers individually in batches of 50 images to avoid fatigue. The order of image presentation was randomized so that for any given patient both the initial view studied by each reviewer and the order of presentation for the other two views varied.
For each image, the reviewer decided whether possible or definite fractures were seen on the image. If no fractures were noted, the observer marked "normal" and moved to the next case. When fractures or possible fractures were detected, the reviewer noted the location of each detected fracture and assigned a confidence rating for that fracture. The rating for confidence of fracture detected was defined on a four-point scale. Possibly abnormal (1) was used to indicate that a finding was noted, but the reviewer was not sure if the finding constituted an abnormality or merely represented overlapping shadows or normal structures. Absent confirmation on other views, the reviewer would not mention the finding in a dictated report. Inconclusive (2) was used to indicate slightly more certainty that a finding constituted a real abnormality, but the reviewer still would need other views to confirm the finding. If included in the dictated report, the significance of the finding would be questioned. Fairly certain abnormal (3) was used when the reviewer would mention the finding in a dictated report as a probable fracture but would look to other views to confirm the diagnosis. Definitely abnormal (4) was used when the reviewer was certain that the finding represented a fracture; no other films were needed to confirm finding, and the finding would be mentioned in the dictated report as a definite fracture.
Locations of the ankle joint at which the reviewers searched for fractures included the medial, lateral, and posterior malleoli. For fractures of the foot visible on the ankle radiographic series, the reviewer noted which bone was abnormal. The reviewers did not attempt to determine the severity, displacement, or causative force of any fractures. The presence or absence of soft-tissue swelling was not scored. The only questions answered by the reviewers were confidence rating for fracture and location of the possible fracture.
For statistical analysis, each location at the ankle was analyzed separately; all foot locations were combined into a single "foot" group. The data for the performance of individual views were analyzed with conventional receiver operating characteristic (ROC) curve analysis using the conventional improper binormal model [10], and the proper bigamma model [11]. An improper ROC model can produce ROC curves that inappropriately cross the chance line, whereas a proper ROC model produces ROC curves that never cross the chance line. Unfortunately, these conventional ROC models do not cope well with ROC points that have zero or almost zero rates of false-positive interpretation. All observers had data points at zero or near zero probability of false-positive diagnosis, and conventional methods estimated areas under the ROC curves as near perfect performance of 1.0. Although consistent differences were noted among the probability of true-positive [p(TP)] rates at these near zero probability of false-positive [p(FP)] rates between projections across observers, these differences were not captured by the ROC areas.
Therefore, evaluating differences among the different views was not
possible using standard ROC techniques. Instead, we compared p(TP)
for each experimental condition at a common value of p(FP), the
approach of McNeil and Hanley
[12]. To do so, we compared
p(TP) at zero p(FP) by fitting a linear ROC curve to each
ROC point and the upper right corner of the graph. The linear ROC curves were
used to estimate p(TP) at p(FP) = 0, the
y-intercept, for each ROC point, and those values of estimated
p(TP) were averaged across the ROC points of that observer and view.
These indexes of diagnostic performance were then tested across views using
the multiobserver method of Dorfman et al.
[13]. Their method tests for
differences in diagnostic performance across the three views taking into
account both observer and image variation. The method of Dorfman et al. was
applied to pairs of views, and the Bonferroni inequality for multiple
comparisons (for
= 0.05, the corrected
= 0.05 / 3 = 0.017) was
applied [14].
To estimate the performance of two-view combinations of films against the three-view series, a set of simulation models was constructed for each fracture category (medial malleolus, lateral malleolus, posterior malleolus, foot). These simulation models combined the confidence ratings from the individual image interpretations to yield a new confidence rating based on the combined views (Appendix). We weighted these models to reflect the perceived relative weights for each combined view for the possible fracture. The matrices for the models are presented in Figure 3. For each fracture location, we included models for the two-view combinations of anteroposterior with lateral and mortise with lateral, as well as the standard three-view series. The same methods that were used to analyze the single-view data were also used to analyze the simulated multiviewsnamely, analysis of estimated p(TP) at p(FP) = 0 by the method of Dorfman et al. [13].
|
|
|
|---|
Analysis of Single Views
Given a rating of 3 (fairly certain abnormal) as the standard for saying
that a fracture has been reported, for the 58 fractures of the medial
malleolus, observers would miss an average of 14.5 fractures with just the
anteroposterior view available, 11.0 fractures with just the mortise view, and
28.0 fractures with just the lateral view. For the 80 fractures of the lateral
malleolus, observers would miss an average of 13.75 fractures with just the
anteroposterior view available, 7.0 fractures with just the mortise view, and
17.75 fractures with just the lateral view. For the 34 fractures of the
posterior malleolus, observers would miss an average of 31.25 fractures with
just the anteroposterior view available, 27.75 fractures with just the mortise
view, and 13.25 fractures with just the lateral view. For the 30 fractures of
the foot, observers would miss an average of 24.5 fractures with just the
anteroposterior view available, 23.75 fractures with just the mortise view,
and 16.5 fractures with just the lateral view. These findings mirror the ROC
area findings we discuss next: for individual views, the mortise view appears
to provide somewhat more information than the anteroposterior view, with the
anteroposterior and mortise views being more important for evaluating the
medial and lateral malleolus, and the lateral view contributing more to the
evaluation of the posterior malleolus and foot.
ROC Analysis
For detection of fractures of the medial malleolus with a p(FP) =
0, estimated p(TP) was 0.742 for the anteroposterior view, 0.805 for
the mortise view, and 0.515 for the lateral view. Analysis of variance on
pairs of views showed that the mortise view detected more fractures than
either the anteroposterior view (F[1,197] = 5.788, p =
0.0171) or the lateral view (Satterthwaite F[1,37] = 19.105,
p = 0.0001), and that the anteroposterior view detected more
fractures than the lateral view (Satterthwaite F[1,26] = 11.683,
p = 0.0021). All pairwise tests were statistically significant.
For detection of fractures of the lateral malleolus with a p(FP) = 0, estimated p(TP) was 0.823 for the anteroposterior view, 0.905 for the mortise view, and 0.772 for the lateral view. Analysis of variance on pairs of views showed that the mortise view detected more fractures than either the anteroposterior view (F[1,219] = 7.712, p = 0.0060) or the lateral view (Satterthwaite F[1,26] = 13.387, p = 0.0011), but the anteroposterior view did not detect significantly more fractures than the lateral view (Satterthwaite F[1,18] = 1.001, p = 0.3304).
For detection of fractures of the posterior malleolus with a p(FP) = 0, estimated p(TP) was 0.095 for the anteroposterior view, 0.178 for the mortise view, and 0.608 for the lateral view. Analysis of variance on pairs of views showed that the lateral view detected more fractures than either the anteroposterior view (F[1,5] = 30.955, p = 0.0033) or the mortise view (Satterthwaite F[1,16] = 25.130, p = 0.0001), but the mortise view did not detect significantly more fractures than the anteroposterior view (Satterthwaite F[1,173] = 3.179, p = 0.0763).
For detection of fractures of the foot with a p(FP) = 0, estimated p(TP) was 0.181 for the anteroposterior view, 0.224 for the mortise view, and 0.434 for the lateral view. Analysis of variance on pairs of views showed no statistically significant difference among the three views (lateral view versus anteroposterior view, F[1,169] = 5.508, p = 0.0201; lateral view versus mortise view, Satterthwaite F[1,16] = 3.800, p = 0.0688; anteroposterior view versus mortise view, Satterthwaite F[1,18] = 0.284, p = 0.6004).
The results for interpretation of single views show a clear superiority of the anteroposterior and mortise views over the lateral view for the detection of fractures of the medial and lateral malleolus, and a clear superiority of the lateral view over the anteroposterior and mortise views for the detection of fractures of the posterior malleolus. For the foot, the same superiority of the lateral view over the anteroposterior and mortise views was obtained, but the pairwise comparison was not statistically significant. Therefore, a two-view combination for detection of all these fractures must include the lateral view and either the mortise or the anteroposterior view. For medial and lateral malleolar fractures, the mortise view was superior to the anteroposterior view, but the mortise view was not superior to the anteroposterior view for detection of fractures of the posterior malleolus and the foot (Table 1).
|
Simulated Two- and Three-View Combinations
Given a rating of 3 (fairly certain abnormal) as the standard for saying
that a fracture has been reported, for the 58 fractures of the medial
malleolus, observers would miss an average of 8.5 fractures with all three
views available, 10.5 fractures without the anteroposterior view, and 10.75
fractures without the mortise view. Therefore, for 58 fractures of the medial
malleolus, two more fractures are missed without the anteroposterior view and
2.25 more fractures are missed without the mortise view. For the 80 fractures
of the lateral malleolus, observers would miss an average of 4.75 fractures
with all three views available, 6.5 fractures without the anteroposterior
view, and 8.0 fractures without the mortise view. Therefore, for 80 fractures
of the lateral malleolus, 1.75 more fractures are missed without the
anteroposterior view and 3.25 more fractures are missed without the mortise
view. For the 34 fractures of the posterior malleolus, observers would miss an
average of 13.25 fractures with all three views available, 13.75 fractures
without the anteroposterior view, and 13.5 fractures without the mortise view.
Therefore, for 34 fractures of the posterior malleolus, 0.5 more fractures are
missed without the anteroposterior view and 0.25 more fractures are missed
without the mortise view. For the 30 fractures of the foot, observers would
miss an average of 10.75 fractures with all three views available, 13.75
fractures without the anteroposterior view, and 12.5 fractures without the
mortise view. Therefore, for 30 fractures of the foot, three more fractures
are missed without the anteroposterior view and 1.75 more fractures are missed
without the mortise view. These results also mirror the ROC area findings that
loss in performance will result from leaving out a view, with greater loss
without the mortise view for the medial and lateral malleolus, and greater
loss without the anteroposterior view for the foot.
ROC Analysis
For detection of fractures of the medial malleolus with a p(FP) =
0, estimated p(TP) was 0.863 for the three-view combination, 0.829
for the mortise-lateral combination, and 0.816 for the anteroposterior-lateral
combination. Analysis of variance on pairs of combinations showed that the
three-view combination detected more fractures than the mortise-lateral
combination (F[1,197] = 9.588, p = 0.0021) but not the
anteroposterior-lateral combination (F[1,197] = 3.156, p =
0.0772), and that neither the anteroposterior-lateral combination nor the
mortise-lateral combination detected more fractures than the other
(F[1,197] = 0.245, p = 0.6210).
For detection of fractures of the lateral malleolus with a p(FP) = 0, estimated p(TP) was 0.940 for the three-view combination, 0.916 for the mortise-lateral combination, and 0.905 for the anteroposterior-lateral combination. Analysis of variance on pairs of combinations showed that the three-view combination detected more fractures than the anteroposterior-lateral combination (F[1,219] = 6.891, p = 0.0093) and the mortise-lateral combination (F[1,219] = 5.753, p = 0.0173), but that neither the anteroposterior-lateral combination nor the mortise-lateral combination detected more fractures than the other (F[1,219] = 0.435, p = 0.5102).
For detection of fractures of the posterior malleolus with a p(FP) = 0, estimated p(TP) was 0.614 for the three-view combination, 0.610 for the mortise-lateral combination, and 0.610 for the anteroposterior-lateral combination. Analysis of variance on pairs of combinations also failed to yield any statistically significant difference in estimated p(TP) for any pair of simulated combinations (three-view versus mortise-lateral combination, F[1,173] = 1.924, p = 0.1660; three-view versus anteroposterior-lateral combination, Satterthwaite F[1,3] = 0.996, p = 0.3918; mortise-lateral combination versus anteroposterior-lateral combination, Satterthwaite F[1,3] = 0.000, p = 0.9971). This result was not unexpected because the lateral view performed far better than the other views for detection of posterior malleolus fractures, and this view was present in all combinations.
For detection of fractures of the foot with a p(FP) = 0, estimated p(TP) was 0.631 for the three-view combination, 0.533 for the mortise-lateral combination, and 0.563 for the anteroposterior-lateral combination. Analysis of variance on pairs of combinations showed that the three-view combination detected more fractures than the mortise-lateral combination (F[1,169] = 5.932, p = 0.0159) but not more than the anteroposterior-lateral combination (Satterthwaite F[1,7] = 2.672, p = 0.1446), and neither the anteroposterior-lateral combination nor the mortise-lateral combination detected more fractures than the other (Satterthwaite F[1,35] = 0.295, p = 0.5906).
The results of comparing the anteroposterior-lateral combination and the mortise-lateral combination provided little support for preferring one two-view combination over the other for any type of fracture. The results also suggest that the three-view combination did detect significantly more fractures than some two-view combinations in some locations, but these differences tended to be small. Such statistically significant differences were not always observed. There does, however, appear to be some cost in diagnostic accuracy for eliminating the anteroposterior or the mortise view (Table 2).
|
|
|
|---|
A number of strategies to reduce the expenditure of resources for examination of patients with suspected ankle injury have been discussed and tested. Five groups of investigators have studied how many views are necessary to detect ankle trauma [1, 3, 8, 9, 20]. Worldwide, the average number of films obtained for a standard ankle evaluation is 2.5; in the United States, this number is 2.9. In Europe, the average number of radiographs is 2.5 per examination, and it is 2.0 in developing countries [1].
In the United States, the standard ankle radiographic series consists of anteroposterior, mortise, and lateral radiographs. Some investigators have noted that a two-view series can replace the three-view series without a significant decrease in fracture detection. Cockshott et al. [1] noted that in a review of 242 ankle radiographic examinations, the anteroposterior and lateral views detected all the fractures and that a mortise view added information in no cases. Wallis [8] studied 945 ankle radiographs and found 128 fractures (13.5%). Wallis studied the anteroposterior and lateral views together and reviewed the mortise view to determine whether additional information was present. Despite finding that 4.7% of cases showed fractures not seen on the anteroposterior and lateral views, Wallis determined that the mortise view was not needed because the missed fractures were fibular avulsions, and that treatment would not have been altered.
Vangsness et al. [3] examined 123 cases with fracture and 10 cases of healthy control subjects, comparing the detection of fractures using the mortise and lateral views with detection using all three views. These researchers determined that "within 95% accuracy" the mortise and lateral combination performed as well as the three-view series. The choice of mortise and lateral over the anteroposterior and lateral was "authors' opinion." Brage et al. [9] studied interobserver and intraobserver agreement of classification of 99 ankle fractures using two or three views and found that agreement was better with two views. They recommended using the mortise and lateral combination to examine patients with possible ankle fractures.
Three two-view combinations exist: anteroposterior with mortise, anteroposterior with lateral, and mortise with lateral. From our data on detection of fractures of the foot and posterior malleolus using single views, any two-view combination must include the lateral view, eliminating the two-view combination of anteroposterior with mortise. Most previous studies have not compared both the remaining two-view combinations with the standard three-view series. It appears that investigator preference played a large role in which combination was evaluated. In some of these studies, the number of healthy control subjects was small, or controls were not included. In clinical practice, however, the actual number of "normal" ankles is quite high. Cockshott et al. [1] noted that fractures were present in only 29% of patients radiographed. Similarly, Auletta et al. [17] noted a 30% fracture rate in studies that they deemed were appropriately ordered. We found a 28% fracture prevalence in our study.
Our analysis indicates that both two-view combinations are similar in diagnostic performance for detection of fractures, and we could not find a reason to choose one over the other. This is somewhat surprising, given that from the single-view analysis, the mortise view performed better than the anteroposterior view for malleolar fractures and equivalent to the anteroposterior view at the remaining locations. One would expect that the mortise and lateral combination would perform better than the anteroposterior with lateral; however, in the combined view analysis, no significant difference was detected. This suggests that the information in the three views may be partially redundant, so that although the single mortise view may be superior to the single anteroposterior view for detection of malleolar fractures, when each of these views is combined with the lateral view, the difference disappears.
More important, we found little justification for eliminating either the anteroposterior or the mortise view, because accuracy decreased from three to two views. De Smet et al. [20] recently studied 344 ankle radiographs and found that using the anteroposterior and lateral radiographs alone would result in 1.5% of fractures being missed compared with the three-view radiographic series. Although our methodology differs from that of De Smet et al., we show a similar decrease in accuracy when the examination is reduced from three views to two. At both the medial and the lateral malleoli, the three-view series performed better than both two-view combinations, although the effect was small. Lack of difference at the posterior malleolus is expected because the lateral view is present in all combinations evaluated. Lack of difference at the foot may be because most foot fractures are best seen on the lateral view. This is true for fractures of the calcaneus, talus, and base of the fifth metatarsal, the three most common foot fractures in our series.
One could argue that other information is present on one view more often than another that would alter treatment decisions. For example, widening of the medial clear space of the mortise, implying rupture of the deep deltoid ligament, might be better seen on a mortise view. The study by Brage et al. [9] concentrated on classification of fractures and not detection of fractures. Our study focused only on fracture detection and did not address whether three views are needed to classify these fractures or to decide on treatment.
Our two- and three-view combinations are models based on the application of decision rules for noting a possible fracture on two or three individual views. All radiologists apply decision rules to process abnormal or possibly abnormal shadows seen on one view by looking at other views in the study. Other radiologists may have a different set of decision rules for how they combine individual images to determine the likelihood that a fracture is present, and this could alter the relative importance of a particular view. In addition, our study design precluded the natural situation in which a radiologist sees a possible abnormality on one film and looks to other views for confirmation. The radiologist then returns to the original image with knowledge and expectations gained from the other images to guide reinterpretation, in effect allowing multiple iterations of the combination matrices to reach a final decision. We believe that our study design, simulating a single iteration of the decision rules, may decrease the diagnostic performance for all fractures and at all fracture locations. However, the single iteration limitation would apply equally to the two- and three-view combinations and therefore would still allow comparison of two- and three-view combinations.
In conclusion, as health care providers attempt to reduce expenditure of resources for examining patients with possible ankle fractures, reducing the number of radiographic projections seems a logical option, and some authors suggest that this would not adversely affect patient care. We have tried to address this question using a systematic analysis of all possible combinations of views on a consecutive series of patients seen in clinical practice, balancing the number of patients with positive findings with a similar number of healthy control subjects. We designed our study to avoid bias from order of film interpretation.
Of the anteroposterior-lateral combination and the mortise-lateral combination, our results show little support for preferring either two-view combination over the other for any type of fracture. Our data suggest that a small but statistically significant cost exists in accuracy of detection for fractures when the three-view standard radiographic series is replaced by any two-view series.
APPENDIX: Models for Combining Single Radiographic View
Interpretation into Two- and Three-View Combinations
|
|
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. D. Michelson Ankle Fractures Resulting From Rotational Injuries J. Am. Acad. Ortho. Surg., November 1, 2003; 11(6): 403 - 412. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |