Genitourinary Imaging
Original Research
Comparative Accuracy of Renal Duplex Sonographic Parameters in the Diagnosis of Renal Artery Stenosis: Paired and Unpaired Analysis
OBJECTIVE. The purpose of this study was to evaluate the test performance of duplex sonographic parameters in screening for hemodynamically significant renal artery stenosis, which occurs in approximately 5% of persons with hypertension.
MATERIALS AND METHODS. A comprehensive literature search was conducted to find studies on the diagnosis of renal artery stenosis in which duplex sonography and intraarterial angiography were compared and in which sensitivity and specificity were calculated. MEDLINE (1966-2005), EMBASE (1988-2005), and reference lists were searched and the authors contacted. Data were subjected to meta-analysis according to the hierarchical summary receiver operating characteristic curve model. Heterogeneity in test performance relating to population and design features was investigated.
RESULTS. From 1,357 titles, 88 studies involving 9,974 arteries in 8,147 patients were included. The following four parameters were evaluated: peak systolic velocity (21 studies), acceleration time (13 studies), acceleration index (13 studies), and renal-aortic ratio (13 studies). The corresponding diagnostic odds ratios (ORs) were 60.9 (95% CI, 28.3-131.2), 28.9 (95% CI, 7.1-117.2), 16.0 (95% CI, 5.1-50.6), and 29.3 (95% CI, 12.7-67.7). Results based on studies in which parameters were directly compared showed that peak systolic velocity had greater accuracy than renal-aortic ratio (relative diagnostic OR, 1.8; p = 0.03; nine studies) and acceleration index (relative diagnostic OR, 5.3; p < 0.001; five studies). Acceleration time versus acceleration index showed no evidence of a difference in accuracy (relative diagnostic OR, 1.1; p = 0.65; nine studies). Analysis of peak systolic velocity used in combination with other parameters compared with peak systolic velocity alone (seven studies) showed evidence of a shift in test positivity (p < 0.001) but only weak evidence of improvement in accuracy (relative diagnostic OR, 1.6; p =0.09).
CONCLUSION. Sonography is a moderately accurate screening test for renal artery stenosis. The single measurement, peak systolic velocity, has the highest performance characteristics, an expected sensitivity of 85% and specificity of 92%. Additional measurements do not increase accuracy.
Keywords: color Doppler sonography, digital subtraction angiography, renal disease, screening
Renal artery stenosis (RAS) is the most common underlying medical condition causing hypertension and is present in 1-5% of all cases of hypertension [1]. RAS is also the indication for renal replacement therapy in 5-15% of cases [2, 3]. Most cases of RAS are due to atherosclerosis [4], the incidence of which increases with age [5]. As the general population ages, it is likely that RAS will be detected more frequently. Identification of patients with RAS-induced hypertension has important clinical implications because correction of RAS with angioplasty and stenting can improve blood pressure control in as many as 64% of patients with hypertension resistant to medical treatment [6]. In addition, improvement or stabilization of renal function occurs in 79% of patients in whom RAS is associated with deteriorating renal function [7]. Intraarterial angiography currently is the reference standard for the diagnosis of RAS. However, the morbidity associated with this technique from bleeding, anaphylaxis, and contrast material-induced nephropathy has largely limited its use for verification of RAS in patients in whom the condition is strongly suspected either on clinical grounds or because of a positive screening test result.
There is no universally accepted screening or triage test for RAS. Clinical assessment to identify patients in whom RAS is a likely cause of hypertension or of deteriorating renal function has not proved sufficiently accurate for routine use; however, clinical prediction algorithms for the detection of RAS have achieved a sensitivity of 65% and a specificity of 87% [8]. Physiologic studies for assessing the renin-angiotensin system have been largely abandoned for screening because of their low predictive accuracy, especially among elderly patients [9, 10]. Radionuclide studies for assessing differential blood flow have several limitations. The results are difficult to interpret in patients with impaired renal function, and the technique cannot be used to identify RAS if the patient has bilateral disease or if only one kidney is present. The accuracy of radionuclide studies in the diagnosis of RAS has ranged from very high [11] to very low [12].
Use of direct imaging of renal vessels for triage in the detection of RAS so that only patients with positive results undergo verification with the reference standard, intraarterial angiography, has attracted much interest in recent years. The techniques used are duplex sonography, CT with administration of contrast material, and MRI [13]. Duplex sonography has several advantages over the other two techniques: it is widely available, noninvasive, and inexpensive. Duplex sonography, however, is not a simple test. Multiple measurements of several indexes of renal blood flow are possible. The absolute and relative accuracy rates of these sonographic parameters, which have been used to diagnose hemodynamically significant RAS, are not well defined. The primary aims of this study were to assess the comparative performance of duplex sonographic parameters in the diagnosis of hemodynamically significant RAS, to determine which test is most discriminating in the detection of disease, to examine threshold effects, and to assess the gain in using tests in combination. Secondary aims were to use population and study design features to explain heterogeneous results among studies.
We included studies in which duplex sonography was compared with renal angiography in examinations of patients with hypertension. Articles were excluded if echo-enhancing agents were used or data were insufficient to complete a two-by-two contingency table from which sensitivity and specificity could be calculated. Studies involving echo-enhancing agents were excluded because the aim of the study was to assess noninvasive triage tests, and use of an echo-enhancing agent constitutes an invasive procedure. Pairs of reviewers independently extracted data from each article. Disagreements were resolved by a third reviewer extracting the data and through subsequent discussion. Reviewers were not blinded to details of authorship. Duplicate publication was identified after data extraction, and the article with the most complete data was included in the review.
Studies were identified through MEDLINE (1966-March 2005), EMBASE (1988-March 2005), review of all references in review articles or eligible articles, and contact with investigators. The medical subject headings renal artery obstruction, ultrasonography, Doppler, color, angiography, hypertension renovascular, hypertension renal, sensitivity and specificity, receiver operating characteristic, and hypertension were combined with the text words renal artery stenosis, renal arteriography, sensitivity, specificity and predictive value, renal angiography, and renal arteriography. Details of search strategies appear in Appendix 1. All titles were reviewed online by two of the authors. Abstracts of these articles were reviewed, and those appearing relevant were obtained for full assessment. Articles not in English were translated into the following languages: French, German, Italian, Spanish, Czechoslovakian, Slovak, Portuguese, and Russian.
| MEDLINE Search Strategy | | EMBASE Search Strategy | | ||
|---|---|---|---|---|---|
| No. | Search History | No. of Results | No. | Search History | No. of Results |
| 1 | renal artery obstruction/ | 6,895 | 1 | kidney artery stenosis/ | 3,310 |
| 2 | renal artery stenosis.tw | 2,738 | 2 | renal artery stenosis.tw | 2,093 |
| 3 | renal artery obstruction.tw | 40 | 3 | 1 or 2 | 3,789 |
| 4 | hypertension renovascular.tw | 99 | 4 | kidney artery stenosis/di [Diagnosis] | 1,084 |
| 5 | hypertension renovascular.mp [mp=ti, ab, rw, sh] | 5,123 | 5 | Kidney angiography/ | 747 |
| 6 | Renovascular hypertension.mp [p=ti, ab, rw, sh] | 3,620 | 6 | Kidney arteriography/ | 1,209 |
| 7 | Or/1-6 | 11,798 | 7 | Doppler echography/ | 6,249 |
| 8 | Renal angiography.tw | 623 | 8 | Duplex Doppler.tw | 837 |
| 9 | Renal arteriography.tw | 541 | 9 | Colo?r Doppler.tw | 4,951 |
| 10 | Renal artery/ and angiography/ | 1,652 | 10 | Doppler pulse.tw | 42 |
| 11 | (dop$adj25ultra$).mp [mp=ti, ab, rw, sh] | 13,530 | 11 | Doppler ultrasonography.tw | 2,187 |
| 12 | Ultrasonography Doppler/ | 3,058 | 12 | Or/4-11 | 14,769 |
| 13 | Ultrasonography Doppler color/ | 4,160 | 13 | 3 and 12 | 1,354 |
| 14 | Ultrasonography Doppler duplx | 1,581 | 14 | Diagnostic accuracy/ | 51,669 |
| 15 | Ultrasonography Doppler pulse | 522 | 15 | Receiver operating characteristic/ | 1,395 |
| 16 | Or/8-15 | 21,748 | 16 | “sensitivity and specificity”.tw | 16,778 |
| 17 | 7 and 16 | 952 | 17 | (sensitive$ or specific$ or predictive value).tw | 440,614 |
| 18 | 17 not animal.mp [mp+ti, ab, rw, sh] | 917 | 18 | Or/14-17 | 475,202 |
| 19 | 13 and 18 | 348 | |||
| 20 | Limit 19 to human | 334 | |||
Design and population characteristics were assessed for each study retrieved. Items assessed included participant details (clinical spectrum, prevalence of stenosis), duplex sonographic method (method described or not, proportion of unsuccessful tests, inclusion of failed tests, inclusion of occluded arteries), angiographic method (method described or not, operator specified, views specified, stenosis measurement, results reviewed by two operators, percentage stenosis severity, threshold of positive test result), and study design items (year of publication, prospective or retrospective data collection, consecutive patient enrollment, interpreters of sonographic results blinded to angiographic results, interpreters of angiographic results blinded to sonographic results, interpreters of both types of images blinded to clinical information, number of patients undergoing both sonography and angiography, method of inclusion of occluded arteries, inclusion of accessory arteries in data). Most population and design characteristics were coded yes, no, or not stated. Some characteristics, such as threshold, prevalence of stenosis, percentage of unsuccessful tests, and percentage of patients undergoing both tests, were treated as continuous variables in the metaregression. RAS severity was classified according to angiographic appearance, most commonly 50% or 60%. Sonographic parameters evaluated were peak systolic velocity, renal-aortic ratio (RAR), acceleration time (AT), acceleration index (AI), combinations that included peak systolic velocity, combinations not including peak systolic velocity, and other single parameters. Peak systolic velocity and RAR are usually extrarenal measurements, and AI and AT are intrarenal measurements.
When results were reported for more than one threshold for a given sonographic parameter, the threshold most commonly used by other authors was chosen to make study designs as similar as possible. Threshold values for studies in which peak systolic velocity was evaluated ranged from 100 to 200 cm/s and in which RAR was measured, 1.8-3.5 (ratio), one article not stating the threshold. AT threshold values were > 0.1 to > 0.7 m/s, and AI threshold values were 3-4.5 m/s, one article did not states the threshold used.
In articles that reported results for more than one level of severity of angiographically defined RAS, the data set with the most frequently used threshold in other studies was chosen. This precaution was used to ensure the most appropriate comparison across articles. We attempted to extract data for both patients and arteries, but because only 28 of the 88 eligible articles reported data on patients, we reported only data on individual arteries.
For each study, sensitivity and specificity were calculated at the chosen threshold, and the diagnostic odds ratio (OR) was computed. This value is a single summary measure of test accuracy for each study, taking into account both sensitivity and specificity at a threshold. A high diagnostic OR indicates high test accuracy. A test that performs no better than chance in discriminating presence and absence of disease has a diagnostic OR of 1. In instances in which five or more studies assessed the same Doppler parameter, a summary receiver operator characteristic (SROC) curve was estimated. A preliminary exploratory analysis was conducted with the SROC regression method of Moses et al. [14], Littenberg and Moses [15], and Irwig et al [16]. Regression diagnostics were examined to identify outliers and potentially influential studies.
The hierarchical SROC curve model described by Rutter and Gatsonis [17, 18] was used at the next stage of the analysis. This method has the advantage of taking into account uncertainty in estimates of both sensitivity and specificity within each study and includes a random effect for both test accuracy and positivity criterion (threshold), thereby taking into account unexplained heterogeneity between studies. The model also allows test accuracy to vary with threshold through the inclusion of a scale (shape) parameter that provides for asymmetry in the SROC curve. This shape parameter is assumed to be constant across studies (fixed effect). The model can be referred to as a mixed model because it includes both random and fixed effects. Empiric Bayes estimates of model parameters were obtained with PROC NLMIXED (SAS) [19]. Appendix 2 shows a detailed specification of the model.
![]() View larger version (17K) | Fig. 1 —Flow chart of identified and included studies. |
An SROC curve was constructed after selection of a range of values of 1 - specificity (consistent with the observed data) and use of the estimated model parameters to compute the predicted values for sensitivity. A function of the estimated model parameters was used to obtain the expected operating point on the SROC curve [17] that gave summary estimates of sensitivity (1 - specificity) and the corresponding positive and negative likelihood ratios. Covariates were added to the model for assessment of whether test accuracy, the positivity criterion, or the shape of the SROC curve varied with population and design characteristics. The level chosen for statistical significance was 5%. Only significance test results that were robust to the removal of an influential observation were reported. Both the SROC curve regression and hierarchical SROC curve methods gave similar results; therefore, only the results of the hierarchical model are reported.
Test accuracy can vary across studies owing to differences in patient characteristics, disease spectrum, and study design characteristics. Because of missing or poorly reported information, it is difficult to adequately adjust for this confounding effect. We thus focused on paired studies conducted with direct comparisons of two or more parameters against the same reference standard, angiography, within the same study.
Among 1,357 publications reviewed, 88 eligible articles with data on 9,974 arteries were identified and included in this study (Fig. 1). Nine duplicate publications were excluded. The following four Doppler sonographic parameters were assessed: peak systolic velocity (21 studies, 2,785 arteries), acceleration time (AT) (13 studies, 1,927 arteries), acceleration index (AI) (13 studies 1,299 arteries), and renal-aortic ratio (RAR) (13 studies, 1,347 arteries). Fifty-one (58%) of the articles contributed no data to our analyses because they reported on parameters or combinations of parameters that fewer than five authors examined, such as delta resistive index (n =4), peak systolic velocity and delta resistive index (n = 2), RAR or spectral broadening (n = 1), and peak systolic velocity, AT, and AI (n = 1). Seventeen studies contributed one data set to our analyses, three studies contributed two data sets, nine studies contributed three data sets, and five studies provided four or more data sets for the analyses. Dual data extraction reached complete agreement between data extractors in all four cells summarizing test performance (true-positive, true-negative, false-positive, false-negative) in 72% of the included studies. Extracted data in an article that did not agree completely were reextracted by a third author and in the case of one article, a fourth author.
The study characteristics are shown in Table 1. Studies were published between 1984 and 2004. The number of patients in each study ranged from 17 to 550. Of the articles that reported sex ratios, all but one [20] included both male and female patients. Only three articles [21-23] reported results for children younger than 12 years, the others reporting a wide range of ages of adults. The clinical spectrum included hypertension alone, hypertension with chronic renal failure, peripheral vascular disease, treatment by kidney transplantation, and potential kidney donation. It was not possible to separate patients with hypertension and chronic kidney failure and patients with hypertension and peripheral vascular disease from the other groups because patients were usually mixed, and separated data were not provided. The prevalence of RAS varied widely, from 6% to 90% with a mean of 37% (median, 33%). The thresholds used for diagnosis of clinically significant stenosis in the analyzed data were almost equal at 50% and 60%.
Peak systolic velocity was measured in the main renal artery in all studies in the analysis, and 17 of the 21 studies detailed an angle-adjusted measurement. RAR was analyzed as a ratio between measurement in the main renal artery relative to aortic measurements. Eleven of 13 studies described an angle-adjusted measurement. AI was measured intrarenally in nine of 13 studies, three articles did not state where AI was measured, and in one study, AI was measured in the main renal artery. In two of 14 studies, calculation of AI was adjusted according to ultrasound frequency, but the effect on outcome was inconsistent with highly variable diagnostic ORs (239 and 59). AT was measured intrarenally in 10 of 13 studies, the reports of three studies not stating where measurements were made. Angle adjustment was stated in eight of 13 articles.
Figure 2A shows the SROC curve for all four individual duplex sonographic tests: peak systolic velocity, AT, AI, and RAR. The shape parameter reached significance in only one analysis, but this result was due to a single influential study. Hence, symmetry was assumed for all SROC curves. The estimated summary diagnostic OR and corresponding 95% CI based on all studies for each parameter are shown in the insets in Figure 2A, 2B, 2C, 2D, 2E. Peak systolic velocity had the highest accuracy, followed by RAR, AT, and AI. The corresponding areas under the SROC curves were computed from the estimated log diagnostic OR [24] to be 0.95, 0.91, 0.91, and 0.87 for peak systolic velocity, RAR, AT, and AI, respectively. After exclusion of the study identified earlier as influential for AT (Fig. 2A), the diagnostic OR increased from 28.9 to 42.2 (95% CI, 13.1-136.7). The corresponding estimate of the area under the curve was 0.93. Estimates of expected sensitivity, 1 - specificity, positive likelihood ratio, and negative likelihood ratio for each test, based on all studies, are shown in Table 2.
![]() View larger version (20K) | Fig. 2A —Summary receiver operator characteristics curves. DOR = diagnostic odds ratio, RDOR = relative DOR. Peak systolic velocity (PSV), renal-aortic ratio (RAR), acceleration time (AT), and acceleration index (AI). |
![]() View larger version (13K) | Fig. 2B —Summary receiver operator characteristics curves. DOR = diagnostic odds ratio, RDOR = relative DOR. Studies in which both PSV and RAR were examined separately. |
![]() View larger version (15K) | Fig. 2C —Summary receiver operator characteristics curves. DOR = diagnostic odds ratio, RDOR = relative DOR. Studies in which AI and AT were examined separately. |
![]() View larger version (13K) | Fig. 2D —Summary receiver operator characteristics curves. DOR = diagnostic odds ratio, RDOR = relative DOR. Studies in which PSV was examined alone and in combination with other parameters. |
For the nine studies in which peak systolic velocity and RAR were directly compared for the same patients (Fig. 2B), the diagnostic OR was higher for peak systolic velocity than for RAR in almost all studies (relative diagnostic OR, 1.8; p = 0.03). In nine studies, investigators assessed both AT and AI (Fig. 2C). Analysis of paired studies showed no significant difference in accuracy between AT and AI (relative diagnostic OR, 1.1; p = 0.65). Excluding the influential study, the relative diagnostic OR increased to 1.4, which was not significant (p = 0.17), but there was a shift in test positivity (p = 0.01), AI having a higher expected sensitivity but lower expected specificity than AT. Analysis of the seven studies in which peak systolic velocity alone was compared with peak systolic velocity in combination with other parameters (Fig. 2D) showed only weak evidence of improvement in accuracy with the use of combined tests (relative diagnostic OR, 1.6; p = 0.09). Three of the studies, however, showed an increase in sensitivity but a decrease in specificity for the combined tests relative to peak systolic velocity alone. These findings suggest that an “either positive” rule was applied when test results were combined. Three of the other four studies showed negligible difference in either sensitivity or specificity, and one showed no difference. These results suggest that using tests in combination results in a shift in test positivity (p < 0.001) but are only weak evidence of an improvement in discrimination (diagnostic OR). Analysis of the five studies in which peak systolic velocity and AI were assessed (Fig. 2E) showed that peak systolic velocity was more accurate than AI (relative diagnostic OR, 5.3; p < 0.001), but the 95% CI was very wide. The estimated relative diagnostic ORs for the comparisons based on the subset of paired studies were consistent with the results for individual parameters shown in Figure 2A except for AT versus AI, in which the paired studies showed no difference in accuracy.
![]() View larger version (15K) | Fig. 2E —Summary receiver operator characteristics curves. DOR = diagnostic odds ratio, RDOR = relative DOR. Studies in which PSV and AI were examined separately. |
The population and design characteristics are shown in Table 3. Eighty-seven (99%) of the articles reported the number of patients who underwent both duplex sonography and angiography, the average proportion being 75% (range, 3-100%) and the median being 100% (48 articles reported 100%). Seventy-seven articles reported the proportion of patients in whom sonography failed to give a result, which occurred in 10% of cases on average (median, 6%). The minimum proportion of failures reported was zero (22 studies), and the maximum, 54%.
To assess whether test accuracy or test positivity cut point was associated with study design or reporting, each population and design characteristic was included as a covariate in the hierarchical SROC curve model. For peak systolic velocity, the approach to failed sonographic examinations was associated with the cut point for test positivity but not with test accuracy. Studies explicitly showing no peak systolic velocity failures had a higher expected sensitivity (0.95) and, hence, a lower expected specificity (0.76) than the following study categories, which had similar expected sensitivity (0.81) and specificity (0.93): peak systolic velocity failures excluded, peak systolic velocity failures included, and no indication of what the investigators did with failed peak systolic velocity (p = 0.004). For AI, test accuracy increased as the test threshold increased. For every 0.5-m/s2 increase in test threshold, the diagnostic OR increased an average of 3.8 times (relative diagnostic OR, 3.8; 95% CI, 1.4-10.5; p = 0.01). Other population and study design characteristics had no significant effect on test performance.
We found that renal duplex sonography is a moderately accurate test for screening patients with hypertension for hemodynamically significant RAS, but important differences in accuracy exist among the parameters measured. Peak systolic velocity was the most accurate test parameter with values of sensitivity, specificity, and diagnostic OR of 85%, 92%, and 60.9, respectively. A diagnostic OR of 61 means that the odds of a test having a positive result (for peak systolic velocity) are 61 times greater in someone with RAS than in someone without RAS. AI was the least accurate parameter analyzed, the OR being 16.0. Between these extremes were RAR and AT, which had similar accuracy, with diagnostic ORs of 29.3 and 28.9, respectively. Studies in which different parameters were evaluated in the same patient group were useful for comparing the parameters while controlling for variation within patient group. On the basis of the findings of these paired studies, peak systolic velocity had superior discrimination compared with RAR (relative diagnostic OR, 1.8) and AI (relative diagnostic OR, 5.3). No significant difference in accuracy was identified for AI and AT. Analysis of peak systolic velocity alone compared with peak systolic velocity in combination with other parameters showed only weak evidence of improvement in discrimination for the combination over peak systolic velocity alone but suggested a shift in test positivity criterion. Overall, findings from paired analyses were consistent with those of individual, unpaired analyses.
There was considerable heterogeneity in the accuracy of the same test parameters among studies, and this finding was largely unexplained. An operator effect might have lead to variation in test performance, operators with the most experience producing best test performance. It was not possible, however, to analyze this effect because publications did not give details about experience level or number of operators involved in a study, nor were data reported separately for individual operators. The numbers of reported sonographic failures varied widely across publications (0-54%), suggesting substantial variation in study design. Failure rates were not higher in the oldest studies and did not appear to be related to which sonographic machine was used or the country in which the study was performed. A possible explanation for variability is referral filter bias. Few articles provided details on how patients were enrolled in the study; it was therefore impossible to determine how many difficult or likely-to-fail cases were excluded from the reports. Evidence of patient exclusion after study commencement was a discrepancy between reported numbers of patients in the assessment and results tallied. Variation in reported test performance may be a result of authors' selecting favorable patients for the studies. Such selection bias may affect the ability to generalize results to the broader group of patients who may ordinarily be considered for Doppler sonographic screening for RAS. Thus application of these results to an unselected clinical group may result in differing test performances.
In one systematic review [25], investigators evaluated renal duplex sonography in the diagnosis of RAS. The objective was to compare all available noninvasive or minimally invasive techniques currently used for detection of RAS. Thus, the focus was not exclusively on Doppler sonography. The authors used area under the curve to compare diagnostic techniques across studies and concluded that CT angiography and gadolinium-enhanced 3D MRI, which had an area under the curve of 0.99, performed better than other studies, including duplex sonography (area under the curve, 0.93). The authors, however, evaluated renal Doppler sonography as one test rather than determining which of the available measurements was the most accurate. The authors of the review included only 24 of the 88 studies identified in the current study, did not analyze each sonographic test separately, and did not evaluate study population and design characteristics. Given the variety of Doppler parameters that can be measured and our evidence of important differences in accuracy, summary statements about renal duplex sonography in general are probably unhelpful to clinicians and radiologists who need to decide which parameter should be measured.
This study was a comprehensive review of all available data and was conducted with recently developed methods for finding relevant studies, evaluating their quality, and synthesizing results. We depended, however, on the quality of design and reporting of the primary studies, which in many cases was problematic. This finding was not novel, and methods are being developed to improve the design and reporting of studies of diagnostic tests. We hope that the recently published Standards for Reporting of Diagnostic Accuracy [26] will improve future reports of studies of diagnostic tests.
It is accepted that for many diagnostic tests, accuracy relies on operator expertise; however, evidence to support this belief is lacking. None of the 88 studies analyzed provided sufficient details to allow exploration of this theory. Considerable evidence of patient selection bias was apparent in many of the studies. In using peak systolic velocity as a triage test for RAS, it may be worthwhile to consider both who will be performing the test and who will be the subjects.
In the clinical setting, duplex sonography is used primarily as a screening or triage test for RAS to prevent all patients with suspected RAS from undergoing the reference standard test, renal angiography. Of the four possible classifications based on test and target condition status, the proportion of patients with true-positive results (those who would still need angiography) is less important than the proportion with truenegative results (those who avoid angiography) and the proportion of those with false-positive findings (those who undergo unnecessary angiography). Based on the summary sensitivities and specificities determined in this metaanalysis, Table 4 shows the expected number of persons in each of the four test and target condition cells with each duplex sonographic parameter. The data were calculated for two different prevalences of disease (5% and 45%), reflecting the prevalence found in unselected hypertensive populations and in selected populations of patients with hypertension and peripheral vascular disease.
Our analysis highlighted and confirmed previous findings that showed studies of diagnostic tests are poorly reported. Further high-quality research that yields data on individual duplex sonographic parameters separately and specifies combinations of parameters is needed. In particular, to explore further the association between expertise and experience and test accuracy, authors should report the experience of the test operators and observers. Further studies are needed to evaluate the comparative accuracy of peak systolic velocity and AT with paired data cross-classified by RAS status.
In conclusion, renal duplex sonography performed with the parameter peak systolic velocity is accurate in triage of many patients with suspected RAS, and unnecessary renal arteriography is avoided. The noninvasive character, ready availability, and low cost of duplex sonography make it an attractive option as a screening test for RAS.
In the hierarchical summary receiver operating characteristic model [17, 18], within-study variability for the ith study is modeled as logit
, where Yij = 1 represents a positive test result for subject j, and Dij represents that subject's true disease status as defined by the reference standard (coded as -0.5 for nondiseased and 0.5 for diseased). Between-study variability is taken into account by allowing each study (i) to have its own cut point for test positivity (θi) and diagnostic accuracy (αi). The cut point for test positivity is assumed to be on an underlying (latent) continuous scale for test results. The random effects for cut point and accuracy are assumed to be independent (uncorrelated) and normally distributed with
and
. Hence Λ, gives the expected accuracy (log diagnostic odds ratio) at a given cut point. The modeling takes into account the sampling variability in estimates of sensitivity and specificity (and the correlation between them) within each study in estimation of random effects. The fixed-effect parameter β is used to assess whether test accuracy depends on test threshold. When β = 0, test accuracy is constant across thresholds, resulting in a summary receiver operating characteristic curve that is symmetric about the diagonal line where sensitivity equals specificity, and Λ then is a global estimate of test accuracy (log diagnostic odds ratio) that is constant across thresholds. The model was fitted with PROC NLMIXED (SAS [19]. Functions of the model parameters were used to estimate the expected sensitivity, specificity, likelihood ratios, and confidence intervals.
Address correspondence to G. J. Williams ([email protected]).
Supported by National Health and Medical Research Council Program grant no. 211205 (Australia).
The authors have no financial interests in this article.

Audio Available | 





