|
|
||||||||
1 Department of Radiology, University of Wisconsin, 600 Highland Ave., Madison,
WI 53792.
2 Lehigh Valley Hospital, Allentown, PA.
3 Department of Radiology, Box 3808, Duke University Medical Center, Durham, NC
27710.
Received August 12, 2004;
accepted after revision September 23, 2004.
Address correspondence to M. A. Kliewer.
Abstract
|
|
|---|
MATERIALS AND METHODS. At the AJR, manuscript reviews are rated by the journal editors on a subjective scale from 1 (lowest) to 4, on the basis of the value, thoroughness, and punctuality of the critique. We obtained all scores for AJR reviewers and determined the average score for each reviewer. We also sent a questionnaire to 989 reviewers requesting specific information regarding the age, sex, radiology subspecialty, number of years serving as a reviewer, academic rank, and practice type of the reviewer. The demographic profiles were correlated with the average quality score for each reviewer. Statistical analysis included correlation analysis and analysis of variance modeling. Reviewer quality scores were also correlated with the scoring of individual reviews and ultimate disposition of 196 manuscripts sent to the AJR during the same period.
RESULTS. Responses to the questionnaire were obtained from 821 reviewers (83.0%), for whom quality scores were available for 714 (87.0%). Correlation analysis shows that the quality score of reviewers strongly correlated with younger age (p = 0.001). A statistically significant correlation between quality score and practice type was seen (p = 0.008), with reviewers from academic institutions receiving higher scores. No significant correlation was found between quality score and sex (p = 0.72), years of reviewing (p = 0.26), academic rank (p = 0.10), or the ultimate disposition of the manuscript (p = 0.40). The quality score of the reviewers showed no variation by subspecialty (p = 0.99).
CONCLUSION. The highest-rated AJR reviewers tended to be young and from academic institutions. The quality of peer review did not correlate with the sex, academic rank, or subspecialty of the reviewer.
|
|
|---|
Despite the central importance of reviewers to the peer review process, only four studies have attempted to identify the characteristics of a good reviewer, and to our knowledge no study has examined the radiology literature [1-4]. Therefore, we undertook a study to look for correlations between the professional and demographic profile of reviewers and the quality of their reviews. We hypothesized that reviewer performance might be related to a number of characteristics, including age, sex, subspecialty, number of years reviewing, academic rank, and type of practice (academic or private). The goal of the study was to determine if particular reviewer attributes tend to predict the quality of manuscript review.
|
|
|---|
|
The AJR database at that time listed the following information: reviewer name, institution, average length of time for reviews, length of time since last review, and quality scores for all manuscript reviews provided by either the editor in chief, one of the two associate editors, or individuals serving an internship in the AJR office by virtue of having been awarded the ARRS Melvin M. Figley Fellowship in Radiology Journalism (all of whom are hereafter referred to as editors). These quality scores are average ratings of reviewer performance by the AJR editors. For every review received by the journal, reviewer performance is evaluated by one of the AJR editors and rated on a scale of 1-4, with 4 being the highest score. Scores are based on the level of sophistication of the commentary, the quality of the suggestions for manuscript improvement, the amount of detail, and the punctuality of the review [5-7]. These scores are subjective in nature and are not based on well-defined criteria. Editors are not blinded to reviewer identity or previous scores. Using these AJR data, we calculated the average quality score for each reviewer. For the database constructed specifically for this project, reviewers were identified only by a number code: reviewer names were purged to protect the confidentiality and privacy of the reviewers. Only the editor in chief and associate editors had access to the list correlating reviewer name and quality score.
|
Over a contemporaneous 6-month interval, 196 major papers submitted to the AJR were collected and entered into a database in which were recorded the reviewer scores of the manuscripts, the demographic profiles of the assigned reviewers, and the ultimate disposition of the manuscripts. The quality score of each reviewer was correlated with the overall score given by the reviewer to individual manuscripts and also with the ultimate disposition of the manuscript using Kendall tau correlation analysis. The overall score is the rating of the manuscript on a 10-point scale, with 1-4 recommending rejection, 5-6 recommending rejection with the opportunity to revise, and 7-10 recommending acceptance. Similarly, the ultimate disposition of a manuscript was coded as rejection, rejection with the opportunity to revise, or acceptance.
All statistical tests were considered significant at p values of 0.05 or less.
|
|
|---|
The average quality score for the composite group ranged from 1 to 4 (mean, 3.4; median, 3.5). The number of reviews on which the average quality score was based ranged from 1 to 13 manuscript reviews (mean, 3.9 reviews; median, 4.0).
The average quality score of these reviewers strongly correlated with age (p = 0.001); older reviewers generally received lower quality scores. The decline of quality score for older reviewers is seen in Figure 1. Further, we found a statistically significant correlation between quality score and practice type (p = 0.008); reviewers from academia typically rated higher than those in private practice. The mean quality score of reviewers with academic affiliations was 3.41, and the mean quality score of reviewers in private practice was 3.26. We found no significant correlation between quality score and sex (p = 0.72), years of reviewing (p = 0.26), or academic rank (p = 0.10). Although the quality score of AJR reviewers did not correlate with years of reviewing when the entire data set was analyzed, we found a notable difference in the quality scores of reviewers with more than 25 years of service (mean quality score, 3.19) when compared with the quality scores of reviewers with shorter periods of service (mean quality score, 3.37-3.51). However, the number of reviewers with at least 25 years of reviewing experience was small (n = 7). Last, the quality score of reviewers showed no variation by subspecialty (p = 0.99).
The average quality score of a reviewer was not significantly correlated with either the overall score of the manuscript (r = 0.05, p = 0.33) or the ultimate disposition of the manuscript in the peer review process (r = 0.06, p = 0.40).
|
|
|---|
Several conceivable explanations can be ventured for our finding that reviewer age and quality of review are inversely correlated. Younger reviewers may bring more enthusiasm to a review, constructing longer, more detailed reviews that draw comparisons with the existing medical literature. Younger reviewers may more avidly seek recognition and validation from well-known and influential editors. Or perhaps, younger reviewers may be more recently schooled in issues of experimental design, statistics, physics, or emerging imaging techniques. Conversely, perhaps as reviewers grow older, a gradual flagging of enthusiasm ensues; or perhaps older reviewers simply become more jaded, believing that they have heard or read some such before; or perhaps experience gained by reviewers over time does not compensate for mounting demands on their time and energy. It is also possible that older reviewers tend to be more laconic and less prone to provide exhaustive critiques or extensive advice, which would tend to result in lower scores by the editorial staff. Furthermore, older reviewers may rely more on their perceived personal authority [1]. Older reviewers may conceivably be more entrenched in their opinions, tending to harbor harsher views toward perspectives that do not coincide with their own beliefs and experiences. This phenomenon has been referred to as "confirmatory bias" [8, 9].
Whatever the explanation, the value of contributions from reviewers of younger years and more junior rank should be recognized. That this group can overcome what are, perhaps, more limited and nascent perspectives and produce thoughtful critiques of real substance is a testament to the value of youthful enthusiasm and dedication.
It is perhaps less mysterious why reviewers from academic practice might produce more compelling reviews than those from private practice. By virtue of less work intensity and protected academic time, academic radiologists tend to have a greater opportunity during the working day to prepare manuscript critiques. Academic radiologists are perhaps more likely to participate in regular journal clubs, in which the techniques of cogent criticism and close reading are taught and reinforced. And finally, academics may have greater resources from which to draw: these could be both material (university and departmental libraries, teaching files, computer and Internet databases) and intellectual (statisticians, subspecialist colleagues, physicists).
Some of our results have precedent in the medical literature. First, our finding that younger reviewers tend to produce more highly regarded reviews has also been described by other researchers outside the discipline of radiology [1-4]. Those studies, like ours, found no other characteristics of reviewers to be consistently associated with higher quality reviews. One study did find that male reviewers were more likely than female reviewers to give extreme scores on manuscripts, but this study also found that such differences between the sexes did not influence the ultimate disposition of a manuscript in the peer review process [10]. A different study from Scandinavian researchers in which reviewers rated fictitious manuscripts failed to find a systematic pattern of manuscript rating that could be attributed to either reviewer subspecialty or sex [3].
The variability between reviewers in our study further attests to the important intermediary role played by editors in the peer review process [11-14]. Editors must monitor the variation in quality and scoring tendencies of reviewers to mitigate the effects of a biased or deviant review. For many years now, the editors at the AJR have used a subjective system of rating individual reviews [5-7]. Such a system has been used by other journals and has been shown to be moderately reliable and moderately well correlated with a reviewer's ability to report manuscript flaws [15]. Editors at the AJR use the reviewer quality score as a tool to monitor reviewer performance and as a safeguard against the mishandling of a manuscript by reviewers who might be less careful than is required. Editors must ensure that every major paperparticularly one advancing complex, unexpected, or highly original interpretationsreceives at least one fair and careful reading by an accomplished reviewer. They must match specific manuscripts with reviewers with particular expertise, knowledge, and skills and scrutinize the reviews for balance, persuasiveness, and clarity.
Our study has several potential limitations. First, editors were not blinded to the identity of reviewers. This lack of blinding could have created biases in various directions. Editors might hold private opinions (good or bad) of a reviewer from prior personal or professional encounters. Feasibly, because journal editors and associate editors tend to hold their positions in mid or late career, they might be expected to be more favorably disposed toward reviewers known to them and of their peer group. Similarly, Figley fellows may prefer the perspective of their younger peers. Unfortunately, the sources of the reviewer scores were not recorded, so the scoring tendencies of specific editors could not be ascertained. Second, the criteria used to judge reviews were neither standardized nor objective. The average quality score is based itself on a subjective assessment and is therefore subject to the inherent biases of the editorial staff. Editors could harbor predetermined views about a person, an institution, an imaging technique, or a field of inquiry. Editors might tend to rate more highly those reviews that coincide with their own opinions of the manuscript. Third, a selection bias doubtless exists in the evolution of the reviewer database. Over time, the editors will retire reviewers who consistently provide poor reviews or refuse to enthusiastically participate in the review process. Arguably, though, such attrition would tend to artificially enrich the ranks of older reviewers. This selection process would have the tendency to obscure differences that might exist between different types of reviewers if all potential reviewers from the radiology community were studied. Fourth, our choice of subspecialty categories was, in retrospect, less apt than it might have been. For instance, a more accurate and inclusive categorization scheme would have probably used "musculoskeletal" rather than "bone" and "breast imaging" rather than "mammography." Considering how such terms are understood in the argot of our profession, however, it is unlikely that this misstep substantially influenced our results. Fifth, we did not control the analysis for type of manuscript. Reviews of shorter manuscripts, such as case reports, may tend to be more succinct (and receive lower scores) than reviews of longer, more detailed manuscripts. However, smaller manuscripts tend to more frequently be assigned to younger reviewers gaining experience with the review process, and this might actually skew younger reviewer scores downward. And finally, because this is a cross-sectional collection of data, one cannot be certain that it is simply age that accounts for the recorded differences in rated performance. A longitudinal study showing a progressive decline in review quality would be more definitive, but such a study is not currently available.
In summary, we found that the best reviewers at the AJR tend to be younger individuals from academic institutions. Of equal importance is the recognition that reviewers showed no significant variation in skill when compared by subspecialty or sex. Clearly, superb reviewers are found throughout the discipline of radiology. The onus to keep the peer review process in good working order falls squarely on the shoulders of the editors. An essential part of their job is tracking reviewer performance and protecting authors from reviewers who tend to be deviant or extreme [11]. We believe that the editors' subjective quality rating of peer reviews of manuscripts serves as a useful tool for monitoring reviewer performance.
What role mentoring might have in improving peer review is as yet unexplored, but it is enticing to imagine the synergy that might be created if the experience of an older mentor were wedded to the enthusiasm of a neophyte. We might soon be able to find out: the AJR staff has recently implemented a program to develop young reviewers under the benign tutelage of seasoned academics. Established reviewers have been asked to identify junior staff and trainees with potential and to help them hone their critical skills. One added benefit of such a program might be greater homogeneity of reviewer scores across age groups.
Acknowledgments
We gratefully acknowledge the assistance of Carrie Poole in the preparation
of the manuscript; the AJR staff, especially Charles Jenkins; and Lee
Rogers, who patiently supported the project. Three authors were recipients of
the Melvin M. Figley Fellowship in Radiology Journalism and therefore wish to
thank the ARRS for this enriching training and experience.
|
|
|---|
This article has been cited by other articles:
![]() |
T. H. Berquist Publication in the AJR: Critical Interactions among Authors, Reviewers, and Section Editors Am. J. Roentgenol., November 1, 2008; 191(5): 1291 - 1292. [Full Text] [PDF] |
||||
![]() |
R. G. Sheiman The RSNA Reviewer Mentorship Program Radiology, September 1, 2007; 244(3): 631 - 632. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |