AJR Join ARRS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kliewer, M. A.
Right arrow Articles by Provenzale, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kliewer, M. A.
Right arrow Articles by Provenzale, J. M.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
AJR 2004; 183:1545-1550
© American Roentgen Ray Society

Peer Review at the American Journal of Roentgenology: How Reviewer and Manuscript Characteristics Affected Editorial Decisions on 196 Major Papers

Mark A. Kliewer1, David M. DeLong2, Kelly Freed3, Charles B. Jenkins4, Erik K. Paulson2 and James M. Provenzale2

1 Department of Radiology, University of Wisconsin, 600 Highland Ave., Madison, WI 53792-3252.
2 Department of Radiology, Duke University Medical Center, Box 3808, Durham, NC 27710.
3 Lehigh Valley Hospital, Allentown, PA.
4 Arthroscopy Journal, CompRehab Plaza, Winston-Salem, NC 27103.

Received February 17, 2004; accepted after revision May 21, 2004.

 
Address correspondence to M. A. Kliewer.


Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
OBJECTIVE. The objective of this study was to examine the relative influence of manuscript characteristics and peer-reviewer attributes in the assessment of manuscripts.

MATERIALS AND METHODS. Over a 6-month period, all major papers submitted to the American Journal of Roentgenology (AJR) were entered into a database that recorded manuscript characteristics, demographic profiles of reviewers, and the disposition of the manuscript. Manuscript characteristics included reviewer ratings on five scales (rhetoric, structure, science, import, and overall recommendation); the subspecialty class of the paper; the primary imaging technique; and the country of origin. Demographic profiles of the reviewers included age, sex, subspecialty, years of reviewing, academic rank, and practice type. Statistical analysis included correlation analysis, ordinal logistic regression, and analysis of variance.

RESULTS. A total of 445 reviews of 196 manuscripts were the work of 335 reviewers. Of the 196 submitted manuscripts, 20 (10.2%) were accepted, 106 (54.1%) were rejected, and 70 (35.7%) were rejected with the opportunity to resubmit. Regarding manuscript characteristics, we found that the country of origin, score on the science scale, and score on the import scale were statistically significant variables for predicting the final disposition of a manuscript. Of the reviewer attributes, we found a statistically significant association between greater reviewer age and also higher academic rank with lower scores on the import scale. Reviewer concordance was higher for structure, science, and overall scores than on the rhetoric and import scores. Greater variability in the overall scoring of papers could be attributed to the reviewer than the manuscript, but both factors combined explain only 23% of the total variability.

CONCLUSION.At the AJR, manuscript acceptance was most strongly associated with reviewer scoring of the science and import of a major paper and also with the country of origin. Reviewers who were older and of higher academic rank tended to discount the importance of manuscripts.


Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Peer review is the thresher that attempts to separate manuscripts worthy of publication from those that are not. It is, however, an imperfect process that relies largely on the skill, discernment, and judgment of expert reviewers. Anyone who has regularly submitted manuscripts for review can tell stories of bewildering rejections and windfall acceptances. It is a commonplace that luck and chance play some role in the outcome of manuscript review.

Manuscript review is a complex cognitive process involving elements of problem solving, close reading, intuition, and expert perspective. But exactly how reviewers approach their task and how they arrive at their decisions are poorly understood. For many reviewers, the primary function is to detect and describe flaws [1]. Some editors have theorized that reviewers scrutinize a document and record potential flaws until the reviewers reach a rejection threshold [1]. A rejection threshold doubtless reflects both the tolerance of the reviewer and the seriousness and number of errors in the study. The exact level of cumulative error at which reviewers reach this rejection threshold may be largely idiosyncratic, different from one reviewer to the next. But are there discernible patterns of decision-making shared by reviewers? Do reviewers weigh the relative gravity of various faults equally, or are certain aspects of a paper especially pivotal for reviewers when reaching a final recommendation on a manuscript?

We have found few studies of peer review in the radiology literature [24]. Other studies of the peer-review process in the medical literature have tended to examine the effects of either reviewer characteristics [2, 513] or manuscript characteristics [3, 14]. To our knowledge, no study has examined the interaction of reviewer characteristics and manuscript characteristics in a single analysis.

Therefore, we undertook a study to examine how the characteristics of a manuscript interact with the professional and demographic profiles of reviewers in the ultimate disposition of manuscripts. Manuscripts were categorized using the following attributes: ratings of manuscript by reviewers on five different scales (rhetoric, structure, science, import, and overall recommendation), imaging techniques, subspecialty classification, and country of origin of the corresponding author. Reviewers were classified according to age, sex, subspecialty, number of years reviewing, and academic rank. The goals of our study were to determine which attributes of a manuscript tend to predict acceptance or rejection; which attributes of a reviewer tend to predict their ratings of a manuscript; how much concordance of opinion exists between reviewers; whether there is evidence of a mediating role played by the editors; and the relative importance of reviewer and manuscript in determining the final outcome of a paper. More broadly, this study examines the extent to which peer review of the American Journal of Roentgenology (AJR) is a reliable, reproducible, and objective measure of the relative value of submitted manuscripts.


Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Over a 6-month interval (October 1997 through March 1998), all major papers submitted to the AJR were collected and entered into a database in which characteristics of the manuscript, demographic profiles of assigned referees, and the eventual disposition of the manuscript through the peer-review process were recorded. At the AJR, major papers are scientific reports based on original research. Other types of manuscripts at the AJR not studied included perspectives, reviews, original reports, opinions or commentaries, pictorial essays, case reports, technical innovations, and letters to the editor. All manuscripts were reviewed by at least two AJR reviewers, all of whom worked at universities or practices in North America. The opinion of a third or even fourth reviewer was occasionally solicited if the views of the initial two reviewers were divergent or if the editor thought that one of the reviews was inadequate.

Manuscript Characteristics
Manuscript characteristics included ratings of reviewers, subspecialty of the research, primary imaging technique, and the country of residence of the corresponding author. Reviewers rated each manuscript on five standardized 10-point scales that measured: the use of rhetoric and language; the structure and organization of the manuscript; the quality of the science of the investigation; the clinical or academic import of the work; and, finally, the overall recommendation of the reviewer regarding the suitability of the manuscript for publication in the AJR (Fig. 1). On each 10-point scale, higher numbers indicate superior ratings. The subspecialty class of the manuscript was chosen by the category of the table of contents of the journal to which the manuscript best conformed. Specifically, these categories included chest imaging, breast imaging, gastrointestinal imaging, genitourinary imaging, obstetrics and gynecology, musculoskeletal imaging, head and neck imaging, pediatric imaging, cardiovascular imaging, computers in radiology, economics, physics, and miscellaneous. The primary technique used in the study was identified as sonography, CT, MRI, nuclear medicine, unenhanced radiography, or other. If two or more techniques were used by the researchers, the technique that was the primary focus of the study was used to categorize the paper. If the primary technique could not be determined because two or more techniques were equally important, then the paper was categorized as "other." The country of residence of the corresponding author of a paper was classified as the United States, Europe, Canada, Asia, the Middle East, India, Turkey, Australia or New Zealand, Africa, or Latin America. Finally, we determined whether the paper originated from an academic department or a private practice using the address of the corresponding author.



View larger version (39K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Fig. 1. Score sheet shows four rating scales and overall recommendation used by peer reviewers at the American Journal of Roentgenology to assess relative merits of manuscripts.

 

The disposition of the paper was categorized as rejection, rejection with the opportunity to revise and resubmit, or acceptance. These categories correspond to the overall recommendation on the manuscript review form (Fig. 1) supplied to AJR reviewers. On that form, overall recommendation is a graduated scale: a score of 1–4 indicates rejection; a score of 5 or 6, rejection with the opportunity to revise and resubmit; and a score of 7–10, acceptance. Those manuscripts that required a third review were also included in the data set. The scores of all reviewers and the attributes of each reviewer were entered in our database and analyzed.

The relationship between the particular characteristics of a manuscript and its ultimate disposition was evaluated for each variable individually and for the entire set of variables as a group. The importance of the country of origin as a predictor of acceptance was studied with a likelihood ratio chi-square test. The importance of the content subspecialty and of the technique was studied using the Mantel-Haenszel test based on table scores. To assess the relative importance of the four rating scales, we generated Kendall tau rank correlation coefficients to characterize the relationship between each of the four scales (rhetoric, structure, science, import) and the overall recommendation and ultimate disposition. Finally, we generated an ordinal logistic regression model using a stepwise fitting procedure to assess the relative importance of each of the variables within the universe of all variables and the independence of one variable from another.

Reviewer Attributes
During this same 6 months of data collection on major papers at the AJR, we sent a questionnaire to all the reviewers for the Journal. The reviewers were asked to specify their age, sex, principal subspecialty, number of years reviewing for the AJR, academic rank, and practice type (academic or private practice).

The relationship between reviewer attributes (age, sex, years of reviewing, academic rank, and practice type) and these reviewers' ratings of manuscripts on the five scales (rhetoric, structure, science, import, and overall recommendation) was analyzed using Kendall tau rank correlation coefficients. The relationship between the subspecialty of the reviewer and the reviewer scoring of the various ratings scales was characterized using the Mantel-Haenszel test. We then tabulated scores to determine the most extreme scores in the data set and to correlate these outliers with reviewer attributes. Finally, a repeated measures analysis of variance (ANOVA) model was used to assess the relationship among the overall recommendation on a manuscript and the ratings on the four other scales, the age of the reviewer, years of reviewing, and academic rank.

Reviewer Concordance
The ratings of the reviewers who reviewed a particular paper were compared for degree of concordance. The absolute value of the difference between reviewers' ratings for each of the rating scales was also computed, and descriptive statistics were generated to illustrate the degree of agreement.

The concordance between reviewers was assessed on the various ratings scales using Kendall tau rank correlation coefficients as well as canonical correlation and regression analysis of the overall recommendation. The correlation among the five scales (rhetoric, structure, science, import, and overall recommendation) to each other and the relative importance of each scale to the overall disposition of the manuscript were assessed using correlation analysis and Kendall tau rank correlation coefficients.

The Authority of the Editor
Evidence for the influence of the editor was sought by comparing the overall scores from the reviews with the ultimate disposition of the manuscript. Mean overall scores were calculated for each disposition category: accept, reject with the opportunity to revise and resubmit, and reject. Further, mean overall scores for each manuscript were calculated and compared with the corresponding review recommendation (Fig. 1). When the recommendation suggested by the mean overall score did not correspond to the ultimate disposition of the manuscript, the data about these particular manuscripts were examined.

The Relative Importance of Reviewer and Manuscript
To assess the relative importance of the manuscript and the reviewer, we used a mixed linear model ANOVA test. This statistical model estimates the random effects of manuscript and reviewer on the overall recommendation. The relative contribution of both the manuscript and the reviewer to the variability of the overall score of a manuscript was calculated.

All statistical tests were performed using SAS software (SAS Institute). Tests were considered statistically significant at p values of 0.05 or less.


Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
We collected and analyzed reviews of 196 manuscripts. One hundred forty-six manuscripts were assessed by two reviewers, 47 manuscripts by three reviewers, and three manuscripts by four reviewers for a total of 445 reviews. Of the 335 reviewers who provided these 445 reviews, 263 (78.5%) reviewed one manuscript, 64 (19.1%) reviewed two manuscripts, 15 (4.5%) reviewed three manuscripts, one (0.3%) reviewed four manuscripts, and one (0.3%) reviewed five manuscripts.

Manuscript Characteristics
In our sample, the country of residence of the corresponding author and number of manuscripts from there were United States (n = 71, 36.2%), Europe (n = 55, 28.1%), Asia (n = 52, 26.5%), Canada (n = 8, 4.1%), India (n = 1, 0.5%), the Middle East (n = 1, 0.5%), Turkey (n = 1, 0.5%), and uncategorized (n = 7, 3.6%). All but two manuscripts originated from academic departments. Manuscripts were categorized according to subspecialty: gastrointestinal (n = 52, 26.5%), chest (n = 24, 12.2%), musculoskeletal (n = 24, 12.2%), head and neck (n = 22, 11.2%), cardiovascular (n = 14, 7.1%), genitourinary (n = 14, 7.1%), breast (n = 14, 7.1%), pediatrics (n = 10, 5.1%), obstetrics and gynecology (n = 7, 3.6%), computers in radiology (n = 6, 3.1%), economics (n = 3, 1.5%), physics (n = 1, 0.5%), and other (n = 5, 2.6%). The primary imaging technique used in these studies included MRI (n = 51, 26.0%), CT (n = 48, 24.5%), sonography (n = 38, 19.4%), unenhanced radiography (n = 34, 17.3%), nuclear medicine (n = 10, 5.1%), and miscellaneous (n = 15, 7.7%).

Of the 196 reviewed major manuscripts, 20 (10.2%) were accepted, 106 (54.1%) were rejected, and 70 (35.7%) were rejected with the opportunity to revise and resubmit. The site of origin was a strongly significant factor (p = 0.001) in the disposition of these manuscripts. For the purposes of our analysis, manuscripts from Canada and the United States were classified as North America, and manuscripts from Turkey, India, and the Middle East were combined with those from Asia. With this categorization, we calculated the percentage of manuscript acceptance to be 23.5% from North America, 12.8% from Europe, and 2.5% from Asia. The percentage of major manuscripts rejected with the opportunity to revise was 39.2% from North America, 28.2% from Europe, and 17.5% from Asia. Finally, the percentage of major manuscripts rejected was 37.2% from North America, 59% from Europe, and 80% from Asia.

No significant effect could be seen for either of the following variables: content subspecialty (p = 0.71) or primary imaging technique (p = 0.54).

Correlation analysis showed that the score for overall recommendations of the reviewers was correlated with the ultimate disposition of the manuscript (r = 0.46, p < 0.0001). These overall scores, in turn, had significant statistical correlation to the scores on the four scales (rhetoric, structure, science, and import): p values were 0.0001 for all four scales. This said, the scores on the science and import scales were more strongly correlated to overall recommendation (r = 0.73 and 0.68, respectively) than were the scores on the rhetoric or structure scales (r = 0.46 and 0.58, respectively). Therefore, the reviewer ratings on the science and import scales were more important to manuscript acceptance than were the scores on the rhetoric and structure scales.

The ordinal logistic regression model that used the final disposition of the manuscript as the outcome variable showed statistically significance for the country of origin (p < 0.001), score on the science scale (p < 0.0001), and score on the import scale (p < 0.0001). With these factors included in the model, scores on the rhetoric scale, the structure scale, subspecialty classification, and primary imaging technique were not statistically significant (p > 0.05).

Reviewer Attributes
We found no statistically significant correlation between a reviewer's subspecialty and scoring on the rhetoric scale (p = 0.19), structure scale (p = 0.18), science scale (p = 0.24), import scale (p = 0.20), or overall recommendation (p = 0.19). Correlation analysis showed no statistically significant effect for sex, years of reviewing, or academic or private practice, on the ratings given for any of the scales (rhetoric, structure, science, import, and overall recommendation). However, we did find a statistically significant correlation between the age of a reviewer (p = 0.03) and also the rank of a reviewer (p = 0.05) and the rating of manuscripts on the import scale. Reviewers who were older or higher in academic rank tended to rate papers lower on the import scale. Further, we saw a tendency for reviewers who had reviewed longer to rate papers lower on the import scale (p = 0.06).

Reviewer Concordance
We found that two reviewers agreed within 2 points 50% of the time on all the scales shown in Figure 1. For 75% of the reviews, we found agreement within 3 points on the structure, science, and overall scales, and within 4 points on the rhetoric and import scales. For 90% of reviews, we found agreement within 4 points for the structure and overall scales and within 5 points for the rhetoric, science, and import scales.

Reviewers giving the most extreme scores (either very high or very low) on the five rating scales were identified. This subset of reviewers comprised the broad scope of all subspecialties, ages, both sexes, academic ranks, and years of reviewing. We found no correlation between reviewer attributes and scores that were either unusually high or unusually low (p > 0.05). This analysis was complicated, however, by the relatively small number of reviewers with multiple extreme entries in the database.

The Authority of the Editor
The mean overall score was 7.42 (SD, 0.99) for accepted manuscripts, 6.09 (SD, 1.06) for rejected manuscripts with opportunity to revise and resubmit, and 4.41 (SD, 1.41) for rejected manuscripts. Of 38 manuscripts receiving a mean overall score of 7 or higher from reviewers (indicating an accept recommendation), 17 (46%) were accepted by the editors, 15 (40%) were placed in the reject with opportunity to resubmit category, and six (16%) were rejected. Of 103 manuscripts receiving a mean overall score greater than 4 but less than 7, three (3%) were accepted by the editors, 51 (50%) were placed in the reject with opportunity to revise category, and 49 (47%) were rejected by the editors. Of 55 manuscripts receiving a mean overall score of 4 or less, none (0%) was accepted by the editors, four (7%) were placed in the reject with opportunity to revise category, and 51 (93%) were rejected.

Two manuscripts with low initial mean overall scores (5, 5.5) were elevated by the editors to the accept category. In both cases, the scores from the initial two reviews were widely discrepant (9 vs 1 and 7 vs 4), and a third review, which was favorable (scores of 9 and 7, respectively), was obtained.

Two manuscripts with high initial mean overall scores (8, 8) were rejected by the editors. One of these manuscripts had only a single reviewer. The other manuscript had two discrepant reviews (overall scores of 10 and 6), and a subsequent third review that was more negative (score of 4) than either of the original two reviews.

The Relative Importance of Reviewer and Manuscript
With the mixed linear model using overall score as the outcome variable, we estimated the relative contribution of the manuscript and reviewer to the manuscript acceptance rate. The covariance parameter estimate was 0.29 for manuscript, 0.68 for the reviewer, and 3.26 for the residual. Thus, an estimated 7% of the variability of the overall score could be attributed to the manuscript, 16% could be attributed to the reviewer, and 77% was unexplained by either factor.

These covariance parameters indicate the degree to which the overall score of a manuscript varies with each of the predictor variables (the particular manuscript, the particular reviewers). The residual component estimates the degree to which the overall score is not explained by either the manuscript or reviewer factor. The large residual value indicates either that there is a substantial amount of randomness or that there are important factors in operation that are not included in the model. The actions of an editor may be one such uncontrolled factor.


Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
The peer review of manuscripts is essential to maintaining the vitality and quality of a scientific journal, but it is a complex process that combines elements of subjective critique and objective standards. Our study shows that relatively few factors seem to have a systematic predictive influence on the editorial decision for any given major manuscript at the AJR. Recognizing that peer review is an interaction between the reviewer and the manuscript, we sought to explore and dissect that interaction to discover recurring patterns that might help to partially predict outcome. One might expect that a manuscript would have a better chance of publication if it describes newer or more sophisticated imaging techniques or focuses on subspecialties that have popular appeal. One could also suppose that particular types of reviewers would be nicer (or harsher) in their critiques or that reviewers would tend to be swayed as much by style as substance.

Our study indicates that at the AJR three characteristics of a major manuscript were most strongly associated with acceptance: the quality of the science, the perceived importance of the study, and the geographic origin of the manuscript. This is not to imply that the rhetorical weight and the eloquence of the writing or the structure and organization of the manuscript were not important to AJR reviewers, but that these qualities appeared to be less influential as far as the reviewers' recommendations for publication. It is interesting, however, that manuscripts from North America had a greater chance of success even when their science and import scores matched those of manuscripts from other parts of the world. This may be evidence of an underlying rhetorical or stylistic effect: manuscripts from outside North America may have been hobbled by nonstandard styles of writing and argument. Such an effect, though, is likely to be small. Correlation analysis clearly shows the paramount importance of the scores on the science and import scales and the relative weakness of the rhetoric and structure scores on the decision for acceptance. This result may be reassuring to investigators outside North America: language and grammar issues do not seem to be insuperable barriers to publication provided the science of a study is solid and its importance is transparent.

It is equally important to recognize those features of a manuscript that did not seem to influence its ultimate disposition: the subspecialty of the content of the paper and the primary imaging technique used in the study. Clearly, scientific investigations of value and merit are produced in all subspecialties of radiology and use any number of imaging techniques.

Regarding reviewer attributes, we found a significant correlation between the age of reviewers and also professional rank and scoring of manuscripts on the import scale. Older reviewers and reviewers at higher academic ranks tended to score the import of manuscripts lower.

The overriding weight given by reviewers to rankings on the import scale and the tendency of older reviewers to be more stingy when scoring this scale may suggest confirmatory bias. Confirmatory bias is the tendency to favor manuscripts with views and data that coincide with one's own beliefs and experiences [12, 13]. Younger reviewers in our study may have held less firmly entrenched views and therefore were more receptive to manuscripts that espoused ideas and perspectives that lay outside the scope of their particular interest or experience.

We found no clear demographic profile for reviewers who tended to give extreme scores. There appeared to be no identifiable subgroup in the AJR reviewers who were either excessively caustic or excessively lenient in their reviews of manuscripts. Admittedly, our analysis is flawed by the limited number of repeat reviews. It is possible that some pattern might be discernible with a larger sampling of manuscripts and reviewers. Nonetheless, we found no clear and strong pattern in our data for the systematic assignment of extreme scores by a particular subgroup. Reviewing styles appeared to be more idiosyncratic, reflecting more the tendencies of individuals than of subgroups.

Our study further showed that these AJR reviewers tended to agree with each other. On the five different rating scales of rhetoric, structure, science, import, and overall recommendation, we found that reviewer scores agreed within two points on a 10-point scale for at least half of all reviews. For 75% of reviews, reviewer scores agreed within 3 points for structure, science, and overall recommendation. Interestingly, on the decisive import scale, we found a greater divergence of opinion. Potential authors might see this as a way to increase their chances of acceptance by arguing as effectively and persuasively as they can about the importance of their studies. In the end, the acceptance of major papers at the AJR depended crucially on the alignment of the reviewers' ideas about what is important with the authors' ideas about what is important.

Finally, our study showed that who reviews a paper is at least as important as the quality of the paper itself. We found that for any combination of two AJR reviewers and one major manuscript, the change in the overall score was likely to be greater if one of the reviewers was exchanged for another reviewer than if the manuscript itself was exchanged for another manuscript. Even so, the manuscript factor accounted for only 7% of the variability of the overall score of the AJR manuscripts, and the reviewer factor accounted for 16%. Both of these factors were dwarfed by the amount of unexplained variability (77%) in the overall score. Most manuscripts were seen by only two reviewers, and most reviewers rated a limited number of manuscripts. Therefore, these variability estimates are likely to be somewhat imprecise. Even so, a similar result has been described in an earlier study in which the characteristics of reviewers explained only 8% of the variation in the quality of the reviews that these reviewers produced [6]. Clearly, the complex process of peer review remains largely unexplained, at least to the extent that quantitative statistical models can capture it.

Our finding that manuscripts from North America tended to have a greater chance of acceptance has also been documented in the gastroenterology literature [14]. For the AJR, chances of acceptance may well be changing with the growing internationalization of the Journal [3] and with the opening of the review process to reviewers beyond North America with the advent of an Internet-based review process.

The moderate amount of disagreement between reviewers in our study has also been described in the literature of other medical subspecialties [1, 79]. Such reviewer disagreement provides further support for the important intermediary role played by editors in the peer-review process [2, 1518]. Editors must monitor the variation in quality and scoring tendencies of reviewers to mitigate the effects of a biased or deviant review. Indeed we found evidence for the active agency of the editors at the AJR: the overall recommendations of the reviewers were strongly—but not perfectly—correlated with the final disposition of the manuscript. In 25% of manuscripts, the editors solicited the opinions of a third or fourth reviewer if the views of the initial two reviewers were widely discrepant or if the editor thought that one of the initial reviews was inadequate. Ultimately, reviewers serve as consultants to editors. Editors are constantly challenged to evaluate reviews for their persuasiveness, mediate differences of reviewer opinion and standards, and recognize potential reviewer bias. Moreover, editors must make decisions to maintain the vitality, originality, relevance, readability, and balance of their journal. As a result, a few good manuscripts will be passed over and a few marginal ones will be published.

The principal limitation of this study is the relatively small sample size and the limited number of repeat reviews by individual reviewers. Some of the factors that were not found to be statistically significant might feasibly become significant if a larger sample were studied. Nonetheless, the effect size of these factors would likely be small.

Peer review of manuscripts is an exercise with subjective and unpredictable components, and yet our study shows that reviewers generally agree on the quality of a manuscript and that good reviewers are found throughout our profession. Overall, the process tends to be reliable and valid: the general concordance of blinded reviewers indicates that peer review provides a reasonable estimate of the objective worth of a study to the general radiology audience. We did find, however, that reviewers tend to become harder to impress as they grow older, which seems intuitively plausible. The central lesson of our study may be that the clinical or academic import of a concept and the tenants of sound science are of paramount importance to reviewers. Recognizing that establishing the importance of an investigation is the primary function of the Introduction section of a manuscript and that establishing the quality of the science is the primary function of the Materials and Methods section of a manuscript, young investigators might profit by devoting particular attention and care to the development of these sections when drafting their papers.


Acknowledgments
 
We thank Carrie Poole for assistance in the preparation of the manuscript; Mitzi Chambers for data entry; and the staff of the Editorial Office of the AJR in Winston-Salem, NC, for the collection of data.


References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Kassirer JP, Campion EW. Peer review: crude and understudied, but indispensable. JAMA1994; 272:96 –97[Medline]
  2. Siegelman SS. Assassins and zealots: variations in peer review. Radiology1991; 178:637 –642[Free Full Text]
  3. Chen MY, Jenkins CB, Elster AD. Internationalization of the American Journal of Roentgenology: 1980–2002. AJR 2003;181:907 –912[Abstract/Free Full Text]
  4. Friedman DP. Manuscript peer review at the AJR: facts, figures, and quality assessment. AJR1995; 164:1007 –1009[Abstract/Free Full Text]
  5. Nylenna M, Riis P, Karlsson Y. Multiple blinded reviews of the same two manuscripts. JAMA1994; 272:149 –151[Abstract]
  6. Black N, van Rooyen S, Godlee F, Smith R, Evans S. What makes a good reviewer and a good review for a general medical journal? JAMA 1998;280:231 –233[Abstract/Free Full Text]
  7. Cullen DJ, Macaulay A. Consistency between peer reviewers for a clinical specialty journal. Acad Med1992; 67:856 –859[Medline]
  8. Ingelfinger FJ. Peer review in biomedical publication.Am J Med 1974;56:686 –692[Medline]
  9. Scharschmidt BF, DeAmicis A, Bacchetti P, Held MJ. Chance, concurrence, and clustering. J Clin Invest1994; 93:1877 –1880
  10. Callaham ML, Baxt WG, Waeckerle JF, Wears RL. Reliability of editors' subjective quality ratings of peer reviews of manuscripts. JAMA 1998;280:229 –231[Abstract/Free Full Text]
  11. Gilbert JR, Williams ES, Lundberg GD. Is there gender bias in JAMA's peer review process? JAMA1994; 272:139 –142[Abstract]
  12. Ernst E, Resch KL. Reviewer bias: a blinded experimental study.J Lab Clin Med1994; 124:178 –182[Medline]
  13. Mahoney MJ. Publication prejudices: an experimental study of confirmatory bias in the peer review system. Cognit Ther Res 1977;1:161 –175
  14. Link AM. US and non-US submissions: an analysis of reviewer bias. JAMA 1998;280:246 –247[Abstract/Free Full Text]
  15. Relman AS. Peer review in scientific journals: what good is it? West J Med1990; 153:520 –522[Medline]
  16. Relman AS, Angell M. How good is peer review? N Engl J Med 1989;321:827 –829[Medline]
  17. Chew FS. Manuscript peer review: general concepts and the AJR process. AJR1993; 160:409 –411[Free Full Text]
  18. Polak JF. The role of the manuscript reviewer in the peer review process. AJR1995; 165:685 –688[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
RadiologyHome page
R. G. Sheiman
The RSNA Reviewer Mentorship Program
Radiology, September 1, 2007; 244(3): 631 - 632.
[Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
S. Ehara and K. Takahashi
Reasons for Rejection of Manuscripts Submitted to AJR by International Authors
Am. J. Roentgenol., February 1, 2007; 188(2): W113 - W116.
[Abstract] [Full Text] [PDF]


Home page
JAMAHome page
J. S. Ross, C. P. Gross, M. M. Desai, Y. Hong, A. O. Grant, S. R. Daniels, V. C. Hachinski, R. J. Gibbons, T. J. Gardner, and H. M. Krumholz
Effect of Blinded Peer Review on Abstract Acceptance
JAMA, April 12, 2006; 295(14): 1675 - 1680.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kliewer, M. A.
Right arrow Articles by Provenzale, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kliewer, M. A.
Right arrow Articles by Provenzale, J. M.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS