|
|
||||||||
Opinion |
1 Department of Radiology, Massachusetts General Hospital and Harvard Medical School, 55 Fruit St., Boston, Massachusetts 02114.
Received April 10, 2002;
accepted after revision August 27, 2002.
Address correspondence to D. B. Kopans.
Introduction
|
|
|---|
I believe that this most recent controversy was not the result of legitimate scientific questions but rather was due to the analysis of trials by individuals who did not fully understand screening trials, the violation of the peer review principles by a respected medical journal, and the recent trend toward tabloid reporting in previously well-respected newspapers with the goal of creating sensational stories.
The Times article was factually inaccurate. It suggested that this was a new study, when, in fact, Gotzsche and Olsen had published identical concerns a year earlier in The Lancet [2]. This second review was actually a reanalysis of the same data using the same approaches to the data. Furthermore, what The Times reported on and what had appeared in The Lancet was actually not the "study," but rather a "research letter" that summarized Gotzsche and Olsen's conclusions (the study was later posted on The Lancet Web site). The urgency of the situation seemed exaggerated by the front-page placement of the story by The Times in light of the fact that the research letter had actually been published 2 months earlier [3].
Ignoring the fact that the arguments that Gotzsche and Olsen had raised had been previously refuted and by implying this study was new, The Times article cast major doubts on the efficacy of mammographic screening. Not only was the story emphasized by The Times, but also the idea that mammographic screening was ineffectual was reinforced by several follow-up articles [4, 5], op-ed pieces [6, 7], and editorials [8, 9] in the paper. Doubts about the benefits of screening were reinforced, while The Times denied publication to most opposing views and criticisms of its articles. The Times even refused publication of a letter to the editor from the American Cancer Society that had been signed by nine other organizations (including the American College of Obstetricians and Gynecologists, American College of Preventive Medicine, American College of Surgeons, Cancer Research Foundation of America, National Alliance of Breast Cancer Organizations, Oncology Nursing Society, and Society of Gynecological Oncology) that reinforced their support for mammographic screening. The same letter was also denied op-ed publication in The Times (Smith R, personal communication). When the American Cancer Society finally paid to publish the letter as an advertisement in the newspaper [10], The Times responded with an editorial attacking the cancer establishment [8]. The American Cancer Society was criticized for not accepting the review by Gotzsche and Olsen despite the fact that the society was well aware of the review and did not accept its conclusions because they were not scientifically sound.
Any time there is a front-page article in The New York Times, it gains the attention of the other media, and the story spreads rapidly. Most of the media had ignored the original articles when they were first published, but once The Times showed interest, other stories followed quickly with little apparent effort by most media to try to understand the facts. Concerns were raised around the world. Not only were women and their physicians confused, but the situation triggered at least four major reviews of mammographic screening.
By-Passed Peer Review and Tabloid Media Behavior Generate a
Controversy
|
|
|---|
...to support the United Kingdom's National Health Service. Funds were provided to establish a `Cochrane Centre,' to collaborate with others in the UK and elsewhere, and to facilitate systematic reviews of randomized controlled trials across all areas of health care.
This initiative was followed by the opening of Cochrane Centers in several countries around the world that collaborate closely with one another and agree to use rigorous review methods to evaluate the evidence behind various medical interventions.
Gotzsche and Olsen were members of this well-respected collaboration, and they chose to review the mammography screening trials. The two reviewers first published their analysis in the British journal The Lancet in 2000 [2]. In that review they decided that five of the seven randomized controlled trials were not performed properly, so they discarded the results from those trials (which happen to show a clear benefit of mammographic screening). They argued that only the Malmö trial and the National Breast Screening Study (NBSS) of Canada were fairly well performed and that only their results were valid. Because they interpreted those two trials as showing that mammographic screening had no benefit, Gotzsche and Olsen concluded that there was no benefit from mammographic screening for women at any age.
Not much attention was paid to this initial publication because the randomized controlled trials of breast cancer screening had been reviewed repeatedly over the past decades, and almost everyone agreed that mammographic screening could reduce the death rate by 25-30% [12]. The major controversy in the 1990s had been over the age at which the screening benefit begins. Furthermore, a number of the trialists had already answered the concerns raised by Gotzsche and Olsen's review.
I believe that the Gotzsche and Olsen analyses [2, 3] showed a lack of understanding about breast cancer screening trials and data analyses from these trials. When multiple trials are being evaluated, one cannot selectively eliminate the results that the reviewer doesn't like. For example, the data from the NBSS of Canada, which evaluated screening for women ages 40-49, run counter to all the other results (they had more deaths in the screened group than in the control group). The results of the study were almost certainly due to the poor quality of the mammography [13, 14] and the fact that the trialists violated the basic rules of a randomized controlled trial. They first examined all the women before randomization and then allocated the participants to the screened and control groups on open lists, which is a major error that allows for the possible untraceable compromise of the allocation process. Blinded randomization is at the heart of a properly executed trial. The Canadian trials grossly violated this basic principle, which was compounded by including women with palpable, advanced breast cancers in their trial of screening [15]. It is fairly clear that in their trial of women 40-49 years old, the trialists allocated a significantly greater number of these women to the screened group, causing the study to be imbalanced from the start [16]. Despite the major flaws in the NBSS, the trial nevertheless must still be included in the analyses of screening trials to avoid bias. The fact that Gotzsche and Olsen excluded five of the trials from their assessment is counter to appropriate trial analysis.
Having had their first analysis repudiated, the two authors persisted and "redid" their review and arrived at the same conclusions, in the same fashion. The authors noted [3] that their reassessment differed from the earlier study in that they
...paid close attention to the standard dimensions of methodological quality of trials: the randomization method, baseline comparability, exclusions after randomization, and unbiased assessment of outcome.... We noted whether early introduction of screening in the control group had occurred...[and]...classified the quality of the available trial data into four groups: high, medium, poor, and flawed.
In my experience, it is unusual to have a second chance to defend an analysis in a publication when the conclusions of the first publication had been rejected by the scientific community. It is not entirely clear what went on at The Lancet that permitted Gotzsche and Olsen to have this second chance, but it would appear that the peer review process may not have been followed. At the Global Summit on Mammographic Screening that was held in Milan, June 3-5, 2002, Peter Boyle, the lead biostatistician at the European Institute of Oncology, revealed that he and at least two other peer reviewers of the Gotzsche and Olsen resubmission rejected the material as scientifically flawed. Despite this fact, the editor of The Lancet, Richard Horton, decided to publish the summary of the paper in the journal and place the detailed paper on The Lancet Web site. The unfortunate ramifications of this highly questionable publication policy by a scientific journal were obvious in the headline that it generated [17]. Because this "new analysis" had not actually been published, it was initially impossible to assess its conclusions. Many journals, such as The Lancet, have adopted a tabloid approach by sending summaries of controversial papers to the media ahead of the journal publication, while preventing the media from disseminating the information until the publication date of the article, which means that a controversial paper will have immediate publicity with no opportunity for the medical or scientific community to have reviewed the data so that they can make informed comments. This policy results in sensational media reports, as in this case, with no way for commentators to provide some balance.
Fortunately, most of the media ignored this rehash of the same analysis until December 2001, when The New York Times broke this year-old story on the front page [1]:
A new study published in a British medical journal has stirred passionate debate among doctors in Europe and the United States by asserting that mammograms do not prevent women from dying of breast cancer or help them avoid mastectomies.
The article summarized interviews from a number of individuals whose answers reflected the fact that most were not well informed about the issues and the data. The responses of those who had the real information and an understanding of the issue were relegated to a few paragraphs. The initial Times article was followed by a second frontpage story on screening in general, in which doubts were raised again about mammography [4]. Attempts to publish the counter arguments were thwarted by The Times. Other media picked up the story from The Times, and the controversy swelled.
The Gotzsche and Olsen Analysis
|
|
|---|
These analyses showed that Gotzsche and Olsen [2, 3] had been concerned with small issues that were of little consequence but which they amplified in importance. There is some irony in their conclusions because the Malmö trials actually show a statistically significant mortality reduction of approximately 35% [20] for women who were invited to be screened before the age of 50 years, which the authors failed to mention. A subsequent analysis of the Malmö trials, also published in The Lancet, showed a benefit from screening older women as well [21], but Gotzsche and Olsen incorrectly concluded that there was no benefit. It is also unclear why they had used outdated information from Malmö in their review.
For a detailed review of the concerns raised by Gotzsche and Olsen [2, 3], I would direct the reader to The Netherlands report [19]. The following are some of the issues that were raised in the report.
Gotzsche and Olsen's basic premise namely, that the trials showing a benefit were significantly flawed and biasedwas highly questionable. These trials have probably undergone greater scrutiny than any group of scientific studies. The Health Insurance Plan (HIP) of New York study, for example, began in 1963 [22], yet Gotzsche and Olsen raised some basic questions about its design and execution. One must wonder why they took so long to raise these concerns. The principal investigators have both died. Nevertheless, Gotzsche and Olsen rejected results from the HIP trial because the data showed a larger number of women in the screened group than in the control group who were removed from the analysis because they had been diagnosed with breast cancer before the start of the trial, which would seem to suggest a bias in the allocation of the women, but it actually reflects a failure by Gotzsche and Olsen to understand the way the HIP study was performed. In the HIP study, 62,000 women in the insurance plan were randomly assigned to the screened group or the unscreened control group. The unscreened control subjects were not even informed that they were in the study. The group that was offered screening was invited to attend a screening session. Obviously, when these women attended a screening session it would be easy to determine whether they had already been diagnosed with breast cancer and to exclude them from the trial. Because the control group was not invited to attend a screening session, the fact that a woman had been diagnosed with breast cancer before the date of allocation would not be known until her death was reviewed, revealing the date of her prior diagnosis. Because most women do not die from their breast cancers, fewer women who had a prior diagnosis in the control group than in the screened group would have come to the attention of the reviewers. Of course, because these women in the screened group also did not die from breast cancer, they would not skew the results. Not only is this information explained in the monograph of the HIP [22], but, according to the director of the National Cancer Institute, Andrew von Eschenbach, it could also have been confirmed if Gotzsche and Olsen had spoken to a "trial expert who worked on this trial" [23]. Gotzsche and Olsen should have been more rigorous in their review.
The results from the Gothenberg trial and the Two-County Trial in Sweden [24] were rejected by Gotzsche and Olsen because the average ages of the women in the trials were not perfectly balanced, leading Gotzsche and Olsen to believe that the randomization must have been biased. They pointed out that in the Gothenberg trial, there was a statistically significant difference of 0.09 years in the ages of the women in the screened group versus those in the control group [2, 3]. This significant difference amounts to 32 days. It is doubtful that this had any influence on the results of the trial and does not suggest a problem with randomization as implied by the authors. Furthermore, as has been pointed out, if 20 parameters are evaluated, it is likely that one will be found to be statistically significant on the basis of chance alone [25]. Minor asymmetries such as this do not invalidate the trial results [26].
Gotzsche and Olsen rejected the results from the Two-County Trial because it too had slight differences in age for the women in the screened group relative to the women in the control group. What Gotzsche and Olsen failed to understand was that these differences are to be expected when cluster randomization is used [27], as was the case in the Two-County Trial, and that the differences in age in no way invalidate the conclusions of the trial.
Gotzsche and Olsen concluded that the allocation of the causes of death in the screening trials must have been biased because there were fewer deaths attributed to breast cancer in the screened groups than in the control groups. Because most of the trials involved reviewers who were unaware of the patient data to determine the causes of death, it is unclear how the allocation process was compromised. Furthermore, the obvious answer to Gotzsche and Olsen's concerns is that there were fewer deaths from breast cancer in the screened group because screening reduced the rate of deaths from breast cancer.
A major point made by Gotzsche and Olsen was that the trials incorrectly used as the measure of benefit a reduction in deaths from breast cancer between the screened group and the control group. Gotzsche and Olsen insisted that the only true measure of benefit was a decrease in deaths from all causes [2, 3] and that measuring breast cancer deaths alone would lead to bias. In their correspondence in The Lancet [3], Gotzsche and Olsen correctly pointed out that the trials would not be expected to show a difference in the rate of deaths from all causes. Paradoxically, they then proceeded to criticize the fact that the trials were not well done because they did not show a decrease in deaths from all causes. The concerns of Gotzsche and Olsen were based on the fact that bias could occur if those who assigned the cause of death wanted to make screening appear beneficial. Gotzsche and Olsen could then attribute more deaths to breast cancer among the unscreened group while assigning fewer deaths to breast cancer among the screened group. It is unclear how this bias would happen if those assigning the cause of death did not know which women had been in the screened group and which women had been in the control group. Gotzsche and Olsen correctly argued that this type of bias can be avoided by comparing deaths from all causes, not just those from breast cancer, which is the way treatment trials are performed. The treated group is compared with the control group by counting deaths from all causes, eliminating any possible bias introduced by trying to determine why an individual died (whether it was from the disease being studied or for some other reason). All-cause mortality is a legitimate end point in treatment trials in which everyone has the disease and most deaths are from the disease, but it shows a lack of understanding to think that it can be used in screening trials in which only a small percentage of the participants have the disease and few of the total deaths that occur during the trial are from the disease. Because the number of deaths from other causes in screening trials is much greater than that from breast cancer, extremely large trials would be needed to show that saving lives from breast cancer lowered the total number of deaths in the population under study. A trial would have to be so large that such a trial could not be done. We have estimated that if breast cancer were to account for 10% of deaths each year, then 3 million women would have to participate in a trial to show that a 25% decrease in breast cancer deaths would have a significant effect on all causes of mortality [28]. Because breast cancer accounts for approximately 3% of deaths each year, a trial would have to involve millions of women and would be impossible to perform.
Gotzsche and Olsen were also skeptical of the trial data because the trials did not take into account possible cardiovascular deaths from breast irradiation. They neglected to realize that cardiovascular deaths were at most 5% of those irradiated [29, 30] and that these deaths would apply to both screened and control groups, so that any difference would be trivial. Furthermore, death from cardiovascular disease is no longer an issue because modern treatment fields are designed to avoid cardiac irradiation.
Gotzsche and Olsen also argued that screening leads to more mastectomies based on data that were gathered from the trials. They overlooked the fact that mastectomy was the standard treatment for all cancers at the time these trials were performed, which is no longer the case. Clearly, modern mammography has resulted in a marked reduction in the need for mastectomy.
The credibility of Gotzsche and Olsen's analysis is also diminished by their emphasis on the NBSS as an example of a properly performed and unbiased trial when the facts clearly show that the Canadian trial was significantly compromised. As noted earlier, the NBSS violated basic and critical rules of randomized controlled trials. To avoid bias, the allocation process must be blinded, and nothing can be known about individual participants before the blinded, random allocation. Not only did all the patients in the NBSS have a clinical breast examination before allocation, but the allocation was also on open lists. Furthermore, women with clinically obvious cancers, some of which were advanced, were permitted to participate in this trial of screening. It would have been simple and undetectable to have inadvertently shifted a few women with advanced cancers into the screened group (i.e., skip a line on the list), causing a major bias in the cancer allocations and trial results without having any effect on the overall demographics. At least for the women between 40 and 49 years old, a statistically significant excess of women with advanced breast cancer was allocated to the screening arm [16]. Gotzsche and Olsen suggest that a review cleared the Canadian randomization process, but they failed to point out that the review was essentially worthless because the reviewers failed to interview the women who performed the randomization to determine whether the randomization process had been compromised [31]. The flaws in the NBSS are indefensible and represent textbook errors in trial design and execution. Furthermore, the quality of the mammography in this trial of screening was poor [13, 14], even prompting their own physicist to state that it was "far below state of the art, even for that time [early 1980s]" [32]. It is inexplicable why Gotzsche and Olsen, who based their arguments on the claim that they were experts in reviewing randomized trials, would point to the NBSS as an example of a properly designed and executed trial.
Despite the scientific facts, The New York Times and their reporters continued to try to buttress their position. They stressed the fact that a committee called the Physicians Data Query (PDQ) agreed with the Gotzsche and Olsen review. The PDQ was portrayed as an objective committee of the National Cancer Institute (NCI) [33]. The PDQ is actually a group of individuals who are loosely connected to the NCI. They were not chosen in any objective fashion, but were chosen by the committee chairman, Barnett Kramer, a long-time opponent of recommending screening for women 40-49 years old. Nine of the other 11 members of the PDQ are also established opponents of screening for women 40-49 years old (they have all lectured and published articles in opposition). I believe that the PDQ is one of the most biased groups that could be assembled to evaluate this issue, which The Times knew (I personally wrote to them including references written by the PDQ members). Nevertheless, The Times continued to mislead their readers into thinking that this was an objective group, implying that it voiced the opinion of the NCI. The new director of the NCI personally reassured me that the PDQ "does not determine NCI policy" (von Eschenbach A, personal communication).
The New York Times, continuing to ignore the scientific evidence, persistently attacked anyone who argued in support of screening. When Secretary Tommy G. Thompson of the United States Department of Health and Human Services came out in support of screening on the basis of the report of the United States Preventive Services Task Force, The New York Times published an editorial criticizing him [9]. They also dismissed the United States Preventive Services Task Force as an unimportant group when, in fact, the organization is quite well respected, even by those of us who have disagreed with it in the past. The Times went on to dismiss the analysis because the task force, under pressure from the Gotzsche and Olsen critique, had downgraded the level of evidence from the trials from an "A" to a "B." This downgrading was clearly a concession to politics. A conference in Sweden in mid February confirmed the quality of the trials and the lack of scientific support for the Gotzsche and Olsen review [34]. All trials have flaws, but most of the trials were well done, and the few flaws do not negate the results. Certainly, the Swedish trials were carefully thought out and executed, with rigorous follow-up, and they showed a clear benefit from mammographic screening, which was recently reaffirmed with an updated review of the those trials [35].
Evidence Supporting the Efficacy of Mammographic Screening
|
|
|---|
The benefit of service screening was confirmed in a more recent review of seven counties in Sweden. This study, encompassing approximately one third of the Swedish population, found that breast cancer deaths had decreased by as much as 44% for women who underwent screening [37]. The death rate from breast cancer in the United States, which has remained unchanged for more than 50 years, began an abrupt downturn around 1989-1990, and the decrease has continued for more than 10 years [38]. This sudden downturn in deaths is clearly mirroring the start of screening in the United States that occurred around 1983-1985, as indicated by the abrupt increase in incidence that took place at that time [39], which is exactly the time frame that would be expected. Better therapies may be helping, but most of the benefit is clearly a result of earlier detection.
In February 2002, a meeting was held in Stockholm to permit Gotzsche and Olsen to present their concerns and to permit the members of the trials to answer their critics. Apparently, it was clear that the concerns of Gotzsche and Olsen were unfounded and their rejection of the benefit of screening unsupported [34]. In March, the Swedish and Dutch governments rejected Gotzsche and Olsen's analysis and reaffirmed the value of mammographic screening. The updated analysis of the Swedish trials showed a 21% statistically significant mortality reduction for "invitation" to mammographic screening [34].
Many analysts and clinicians overlook the fact that these trials measured the benefit of women having been "invited" to the screening. The trials do not take into account noncompliance and "contamination." Women who died of breast cancer in the screened group who had refused screening are still counted as having been screened, and women in the control group whose lives were saved by mammography they underwent outside of the trial are still counted as unscreened control subjects. Thus, the true mortality reduction is likely even greater than that reported by the trials.
|
|
|---|
Medical journals should end their tabloid polices of sending press releases about embargoed information. There is rarely a reason for a medical finding to be released as an emergency. Controversial issues should be analyzed at a scientific level, and a rush to involve the media should be avoided. Given the pressures to be the first with a story, it is clear that many in the media have forsaken the search for truth in favor of more publicity. Medicine and science should not be lured down the same path.
|
|
|---|
This article has been cited by other articles:
![]() |
D. A Freedman, D. B Petitti, and J. M Robins On the efficacy of screening for breast cancer Int. J. Epidemiol., February 1, 2004; 33(1): 43 - 55. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Kopans Sonography Should Not Be Used for Breast Cancer Screening Until Its Efficacy Has Been Proven Scientifically Am. J. Roentgenol., February 1, 2004; 182(2): 489 - 491. [Full Text] [PDF] |
||||
![]() |
D. B. Kopans, B. Monsees, and S. A. Feig Screening for "Cancer": When is it Valid?--Lessons from the Mammography Experience Radiology, November 1, 2003; 229(2): 319 - 327. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Kopans What Is Wrong with This Picture? Screening Mammography Am. J. Roentgenol., November 1, 2003; 181(5): 1429 - 1429. [Full Text] [PDF] |
||||
![]() |
H. K. Weir, M. J. Thun, B. F. Hankey, L. A. G. Ries, H. L. Howe, P. A. Wingo, A. Jemal, E. Ward, R. N. Anderson, and B. K. Edwards Annual Report to the Nation on the Status of Cancer, 1975-2000, Featuring the Uses of Surveillance Data for Cancer Prevention and Control J Natl Cancer Inst, September 3, 2003; 95(17): 1276 - 1299. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Hensel, M. J. Fishbein, D. B. Kopans, J. S. Vaidya, M. Baum, P. C. Gotzsche, S. W. Fletcher, and J. G. Elmore Mammographic Screening for Breast Cancer N. Engl. J. Med., August 7, 2003; 349(6): 610 - 612. [Full Text] [PDF] |
||||
![]() |
L. F. Rogers Target of Opportunity for the Media Am. J. Roentgenol., January 1, 2003; 180(1): 1 - 1. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |