Seminar #61: Controversies in randomized trials
There are many controversies in the use of randomized trials to develop new research models. We will examine some of the strengths and weaknesses of the randomized trial and highlight research situations where a randomized trial is either unethical, impractical, or unrealistic.
In this class you will learn how to:
- understand the ethical constraints that prevent randomization;
- highlight the special problems with conducting randomized trials of alternative medicine; and
- explain the importance of temporality in establishing a cause and effect relationship.
This class qualifies for 1 IRB Education Credit (IRBEC).
Contents
- Abstract
- Where can I find this handout?
- Criticisms of randomized clinical trials
- Temporality of causes
- Constraints on randomization
Where can you find this handout?
This handout and the handouts that I use for all of my seminars and training classes are a compilation of individual web pages at www.childrensmercy.org/stats. I use the "Include Page" feature of Microsoft FrontPage to combine these into a single page. You can always find the most recent version of this compilation by going to the web address listed at the bottom of this page. Links for the handouts for other seminars and classes appear at www.childrensmercy.org/stats/training.asp.
Why don't I use PowerPoint?
I stopped using PowerPoint for my presentations in the mid 1990's. This was based on Edward Tufte's advice that presenting information in a paper handout is more effective than presenting the information on a projected screen. I found this to be excellent guidance. I enjoy talking when I don't have to wrestle with a laptop computer. I look at my audience more and interact with them better. I elaborate on this in greater detail at www.childrensmercy.org/stats/weblog2004/powerpoint.asp.
Criticisms of Randomized Clinical Trials (April 7, 2004)
While surfing the web, I found out about a book, Fiction and Fantasy in Medical Research. The Large-Scale Randomised Trial by James Penston (2003, The London Press, London England. ISBN: 0-9544636-1-7). I picked up a copy but have not read it yet. The blurb on the back cover reads:
"Every day, millions of patients throughout the world take treatment which is based on the results of large-scale randomised trails. But, how much do we really know about these studies? This book exposes the serious flaws in this method of medical research. Although making vast profits for the pharmaceutical industry, large-scale randomized trials do little to improve the lives of patients and are responsible for an enormous waste of scare health care resources."
Wow! That's quite an indictment. When I am done reading the book, I'll post a short review on this page. But Dr. Penston is not the first critic in this area.
The Association for Human Resource Protection printed a stinging critique of randomized clinical trials (RCTs) in the study of psychiatric drugs. An article in the June 2001 issue of the Journal of Clinical Epidemiology ([Medline]), asks whether the RCTs is the gold standard or the golden calf. I myself discuss the strengths and weaknesses of the RCT ([Medline]) and demote it from a gold standard to a silver standard.
Perhaps the sharpest criticisms of RCTs, however, come from proponents of complementary and alternative medicine (CAM). They have considered randomized clinical trials (RCTs) to be "reductionist" because they fail to look at the whole patient and reduce that patient to a single dimension. Mason et al [Medline] give a balanced perspective on this controversy. They point out that:
"..many practitioners argue that research methods dissect their practice in a reductionist manner and fail to take into account complementary medicine's holistic nature."
They argue that RCTs have to be adapted to the special features of CAM. In particular, the tendencies of RCTs and CAM are often in conflict.
- RCTs focus on a single disease, but CAM is used for more general problems and conditions.
- RCTs require tightly standardized treatment regimens, but CAM tailors the treatment to individual patients.
- RCTs attempt to remove practitioner effects from the design, but CAM relies on the relationship between the patient and the practitioner.
- RCTs focus on a single intervention, but CAM uses multiple interventions simultaneously.
- RCTs focus on easily quantifiable outcomes, but CAM tries to produce more subtle effects such as spiritual change or personal growth.
- RCTs focus on short term changes, but CAM aims for long term healing.
Note that these are tendencies. Some RCTs focus on more than one disease, but the tendency is to focus on a single disease. Some types of CAM are standardized, but the tendency is to offer individualized therapies.
It's not just CAM that exhibits these conflicts, though. The Medical Research Council wrote a report in April 2000 ([pdf]) that discusses the evaluation of complex interventions where it is difficult to isolate the individual components of the intervention. They mention several examples.
Does a physiotherapist contribute significantly to the management of knee injuries? This role goes beyond a simple sequence of exercises.
The package of care to treat a knee injury may be quite straightforward and easily definable - and therefore reproducible: “This series of exercise in this order with this frequency for this long, with the following changes at the following stages”. However, the physiotherapist may have, in addition to the exercises, a psychotherapy role in rebuilding the patient's confidence, a training role teaching their spouse how to help with care or rehabilitation, and potentially significant influence via advice on the future health behaviour of the patient. Each of these elements may be an important contribution to the effectiveness of a physiotherapy intervention.
How does a stroke unit improve the quality of care for stroke patients? The concept of a stroke unit is difficult to standardize.
For example, although research suggests that stroke units work, what, exactly, is a stroke unit? What are the active ingredients that make it work? The physical set-up? The mix of care providers? The skills of the providers? The technologies available? The organizational arrangements?
How cognitive behavioral therapy works? This approach is highly individualistic.
Does success depend on the personality of the therapist? The personality, health status, social status, or other characteristic of the patient? The content of the therapy? The way it is delivered? The frequency of contact? The location of contact? The duration and the timing? What other components count?
Rather than arguing that RCTs need to be adapted to the special needs of CAM, perhaps RCTs should be adapted to meet the special needs of many types of medical interventions.
Furthermore, the claim that a practice is holistic should not be used as a blithely disregard evidence from an overly simplistic RCT. Perhaps the RCT can get to the heart of the issue by focusing on a single key dimension to the problem. A fourth grade student evaluated Therapeutic Touch (TT) for a science fair project. This project was highlighted on a Public Broadcasting Service show "Scientific American Frontiers" and published in the April 1, 1998 issue of JAMA ([Medline]) and received a lot of press coverage (CNN has a very nice story).
Therapeutic Touch is a therapy to improve health through the manipulation of the human energy field. There apparently is no physical touching. The official website on therapeutic touch describes it as:
"...an intentionally directed process of energy exchange during which the practitioner uses the hands as a focus to facilitate the healing process. It is a contemporary interpretation of several ancient healing practices. Therapeutic Touch is a scientifically-based practice founded on the premise that the human body, mind, emotions and intuition form a complex, dynamic energy field. The human energy field is governed by pattern and order. In health, the field is balanced, however in disease, the energy is characterized by imbalance and disorder."
Emily Rosa's experiment was very simple, perhaps too simple. If practitioners of Therapeutic Touch are able to manipulate energy fields, they must first be able to detect energy fields. She would hold her hand above either the left or right hand of the practitioner and ask him/her to tell which hand. The choice of hand was randomly determined by a coin flip. A screen with two holes in it prevented the practitioner from seeing what was going on.
Emily Rosa got 21 experienced practitioners to agree to the test. They were right only 44% of the time. Did this simple experiment disprove the healing power of TT? Perhaps not. TT is a complex intervention and this experiment only looked at a single aspect of it.
The experiment does shift the burden of proof, however. Detection of energy fields is a fundamental aspect of TT that all other aspects of this therapy rely on. How can practitioners of TT manipulate energy fields that they cannot even detect? Any further research should be discontinued until practitioners of TT can demonstrate the ability to detect energy fields in a rigorous blinded study.
Larry Sarner (Emily Rosa's step-father) makes much the same point in an article on the Quackwatch web site that responds to criticisms of the Rosa study. In particular, he responds to the criticism of reductionism:
[Critical comment #5] This was not a test of TT, but a parlor game. What the practitioners were required to do during the experiment invalidated its applicability to TT, especially since TT is a holistic process and can't be validly analyzed in parts. Emily's test was not of efficacy or technique (or "healing"), but I of raw ability. It's very much like testing a surgeon to see if he can l tell, without looking, in which hand the scalpel is being held. In any event, there was some movement. Emily presented her hand after each coin flip, which required relative movement between her hands and the subject's. Both subjects and Emily had at least small I movements of their hands during the trials, and some practitioners even wiggled their fingers or hands. Previous descriptions of the sensations of feeling an HEF state that the field itself is constantly in motion, and the literature states that such motion can be easily felt. Significantly, all of Emily's subjects agreed to the protocol and none voiced any concern that the test setup would pose a problem in demonstrating their ability. The argument about TT being "holistic" is a thinly disguised attempt to get back to "outcome" (i.e., clinical) testing, where it is easier to obfuscate, ignore negative results, or explain away nonconforming data. There have been numerous clinical trials on outcomes using TT. The results are highly mixed. Some tests do not have statistically significant results, others revealed slight positive effects (though statistically significant), and several actually reported statistically significant effects, but negative (i.e., the control group did better than the TT group). Holistic practitioners' prejudice against what they call "reductionism" (analyzing things in parts) is not shared by others in scientific medicine.
There is, by the way, a huge financial incentive to demonstrate the ability to detect energy fields. The James Randi Education Foundation offers a one million dollar prize to anyone who can show, under carefully controlled conditions, evidence of any paranormal, supernatural, or occult power or event. James Randi himself says that TT as well as several other alternative medicine therapies (Iridology, Reiki, Homeopathy and Applied Kinesiology) would qualify for the challenge.
This webpage was written by Steve Simon and was last modified on 07/08/2008. Category: Randomized trials
Temporality of Causes (April 7, 2004).
One of the nine criteria that Sir Austin Bradford Hill offers to establish a cause and effect relationship is temporality. In order for A to cause B, A must precede B in time. That seems logical enough, but every once in a while, you find someone who ignores temporality.
The classic joke along these lines was a statistician who was studying fire department records and concluded that the more fire engines you sent to a fire, the more damage they caused.
When scientists were first establishing that smoking causes lung cancer, some people offered the counter argument that cancer causes smoking. It's a difficult argument to make, but it was put forth with perfect seriousness. We all know that smoking preceded cancer, usually by several decades of time. The argument was that there were genetic tendencies towards cancer and perhaps these same tendencies also were related to the tendency to become addicted to nicotine. It's pretty easy to demolish this argument, of course.
Victor Stenger highlights another example in an article in the March 2004 issue of Skeptical Briefs. There was a cute article in the 2001 year end issue of the British Medical Journal titled: "Beyond Science? Effects of remote, retroactive intercessory prayer on outcomes in patients with bloodstream infection: randomized controlled trial." ([Medline]). The author, Leonard Leibovici, found the records of 3,393 adult patients with bloodstream infection at Rabin Medical Center, randomly divided those records into two groups then randomly selected one group to pray over. Note that this is a RETROSPECTIVE study. The outcomes were determined 4 to 10 years prior to the start of the study. After praying over one group, the author evaluated mortality, stay in the hospital, and duration of fever. Although there was no statistically significant difference in mortality (p=0.40), there was a reduction in the length of stay (p=0.01) and in duration of fever (p=0.04).
It is dangerous to speculate on the motives of the author, but this article was published in a year end issue that is traditionally reserved for light hearted articles.
Brian Olshansky and Larry Dossey treated the paper with quite a bit more seriousness, and in an article in the December 20, 2003 issue of BMJ argue for various physical mechanisms that could explain this result. They first argue that our understanding of how the world truly works is incomplete. For example, there is a classic result in physics where particles separated by vast distances of space can influence one another. There is no good explanation for how this occurs, therefore we shouldn't be too upset if there is no good explanation for how retrospective prayer would work.
They then cite obscure theoretical models in physics (bosonic string quantum mechanics and Calabi-Yau space) and research at the Institute of Transpersonal Psychology and the William James Center for Consciousness Studies that show retroactive influences in 10 out of a series of 19 experiments.
In my opinion, Olshansky and Dossey are making much too big a fuss about an article that was not intended to be serious research. This article was published in the same spirit as the Canadian Medical Association article, Celestial Determinants of Success in Research ([Medline]), that demonstrated an association between certain Zodiac signs and the likelihood of having received the Nobel Prize in Medicine and Physiology. Rather than demonstrating the validity of Astrology, this article was intended to illustrate that
foraging through databases, using contrived study designs in the absence of biological mechanistic data, sometimes yields spurious results.
The Leibovici article was clearly an effort to show the limitations of research methods by an example where a randomized controlled trial (RCT) shows a result that is truly bizarre. The results of this study should force us to confront the weaknesses of the RCT rather than the limitations of bosonic string quantum mechanics.
Furthermore, Olshansky and Dossey ignore a far simpler explanation of the unusual results: fraud. I have no reason to believe that Leonard Leibovici would manufacture these findings for personal gain, but quite frankly, the field of paranormal research, as a whole, has a quite troubling reputation for fraudulent claims. For that matter, the field of medicine has a quite troubling reputation for fraudulent claims. I recently received a recommendation for a book, Betrayers of the Truth by William Broad and Nicholas Wade that highlights that research fraud is a pervasive problem.
The best protection against fraud, of course, is replication by an independent group. This replication has not been done, perhaps because most people recognized that the Leibovici article was not intended to be serious research.
Olshansky and Dossey call on all of us to be open minded about the possibility of retrospective effects of prayer. But if they are willing to abandon commonly accepted principles about the temporality of causes and effects, without first asking for independent replication of this unusual result, perhaps they themselves are being a bit too open minded.
Some additional comments about the retrospective prayer article are worth noting. In the Rapid Response e-letters section of BMJ, Martin Bland raised the ethical issue of treating the control group.
According to Clause 30 of the latest revision of the Declaration of Helsinki: At the conclusion of the study, every patient entered into the study should be assured of access to the best proven prophylactic, diagnostic and therapeutic methods identified by the study. To meet this ethical standard, the prayer should now be said for the control group. If the treatment is effective, this should have the effect of removing the difference between the groups. I await the results with interest.
In separate e-letters, Eugenio Pucci and Christopher Price raise a different issue: informed consent. The author did not ask consent from the patients before conducting this experiment. Consent is often waived in retrospective studies because the research would otherwise be impractical. Surprisingly, informed consent has been waived for most PROSPECTIVE studies of prayer, where consent is actually quite easy to obtain. The rationale for waiving informed consent was to avoid volunteer bias, and perhaps this was also tempered by the belief that most people would not be offended at the thought of prayers being offered on their behalf.
As another aside, I was involved with a similar study (prospective, not retrospective). We planned this study using a one-sided hypothesis (remote prayer has a positive effect on health). The Institutional Review Board suggested changing this to a two-sided hypothesis (remote prayer has either a positive or a negative effect on health). Thankfully, we did not observe an outcome in the opposite tail as that would have been very difficult to explain.
This webpage was written by Steve Simon and was last modified on 07/08/2008. Category: Corroborating evidence
Constraints on Randomization (March 23, 2004).
I received an inquiry from our Institutional Review Board about a study they were doing a continuing review for. This was a non-randomized comparison of two surgical techniques.
The first question you might ask is: "Why didn't they randomize the treatments?" In general, it is difficult to randomize in a surgery trial. In a 2002 BMJ article [Medline], McCulloch offers a wide range of reasons.
- Unlike drug trials, there are no large commercial sponsors of surgical research.
- Emergency surgery often makes consent and randomization impossible.
- Many surgical techniques are rare and cannot accumulate sufficient sample sizes.
- The learning curve for new techniques make it impractical two different surgical methods in the same practice.
- Outcomes may be more strongly related to the skill of the particular surgeon rather than the technique used.
- Variations in an operation may occur during the operation itself and these variations can often influence the outcome.
- While it is possible to monitor the purity of a drug, it is difficult to assess the technical quality of an operation.
- Patients may be unwilling to allow randomization to choose the type of surgery they will receive.
This last point was also emphasized in a 1994 World J Surg article [Medline], where Plaisier noted that most patients did not want to participate in a randomized study of extracorporeal shock wave lithotripsy with open cholecystectomy, and that a newer technique, laparoscopic cholecystectomy, was so popular that it would probably be impossible to ever run a randomized trial for this surgery.
In a 1984 NEJM article [Medline], Taylor noticed poor enrollment in a randomized comparison of mastectomy approaches and asked physicians why. Some of the reasons offered included:
(1) concern that the doctor-patient relationship would be affected by a randomized clinical trial (73 per cent);
(2) difficulty with informed consent (38 per cent);
(3) dislike of open discussions involving uncertainty (22 per cent);
(4) perceived conflict between the roles of scientist and clinician (18 per cent);
(5) practical difficulties in following procedures (9 per cent); and
(6) feelings of personal responsibility if the treatments were found to be unequal (8 per cent).In a 1995 BMJ editorial [Medline], Russell argues that most of the constraints on randomization in surgical trials can be overcome with sufficient effort.
The use of placebos in randomized trials of surgery is extremely controversial. I talk about this at length on my page about placebos, and will not repeat the discussion here.
Lots of other areas of medicine have trouble with recruiting patients for randomized trials (would you volunteer for a randomized trial of birth control methods, especially if you knew that one of the arms of the study was a placebo arm?). Still, the problems are thoughtfully documented for surgery trials, and this offers us a good example on some of the constraints on randomization.
This webpage was written by Steve Simon and was last modified on 07/08/2008. Category: Randomized trials
