![]() |
![]() |
![]() |
|
![]() |
|
![]() |
|
Miscellaneous resources for "Statistical Evidence."
This is a list of references that helped me in writing my book, "Statistical Evidence."
evidence >> apples >> adjustment (16)
Are sex and death related? Study failed to adjust for an important confounder [letter; comment]. David Batty. British Medical Journal 1998: 316(7145); 1671; discussion 1672. Abstract not available. [Full text]
Conditions for confounding of the risk ratio and of the odds ratio. J. F. Boivin, S. Wacholder. American Journal Epidemiology 1985: 121(1); 152-8. There are disagreements in the literature about the criteria to be used to ascertain whether or not a measure of association is confounded. The authors postulate the general principle that a crude unconfounded measure of association is structured as a weighted average of the stratum-specific values of the measure. They examine the relationships between stratum-specific measures of association, crude overall measures, and weighted averages of stratum-specific measures, and indicate how these relationships may be used to define criteria for the assessment of confounding in cohort studies in which the exposure, disease, and stratification variables are classified dichotomously. The criteria presented differ for the risk ratio and for the disease-odds ratio. In other words, one can reach different conclusions about the confounding effect of a given extraneous variable, depending on which measure of association is chosen. This view differs from that of Miettinen and Cook (Confounding: essence and detection. Am J Epidemiol 1981;114:593-603) who postulated one set of criteria for the assessment of confounding, which was applicable to both measures of association. These different approaches may lead to different conclusions about the presence or absence of confounding. [Medline]
Maternal smoking and Down syndrome: the confounding effect of maternal age. C. L. Chen, T. J. Gilbert, J. R. Daling. Am J Epidemiol 1999: 149(5); 442-6. Inconsistent results have been reported from studies evaluating the association of maternal smoking with birth of a Down syndrome child. Control of known risk factors, particularly maternal age, has also varied across studies. By using a population-based case-control design (775 Down syndrome cases and 7,750 normal controls) and Washington State birth record data for 1984-1994, the authors examined this hypothesized association and found a crude odds ratio of 0.80 (95% confidence interval 0.65-0.98). Controlling for broad categories of maternal age (<35 years, > or =35 years), as described in prior studies, resulted in a negative association (odds ratio = 0.87, 95% confidence interval 0.71-1.07). However, controlling for exact year of maternal age in conjunction with race and parity resulted in no association (odds ratio = 1.00, 95% confidence interval 0.82-1.24). In this study, the prevalence of Down syndrome births increased with increasing maternal age, whereas among controls the reported prevalence of smoking during pregnancy decreased with increasing maternal age. There is a substantial potential for residual confounding by maternal age in studies of maternal smoking and Down syndrome. After adequately controlling for maternal age in this study, the authors found no clear relation between maternal smoking and the risk of Down syndrome.
Look before You Leap: Stratify before You Standardize. Bernard C.K. Choi. American Journal of Epidemiology 1999: 149(12); 1087-1095. ABSTRACT: This paper presents a mathematical model to show the conditions in which age standardization can be used to summarize age-specific rates for comparison purposes over calendar time. It shows that the conditions for valid comparison depend on the type of measure used for comparison, that is, difference, ratio, or percent change. If the measure for comparison is a difference of the standardized rates at two time points, then the age-specific rates need to maintain a constant rate difference over time for the comparison to be valid. If the measure for comparison is a ratio or percent change of the standardized rates at two time points, then the age-specific rates need to maintain a constant rate ratio over time for the comparison to be valid. Since in reality, as shown by our Canadian empirical data, age-specific rates do not always maintain a consistent pattern over time, it is recommended that one should always stratify the data to look at patterns of age-specific rates before applying age standardization.
Presenting statistical uncertainty in trends and dose-response relations. S Greenland, KB Michels, JM Robins, C Poole, WC Willett. AJE 1999: 149(12); 1077-86. ABSTRACT: When one estimates the effects of a polytomous exposure, it is common practice to express all effects relative to a baseline or reference level. Certain authors have challenged this practice and proposed alternatives, which we review here. One alternative, the "floating absolute risk" method, can supply useful statistics and trend graphs, but it does not yield valid confidence intervals for relative risks. All categorical methods have further shortcomings when the exposure is continuous, however. These shortcomings can be addressed by plotting or tabulating confidence limits for points on a flexible curve fitted to the uncategorized data.
Patient volume, staffing, and workload in relation to risk-adjusted outcomes in a random stratified sample of UK neonatal intensive care units: a prospective evaluation. The UK Neonatal Staffing Study Group. Lancet 2002: 35999-107. Background UK recommendations suggest that large neonatal intensive-care units (NICUs) have better outcomes than small units, although this suggestion remains unproven. We assessed whether patient volume, staffing levels, and workload are associated with risk-adjusted outcomes, and with costs or staff wellbeing. Methods 186 UK NICUs were stratified according to volume of patients, nursing provision, and neonatal consultant provision. Primary outcomes were hospital mortality, mortality or cerebral damage, and nosocomial bacteraemia. We studied 13 515 infants of all birthweights consecutively admitted to 54 randomly selected NICUs. Multiple logistic regression analyses were done with every primary outcome as the dependent variable. Staff wellbeing and stress were assessed by anonymous mental health index (MHI)-5 questionnaires. Findings Data were available for 13 334 (99%) infants. High-volume NICUs treated the sickest infants and had highest crude mortality. Risk-adjusted mortality and mortality or cerebral damage were unrelated to patient volume or staffing provision; however, nosocomial bacteraemia was less frequent in NICUs with low neonatal consultant provision (odds ratio 0·65, 95% CI 0·43-0·98). Mortality was raised with increasing workload in all types of NICUs. Infants admitted at full capacity versus half capacity were about 50% more likely to die, but there was wide uncertainty around this estimate. Most staff had MHI-5 scores that suggested good mental health. Interpretation The implications of this report for staffing policy, medicolegal risk management, and ethical practice remain to be tested. Centralisation of only the sickest infants could improve efficiency, provided that this does not create excessive workload for staff. Assessment of increased staffing levels that are closer to those in adult intensive care might be appropriate.
Causal Knowledge as a Prerequisite for Confounding Evaluation: An Application to Birth Defects Epidemiology. Miguel A. Hernán, Sonia Hernández-Díaz2, Martha M. Werler2 and Allen A. Mitchell2. Am. J of Epidemiology 2002: 155(2); 176-184. Common strategies to decide whether a variable is a confounder that should be adjusted for in the analysis rely mostly on statistical criteria. The authors present findings from the Slone Epidemiology Unit Birth Defects Study, 1992–1997, a case-control study on folic acid supplementation and risk of neural tube defects. When statistical strategies for confounding evaluation are used, the adjusted odds ratio is 0.80 (95% confidence interval: 0.62, 1.21). However, the consideration of a priori causal knowledge suggests that the crude odds ratio of 0.65 (95% confidence interval: 0.46, 0.94) should be used because the adjusted odds ratio is invalid. Causal diagrams are used to encode qualitative a priori subject matter knowledge.
Socioeconomic status and health in blacks and whites: the problem of residual confounding and the resiliency of race. J. S. Kaufman, R. S. Cooper, D. L. McGee. Epidemiology 1997: 8(6); 621-8. A large number of epidemiologic studies have focused on racial/ethnic differences, particularly between blacks and whites. Because health endpoints and racial categorizations are associated with socioeconomic status, investigators generally adjust for socioeconomic indicators. The intention is usually to control for confounding, thereby making groups comparable and excluding socioeconomic status as an alternative explanation to hypotheses of innate physiologic differences. A threat to the validity of these analyses is therefore the presence of residual confounding. We identify four potential sources of residual confounding in this analytical design: categorization of socioeconomic status variables, measurement error in socioeconomic indicators, use of aggregated socioeconomic status measures, and incommensurate socioeconomic indicators. Using simulations and examples from the literature, we demonstrate that the effect of residual confounding is to bias interpretation of data toward the conclusion of independent racial/ethnic group effects. Investigators often refer to possible "genetic" differences on the basis of models that control for socioeconomic status. We propose that such conclusions on the basis of this analytical strategy are generally unwarranted. Racial/ethnic differences in disease are a pressing public health concern, but the current approach does not often provide a basis for inference about putative biological factors in the etiology of this disparity.
META-ANALYSIS Dose-specific Meta-Analysis and Sensitivity Analysis of the Relation between Alcohol Consumption and Lung Cancer Risk. Jeffrey E. Korte, Paul Brennan, S. Jane Henley, Paolo Boffetta. Am. J of Epidemiology 2002: 155(6); 496-506. Alcohol drinking increases the risk of several types of cancer, but studies of the relation between alcohol and lung cancer risk are complicated by smoking. The authors carried out meta-analyses for four study designs and conducted sensitivity analyses to assess the results. Pooled smoking-unadjusted relative risks (RRs) for brewery workers and alcoholics were 1.17 (95% confidence interval (CI): 0.99, 1.39) and 1.99 (95% CI: 1.66, 2.39), respectively, relative to population rates. For cohort and case-control studies, the authors conducted dose-specific meta-analyses for ethanol consumption of 1–499, 500–999, 1,000–1,999, and 2,000 g/month, relative to nondrinking. Smoking-adjusted RRs for ascending dose groups in cohort studies were 0.98 (95% CI: 0.79, 1.21), 0.92 (95% CI: 0.81, 1.04), 1.04 (95% CI: 0.88, 1.22), and 1.53 (95% CI: 1.04, 2.25), respectively. Smoking-adjusted odds ratios for ascending groups in case-control studies were 0.63 (95% CI: 0.51, 0.78), 1.30 (95% CI: 0.98, 1.70), 1.13 (95% CI: 0.46, 2.75), and 1.86 (95% CI: 1.39, 2.49), respectively. Elevated odds ratios were seen for hospital-based case-control studies but not for population-based case-control studies. Sensitivity analyses indicated that smoking explained the elevated RRs in studies of alcoholics and that strong misclassification of smoking status could produce an elevated smoking-adjusted RR in cohort and case-control studies. Overall, evidence for a smoking-adjusted association between alcohol and lung cancer risk is limited to very high consumption groups in cohort and hospital-based case-control studies. At lower levels, any associations observed appear to be explained by confounding.
How do risk factors work together? Mediators, moderators, and independent, overlapping, and proxy risk factors. H. C. Kraemer, E. Stice, A. Kazdin, D. Offord, D. Kupfer. Am J Psychiatry 2001: 158(6); 848-56. OBJECTIVE: The authors developed a methodological basis for investigating how risk factors work together. Better methods are needed for understanding the etiology of disorders, such as psychiatric syndromes, that presumably are the result of complex causal chains. METHOD: Approaches from psychology, epidemiology, clinical trials, and basic sciences were synthesized. RESULTS: The authors define conceptually and operationally five different clinically important ways in which two risk factors may work together to influence an outcome: as proxy, overlapping, and independent risk factors and as mediators and moderators. CONCLUSIONS: Classifying putative risk factors into these qualitatively different types can help identify high-risk individuals in need of preventive interventions and can help inform the content of such interventions. These methods may also help bridge the gaps between theory, the basic and clinical sciences, and clinical and policy applications and thus aid the search for early diagnoses and for highly effective preventive and treatment interventions.
Mediators and moderators of treatment effects in randomized clinical trials. H. C. Kraemer, G. T. Wilson, C. G. Fairburn, W. S. Agras. Arch Gen Psychiatry 2002: 59(10); 877-83. (Covariate adjustment is important, even in randomized trials and can identify important subgroups and mechanisms of action.) Randomized clinical trials (RCTs) not only are the gold standard for evaluating the efficacy and effectiveness of psychiatric treatments but also can be valuable in revealing moderators and mediators of therapeutic change. Conceptually, moderators identify on whom and under what circumstances treatments have different effects. Mediators identify why and how treatments have effects. We describe an analytic framework to identify and distinguish between moderators and mediators in RCTs when outcomes are measured dimensionally. Rapid progress in identifying the most effective treatments and understanding on whom treatments work and do not work and why treatments work or do not work depends on efforts to identify moderators and mediators of treatment outcome. We recommend that RCTs routinely include and report such analyses.
Baseline imbalance in randomised controlled trials. C Roberts, DJ Torgerson. British Medical Journal 1999: 319(7203); 185. Abstract not available yet. [Medline] [Full text] [PDF]
Sex and death: are they related? Findings from the Caerphilly cohort study. GD Smith, S Frankel, J Yarnell. British Medical Journal 1997: 315(7123); 1641-1644. ABSTRACT: OBJECTIVE: To examine the relation between frequency of orgasm and mortality. STUDY DESIGN: Cohort study with a 10 year follow up. SETTING: The town of Caerphilly, South Wales, and five adjacent villages. SUBJECTS: 918 men aged 45-59 at time of recruitment between 1979 and 1983. MAIN OUTCOME MEASURES: All deaths and deaths from coronary heart disease. RESULTS: Mortality risk was 50% lower in the group with high orgasmic frequency than in the group with low orgasmic frequency, with evidence of a dose-response relation across the groups. Age adjusted odds ratio for all cause mortality was 2.0 for the group with low frequency of orgasm (95% confidence interval 1.1 to 3.5, test for trend P = 0.02). With adjustment for risk factors this became 1.9 (1.0 to 3.4, test for trend P = 0.04). Death from coronary heart disease and from other causes showed similar associations with frequency of orgasm, although the gradient was most marked for deaths from coronary heart disease. Analysed in terms of actual frequency of orgasm, the odds ratio for total mortality associated with an increase in 100 orgasms per year was 0.64 (0.44 to 0.95). CONCLUSION: Sexual activity seems to have a protective effect on men's health. [Medline] [Abstract] [Full text]
Clinical trials in acute myocardial infarction: Should we adjust for baseline characteristics? Ewout W. Steyerberg, Patrick M.M. Bossuyt, Kerry L. Lee. American Heart Journal 2000: 139(5); 745-751. ABSTRACT: BACKGROUND: Clinical trials concerning acute myocardial infarction often evaluate short-term death. Several baseline characteristics are predictors of death, most notably age. Adjustment for one or more predictors in a multivariable analysis may be considered to correct the estimate of the treatment effect for any imbalance that by chance may have occurred between the randomized groups. Moreover, adjustment results in a stratified estimate of the effect of treatment. METHODS AND RESULTS: The effects of adjustment (correction for imbalance and stratification) were studied with logistic regression analysis in the Global Use of Strategies to Open Occluded Coronary Arteries (GUSTO)-I trial. The primary end point was 30-day death, which occurred in 6.3% of 10,348 patients randomly assigned to tissue plasminogen activator and 7.3% of 20,162 patients randomly assigned to streptokinase thrombolytic therapy. This is equivalent to an unadjusted odds ratio of 0.853. No significant imbalance had occurred for any of 17 baseline characteristics considered, including well-known demographic, presenting, and history characteristics. Adjusted for age, the odds ratio was 0.829, which is an 18% increase in estimated effect on the logistic scale. When adjusted for 17 characteristics, the odds ratio was 0.820, an increase of 25%. The increase in effect estimate was largely explained by the stratification effect and only partly by imbalance of predictors. CONCLUSIONS: Adjustment for predictive baseline characteristics, even when largely balanced, may lead to clearly different estimates of the treatment effect on mortality rates. Adjustment for important predictors such as age is recommended in clinical trials studying patients with acute myocardial infarction.
Research Methods: Why Covariance? A Rationale for Using Analysis of Covariance Procedures in Randomized Studies. Matthew J. Taylor. Journal of Early Intervention 1993: 17(4); 455-466. Abstract not available yet.
A comparison of direct adjustment and regression adjustment of epidemiologic measures. T. C. Wilcosky, L. E. Chambless. J Chronic Dis 1985: 38(10); 849-56. Although regression adjustment can provide a useful alternative to direct adjustment, especially when data are sparse, many researchers are unaware that adjusted summary measures can be easily derived from regression coefficients. In a non-technical discussion with examples, the direct adjustment procedure is compared with three methods of regression adjustment based on analysis of covariance models: the conditional prediction method, the stratified prediction method, and the marginal prediction method. Both the stratified prediction and direct adjustment methods yield summary measures that are weighted averages of stratum-specific measures, while adjusted measures from the conditional prediction method are similar to stratum-specific estimates. In contrast to the other adjustment procedures, which can use internal or external weights, the marginal prediction method always gives an internally adjusted measure. Under certain conditions, the three regression adjustment procedures produce identical results. Major advantages of direct adjustment include computational simplicity and relatively few statistical assumptions. Regression adjustment, however, is more convenient for statistical tests for interactions and group differences, and often precludes the need to categorize continuous variables, so that problems with empty strata are avoided.
evidence >> apples >> casecontrol (13)
Reye's syndrome in the United States from 1981 through 1997. E. D. Belay, J. S. Bresee, R. C. Holman, A. S. Khan, A. Shahriari, L. B. Schonberger. New England Journal of Medicine 1999: 340(18); 1377-82. BACKGROUND: Reye's syndrome is characterized by encephalopathy and fatty degeneration of the liver, usually after influenza or varicella. Beginning in 1980, warnings were issued about the use of salicylates in children with those viral infections because of the risk of Reye's syndrome. METHODS: To describe the pattern of Reye's syndrome in the United States, characteristics of the patients, and risk factors for poor outcomes, we analyzed national surveillance data collected from December 1980 through November 1997. The surveillance system is based on voluntary reporting with the use of a standard case-report form. RESULTS: From December 1980 through November 1997 (surveillance years 1981 through 1997), 1207 cases of Reye's syndrome were reported in patients less than 18 years of age. Among those for whom data on race and sex were available, 93 percent were white and 52 percent were girls. The number of reported cases of Reye's syndrome declined sharply after the association of Reye's syndrome with aspirin was reported. After a peak of 555 cases in children reported in 1980, there have been no more than 36 cases per year since 1987. Antecedent illnesses were reported in 93 percent of the children, and detectable blood salicylate levels in 82 percent. The overall case fatality rate was 31 percent. The case fatality rate was highest in children under five years of age (relative risk, 1.8; 95 percent confidence interval, 1.5 to 2.1) and in those with a serum ammonia level above 45 microg per deciliter (26 micromol per liter) (relative risk, 3.4; 95 percent confidence interval, 1.9 to 6.2). CONCLUSIONS: Since 1980, when the association between Reye's syndrome and the use of aspirin during varicella or influenza-like illness was first reported, there has been a sharp decline in the number of infants and children reported to have Reye's syndrome. Because Reye's syndrome is now very rare, any infant or child suspected of having this disorder should undergo extensive investigation to rule out the treatable inborn metabolic disorders that can mimic Reye's syndrome. [Abstract] [Full text] [PDF]
A case-control study of HIV seroconversion in health care workers after percutaneous exposure. Centers for Disease Control and Prevention Needlestick Surveillance Group. D. M. Cardo, D. H. Culver, C. A. Ciesielski, P. U. Srivastava, R. Marcus, D. Abiteboul, J. Heptonstall, G. Ippolito, F. Lot, P. S. McKibben, D. M. Bell. N Engl J Med 1997: 337(21); 1485-90. BACKGROUND: The average risk of human immunodeficiency virus (HIV) infection after percutaneous exposure to HIV-infected blood is 0.3 percent, but the factors that influence this risk are not well understood. METHODS: We conducted a case-control study of health care workers with occupational, percutaneous exposure to HIV-infected blood. The case patients were those who became seropositive after exposure to HIV, as reported by national surveillance systems in France, Italy, the United Kingdom, and the United States. The controls were health care workers in a prospective surveillance project who were exposed to HIV but did not seroconvert. RESULTS: Logistic-regression analysis based on 33 case patients and 665 controls showed that significant risk factors for seroconversion were deep injury (odds ratio= 15; 95 percent confidence interval, 6.0 to 41), injury with a device that was visibly contaminated with the source patient's blood (odds ratio= 6.2; 95 percent confidence interval, 2.2 to 21), a procedure involving a needle placed in the source patient's artery or vein (odds ratio=4.3; 95 percent confidence interval, 1.7 to 12), and exposure to a source patient who died of the acquired immunodeficiency syndrome within two months afterward (odds ratio=5.6; 95 percent confidence interval, 2.0 to 16). The case patients were significantly less likely than the controls to have taken zidovudine after the exposure (odds ratio=0.19; 95 percent confidence interval, 0.06 to 0.52). CONCLUSIONS: The risk of HIV infection after percutaneous exposure increases with a larger volume of blood and, probably, a higher titer of HIV in the source patient's blood. Postexposure prophylaxis with zidovudine appears to be protective. [Abstract] [Full text] [PDF]
Reye's syndrome. M. Casteels-Van Daele, C. Van Geet, C. Wouters, E. Eggermont. Lancet 2001: 358(9278); 334. Abstract not available yet.
Risk of testicular cancer in subfertile men: case-control study. H. Moller, N. E. Skakkebaek. British Medical Journal 1999: 318(7183); 559-62. OBJECTIVE: To evaluate the association between subfertility in men and the subsequent risk of testicular cancer. DESIGN: Population based case-control study. SETTING: The Danish population. PARTICIPANTS: Cases were identified in the Danish Cancer Registry; controls were randomly selected from the Danish population with the computerised Danish Central Population Register. Men were interviewed by telephone; 514 men with cancer and 720 controls participated. OUTCOME MEASURE: Occurrence of testicular cancer. RESULTS: A reduced risk of testicular cancer was associated with paternity (relative risk 0.63; 95% confidence interval 0.47 to 0.85). In men who before the diagnosis of testicular cancer had a lower number of children than expected on the basis of their age, the relative risk was 1.98 (1.43 to 2.75). There was no corresponding protective effect associated with a higher number of children than expected. The associations were similar for seminoma and non-seminoma and were not influenced by adjustment for potential confounding factors. CONCLUSION: These data are consistent with the hypothesis that male subfertility and testicular cancer share important aetiological factors.
Testicular cancer risk in relation to use of disposable nappies. H. Moller. Arch Dis Child 2002: 86(1); 28-9. Information on the use of disposable nappies in childhood was available for 296 testicular cancer cases and 287 population controls in Denmark. No association was found between disposable nappy use and the subsequent risk of testicular cancer in adulthood.
The disappearance of Reye's syndrome--a public health triumph. A. S. Monto. N Engl J Med 1999: 340(18); p1423-4. Abstract not available.
Hospital controls versus community controls: differences in inferences regarding risk factors for hip fracture. D. J. Moritz, J. L. Kelsey, J. A. Grisso. Am J Epidemiol 1997: 145(7); 653-60. In case-control studies using cases identified from persons admitted to hospitals, two types of controls are most often used: persons from the communities served by the hospitals and persons admitted to the same hospitals as those to which the cases were admitted. It is often unclear which is the more appropriate choice, and whether the use of one or the other type of control group will lead to biased conclusions. The purpose of the present analysis was to determine whether the choice of hospital controls versus community controls would influence conclusions regarding risk factors for hip fracture. Cases (n = 425), hospital controls (n = 312) and community controls (n = 454) were drawn from a case-control study of risk factors for hip fracture in women. Study participants were white and black women aged 45 years or older and living in New York City or Philadelphia, Pennsylvania, who were selected between September 1987 and July 1989. Using community controls but not hospital controls, investigators would have concluded that having a fall during the previous 6 months, current smoking, and moving during the previous year were associated with an increased risk of hip fracture. Associations of hip fracture risk with stroke and prior use of ambulatory aids were stronger using community controls, but associations with estrogen use and body mass index were not influenced by choice of control group. Community controls were quite similar to representative samples of community-dwelling elderly women, whereas hospital controls were somewhat sicker and more likely to be current smokers. The authors conclude that community controls comprise the more appropriate control group in case-control studies of hip fracture in the elderly.
Case-control studies: research in reverse. K. F. Schulz, D.A. Grimes. Lancet 2002: 359431-434. Epidemiologists benefit greatly from having case-control study designs in their research armamentarium. Case-control studies can yield important scientific findings with relatively little time, money, and effort compared with other study designs. This seemingly quick road to research results entices many newly trained epidemiologists. Indeed, investigators implement case-control studies more frequently than any other analytical epidemiological study. Unfortunately, case-control designs also tend to be more susceptible to biases than other comparative studies. Although easier to do, they are also easier to do wrong. Five main notions guide investigators who do, or readers who assess, case-control studies. First, investigators must explicitly define the criteria for diagnosis of a case and any eligibility criteria used for selection. Second, controls should come from the same population as the cases, and their selection should be independent of the exposures of interest. Third, investigators should blind the data gatherers to the case or control status of participants or, if impossible, at least blind them to the main hypothesis of the study. Fourth, data gatherers need to be thoroughly trained to elicit exposure in a similar manner from cases and controls; they should use memory aids to facilitate and balance recall between cases and controls. Finally, investigators should address confounding in case-control studies, either in the design stage or with analytical techniques. Devotion of meticulous attention to these points enhances the validity of the results and bolsters the reader's confidence in the findings.
Selection of controls in case-control studies. I. Principles. S. Wacholder, J. K. McLaughlin, D. T. Silverman, J. S. Mandel. Am J Epidemiol 1992: 135(9); p1019-28. A synthesis of classical and recent thinking on the issues involved in selecting controls for case-control studies is presented in this and two companion papers (S. Wacholder et al. Am J Epidemiol 1992; 135:1029-50). In this paper, a theoretical framework for selecting controls in case-control studies is developed. Three principles of comparability are described: 1) study base, that all comparisons be made within the study base; 2) deconfounding, that comparisons of the effects of the levels of exposure on disease risk not be distorted by the effects of other factors; and 3) comparable accuracy, that any errors in measurement of exposure be nondifferential between cases and controls. These principles, if adhered to in a study, can reduce selection, confounding, and information bias, respectively. The principles, however, are constrained by an additional efficiency principle regarding resources and time. Most problems and controversies in control selection reflect trade-offs among these four principles.
Selection of controls in case-control studies. II. Types of controls. S. Wacholder, D. T. Silverman, J. K. McLaughlin, J. S. Mandel. Am J Epidemiol 1992: 135(9); p1029-41. Types of control groups are evaluated using the principles described in paper 1 of the series, "Selection of Controls in Case-Control Studies" (S. Wacholder et al. Am J Epidemiol 1992; 135:1019-28). Advantages and disadvantages of population controls, neighborhood controls, hospital or registry controls, medical practice controls, friend controls, and relative controls are considered. Problems with the use of decreased controls and proxy respondents are discussed.
Selection of controls in case-control studies. III. Design options. S. Wacholder, D. T. Silverman, J. K. McLaughlin, J. S. Mandel. Am J Epidemiol 1992: 135(9); p1042-50. Several design options available in the planning stage of case-control studies are examined. Topics covered include matching, control/case ratio, choice of nested case-control or case-cohort design, two-stage sampling, and other methods that can be used for control selection. The effect of potential problems in obtaining comparable accuracy of exposure is also examined. A discussion of the difficulty in meeting the principles of study base, deconfounding, and comparable accuracy (S. Wacholder et al. Am J Epidemiol 1992; 135:1019-28) in a single study completes this series of papers.
Design issues in case-control studies. S. Wacholder. Stat Methods Med Res 1995: 4(4); p293-309. The most difficult and most important considerations in planning the protocol of a case-control study are ascertainment of cases, selection of controls and the quality of the exposure measurement. Plans to ensure careful field work are equally important; without attention to data collection, the protocol will be meaningless. In most case-control studies, the measurement problem is magnified because one cannot implement the collection of exposure information at the beginning of follow-up, and instead must rely on interviews, existing records or extrapolation into the past. Consideration of a case-control study as an efficient way to study a cohort helps to resolve some design issues.
Are risk factors for sudden infant death syndrome different at night? S. M. Williams, E. A. Mitchell, B. J. Taylor. Arch Dis Child 2002: 87(4); 274-8. AIMS: To determine whether the risk factors for SIDS occurring at night were different from those occurring during the day. METHODS: Large, nationwide case-control study, with data for 369 cases and 1558 controls in New Zealand. RESULTS: Two thirds of SIDS deaths occurred at night (between 10 pm and 7 30 am). The odds ratio (95% CI) for prone sleep position was 3.86 (2.67 to 5.59) for deaths occurring at night and 7.25 (4.52 to 11.63) for deaths occurring during the day; the difference was significant. The odds ratio for maternal smoking for deaths occurring at night was 2.28 (1.52 to 3.42) and that for the day 1.27 (0.79 to 2.03); that for the mother being single was 2.69 (1.29 to 3.99) for a night time death and 1.25 (0.76 to 2.04) for a daytime death. Both interactions were significant. The interactions between time of death and bed sharing, not sleeping in a cot or bassinet, Maori ethnicity, late timing of antenatal care, binge drinking, cannabis use, and illness in the baby were also significant, or almost so. All were more strongly associated with SIDS occurring at night. CONCLUSIONS: Prone sleep position was more strongly associated with SIDS occurring during the day, whereas night time deaths were more strongly associated with maternal smoking and measures of social deprivation.
evidence >> apples >> cluster (1)
Extending the CONSORT statement to cluster randomized trials: for discussion. D. R. Elbourne, M. K. Campbell. Stat Med 2001: 20(3); 489-96. The need for clear reporting of randomized controlled trials has been emphasized recently. The CONSORT Statement has made evidence-based suggestions for a checklist and a patient flow diagram. Adapting this for cluster randomized controlled trials presents particular challenges. Simple changes in the checklist and diagram for the completely randomized two level cluster randomized trials are suggested for discussion. An example taken from an unpublished trial demonstrates that these changes are less simple to implement, although extensions to electronic publications may be helpful. These suggestions should be formally evaluated. Further work is required to consider the cases of more levels and of stratified or pair-matched cluster randomized trials.
evidence >> apples >> cohort (1)
Cigarette smoking and diabetes mellitus: evidence of a positive association from a large prospective cohort study. J. C. Will, D. A. Galuska, E. S. Ford, A. Mokdad, E. E. Calle. Int J Epidemiol 2001: 30(3); p540-6. OBJECTIVE: Only a few prospective studies have examined the relationship between the frequency of cigarette smoking and the incidence of diabetes mellitus. The purpose of this study was to determine whether greater frequency of cigarette smoking accelerated the development of diabetes mellitus, and whether quitting reversed the effect. METHODS: Data were collected in the Cancer Prevention Study I, a prospective cohort study conducted from 1959 through 1972 by the American Cancer Society where volunteers recruited more than one million acquaintances in 25 US states. From these over one million original participants, 275,190 men and 434,637 women aged > or = 30 years were selected for the primary analysis using predetermined criteria. RESULTS: As smoking increased, the rate of diabetes increased for both men and women. Among those who smoked > or = 2 packs per day at baseline, men had a 45% higher diabetes rate than men who had never smoked; the comparable increase for women was 74%. Quitting smoking reduced the rate of diabetes to that of non-smokers after 5 years in women and after 10 years in men. CONCLUSIONS: A dose-response relationship seems likely between smoking and incidence of diabetes. Smokers who quit may derive substantial benefit from doing so. Confirmation of these observations is needed through additional epidemiological and biological research.
evidence >> apples >> concealed (5)
Bias in treatment assignment in controlled clinical trails. TC Chalmers, P Celano, HS Sacks, H Jr Smith. N Engl J Med 1983: 309(22); 1358-61. ABSTRACT: Controlled clinical trials of the treatment of acute myocardial infarction offer a unique opportunity for the study of the potential influence on outcome of bias in treatment assignment. A group of 145 papers was divided into those in which the randomization process was blinded (57 papers), those in which it may have been unblinded (45 papers), and those in which the controls were selected by a nonrandom process (43 papers). At least one prognostic variable was maldistributed (P less than 0.05) in 14.0 per cent of the blinded-randomization studies, in 26.7 per cent of the unblinded-randomization studies, and in 58.1 per cent of the nonrandomized studies. Differences in case-fatality rates between treatment and control groups (P less than 0.05) were found in 8.8 per cent of the blinded-randomization studies, 24.4 per cent of the unblinded-randomization studies, and 58.1 per cent of the nonrandomized studies. These data emphasize the importance of keeping those who recruit patients for clinical trials from suspecting which treatment will be assigned to the patient under consideration.
Randomised trials, human nature, and reporting guidelines. K. F. Schulz. Lancet 1996: 348(9027); 596-8. Abstract not available.
Empirical evidence of bias dimensions of methodological quality associated with estimates of treatment effects in controlled trials. KF Schulz, I Chalmers, RJ Hayes, DG Altman. JAMA 1995: 273(5); 408-12. ABSTRACT: OBJECTIVE--To determine if inadequate approaches to randomized controlled trial design and execution are associated with evidence of bias in estimating treatment effects. DESIGN--An observational study in which we assessed the methodological quality of 250 controlled trials from 33 meta-analyses and then analyzed, using multiple logistic regression models, the associations between those assessments and estimated treatment effects. DATA SOURCES--Meta-analyses from the Cochrane Pregnancy and Childbirth Database. MAIN OUTCOME MEASURES--The associations between estimates of treatment effects and inadequate allocation concealment, exclusions after randomization, and lack of double-blinding. RESULTS--Compared with trials in which authors reported adequately concealed treatment allocation, trials in which concealment was either inadequate or unclear (did not report or incompletely reported a concealment approach) yielded larger estimates of treatment effects (P < .001). Odds ratios were exaggerated by 41% for inadequately concealed trials and by 30% for unclearly concealed trials (adjusted for other aspects of quality). Trials in which participants had been excluded after randomization did not yield larger estimates of effects, but that lack of association may be due to incomplete reporting. Trials that were not double-blind also yielded larger estimates of effects (P = .01), with odds ratios being exaggerated by 17%. CONCLUSIONS--This study provides empirical evidence that inadequate methodological approaches in controlled trials, particularly those representing poor allocation concealment, are associated with bias. Readers of trial reports should be wary of these pitfalls, and investigators must improve their design, execution, and reporting of trials.
Allocation concealment in randomised trials: defending against deciphering. K. F. Schulz, D.A. Grimes. Lancet 2002: 359614-618. Proper randomisation rests on adequate allocation concealment. An allocation concealment process keeps clinicians and participants unaware of upcoming assignments. Without it, even properly developed random allocation sequences can be subverted. Within this concealment process, the crucial unbiased nature of randomised controlled trials collides with their most vexing implementation problems. Proper allocation concealment frequently frustrates clinical inclinations, which annoys those who do the trials. Randomised controlled trials are anathema to clinicians. Many involved with trials will be tempted to decipher assignments, which subverts randomisation. For some implementing a trial, deciphering the allocation scheme might frequently become too great an intellectual challenge to resist. Whether their motives indicate innocent or pernicious intents, such tampering undermines the validity of a trial. Indeed, inadequate allocation concealment leads to exaggerated estimates of treatment effect, on average, but with scope for bias in either direction. Trial investigators will be crafty in any potential efforts to decipher the allocation sequence, so trial designers must be just as clever in their design efforts to prevent deciphering. Investigators must effectively immunise trials against selection and confounding biases with proper allocation concealment. Furthermore, investigators should report baseline comparisons on important prognostic variables. Hypothesis tests of baseline characteristics, however, are superfluous and could be harmful if they lead investigators to suppress reporting any baseline imbalances.
Generation of allocation sequences in randomised trials: chance not choice. K. F. Schulz, D.A. Grimes. Lancet 2002: 359515-519. The randomised controlled trial sets the gold standard of clinical research. However, randomisation persists as perhaps the least-understood aspect of a trial. Moreover, anything short of proper randomisation courts selection and confounding biases. Researchers should spurn all systematic, non-random methods of allocation. Trial participants should be assigned to comparison groups based on a random process. Simple (unrestricted) randomisation, analogous to repeated fair coin-tossing, is the most basic of sequence generation approaches. Furthermore, no other approach, irrespective of its complexity and sophistication, surpasses simple randomisation for prevention of bias. Investigators should, therefore, use this method more often than they do, and readers should expect and accept disparities in group sizes. Several other complicated restricted randomisation procedures limit the likelihood of undesirable sample size imbalances in the intervention groups. The most frequently used restricted sequence generation procedure is blocked randomisation. If this method is used, investigators should randomly vary the block sizes and use larger block sizes, particularly in an unblinded trial. Other restricted procedures, such as urn randomisation, combine beneficial attributes of simple and restricted randomisation by preserving most of the unpredictability while achieving some balance. The effectiveness of stratified randomisation depends on use of a restricted randomisation approach to balance the allocation sequences for each stratum. Generation of a proper randomisation sequence takes little time and effort but affords big rewards in scientific accuracy and credibility. Investigators should devote appropriate resources to the generation of properly randomised trials and reporting their methods clearly.
evidence >> apples >> ecologic (5)
Modeling treatment effects on binary outcomes with grouped-treatment variables and individual covariates. S. C. Johnston, T. Henneman, C. E. McCulloch, M. van der Laan. Am J Epidemiol 2002: 156(8); 753-60. During evaluation of treatment effects in observational studies, confounding is a constant threat because it is always possible that patients with a better prognosis, not adequately characterized by measured covariates, are chosen for a specific therapy. Ecologic analyses may avoid confounding that would be present in analysis at the individual level because variations in regional or hospital practice may be unrelated to prognosis. The authors used simulated data with an excluded confounder to evaluate the reliability and limitations of the grouped-treatment approach, a method of incorporating an ecologic measure of treatment assignment into an individual-level multivariable model, similar to the instrumental variable approach. Estimates based on the grouped-treatment approach were closer to the true value than those of standard individual-level multivariable analysis in every simulation. Furthermore, confidence intervals based on the grouped-treatment approach achieved approximately their nominal coverage, whereas those based on individual-level analyses did not. The grouped-treatment approach appears to be more reliable than standard individual-level analysis in situations where the grouped-treatment variable is unassociated with the outcome except via the actual treatment assignment and measured covariates.
The Semi-individual Study in Air Pollution Epidemiology: A Valid Design as Compared to Ecologic Studies. Nino Kunzli, Ira B. Tager. Environmental Health Perspectives 1997: 105(10); 1078-1083. ABSTRACT: The assessment of long-term effects of air pollution in humans relies on epidemiologic studies. A widely used design consists of cross-sectional or cohort studies in which ecologic assignment of exposure, based on a fixed-site ambient monitor, is employed. Although health outcome and usually a large number of covariates are measured in individuals, these studies are often called ecological. We will introduce the term semi-individual design for these studies. We review the major properties and limitations with regard to causal inference of truly ecologic studies, in which outcome, exposure, and covariates are available on an aggregate level only. Misclassification problems and issues related to confounding and model specification in truly ecologic studies limit etiologic inference to individuals. In contrast, the semi-individual study shares its methodological and inferential properties with typical individual-level study designs. The major caveat relates to the case where too few study areas, e.g., two or three, are used, which render control of aggregate level confounding impossible. The issue of exposure misclassification is of general concern in epidemiology and not an exclusive problem of the semi-individual design. In a multicenter setting, the semi-individual study is a valuable tool to approach long-term effects of air pollution. Knowledge about the error structure of the ecologically assigned exposure allows consideration of the impact of ecologically assigned exposure on effect estimation. Semi-individual studies, i.e., individual level air pollution studies with ecologic exposure assignment, more readily permit valid inference to individuals and should not be labeled as ecologic studies.
Ecologic studies in epidemiology: concepts, principles, and methods. H. Morgenstern. Annu Rev Public Health 1995: 1661-81. An ecologic study focuses on the comparison of groups, rather than individuals; thus, individual-level data are missing on the joint distribution of variables within groups. Variables in an ecologic analysis may be aggregate measures, environmental measures, or global measures. The purpose of an ecologic analysis may be to make biologic inferences about effects on individual risks or to make ecologic inferences about effects on group rates. Ecologic study designs may be classified on two dimensions: (a) whether the primary group is measured (exploratory vs analytic study); and (b) whether subjects are grouped by place (multiple-group study), by time (time-trend study), or by place and time (mixed study). Despite several practical advantages of ecologic studies, there are many methodologic problems that severely limit causal inference, including ecologic and cross-level bias, problems of confounder control, within-group misclassification, lack of adequate data, temporal ambiguity, collinearity, and migration across groups.
Medicine and the Media: Did Monica really say that? Hugh Tunstall-Pedoe. British Medical Journal 1998: 3171023. Abstract not available yet. [Full text]
Ecological study for reasons for sharp decline in mortality from ischaemic heart disease in Poland since 1991. WA Zatonski, AJ McMichael, JW Powles. British Medical Journal 1998: 316(7137); 1047-1051. ABSTRACT: OBJECTIVE: To investigate the reasons for the decline in deaths attributed to ischaemic heart disease in Poland since 1991 after two decades of rising rates. DESIGN: Recent changes in mortality were measured as percentage deviations in 1994 from rates predicted by extrapolation of sex and age specific death rates for 1980-91 for diseases of the circulatory system and selected other categories. Available data on national and household food availability, alcohol consumption, cigarette smoking, socioeconomic indices, and medical services over time were reviewed. MAIN OUTCOME MEASURES: Age specific and age standardised rates of death attributed to ischaemic heart disease and related causes. RESULTS: The change in trend in mortality attributed to diseases of the circulatory system was similar in men and women and most marked (> 20%) in early middle age. For ages 45 to 64 the decrease was greatest for deaths attributed to ischaemic heart disease and atherosclerosis (around 25%) and less for stroke (< 10%). For most of the potentially explanatory variables considered, there were no corresponding changes in trend. However, between 1986-90 and 1994 there was a marked switch from animal fats (estimated availability down 23%) to vegetable fats (up 48%) and increased imports of fruit. CONCLUSION: Reporting biases are unlikely to have exaggerated the true fall in ischaemic heart disease; neither is it likely to be mainly due to changes in smoking, drinking, stress, or medical care. Changes in type of dietary fat and increased supplies of fresh fruit and vegetables seem to be the best candidates. [Medline] [Abstract] [PDF]
evidence >> apples >> example (8)
Influence of maternal age at delivery and birth order on risk of type 1 diabetes in childhood: prospective population based family study. Bart's-Oxford Family Study Group. P. J. Bingley, I. F. Douek, C. A. Rogers, E. A. Gale. British Medical Journal 2000: 321(7258); 420-4. OBJECTIVES: To examine the influence of parental age at delivery and birth order on subsequent risk of childhood diabetes. DESIGN: Prospective population based family study. SETTING: Area formerly administered by the Oxford Regional Health Authority. Participants: 1375 families in which one child or more had diabetes. Of 3221 offspring, 1431 had diabetes (median age at diagnosis 10.5 years, range 0.4-28.5) and 1790 remained non-diabetic at a median age of 16. 1 years. MAIN OUTCOME MEASURES: Disease free survival and hazard ratios for the development of type 1 diabetes in all offspring, assessed by Cox proportional hazard regression. Results: Maternal age at delivery was strongly related to risk of type 1 diabetes in the offspring; risk increased by 25% (95% confidence interval 17% to 34%) for each five year band of maternal age, so that maternal age at delivery of 45 years or more was associated with a relative risk of 3.11 (2.07 to 4.66) compared with a maternal age of less than 20 years. Paternal age was also associated with a 9% (3% to 16%) increase for each five year increase in paternal age. The relative risk of diabetes, adjusted for parental age at delivery and sex of offspring, decreased with increasing birth order; the overall effect was a 15% risk reduction (10% to 21%) per child born. CONCLUSIONS: A strong association was found between increasing maternal age at delivery and risk of diabetes in the child. Risk was highest in firstborn children and decreased progressively with higher birth order. The fetal environment seems to have a strong influence on risk of type 1 diabetes in the child. The increase in maternal age at delivery in the United Kingdom over the past two decades could partly account for the increase in incidence of childhood diabetes over this period. [Medline] [Abstract] [Full text] [PDF]
Statistical Inquiries into the Efficacy of Prayer. Sir Francis Galton. Fortnightly Review 1872: 12125-135. (This article was originally published in 1872 and is reproduced by the Pictures of Health Web Site.) An eminent authority has recently published a challenge to test the efficacy of prayer by actual experiment. I have been induced, through reading this, to prepare the following memoir for publication, nearly the whole of which I wrote and laid by many years ago, after completing a large collection of data, which I had undertaken for the satisfaction of my own conscience. [Full text] [PDF]
Lack of effect of long-term supplementation with beta carotene on the incidence of malignant neoplasms and cardiovascular disease. C. H. Hennekens, J. E. Buring, J. E. Manson, M. Stampfer, B. Rosner, N. R. Cook, C. Belanger, F. La Motte, J. M. Gaziano, P. M. Ridker, W. Willett, R. Peto. N Engl J Med 1996: 334(18); 1145-9. BACKGROUND. Observational studies suggest that people who consume more fruits and vegetables containing beta carotene have somewhat lower risks of cancer and cardiovascular disease, and earlier basic research suggested plausible mechanisms. Because large randomized trials of long duration were necessary to test this hypothesis directly, we conducted a trial of beta carotene supplementation. METHODS. In a randomized, double-blind, placebo-controlled trial of beta carotene (50 mg on alternate days), we enrolled 22,071 male physicians, 40 to 84 years of age, in the United States; 11 percent were current smokers and 39 percent were former smokers at the beginning of the study in 1982. By December 31, 1995, the scheduled end of the study, fewer than 1 percent had been lost to follow-up, and compliance was 78 percent in the group that received beta carotene. RESULTS. Among 11,036 physicians randomly assigned to receive beta carotene and 11,035 assigned to receive placebo, there were virtually no early or late differences in the overall incidence of malignant neoplasms or cardiovascular disease, or in overall mortality. In the beta carotene group, 1273 men had any malignant neoplasm (except nonmelanoma skin cancer), as compared with 1293 in the placebo group (relative risk, 0.98; 95 percent confidence interval, 0.91 to 1.06). There were also no significant differences in the number of cases of lung cancer (82 in the beta carotene group vs. 88 in the placebo group); the number of deaths from cancer (386 vs. 380), deaths from any cause (979 vs. 968), or deaths from cardiovascular disease (338 vs. 313); the number of men with myocardial infarction (468 vs. 489); the number with stroke (367 vs. 382); or the number with any one of the previous three end points (967 vs. 972). Among current and former smokers, there were also no significant early or late differences in any of these end points. CONCLUSIONS. In this trial among healthy men, 12 years of supplementation with beta carotene produced neither benefit nor harm in terms of the incidence of malignant neoplasms, cardiovascular disease, or death from all causes.
Dietary fat intake and the risk of coronary heart disease in women. F. B. Hu, M. J. Stampfer, J. E. Manson, E. Rimm, G. A. Colditz, B. A. Rosner, C. H. Hennekens, W. C. Willett. N Engl J Med 1997: 337(21); 1491-9. BACKGROUND: The relation between dietary intake of specific types of fat, particularly trans unsaturated fat and the risk of coronary disease remains unclear. We therefore studied this relation in women enrolled in the Nurses' Health Study. METHODS: We prospectively studied 80,082 women who were 34 to 59 years of age and had no known coronary disease, stroke, cancer, hypercholesterolemia, or diabetes in 1980. Information on diet was obtained at base line and updated during follow-up by means of validated questionnaires. During 14 years of follow-up, we documented 939 cases of nonfatal myocardial infarction or death from coronary heart disease. Mutivariate analyses included age, smoking status, total energy intake, dietary cholesterol intake, percentages of energy obtained from protein and specific types of fat, and other risk factors. RESULTS: Each increase of 5 percent of energy intake from saturated fat, as compared with equivalent energy intake from carbohydrates, was associated with a 17 percent increase in the risk of coronary disease (relative risk, 1.17; 95 percent confidence interval, 0.97 to 1.41; P=0.10). As compared with equivalent energy from carbohydrates, the relative risk for a 2 percent increment in energy intake from trans unsaturated fat was 1.93 (95 percent confidence interval, 1.43 to 2.61; P<0.001); that for a 5 percent increment in energy from monounsaturated fat was 0.81 (95 percent confidence interval, 0.65 to 1.00; P=0.05); and that for a 5 percent increment in energy from polyunsaturated fat was 0.62 (95 percent confidence interval, 0.46 to 0.85; P= 0.003). Total fat intake was not signficantly related to the risk of coronary disease (for a 5 percent increase in energy from fat, the relative risk was 1.02; 95 percent confidence interval, 0.97 to 1.07; P=0.55). We estimated that the replacement of 5 percent of energy from saturated fat with energy from unsaturated fats would reduce risk by 42 percent (95 percent confidence interval, 23 to 56; P<0.001) and that the replacement of 2 percent of energy from trans fat with energy from unhydrogenated, unsaturated fats would reduce risk by 53 percent (95 percent confidence interval, 34 to 67; P<.001). CONCLUSIONS: Our findings suggest that replacing saturated and trans unsaturated fats with unhydrogenated monounsaturated and polyunsaturated fats is more effective in preventing coronary heart disease in women than reducing overall fat intake.
Risk factors for lung cancer and for intervention effects in CARET, the Beta-Carotene and Retinol Efficacy Trial. G. S. Omenn, G. E. Goodman, M. D. Thornquist, J. Balmes, M. R. Cullen, A. Glass, J. P. Keogh, F. L. Meyskens, Jr., B. Valanis, J. H. Williams, Jr., S. Barnhart, M. G. Cherniack, C. A. Brodkin, S. Hammar. Journal of the National Cancer Institute 1996: 88(21); 1550-9. BACKGROUND: Evidence has accumulated from observational studies that people eating more fruits and vegetables, which are rich in beta-carotene (a violet to yellow plant pigment that acts as an antioxidant and can be converted to vitamin A by enzymes in the intestinal wall and liver) and retinol (an alcohol chemical form of vitamin A), and people having higher serum beta-carotene concentrations had lower rates of lung cancer. The Beta-Carotene and Retinol Efficacy Trial (CARET) tested the combination of 30 mg beta-carotene and 25,000 IU retinyl palmitate (vitamin A) taken daily against placebo in 18314 men and women at high risk of developing lung cancer. The CARET intervention was stopped 21 months early because of clear evidence of no benefit and substantial evidence of possible harm; there were 28% more lung cancers and 17% more deaths in the active intervention group (active = the daily combination of 30 mg beta-carotene and 25,000 IU retinyl palmitate). Promptly after the January 18, 1996, announcement that the CARET active intervention had been stopped, we published preliminary findings from CARET regarding cancer, heart disease, and total mortality. PURPOSE: We present for the first time results based on the pre-specified analytic method, details about risk factors for lung cancer, and analyses of subgroups and of factors that possibly influence response to the intervention. METHODS: CARET was a randomized, double-blinded, placebo-controlled chemoprevention trial, initiated with a pilot phase and then expanded 10-fold at six study centers. Cigarette smoking history and status and alcohol intake were assessed through participant self-report. Serum was collected from the participants at base line and periodically after randomization and was analyzed for beta-carotene concentration. An Endpoints Review Committee evaluated endpoint reports, including pathologic review of tissue specimens. The primary analysis is a stratified logrank test for intervention arm differences in lung cancer incidence, with weighting linearly to hypothesized full effect at 24 months after randomization. Relative risks (RRs) were estimated by use of Cox regression models; tests were performed for quantitative and qualitative interactions between the intervention and smoking status or alcohol intake. O'Brien-Fleming boundaries were used for stopping criteria at interim analyses. Statistical significance was set at the .05 alpha value, and all P values were derived from two-sided statistical tests. RESULTS: According to CARET's pre-specified analysis, there was an RR of 1.36 (95% confidence interval [CI] = 1.07-1.73; P = .01) for weighted lung cancer incidence for the active intervention group compared with the placebo group, and RR = 1.59 (95% CI = 1.13-2.23; P = .01) for weighted lung cancer mortality. All subgroups, except former smokers, had a point estimate of RR of 1.10 or greater for lung cancer. There are suggestions of associations of the excess lung cancer incidence with the highest quartile of alcohol intake (RR = 1.99; 95% CI = 1.28-3.09; test for heterogeneity of RR among quartiles of alcohol intake has P = .01, unadjusted for multiple comparisons) and with large-cell histology (RR = 1.89; 95% CI = 1.09-3.26; test for heterogeneity among histologic categories has P = .35), but not with base-line serum beta-carotene concentrations. CONCLUSIONS: CARET participants receiving the combination of beta-carotene and vitamin A had no chemopreventive benefit and had excess lung cancer incidence and mortality. The results are highly consistent with those found for beta-carotene in the Alpha-Tocopherol Beta-Carotene Cancer Prevention Study in 29133 male smokers in Finland.
Observational Studies. PR Rosenbaum (1995) New York: Springer-Verlag.
The effect of vitamin E and beta carotene on the incidence of lung cancer and other cancers in male smokers. Beta Carotene Cancer Prevention Study Group The Alpha-Tocopherol. NEJM 1994: 330(15); 1029-35. ABSTRACT: BACKGROUND. Epidemiologic evidence indicates that diets high in carotenoid-rich fruits and vegetables, as well as high serum levels of vitamin E (alpha-tocopherol) and beta carotene, are associated with a reduced risk of lung cancer. METHODS. We performed a randomized, double-blind, placebo-controlled primary-prevention trial to determine whether daily supplementation with alpha-tocopherol, beta carotene, or both would reduce the incidence of lung cancer and other cancers. A total of 29,133 male smokers 50 to 69 years of age from southwestern Finland were randomly assigned to one of four regimens: alpha-tocopherol (50 mg per day) alone, beta carotene (20 mg per day) alone, both alpha-tocopherol and beta carotene, or placebo. Follow-up continued for five to eight years. RESULTS. Among the 876 new cases of lung cancer diagnosed during the trial, no reduction in incidence was observed among the men who received alpha-tocopherol (change in incidence as compared with those who did not, -2 percent; 95 percent confidence interval, -14 to 12 percent). Unexpectedly, we observed a higher incidence of lung cancer among the men who received beta carotene than among those who did not (change in incidence, 18 percent; 95 percent confidence interval, 3 to 36 percent). We found no evidence of an interaction between alpha-tocopherol and beta carotene with respect to the incidence of lung cancer. Fewer cases of prostate cancer were diagnosed among those who received alpha-tocopherol than among those who did not. Beta carotene had little or no effect on the incidence of cancer other than lung cancer. Alpha-tocopherol had no apparent effect on total mortality, although more deaths from hemorrhagic stroke were observed among the men who received this supplement than among those who did not. Total mortality was 8 percent higher (95 percent confidence interval, 1 to 16 percent) among the participants who received beta carotene than among those who did not, primarily because there were more deaths from lung cancer and ischemic heart disease. CONCLUSIONS. We found no reduction in the incidence of lung cancer among male smokers after five to eight years of dietary supplementation with alpha-tocopherol or beta carotene. In fact, this trial raises the possibility that these supplements may actually have harmful as well as beneficial effects.
Comparison of maternal and infant outcomes between vacuum extraction and forceps deliveries. S. W. Wen, S. Liu, M. S. Kramer, S. Marcoux, A. Ohlsson, R. Sauve, R. Liston. Am J Epidemiol 2001: 153(2); 103-7. The authors conducted a population-based historical cohort study in the Canadian province of Quebec to assess the maternal and infant outcomes associated with vacuum extraction and forceps deliveries. The study database contains information on 305,391 mother-infant dyads (linked by a common institutional code and hospital chart number) for singleton live vaginal births with a nonbreech presentation at the gestational age of 37 or more completed weeks and a birth weight between 2,500 and 4,000 g during fiscal years 1991/1992 to 1995/1996. Of the births, 31,015 were delivered by vacuum extraction, and 18,727 were delivered by forceps. Compared with delivery by forceps, the adjusted risk ratios for third-/fourth-degree perineal laceration, intracranial hemorrhage, subdural or cerebral hemorrhage, intraventricular hemorrhage, subarachnoid hemorrhage, cephalhematoma, and neonatal in-hospital death were 0.48 (95% confidence interval: 0.45, 0.50), 1.28 (95% confidence interval: 0.73, 2.25), 0.97 (95% confidence interval: 0.49, 1.93), 0.99 (95% confidence interval: 0.16, 5.97), 5.44 (confidence interval: 1.26, 23.43), 2.02 (95% confidence interval: 1.89, 2.16), and 0.93 (95% confidence interval: 0.32, 2.70), respectively. The authors conclude that vacuum extraction causes less maternal trauma but may increase the risk of cephalhematoma and certain types of intracranial hemorrhage (e.g., subarachnoid hemorrhage).
evidence >> apples >> historical (3)
A Challenge for HD Researchers. Ken Pidock, Huntington's Disease Advocacy Center. Accessed on 2003-06-20. "To those of us who have watched Huntington's Disease for more than a generation, news about actual clinical trials of potential therapies is most welcome. However, such news also carries issues concerning how such therapies can best be evaluated." www.hdac.org/features/article.php?p_articleNumber=32
The way forward for clinical research. Sir Michael Rawlins, Pharmafocus. Accessed on 2003-06-20. "Historical controls can be very useful, particularly where one is investigating otherwise untreatable conditions where there is a biologically plausible basis for the treatment, and where the outcome untreated is homogenous and either very disabling or fatal." Published June 2, 2003. www.pharmafile.com/Pharmafocus/Features/feature.asp?fID=354
Randomized versus Historical Controls for Clinical Trials. H Sacks, TC Chalmers, H Jr Smith. The American Journal of Medicine 1982: 72(2); 233-240. ABSTRACT: To compare the use of randomized controls (RCTs) and historical controls (HCTs) for clinical trials, we searched the literature for therapies studied by both methods. We found six therapies for which 50 RCTs and 56 HCTs were reported. Forty-four of 56 HCTs (79 percent) found the therapy better than the control regimen, but only 10 of 50 RCTs (20 percent) agreed. For each therapy, the treated patients in RCTs and HCTs of the same therapy was largely due to differences in outcome for the control groups, with HCT control patients generally doing worse than the RCT control groups. Adjustment of the outcomes of the HCTs for prognostic factors, when possible, did not appreciably change the results. The data suggest that biases in patient selection may irretrievably weight the outcome of HCts in favor of new therapies. RCTs may miss clinically important benefits because of inadequate attention to sample size. The predictive value of each might be improved by reconsidering the use of p less than 0.05 as the significance level for all types of clinical trials, and by the use of confidence intervals around estimates of treatment effects.
evidence >> apples >> matching (4)
Hypothesis: Comparisons of inter- and intra-individual variations can substitute for twin studies in drug research. W. Kalow, B. K. Tang, L Endrenyi. Pharmacogenetics 1998: 8(4); 283-289. ABSTRACT: Twin studies are useful devices to determine the heritability of persistent but variable characteristics that tend to differ among individuals. Drug responses are not persistent affairs; they are temporary characteristics. One therefore may ask whether twin studies are necessary to assess the genetic element in pharmacological responsiveness. To measure the genetic component contributing to their variability, it seems logical to investigate the response variation by repeated drug administration to given individuals, and to compare the variability of the responses within and between individuals. We attempt here to describe a theoretical background of this venture, and to show some results of the exercise. Potential sources of error or uncertainty are discussed.
Removal of radiation dose response effects: an example of over-matching. J. L. Marsh, J. L. Hutton, K. Binks. Bmj 2002: 325(7359); 327-30. [Medline] [Full text] [PDF]
Paired versus Two-Sample Design for a Clinical Trial of Treatments with Dichotomous Outcome: Power Considerations. S Wacholder, CR Weinberg. Biometrics 1982: 38(3); 801-812. ABSTRACT: For the same number of observations in a small-sample clinical trial with dichotomous outcome, the statistical power associated with a two-sample design, analyzed by Fisher's exact test, is slightly greater than that associated with a matched design, analyzed by McNemar's test, and hence of the matched design, is monotone increasing in the within-pair correlation between the treatment responses. Power curves are presented which demonstrate that positive within-pair correlation, even when quite small, can result in a superiority in power for the matched design. Conversely, in the rare situations where there is a negative within-pair correlation, choice of a two-sample design can result in a substantial gain in power.
Matching in epidemiology as a paradigm for twin research on the Etiology of Disease. C White. Acta Geneticae Medicae Et Gemellologiae 1981: 30(1); 77-86. Abstract not available.
evidence >> apples >> observational (14)
A comparison of observational studies and randomized, controlled trials. K. Benson, A. J. Hartz. New England Journal of Medicine 2000: 342(25); 1878-86. BACKGROUND: For many years it has been claimed that observational studies find stronger treatment effects than randomized, controlled trials. We compared the results of observational studies with those of randomized, controlled trials. METHODS: We searched the Abridged Index Medicus and Cochrane data bases to identify observational studies reported between 1985 and 1998 that compared two or more treatments or interventions for the same condition. We then searched the Medline and Cochrane data bases to identify all the randomized, controlled trials and observational studies comparing the same treatments for these conditions. For each treatment, the magnitudes of the effects in the various observational studies were combined by the Mantel-Haenszel or weighted analysis-of-variance procedure and then compared with the combined magnitude of the effects in the randomized, controlled trials that evaluated the same treatment. RESULTS: There were 136 reports about 19 diverse treatments, such as calcium-channel-blocker therapy for coronary artery disease, appendectomy, and interventions for subfertility. In most cases, the estimates of the treatment effects from observational studies and randomized, controlled trials were similar. In only 2 of the 19 analyses of treatment effects did the combined magnitude of the effect in observational studies lie outside the 95 percent confidence interval for the combined magnitude in the randomized, controlled trials. CONCLUSIONS: We found little evidence that estimates of treatment effects in observational studies reported after 1984 are either consistently larger than or qualitatively different from those obtained in randomized, controlled trials.
Invited commentary: Rare side effects of obstetric interventions: Are observational studies good enough? P. Buekens. Am J Epidemiology 2001: 153(2); 108-9.
Systematic reviews and lifelong diseases. H. E. Elphick, A. Tan, D. Ashby, R. L. Smyth. Bmj 2002: 325(7360); 381-4. Systematic reviews of randomised controlled trials provide an evidence base for treatment but too often fail to give adequate information on long term outcomes. Elphick and colleagues discuss the limitations of the systematic review of randomised controlled trials for patients with chronic or lifelong diseases and suggest that long term observational studies have a place in the evaluation of the benefits and risks of treatment. [Full text] [PDF]
Statistics in Action. M.H. Gail. Journal of the American Statistical Association 1996: 91(433); 1-13. Abstract not available.
Research Fables from the Sisters Grinn, No. 1. The Hunch-test of Notre Dame.. Jeanne Grace, University of Rochester School of Nursing. Accessed on 2003-05-27. "Once upon a time in the land of Evidence, a sickly baby was born. His parents loved him and nursed him back to health and named him Quasi-experiment. As he grew, Quasi-experiment was unable to keep up with the other children. His physical challenges made him unable to compete in games of Manipulate the Independent Variable, and his strength was insufficient for random assignment tasks. While his schoolmates Randomized Clinical Trial and True Experiment received glowing praise for their accomplishments, Quasi- experiment received only disdain. The land of Evidence valued rigorous tests of causality above all else and had no tolerance for other investigative approaches. Saddened and isolated, Quasi-experiment withdrew from the company of others and came to live in the remote towers of the great cathedral of Evidence, Notre Dame." http://www.urmc.rochester.edu/SON/Fables/hunchbck.htm
How Good Is the Evidence Linking Breastfeeding and Intelligence? Anjali Jain, John Concat, John M. Leventhal. Pediatrics Journals 2002 (April): 109(6); 1044-1053. Section of General Pediatrics, Department of Pediatrics, University of Chicago Children’s Hospital, Chicago, Illinois Robert Wood Johnson Clinical Scholars Program, Yale University, New Haven, Connecticut Section of General Pediatrics, Department of Pediatrics, Yale University, New Haven, Connecticut | Department of Medicine, Yale University, New HavenConnecticut Clinical Epidemiology Unit, West Haven Veterans Affairs Medical Center, West Haven, Connecticut Background. We conducted a critical review of the many studies that have tried to determine whether breastfeeding has a beneficial effect on intellect. Design/Methods. By searching Medline and the references of selected articles, we identified publications that evaluated the association between breastfeeding and cognitive outcomes. We then appraised and described each study according to 8 principles of clinical epidemiology: 1) study design, 2) target population: whether full-term infants were studied, 3) sample size, 4) collection of feeding data: whether studies met 4 standards of quality— suitable definition and duration of breastfeeding, and appropriate timing and source of feeding data, 5) control of susceptibility bias: whether studies controlled for socioeconomic status and stimulation of the child, 6) blinding: whether observers of the outcome were blind to feeding status, 7) outcome: whether a standardized individual test of general intelligence at an age older than 2 years was used, and 8) format of results: whether studies reported an effect size or some other strategy to interpret the clinical impact of results. Results. We identified 40 pertinent publications from 1929 to February 2001. Twenty-seven (68%) concluded that breastfeeding promotes intelligence. Many studies, however, had methodological flaws. Only 2 papers studied full-term infants and met all 4 standards of high-quality feeding data, controlled for 2 critical confounders, reported blinding, used an appropriate test, and allowed the reader to interpret the clinical significance of the findings with an effect size. Of these 2, 1 study concluded that the effect of breastfeeding on intellect was significant, and the other did not. Conclusion. Although the majority of studies concluded that breastfeeding promotes intelligence, the evidence from higher quality studies is less persuasive.
Problems and approaches in investigating the role of micronutrients in the aetiology of cancer in humans. J. Little. Br Med Bull 1999: 55(3); 600-18. Observational studies have provided leads regarding a number of micronutrients which may account for the apparent protective effects of high intakes of vegetables and fruit against many types of cancer. In general, these leads have not been confirmed by randomised controlled trials. This apparent conflict raises issues about the timing and duration of a critical period or periods during which micronutrient intake may influence the development of cancer, the dose, possible interaction between high doses of micronutrients and exposures conferring a high risk of cancer and gene-micronutrient interactions. When gene-environmental interaction exists, failure to take both of these sets of factors into account leads to bias in the estimation of disease risk. As a result of recent advances, it is now possible to take measures of genetic susceptibility into account. Therefore, in future studies, the opportunity should be taken to obtain DNA samples to determine genotypes for polymorphisms potentially affecting micronutrient metabolism.
Interpreting the evidence: choosing between randomised and non-randomised studies. M McKee, A Britton, N Black, K McPherson, C Sanderson, C Bain. British Medical Journal 1999: 319(7205); 312-15. Abstract not available. [Medline] [Full text] [PDF]
The arrogance of preventive medicine. D. L. Sackett. Cmaj 2002: 167(4); 363-4.
Humility in observational studies. J. D. Shelton. Science 2002: 297(5590); 2208. Abstract not available yet.
Fat chance: diet and ischemic stroke [editorial; comment]. R. Sherwin, T. R. Price. Jama 1997: 278(24); 2185-6. Abstract not available.
Smoking as "independent" risk factor for suicide: illustration of an artifact from observational epidemiology? G. D. Smith, A. N. Phillips, J. D. Neaton. Lancet 1992: 340(8821); 709-12. Two widely used criteria for determining whether an association between a risk factor and a disease is causal are dose response and independence from other factors. Data from a large US risk factor study (MRFIT) throw up a relation between cigarette smoking and suicide that meets these criteria, yet appears to be biologically implausible. It is likely that many more such associations, for other exposures and other diseases, are equally spurious, but are protected by their lack of obvious implausibility.
Epidemiology faces its limits. G. Taubes. Science 1995: 269(5221); p164-9. Abstract not available.
The Cochrane Lecture. The best and the enemy of the good: randomised controlled trials, uncertainty, and assessing the role of patient choice in medical decision making. K. McPherson. J. Epidemiol. Community Health 1994: 48(1); 6-15. This lecture aimed to create a bridge to span the conceptual and ideological gap between randomised controlled trials and systematic observational comparisons and to reduce unwanted and unproductive polarisation. The argument, simply put, is that since randomisation alone eliminates the selection effect of therapeutic decision making, anything short of randomisation to attribute cause to consequent outcome is a waste of time. If observational comparison does have any significant part in evaluating medical outcomes, there is a grave danger of "the best", to paraphrase Voltaire, becoming "the enemy of the good". The first section aims to emphasise the advantages of randomised controlled trials. Then the nature of an essential precondition--medical uncertainty--is discussed in terms of its extent and effect. Next, the role of patient choice in medical decision making is considered, both when outcomes can safely be attributed to treatment choice and when they cannot. There may be many important situations in which choice itself affects outcome and this could mean that random comparisons give biased estimates of true therapeutic effects. In the penultimate section, the implications of this possibility both for randomised controlled trials and for outcome research is pursued and lastly there are some simple recommendations for reliable outcome research. [Medline]
evidence >> apples >> overview (2)
What is a P-value?. Ronald Thisted. Accessed on 2003-06-20. "Results favoring one treatment over another in a randomized clinical trial can be explained only if the favored treatment really is superior or the apparent advantage enjoyed by the treatment is due solely to the working of chance." www.stat.uchicago.edu/~thisted/Distribute/pvalue.pdf
Study designs in medical research. Ronald Thisted. Accessed on 2003-06-20. "Study design is the procedure under which a study is carried out." galton.uchicago.edu/~thisted/courses/315/lectures/0297.pdf
evidence >> apples >> randomization (46)
The mythology of randomization. U. Abel, A. Koch. Accessed on 2003-06-30. "In biostatistics and medicine one sometimes encounters an extremely negative view or even a categorical rejection of nonrandomized studies. This attitude may be comprehensible from a historical, pragmatic, or educational viewpoint but it is not well-founded on epistemological grounds. In addition, it is potentially harmful." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/abel.htm
Coronary artery surgery study (CASS): a randomized trial of coronary artery bypass surgery. Comparability of entry characteristics and survival in randomized patients and nonrandomized patients meeting randomization criteria. CASS Principal Investigators and Their Associates. Journal of the American College of Cardiology 1984: 3(1); 114-28. The Coronary Artery Surgery Study (CASS) includes a randomized trial of coronary artery bypass surgery and medical therapy in the management of patients with mild or moderate stable angina pectoris or free of angina but with a documented history of myocardial infarction. While 780 patients at 11 participating institutions entered the randomized trial, 1,315 patients at the same institutions met randomization criteria but declined participation in the randomized study; they constitute the "randomizable" patients. Half the randomized patients were assigned to surgery and half to the medical group. Of the 1,315 randomizable patients, 43% started with surgical therapy and 57% constitute the medical group. Follow-up periods average 64 months (range 46 to 92). The only entry characteristic in which the randomized and randomizable medical groups differ importantly is the extent of coronary artery disease, which is less extensive in the latter. The two surgical groups also differ in this respect, but with more extensive disease in the randomizable group. At 5 year follow-up, 24% of the medically-assigned randomized patients and 22% of the medically-started randomizable patients have had coronary bypass surgery. Survival in the medically-randomized and randomizable patient groups is similar in the aggregate (both 92% at 5 years) and also in all subgroups based on clinical classification, the number of diseased vessels, the presence of proximal left anterior descending coronary artery disease and ejection fraction. Survival for the surgically-assigned randomized patients and the surgically-started randomizable patients is also similar in the aggregate (95 and 94%, respectively) and in all subgroups. It is concluded that the randomized patients in CASS are not a special or atypical subset of those eligible for randomization. The data from the randomizable patients thus support and extend the inference of the generally very good survival of both the medically- and surgically-assigned patients of the randomized trial. [Medline]
The Paired Availability Design: An Update. S. G. Baker. Accessed on 2003-06-30. "Baker and Lindeman [ 3] introduced the paired availability design for strengthening inference when using historical controls. We review the design in the context of the following updates. First, we make the notation similar to that in the recent literature on all-or-none compliance in randomized trials. See the review in Baker [ 2] and Angrist et al. [ 1] . Second, in addition to excess risk, we consider the relative risk as a possible test statistic. Cuzick et al [ 4] independently made similar calculations in the context of a randomized trial with all-or-none compliance. Third, we recommend using the inverse of the variance rather than the inverse of the standard error when weighting estimates from multiple pairs. This was also independently suggested by Cuzick et al. [ 4] in the context of randomized trials. Fourth, to improve the sample size calculation we suggest a method for using exogenous data to estimate the variation due to random time changes. Fifth, we propose an adjustment for one type of systematic change over time." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/baker.htm
Unconventional therapies and cancer. M. Begin, E. Kaegi. Cmaj 1999: 161(6); 686-7. Abstract not available yet.
Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. I. Chalmers. Int J Epidemiol 2001: 30(5); 1156-64. Histories of clinical trials have recorded and analysed the development of quantification in therapeutic evaluation, the emergence of probabilistic thinking, the application of statistical methods and theory, and the sociology, ethics and politics of clinical trials; but it is surprising that they only rarely identify as a distinct theme the development of efforts to control biases. An exception is Kaptchuk's recent account of the history of blinding and placebos for reducing observer biases. In this complementary paper I introduce and discuss some milestones between 1662 and 1948 in the development of methods to control selection biases when assembling therapeutic comparison groups, to ensure, as far as possible, that 'like is compared with like'. In the paper I note (i) that treatment allocation based on strict alternation abolishes selection bias as effectively as treatment allocation based on strict random allocation; (ii) that use of schedules based on random numbers is more likely to prevent foreknowledge of allocation schedules, and thus the risk of introducing selection bias at the point of recruitment to trials; (iii) that a concern to conceal allocation schedules was the rationale for using schedules based on random numbers in the Medical Research Council trials of vaccination for whooping cough and streptomycin for pulmonary tuberculosis; and (iv) that the introduction of allocation concealment more than half a century ago remains the most recent substantive milestone in the history of efforts to control selection biases in therapeutic experiments.
Experimental Study versus Non-Experimental Study: The Non-Experimental (Non-Randomized) Study as a Methodological Compromise. K. Dannehl. Accessed on 2003-06-30. "Most methodologists agree that the experimental study is not only the best method for physics, chemistry, and biology, but also for medical research. However, often one has to be satisfied with non-experimental, i.e., less than optimal, designs for gaining knowledge. This is due to organisational and economic, as well as legal and ethical limits which we often meet when we conduct experiments in humans and which we can not, may not, or do not want to go beyond." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. Translated by Christoph Trautner. www.symposion.com/nrccs/dannehl.htm
evidence-based conclusions on the efficacy of a treatment - What can be learned from risk assessment?. L. Edler. Accessed on 2003-06-30. "A comparative clinical trial for which the randomization of the patients is either impossible, unwelcome, or inopportune, misses a basic justification for the establishment a causal relationship between the treatment and the health outcome. Various designs of observational studies have been developed with the aim of identifying and defining treatments which may have a curative or palliative effect for patients despite the absence of this methodological requirement. The discussions of pros and cons of these approaches make obvious the need of new methodologies for clinical studies when treatments and effects are to be related in a non-randomized set-up. In this situation, it may be helpful to adopt an approach similar to that of toxicology where the exposure to hazardous substances is related to the possible noxious effects on human health. Usually randomized studies are unavailable for risk assessments so that toxicological epidemiology has to base its conclusions on best available evidence. In this contribution analogies, resemblences, and dissemblances between risk assessment and treatment evaluation using nonrandomized studies are shown, and, on the basis of partial concordance, a proposal for the achievement of evidence-based Therapy Assessment (EBTA) is derived for a causal relationship between treatment and its effects on the human disease relief. EBTA may be helpful for structuring, ordering and weighting medical evidence when consensus on treatment recommendations has to be found in the face of results from randomized as well as non-randomized studies, and other data." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/ledler.htm
Problems of Randomized Trials. A.R. Feinstein. Accessed on 2003-06-30. "Regardless of how wonderful randomized trials are - and I will yield to no one in acknowledging them as the gold-standard when they can be done - they have some major problems and difficulties (Table 1)." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/feinstein.htm
A Nonparametric Test for Evaluating Coherent Alternativesin Nonrandomised Studies. O. Gefeller, L. Pralle. Accessed on 2003-06-30. "When considering the effect of treatments or exposures on some outcome variable in a nonrandomised study, the presence of coherence provides supporting evidence that an observed relationship between the factors of interest might reflect a causal treatment or exposure effect. In our understanding, coherence means that we have a specific and detailed description of what an actual treatment or exposure effect would look like. The concept of coherence can then be used to formulate a "coherent pattern" of expected results, indicative of a real effect of the treatment or exposure under study, that can be tested using the observed data. In the paper, we review a simple nonparametric rank test, developed by Rosenbaum, for testing the null hypothesis of no treatment/exposure effect against arbitrarily complicated coherent alternatives. In addition, we introduce a new measure of coherence to summarise quantitatively the coherence present in the data. Two empirical examples, one epidemiological investigation and one nonrandomised clinical trial, illustrate the application of the methodology." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/gefeller.htm
Randomized Controlled Trials: Evidence Biased Psychiatry. David Healy, Alliance for Human Research Protection. Accessed on 2002-"A new drug gets introduced to the market. It has been approved after stringent scrutiny by the FDA, which requires ever more convincing evidence that it works and that its safe. The new treatment will always cost more than the old treatments, but even on the cost front, many would argue that we have entered an era where placebo controlled clinical trials demonstrate that new in contrast to older treatments actually do work, and if we just stick to treatments that really work costs should fall. Besides it always seems to happen these days that when new and costly antidepressants or antipsychotics are put through an economic model based on the figures from clinical trials and a range of assumptions provided by experts, the model demonstrates that these new drugs costing thousand of dollars a year are in fact cheaper than treatments costing $100 per year or less. So where could the problems lie? Why do we seem to be so slow in reaching the new medical utopia towards which companies and others assure us we are heading?" www.researchprotection.org/COI/healy0802.html
The Analysis of Intervention Effects Using Observational Data Bases. C. Heuer, U. Abel. Accessed on 2003-06-30. "If, in a clinical unit, a new treatment is introduced within a short time period the problem arises as to how to evaluate its immediate impact on the patients’ prognosis, i.e., the (possible) intervention effect. An exploratory tool is described which can be employed to examine this effect. The method is illustrated by means of a clinical example." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/heuer.htm
Proof versus plausibility: rules of engagement for the struggle to evaluate alternative cancer therapies. L. J. Hoffer. Cmaj 2001: 164(3); 351-3.
Methodological Contributions to Clinical Research: Random Sampling, Randomization, and Equivalence of Contrasted Groups in Psychotherapy Outcome Research. Louis M Hsu. Journal of Consulting and Clinical Psychology 1989: 57(1); 131-137.
Comparison of evidence of treatment effects in randomized and nonrandomized studies. J. P. Ioannidis, A. B. Haidich, M. Pappa, N. Pantazis, S. I. Kokori, M. G. Tektonidou, D. G. Contopoulos-Ioannidis, J. Lau. Jama 2001: 286(7); p821-30. CONTEXT: There is substantial debate about whether the results of nonrandomized studies are consistent with the results of randomized controlled trials on the same topic. OBJECTIVES: To compare results of randomized and nonrandomized studies that evaluated medical interventions and to examine characteristics that may explain discrepancies between randomized and nonrandomized studies. DATA SOURCES: MEDLINE (1966-March 2000), the Cochrane Library (Issue 3, 2000), and major journals were searched. STUDY SELECTION: Forty-five diverse topics were identified for which both randomized trials (n = 240) and nonrandomized studies (n = 168) had been performed and had been considered in meta-analyses of binary outcomes. DATA EXTRACTION: Data on events per patient in each study arm and design and characteristics of each study considered in each meta-analysis were extracted and synthesized separately for randomized and nonrandomized studies. DATA SYNTHESIS: Very good correlation was observed between the summary odds ratios of randomized and nonrandomized studies (r = 0.75; P<.001); however, nonrandomized studies tended to show larger treatment effects (28 vs 11; P =.009). Between-study heterogeneity was frequent among randomized trials alone (23%) and very frequent among nonrandomized studies alone (41%). The summary results of the 2 types of designs differed beyond chance in 7 cases (16%). Discrepancies beyond chance were less common when only prospective studies were considered (8%). Occasional differences in sample size and timing of publication were also noted between discrepant randomized and nonrandomized studies. In 28 cases (62%), the natural logarithm of the odds ratio differed by at least 50%, and in 15 cases (33%), the odds ratio varied at least 2-fold between nonrandomized studies and randomized trials. CONCLUSIONS: Despite good correlation between randomized trials and nonrandomized studies-in particular, prospective studies-discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common.
Amniotomy or oxytocin for induction of labor. Re-analysis of a randomized controlled trial. M. J. Keirse. Acta Obstet Gynecol Scand 1988: 67(8); p731-5. A recently reported "prospective, randomized study into amniotomy and oxytocin as induction methods in a total unselected population" was examined for selection bias and bias after entry into the study. The null hypothesis that clinical attitudes to amniotomy as a means for inducing labor had no influence on the decision to enter women into the trial and allocate them to either amniotomy or oxytocin was rejected at p less than 0.00025. Clinical attitudes were further found to statistically significantly influence the prescribed assessments 4 h after entry into the trial and the selection of the second intervention that was required in the absence of acceptable progress (p less than 0.0005). Bias at the time of this prescribed assessment was large enough to result in an inverse relationship between "acceptable progress within 4 hours" and "delivery within 24 hours" after induction. A subanalysis of the nulliparae entered into the trial further substantiated both bias at entry and bias in following the prescribed protocol. As hypothesized, these biases reached a greater statistical significance in nulliparous than in parous women. The likelihood that all of these observations would be encountered in a truly randomized study of this size can be estimated to be less than one in a billion (or p less than 0.000,000,000,000,1). The study, therefore, provides a classical example of the dangers of non-blind allocation to different treatment groups in clinical trials. It is further concluded that no randomized controlled studies between amniotomy and oxytocin in a "total unselected population" are available.
Discussion: Why Clinical Trials in the Evaluation of Life Style Evaluation? Genell L. Knatterud, PhD. Control Clinical Trials 1997: 18(6); 514-516. Abstract not available.
"The 60-Minutes-Myocardial Infarction Project": Comparison with a Registry and a Randomized Clinical Trial. A. Koch, A. Hörmann, H. Löwel, J. Senges. Accessed on 2003-06-30. "There is an ongoing debate about whether observational studies can produce reliable information on treatment comparisons. Randomized clinical trials are the accepted gold standard for this purpose. It is, however, impossible to investigate all important issues in randomized trials. Large observational studies are frequently performed and large clinical databases are available. Thus it is a relevant question, how reliable data from nonrandomized studies are. In this contribution, data of a large nonrandomized multicenter study on decision-making with respect to thrombolytic treatment in patients with acute myocardial infarction are compared with a randomized clinical trial and a population based registry. It is demonstrated that similar event rates are observed in those subgroups of the observational study that are comparable with the randomized study or the registry. Especially, there is no indication of underreporting of deaths in the observational study that would invalidate all investigations on treatment comparisons based on the observational study in the first step. Although such results can hardly be generalized to other situations, they might help balance the view on so-called horror-stories, where results of observational studies could not be verified in subsequent randomized clinical trials." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/koch.htm
Breastfeeding and infant growth: biology or bias? M. S. Kramer, T. Guo, R. W. Platt, S. Shapiro, J. P. Collet, B. Chalmers, E. Hodnett, Z. Sevkovskaya, I. Dzikovich, I. Vanilovich. Pediatrics 2002: 110(2 Pt 1); 343-7. BACKGROUND: Available evidence suggests that prolonged and exclusive breastfeeding is associated with lower infant weight and length by 6 to 12 months of age. This evidence, however, is based on observational studies, which are unable to separate the effects of feeding mode per se from selection bias, reverse causality, and the confounding effects of maternal attitudinal factors. DESIGN/METHODS: A cluster-randomized trial in the Republic of Belarus of a breastfeeding promotion intervention modeled on the World Health Organization (WHO)/UNICEF Baby-Friendly Hospital Initiative versus control (then current) infant feeding practices. Healthy, full-term, singleton breastfed infants (n = 17 046) weighing > or =2500 g were enrolled soon after birth and followed up at 1, 2, 3, 6, 9, and 12 months old for measurements of weight, length, and head circumference. Data were analyzed according to intention-to-treat, while accounting for within-cluster correlation. To assess the potential for bias in observational studies of breastfeeding, we also analyzed our data as if we had conducted an observational study by ignoring treatment, combining the 2 randomized groups, and comparing 1378 infants weaned in the first month and those breastfed for the full 12 months of follow-up with either > or =3 months (n = 1271) or > or =6 months (n = 251) of exclusive breastfeeding. RESULTS: Infants from the experimental sites were significantly more likely to be breastfed (to any degree) at 3, 6, 9, and 12 months and were far more likely to be exclusively breastfed at 3 months (43.3% vs 6.4%). Mean birth weight was nearly identical in the 2 groups (3448 g, experimental; 3446 g, control). Mean weight was significantly higher in the experimental group by 1 month of age (4341 vs 4280 g). The difference increased through 3 months (6153 g vs 6047 g), declined slowly thereafter, and disappeared by 12 months (10564 g vs 10571 g). Analysis by z scores confirmed that infants in both groups gained more weight than the WHO/Centers for Disease Control and Prevention reference, with no evidence of undernutrition in the control group. Length followed a similar pattern. In the observational analyses, infants weaned in the first month were slightly lighter and shorter at birth and their weight-for-age and length-for-age z scores declined by 1 month, but they caught up to both experimental and the other observational groups by 6 months and were heavier and longer by 12 months. Among infants in the 2 prolonged and exclusive breastfeeding groups, weight-for-age z scores fell slightly between 3 and 12 months; length-for-age fell below the reference by 6 months with catch-up to the reference by 12 months. Head circumference showed no significant differences at any age between the 2 trial groups or among the observational groups. CONCLUSIONS: Our data, the first in humans based on a randomized experiment, suggest that prolonged and exclusive breastfeeding may actually accelerate weight and length gain in the first few months, with no detectable deficit by 12 months old. These results add support to current WHO and UNICEF feeding recommendations. Our observational analysis showing faster weight and length gains with early weaning and slower gains with prolonged and exclusive breastfeeding may reflect unmeasured confounding differences or a true biological effect of formula feeding.
Problems of Randomized Controlled Trails (RCT) in Surgery. R. Lefering, E. Neugebauer. Accessed on 2003-06-30. "Randomized controlled trials (RCT) are widely accepted as the gold standard for comparing different therapeutic modalities. The random allocation of patients avoids a selection bias and cares for an equal distribution of conscious as well as unsconscious prognostic factors among the sutdy groups, provided the number of patients included is large enough. The credibility of study results is further enhanced by applying techniques like independent investigators, blinding techniques, or homogenisation of patients and therapy." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/lefering.htm
The Psychic Staring Effect: An Artifact of Pseudo Randomization. David F. Marks, John Colwell. Skeptical Inquirer 2000: 24(5); 41-44 and 49.
Evaluating complementary medicine: methodological challenges of randomised controlled trials. S. Mason, P. Tovey, A. F. Long. Bmj 2002: 325(7368); 832-4.
Reference - controlled observational studies - a new tool for post marketing studies and for evaluation of preventive measures. J. Michaelis. Accessed on 2003-06-30. "Several limitations of controlled clinical trials in phase-III drug research (e.g., highly selected patients, limited size of trials) make it mandatory to perform extensive research also in the post marketing phase. It is proposed to enhance the achievable evidence on therapeutic effects from large observational studies by designing small nested randomized trials. In contrast to the "comprehensive cohort studies" which have been discussed several years ago [25, 26], this combination of observational and experimental studies is planned in advance and does not result from patients’ compliance with the idea of randomization." Published in the Proceedings of the International Conference on Nonrandomized Comparative Clinical Studies in Heidelberg, April 10 -11,1997. www.symposion.com/nrccs/michaeli.htm
Effects of a Combination of Beta Carotene and Vitamin A on Lung Cancer and Cardiovascular Disease. GS Omenn, GE Goodman, MD Thornquist, J Balmes, MR Cullen, A Glass, JP Keogh, FL Meyskens, B Valanis, JH Williams, S Barnhart, S Hammar. The New England Journal of Medicine 1992: 334(18); 1150-1155.