Stats #71: Control charts for continuous monitoring of the number needed to harm.
Content: While most of the efforts in signal detection use newly developed data mining algorithms that are both complex and computer intensive, there is still room in your research arsenal for simpler approaches that have withstood the test of time, like the statistical process control chart. By applying a straightforward data transformation, you can use the control chart to monitor the Number Needed to Harm (NNH), an easily interpreted measure of absolute risk.
Teaching strategies: Didactic lectures and small group exercises.
Objectives: In this class you will learn how to
- Identify those situations where simple control charts are preferable, but also recognize their risks and limitations.
- Adapt different decision rules and alternate control chart formats to increase your sensitivity for small but consistent shifts in risk.
- Establish rational targets for the NNH that balance the benefits of a new drug against its risks.
Outline of this talk.
Introduction
- Information about my book.
- Where can you find this handout?
- Why don't I use PowerPoint?
- A plea for open-mindedness
Review
- What is a control chart?
- What is a special cause of variation?
- What is a common cause of variation?
- Statistical koan: The Busy Tailor.
- Advantages of control charts
- Disadvantages of control charts
- Advantages of data mining models
- Two cautionary tales about data mining
A new and simple approach for monitoring safety data
- Date gaps rather than rates
- Adjustments for patient load and the number needed to harm calculations
- What is a reasonable value for NNH?
- Monitoring targets with a CUSUM chart
- Bayesian prior distributions and their application to safety data
Examples
- Monitoring adverse events during peritoneal dialysis
- Tracking central line infections over time
- Tracking adverse events during kidney biopsy.
Conclusion
Information about my book, Statistical Evidence in Medical Trials
I
recently published a book, Statistical Evidence in Medical Trials, What
do the Data Really Tell Us? through Oxford University Press. A good
summary of what this book is about appears on the back cover:"Statistical Evidence in Medical Trials is a lucid, well-written and entertaining text that addresses common pitfalls in evaluating medical research. Including extensive use of publications from the medical literature and a non-technical account of how to appraise the quality of evidence presented in these publications, this book is ideal for health care professionals, students in medical or nursing schools, researchers and students in statistics, and anyone needing to assess the evidence published in medical journals." A review by Rebecca Rooney in the International Journal of Epidemiology states: "This book is a clear, concise, and interesting read and should prove to be a useful guide. The examples and case studies make it easy to understand difficult concepts and the jokes and stories make it fun. There are some salient points and hopefully the reader will be enthused about looking at the published research and be more confident about distinguishing between the good and the bad." More information about the book (supporting materials, answers to the exercises, and other updates) can be found on the web at http://www.childrensmercy.org/stats/evidence.asp. |
Where can you find this handout?
This handout and the handouts that I use for all of my seminars and training classes are a compilation of individual web pages at www.childrensmercy.org/stats. I use the "Include Page" feature of Microsoft FrontPage to combine these into a single page. You can always find the most recent version of this compilation by going to the web address listed at the bottom of this page. Links for the handouts for other seminars and classes appear at www.childrensmercy.org/stats/training.asp.
Why don't I use PowerPoint?
I stopped using PowerPoint for my presentations in the mid 1990's. This was based on Edward Tufte's advice that presenting information in a paper handout is more effective than presenting the information on a projected screen. I found this to be excellent guidance. I enjoy talking when I don't have to wrestle with a laptop computer. I look at my audience more and interact with them better. I elaborate on this in greater detail at www.childrensmercy.org/stats/weblog2004/powerpoint.asp.
A plea for open-mindedness (November 2, 2006).
Most people that I work with are quite open minded, but I do encounter, from time to time, someone who is resistant to ideas that originate from outside the sphere of medicine. In particular, some individuals are almost cynical about the application of quality control in health care. The attitude seems to be something like this:
Quality control is an approach that works on assembly lines. I am a doctor not a factory worker, and my patients are not products on an assembly line.
That's a fair statement. Patients are not widgets, and it is a mistake to treat them the same way. But it's also a mistake to think that we can't learn from how other people have approached problems that do indeed bear some semblance of similarity to the problems that you face.
Let me mention a slightly different area where healthcare professionals are indeed listening to and learning from outsiders. Patient safety is a very important issue in hospitals. Healthcare professionals recognize they make mistakes and their patients sometimes suffer from those mistakes. There are numerous well publicized examples of this, such as the following tragic report:
Boy, 6, killed in MRI accident. The Journal News. Melissa Klein and Oliver W. Prichard. (Original publication: July 31, 2001) VALHALLA — A 6-year-old boy died two days after he was smashed in the head by a metal oxygen canister that was pulled by magnetic force into the MRI machine where he was being examined, Westchester Medical Center officials said yesterday. An unidentified hospital employee brought the oxygen tank within reach of the 10-ton magnet's field, and it shot through the air to the center of the machine, the hospital said. (Source: www.mrireview.com/docs/mrideath.pdf).
Or this one:
Doctor mistakenly amputates wrong foot. Dec 28, 2004 (TUXTLA GUTIERREZ, Mexico) - A doctor at a public hospital in southern Mexico mistakenly amputated the right leg of an elderly patient who had sought treatment for an infection in his left foot, the patient's family announced Sunday. Seeking treatment for a foot wound aggravated by diabetes, Alberto Lopez, 74, was admitted to a Social Security Institute hospital in Tuxtla Gutierrez, 430 miles south of Mexico City, and underwent surgery on Friday. But the patient emerged from surgery without a right leg and still suffering from the original infection according to family members who filed a complaint Sunday with the state attorney general's office and a national medical arbitration commission. (Source: abclocal.go.com/wls/story?section=News&id=2552947).
Anecdotes like this produce a strong emotional impact among healthcare professionals and in the general public. There's also the recognition among healthcare professionals that many preventable deaths and illnesses in our hospitals go unrecognized, and simple interventions like regular handwashing are ignored. So who have the doctors and nurses and other medical professionals turned to for help with patient safety? The suprising answer is documented nicely in a recent newspaper article by Kate Murphy in the October 31, 2006 issue of the New York Times, What Pilots Can Teach Hospitals About Patient Safety. This article has a very strong lead.
Wearing scrubs and slouching in their chairs, the emergency room staff members, assembled for a patient-safety seminar, largely ignored the hospital’s chief executive while she made her opening remarks. They talked on their cellphones and got up to freshen their coffee or snag another danish. But the room became still and silent when an airline pilot who used to fly F-14 Tomcats for the Navy took the lectern. Handsome, upright and meticulously dressed, the pilot began by recounting how in 1977, a series of human errors caused two Boeing 747s to collide on a foggy runway in the Canary Islands, killing 583 people. Riveted, a surgeon gripped his pen with both hands as if he might break it, an anesthetist stopped maniacally chewing his gum, and a wide-eyed nurse bit her lip. An attention grabber, yes, but what does an airplane crash have to do with patient safety?
Apparently some pretty important people in the healthcare industry do believe that there is a link
After the Canary Islands accident, NASA convened a panel to address aviation safety and came up with a program called Cockpit or Crew Resource Management. The Federal Aviation Administration requires that all pilots for commercial airlines and the military undergo the training. They learn, among other things, to recognize human limitations and the impact of fatigue, to identify and effectively communicate problems, to support and listen to team members, resolve conflicts, develop contingency plans and use all available resources to make decisions.
Recognizing the positive impact of the program on the aviation industry’s safety record, the Institute of Medicine in 2001 recommended similar training for health care workers. The National Academies, the Agency for Healthcare Research and Quality and the Institute for Healthcare Improvement also advocate the training, as well as the use of other aviation-inspired practices like pre- and post-operative briefings, simulator training, checklists, annual competency reviews and incident reporting systems.
So is there a commonality between landing an airplane at Heathrow and excising a gall bladder?
“The trend is not surprising given the similarities between health care and aviation,” said Dr. David M. Gaba, associate dean of immersive and simulation-based learning at the Stanford University School of Medicine in Palo Alto, Calif. “Both involve hours of boredom punctuated by moments of sheer terror,” he said. In addition to sometimes having to make life-and-death decisions in seconds, pilots and physicians also tend to be highly skilled, Type A personalities, who rely heavily on technology to do their jobs.
There are differences as well and the article points these out.
The definition of an error in health care, Professor Helmreich said, is “fuzzier” than in aviation, where it is easier to identify a “foul-up” and who was responsible. Health care providers’ fear of litigation and losing their medical licenses also hinders the honest reporting of mistakes, whereas aviators are often inoculated against punishment if they promptly report incidents to the authorities. Training programs developed by pilots without knowledge of health care realities can be “appallingly bad,” he said.
I believe that a respectful attitude couched in humility is the best approach for people who are advocating new approaches and people who are listening to those advocates. We can't force fit solutions from the outside that don't respect the unique aspects of healthcare, but neither can we pretend that healthcare is so unique that only an insider can make changes and suggestions for improvement.
If healthcare professionals can learn from pilots, there may even be a sliver of hope that they can learn from Statisticians.
This webpage was written by Steve Simon on 2007-11-02, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Quality control
What is a control chart? (November 11, 2006). Category: Definitions, Category: Control charts
A control chart is a graphical tool used in many industrial settings that monitors a work process on a continual and on-going basis. Here is an example of a control chart published in the Engineering Statistics Handbook, published by the U.S. National Institute for Standards and Technology.
Source: www.itl.nist.gov/div898/handbook/pmc/section3/pmc322.htm
There is small typographical error in this chart, but it illustrates the general structure quite well. The control chart is simply a run chart (a plot of a sequence of values) with three reference lines. The center line is typically drawn at the average of all of the data. The control chart also includes two control limits, an upper control limit (UCL) and a lower control limit (LCL). The control limits are set a certain distance away from the center line (I'm deliberately being vague here). Any data values that fall above the UCL or below the LCL are described as "out of control" and represent a "special cause" of variation. If all the data values lie inside the LCL and UCL, the work process is said to be "in control" and all of the observed variation represents "common cause" variation.
This webpage was written by Steve Simon on 2006-11-11, edited by Steve Simon, and was last modified on 2008-07-08. This page needs minor revisions. Category: Control charts, Category: Definitions.
What is a special cause of variation?
A special cause of variation is a variation from the mean that has an assignable cause. When you have a special cause in your work process, you need to investigate immediately (while the trail is still warm).
When you spot a special cause:
- The first thing to do is control any damage or problems with an immediate, short-term fix. Be careful not to view this fix as a permanent solution or the process will never be improved.
- Once a quick fix is in place, search for the cause. Ask people in the process what was different that time. What was out of the ordinary? It might not have been much – an unexpected emergency, a change in schedules, or new materials. The need for this sort of information is part of the reason for collecting very complete data the first time around, noting details and traceability factors about a sample or recorded event.
- Once you have discovered the special cause, you can develop a longer-term remedy. Most special causes have a negative impact on the output of the process and need to be removed. Occasionally, a special cause can have a positive impact depending on the nature of the process. If this is the case, finds ways to capture and integrate it into the system.
Avoid these mistakes:
- Changing the process to accommodate the special cause. This usually adds cost and bureaucracy.
- Blaming individuals. Not only does everyone makes mistakes, but also chances are that the problem would have occurred regardless of individuals involved.
- Exhorting workers to simply "do better." People can only do as well as the system allows them to do.
Source: www.skymark.com/resources/responding_to_variation.asp
This webpage was written by Steve Simon on 2006-11-11, edited by Steve Simon, and was last modified on 2008-07-08. This page needs minor revisions. Category: Control charts, Category: Definitions.
What is a common cause of variation?
A common cause of variation is a variation from the mean that is caused by the system as a whole. This variation is not due to an assignable cause, but rather represents variation inherent in the process you are studying.
When a work process has only common causes of variation and no special causes, that process is "in control." This means that it is stable, consistent, and predictable. It might be predictably good or predictably bad, or it might be a very regular mix of good and bad results.
What do you do with a common cause of variation?
Just because a process is stable, or in statistical control, does not mean that its results are satisfactory. A process may be very consistent, day in and day out making items that are nowhere near specification limits. Or, as the Japanese have done so successfully, variation can be systematically reduced, even in stable processes, enabling a gradual tightening of specification limits, and an overall increase in product quality at lower cost.
Improving a stable process is somewhat more difficult than improving an unstable process because, by definition, a stable process has no special causes of variation that jump out at you, asking to be investigated. Instead, you are faced with the task of looking at all data about the process, not simply what made one point different from the others.
Common causes of variation often lie hidden within the system, and are sometimes assumed to be unavoidable. Yet it is very possible, and often very rewarding, to improve processes and reduce common cause variation. Experience had shown that, amongst the people in and around the process, there are enough ideas for improvements to make a significant impact, even on a sound process. (Source: www.skymark.com/resources/responding_to_common_cause_variation.asp).
This webpage was written by Steve Simon on 2006-11-11, edited by Steve Simon, and was last modified on 2008-07-08. This page needs minor revisions. Category: Control charts, Category: Definitions.
The following story illustrates the problems that can occur when you fail to recognize the difference between common cause and special cause variation.
The Busy Tailor
When it was his turn to explain his recent work, Student Leaf stood up and portrayed an elegant experiment that used a central composite design with four factors. Master Stem asked, "Is this process ready for such an experiment?"
Student Leaf replied, "I do not understand."
Master Stem looked at him with an air of amusement. "If this process is not ready for an experiment, then you will make yourself very busy for no good reason."
"How can I tell, Master Stem, if a process is ready?"
"Have you computed a control chart for this process? Do you know if the process is in control?"
"I have not computed a control chart, but I do know that the process is too variable. I want to run an experiment to reduce that variation."
"I have a tailor I would like you to meet. He makes all the clothes for my family. I brought my oldest child in for a fitting and the tailor made measurements and started sewing. When I visited the next time, I had my youngest child with me. I apologized, but the tailor still insisted on doing the fitting. This required ripping out all the old seams, remeasuring and resewing. 'I am almost done with the clothes for your youngest child,' he told me, 'please come back tomorrow.' So I returned the next day, but this time I was accompanied by my middle child. 'No matter,' replied the tailor, 'I will rip out all the seams again and make the clothes fit your middle child.'"
"That is a very foolish tailor, Master Stem."
"And you, too, are foolish if you run an experiment without looking at the control chart first. If your process is out of control, that tells you that your process is not a single process, but is many instead. And you do not know which process is visiting at any time. Your experiment, carefully optimized for one process, will fit poorly for the other processes."
This webpage was written by Steve Simon and was last modified on 2008-07-08. Category: Teaching resources
The pros and cons of control charts versus data mining (November 17, 2007)
In a talk I gave in December 2006, I highlighted how in the analysis of adverse event data, control charts can augment more complex statistical tools like data mining. Here's a summary of the pros and cons of using control charts.
Advantages of control charts. Control charts were originally proposed by Walter Shewhart in the 1920's. There is a lot of history behind the control charts, allowing for lots of experience to prove their usefulness and adaptability in a wide range of applications.
The long history of the control chart also makes it a tool that is familiar and comfortable to a lot of people. While most of the applications are in industrial areas, a book published a decade ago,
- Measuring Quality Improvement in Healthcare: A Guide to Statistical Process Control Applications. Carey RG, Lloyd RC (1995) New York: Quality Resources.
highlights numerous applications of control charts in health care.
Finally, the control chart is easy to use. Even with some of the recent enhancements and extensions, control charts remain a relatively simple and accessible tool. You don't need a lot of state-of-the-art statistical tools like you do for a data mining project.
This means that you don't need a lot of statistical and computational expertise to use control charts. There are only a small number of people who have the qualifications and the expertise to do a good job with a data mining model. By placing control charts in the hands of a larger number of people, you increase the number of eyes that look at a problem and (in theory) increase the chances that safety problems are found early.
Disadvantages of control charts. The control chart is an exploratory tool. If the control chart shows a point out of control, the chart won't explain to you WHY it is out of control.
The control chart won't help to identify a subgroup at greater risk if you did not have the foresight to monitor that group. It also won't identify an adverse event that was unexpected. With a control chart, you have to know what you're looking for.
While there are some adaptations of control charts for multivariate data, seasonal data, and other complexities, the control chart is not easily adapted to these types of complexities.
Advantages of data mining models
Data mining models excel in situations where the data streams are large and complex. Some of the data mining methods are adept at handling ambiguous data and missing data. They can also detect subtle non-linearities and interactions that most other statistical methods might miss.
While the data mining methods are not as easy to use as their proponents claim (the old saw "easy to use is easy to say" certainly applies here), the researchers in this field go to great lengths to automate key components of the data mining process. Many methods will incorporate methods like cross validation that allow you to instantly hone in on a model that is neither too complex nor overly simple.
There is a wealth of data mining tools, each with its own particular strengths, so a sophisticated modeler can apply a variety of data mining methods to rapidly triangulate on an accurate solution.
Finally, data mining models are just a lot of fun. Or am I the only one who thinks this sort of thing is cool?
Disadvantages of data mining models. While some of the disadvantages of data mining models are highlighted above (the need for highly trained personnel and specialized software), perhaps two additional disadvantages can be summarized by a couple of personal anecdotes that I originally discussed in a January 6, 2005 weblog entry.
The first story was told to me by a doctor here at Children's Mercy, Jay Portnoy. He was describing a data mining model that was fed images of both cars and trucks (a training set, in the parlance of data mining) to see if it could develop a rule for identifying whether a future image was either a car or a truck based just on mathematical properties of that image. It did a pretty good job of finding factors in the training set that distinguished between cars and trucks. But it failed miserably on the first new image it was trying to classify. It was an image of a car on a snow covered highway. The data mining algorithm said that this was almost certainly a truck. What the researchers then realized is that in the training set, anytime there was snow in the background, it was a truck that was being shown and never a car. I suppose it is the tendency of marketing to always show trucks in rugged, primitive, and/or dangerous driving conditions. So the data mining model seized on a key relationship (color of the background) that existed only accidentally in the training set, rather than focusing on those aspects, such as the shape and size of the vehicle, that most of us would use to distinguish cars from trucks.
Moral from anecdote #1. Even the most sophisticated data mining models cannot overcome deficiencies in your data.
The second story was one I heard in a training class by Richard DeVeaux on data mining models that dealt with the question "so what?". He mentioned one of the earliest findings in a data mining model world (though he is uncertain if this is a true story or an urban legend) was that there was an unusual association seen in sales patterns at convenience stores. It seemed that people who came in to buy beer almost always ended up buying diapers at the same visit. This is the classic sort of thing that data mining models are supposed to find: unusual and unexpected associations in a very large data set. So he posed this question to a group of managers: what would you do with this information? A common response was: stock the shelves so that the beer and the diapers are close together to make the trip for the customer faster and more convenient. Another common response was: put the beer and the diapers at opposite ends of the store so that customers would have to spend more time in the store, increasing the chances for impulse purchases. Another common response was a shrug of the shoulders. In fact, we often don't know what to make of the associations found by data mining models.
Moral from anecdote #2. Significant findings from a data mining model are not guaranteed to provide appropriate clinical guidance.
The bottom line. No one statistical tool or method is going to provide you with everything you need. The broader range of methods that you bring to bear on a problem, the better your chances of success.
This webpage was written by Steve Simon on 2007-11-11, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Adverse events in clinical trials
Two cautionary tales about data mining (January 6, 2005). Category: Data mining
I attended a 7am seminar this morning on data warehousing and data mining, which was quite good. It reminded me of a couple of stories I heard about the pitfalls of data mining.
The first was told to me by a doctor here at Children's Mercy, Jay Portnoy. He was describing a data mining model that was fed images of both cars and trucks (a training set, in the parlance of data mining) to see if it could develop a rule for identifying whether a future image was either a car or a truck based just on mathematical properties of that image. It did a pretty good job of finding factors in the training set that distinguished between cars and trucks. But it failed miserably on the first new image it was trying to classify. It was an image of a car on a snow covered highway. The data mining algorithm said that this was almost certainly a truck. What the researchers then realized is that in the training set, anytime there was snow in the background, it was a truck that was being shown and never a car. I suppose it is the tendency of marketing to always show trucks in rugged, primitive, and/or dangerous driving conditions. So the data mining model seized on a key relationship (color of the background) that existed only accidentally in the training set, rather than focusing on those aspects, such as the shape and size of the vehicle, that most of us would use to distinguish cars from trucks.
The second story was one I heard in a training class by Richard DeVeaux on data mining models that dealt with the question "so what?". He mentioned one of the earliest findings in a data mining model world (though he is uncertain if this is a true story or an urban legend) was that there was an unusual association seen in sales patterns at convenience stores. It seemed that people who came in to buy beer almost always ended up buying diapers at the same visit. This is the classic sort of thing that data mining models are supposed to find: unusual and unexpected associations in a very large data set. So he posed this question to a group of managers: what would you do with this information? A common response was: stock the shelves so that the beer and the diapers are close together to make the trip for the customer faster and more convenient. Another common response was: put the beer and the diapers at opposite ends of the store so that customers would have to spend more time in the store, increasing the chances for impulse purchases. Another common response was a shrug of the shoulders. In fact, we often don't know what to make of the associations found by data mining models.
This webpage was written by Steve Simon on 2005-01-06, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Data mining
A new and simple approach for monitoring safety data (November 18, 2007)
Many hospitals administrators collect safety data, and for the most part this data is not analyzed well. The people who collect the data are well-meaning, but the simplistic tables and graphs that they use are typically unable to reveal important trends and patterns in the data. Much of the safety data represents a description of events (usually bad events) that occur. The question that always seemed to be on their minds was: is there a sudden surge of events that we need to take action on?
The groups that monitor research (Research Ethics Boards or Institutional Review Boards) also examine safety data. The first thing they are looking for either an unexpected adverse event that might require a more detailed informed consent form. These review boards are also concerned with unduly high rates of an adverse event that might tip the risk-benefit ratio the wrong way and require that the research study be modified or shut down. Again much of the review is well-meaning, but is too simplistic to provide an accurate picture of what is going on.
It was in recognition of the special difficulties that these two groups have with monitoring safety data that I started researching some adaptations of the control chart. The work I've done so far is in four areas: analysis of date gaps rather than rates, adjustments for patient load that provide solutions analogous to the number needed to harm calculation, and Bayesian prior distributions and their application to safety data.
Date gaps rather than rates
Consider a series of n events that occur at times T1, T2, ..., Tn. The date gaps G2, G3 , ..., Gn-1 are defined as
Gi = Ti - Ti-1.
You can optionally define an initial time T0 that represents the time that observation started and an initial date gap,
G1 = T1 - T0.
Monitoring the date gaps will allow you to monitor important trends. If the events are occurring more frequently than expected, the average time between events will be smaller than expected. If the events are occurring less frequently than expected, then the average time between events will be larger than expected.
Consider a hypothetical research study that started in January 1997 with the intention to recruit 12 patients per year (one per month) over a ten year period, for a total sample size of 120 patients. By the end of June 2004, (roughly 7 1/2 years), the study has enrolled 42 patients (Table 1).
2/26/1997 4/ 4/1997 7/ 7/1997
7/25/1997 2/ 5/1998 2/15/1998
3/ 6/1998 7/ 3/1998 8/ 3/1998
2/ 8/1999 3/19/1999 4/20/1999
5/29/1999 6/21/1999 7/27/1999
9/ 6/1999 1/10/2000 1/11/2000
2/28/2000 3/ 3/2000 4/13/2000
5/30/2000 11/21/2000 12/18/2000
2/ 6/2001 4/30/2001 8/ 3/2001
1/20/2001 12/ 3/2001 12/ 7/2001
9/27/2002 10/ 1/2002 2/ 2/2003
3/ 3/2003 10/31/2003 11/ 4/2003
11/11/2003 1/ 5/2004 2/ 2/2004
4/15/2004 5/23/2004 6/ 2/2004Note: this table uses the American format for dates (mm/dd/yyyy) rather than the European format (dd/mm/yyyy).
Clearly this clinical trial has problems. The actual accrual rate is a meager 5.6 patients per year, and now it is probably too late to fix things. In order to finish on time, the researchers would have to recruit at a rate more than 30 patients per year over the remainder of the study. This is more than 5 times faster than the current accrual rate and 2.5 times faster than the original planned accrual rate.
Wouldn't it be nicer if the researcher had noticed the problem two years into the study rather than 7 1/2 years out? The researcher would still have to hustle, but 14 patients per year would allow the study to still finish on time and it represents only a modest increase over the planned rate.
An important aside: I am using the example of accrual in a clinical trial for two reasons. First, it is easy to explain. There are some minor complexities with tracking adverse events that make it more difficult to discuss. Second, I have done a lot of the preliminary work in this area with the understanding that it can be easily applied to other areas. From the perspective of pharmacovigilance, imagine that the dates are not the dates that patients entered a clinical trial, but rather the dates that a medical device failed or the dates that a patient is hospitalized because of an adverse drug reaction associated with the drug you are studying.
The traditional approach to examining rates is to set a time interval (weeks, months, or years, for example) and count the number of events per that time interval. For example, you could compute the monthly rates
Jan97 0
Feb97 1
Mar97 0
Apr97 1
May97 0
Jun97 0
Jul97 2
etc.The plot of monthly rates looks like this:
Or the yearly rates
1997 4
1998 5
1999 7
2000 8
etc.which looks like this:
Or something in between like the quarterly rates
97Q1 1
97Q2 1
97Q3 2
97Q4 0
98Q1 3
etc.which looks like this:
A narrow time interval allows you to respond very rapidly, but the individual values (mostly zeros and ones) are so granular that the information value of this approach may be limited. The yearly approach has more information for any single time interval, but you have to wait a full year or more to spot any important changes. A quarterly interval offers the best (worst?) of both worlds.
Here is how you would compute the date gaps for this data set:
56 = ( 2/26/1997) - ( 1/ 1/1997)
37 = ( 4/ 4/1997) - ( 2/26/1997)
94 = ( 7/ 7/1997) - ( 4/ 4/1997)
etc.The date gaps offer two advantages over monthly, quarterly, or yearly rates. First, the date gaps are self scaling. Here's a plot of the date gaps:
I deliberately used a mixture of units on this graph to emphasize an important point. One of the big advantages of using the date gap is that the graphs are self-scaling. If you are examining events that occur frequently, your date gaps will be in the lower portion of the graph, where the units are expressed in days or weeks. If you are examining events that occur rarely, your date gaps will be in the upper portion of the graph, where the units are expressed in months, quarters, or even years.
Another advantage of the date gap is that it liberates you from arbitrary calendar boundaries. Suppose that this chart were monitoring some type of adverse event that was occurring infrequently (every other week or so), and suddenly you noticed three adverse events on three consecutive days (December 2, 3, and 4). Do you tell yourself, "Hmmm, that's interesting. We'll have to see what the monthly rate will be come December 31"? With a date gap model, every time an event occurs, another date gap is added to the chart. You don't have to wait until the end of the month, end of the quarter, or (heaven forbid!) the end of the year before you draw your conclusion. The date gap allows you to respond rapidly to a sudden surge of events.
A third advantage of the date gap is that the terms in the series of date gaps form a telescoping sum. If you computed the average date gap, for example, it would be
which simplifies to
When you divide the number of events by the total elapsed time, you get the average rate. So what this formula is telling you is that the average date gap is the inverse of the average rate. Take 42 patients and divide by 7.5 years and you get 5.6 patients per year. The average date gap is 65 days or 0.18 years. If you compute 1 / 0.18, you get 5.6.
This is hardly surprising if you think about it. If you are seeing one event every fifteen days on average (half a month between events), that represents a rate of 2 per month.
Adjustments for patient load and the number needed to harm calculations
I want to propose some adjustments to the date gap calculation. Let's pretend that we are in a bizarre Einsteinian universe where time is not always constant. This is not too hard to imagine: some days seem to go very slowly and others fly by. There's a joke that is widely circulated about this concept.
If I had only one hour to live, I would spend it in a Statistics class. It would just seem to last so much longer.
Suppose the march of time is represented by a monotone nondecreasing function F( ). It has to be nondecreasing because you don't want to allow for the possibility of travel backwards in time. When the slope of F( ) is large, time marches slowly. When the slope of F( ) is nearly small, time whizzes by quickly.
Think of the curve as a hill that you are climbing. When the hill is steep you need a lot of time to move just a little bit, but when the hill is flat, you can cover long distances quickly.
Define an adjusted date gap Ai by the formula
Ai = F(Ti) - F(Ti-1)
Here's a simple example. Choose a function F that has slope 1 for five days, is flat for two days, then repeats itself.
If you use this function to compute an adjusted gap, it treats some gaps the same way: there are two days between Tuesday and Thursday, for example. But when two time points straddle a weekend, the Saturday and Sunday are ignored. So the adjusted gap between an event on Friday and an event on Monday is only 1, not 3. This adjustment counts the number of working days between two events.
Now in most medical situations, it makes little sense to ignore the weekends because people don't stop taking medications during the weekend. A more realistic use of adjustments involves tracking the cumulative number of patients seen. In the example shown above, the graph of the cumulative number of patients would be
These patients are undergoing peritoneal dialysis. Some of them experienced complications during the placement of their catheters. The patients who experienced problems were recruited on days 93, 579, 1675, and 2588. They represented the 2nd, 9th, 27th, and 39th patients.
When you compute the adjusted date gaps, you are effectively looking at distances in the vertical dimension rather than the horizontal dimension.These adjusted gaps (2, 7, 18, and 12), represent the number of patients that you have to wait between complications rather than the number of days that you have to wait between complications.
The average adjusted gap also simplifies because of a telescoping sum
which simplifies to
In the example, the average adjusted gap is (2+7+18+12) / 4 = 39 / 4 = 9.75. The denominator, 4, represents the number of patients who experience problems and the numerator, 39, represents the number of patients seen up to and including the fourth problem.
The fraction 4 / 39 represents the estimated probability that a patient will experience catheter related problems. The inverse of that probability, 39 / 4, is known as the number needed to harm (NNH). This number tells you that you would have to insert about 10 catheters in order to find one patient that has trouble with the catheter.
Each time a new patient experiences an adverse event, you get an additional adjusted gap which helps you refine the estimate of the NNH. The individual adjusted gaps can even be thought of as individual point estimates of NNH and they allow you to look for trends and patterns.
There are other adjustments that also make sense and lead to an NNH calculation. If a patient can experience multiple adverse events (infections or re-hospitalizations, for example), you might want to calculate the cumulative number of patient days at risk. The adjusted chart then measures the number of patient days between events.
Another possibility is to track the cumulative number of medications dispensed by a hospital pharmacy. Then the adjusted chart would measure the number of pills between events.
Finally, the holy grail of medical research is developing statistical measures of acuity. It seems like the doctors who do the best jobs get referrals for the toughest and most intractable patients. So a naive comparison will end up making the best doctors look like the worst performers. It is unclear what form these acuity adjustments will take, but when they become available, a cumulative acuity score will allow you to look at a risk adjusted time between events.
What is a reasonable value for NNH?
The NNH has tremendous value for safety data because it places the data in a context where it is easy for medical professionals to make informed decisions about the relative risks and benefits of a new drug or device.
Here's a simple example that I calculated from a research paper. A flu vaccine has an efficacy of 17%. It prevents the flu in about one out of every six people vaccinated. This tells you that the number needed to treat (NNT) is 6. The vaccine does not come without side effects, however. One of the side effects is fever. About 1.1 % of all patients vaccinated develop a short term fever. This tells you that the NNH is 90.
To see if the benefits are worth the risks, it is useful to examine the ratio of NNT to NNH. This ratio, 15, tells you that the vaccine prevents 15 cases of flu for every additional short term fever that has to be endured. I'm not a medical expert, but this seems like a very good tradeoff. The short term fever seems relatively mild compared to the problems caused by a bout of the flu. In fact, I'd be tempted to say that a ratio of 1 to 1 or even higher might still make the vaccine a worthwhile endeavor.
So, to set an acceptable NNH target, ask yourself how serious the side effect is relative to how beneficial a cure would be. Then set a target for NNH that makes its ratio comparable to the relative severity. Suppose, for example, that we found a drug that cured the common cold. In one out of every four patients, the sniffling, sneezing, and coughing just disappeared. But let's suppose that the drug produced a rare but serious side effect, formation of kidney stones. Kidney stones are a very serious matter. If you created as many kidney stone cases as you saved in sniffling, sneezing, and coughing, that would be an unacceptable trade-off. So how much worse are kidney stones-10 times worse, 50 times worse, 100 times worse? If you believed that kidney stones were 50 times worse--that you would be willing to endure 50 cases of sniffles, sneezing, and coughing rather than a single extra case of kidney stones, then you need to make sure that the NNH is smaller than 50*4 = 200.
Now there are complex issues involving public perception, regulator scrutiny, etc. that may dominate your concerns and force you to adopt a different standard. But setting the NNH so that it creates an acceptable ratio to NNT offers a credible medical way of determining what safety level is appropriate.
Monitoring targets with a CUSUM chart
The date gaps also provide an interesting pattern when you plot them in a CUSUM plot. The CUSUM plot examines the cumulative deviation from a target. In the example of the clinical trial, the original goal was to recruit 12 patients per year or one every 30 days. So the cumulative sums are
S1 = (30 - 56) = -26
which tells you that the first patient was recruited 26 days behind schedule. The second cumulative sum is
S2 = (30 - 56) + (30 - 37) = -33
Since the second patient took seven days longer than your target, you have fallen 7 more days behind for a total deficit of 33 days. With the third cumulative sum,
S3 = (30 - 46) + (30 - 37) + (30 - 94) = -97
you have learned that you are now more than three months behind schedule. Here's a plot of all the cumulative sums.
You can see that the pattern is consistent--with every patient recruited, you are falling further and further behind. Once in a while you make a tiny bit of progress upward, but the downward trend tells you that this study is already 4 years behind schedule.
The rules for identifying a signal in a CUSUM chart are somewhat complex. You set a vertical distance h and a horizontal distance d that define a V-mask.
(Source: www.itl.nist.gov/div898/handbook/pmc/section3/pmc323.htm)The choices for h and d are not defined well. An alternative choice is to set a Bayesian prior distribution, compute the posterior distribution for each cumulative sum and then examine the 2.5 percentile and 97.5 percentile of this distribution. If the path of future cumulative sums stays inside the 2.5 and 97.5 percentiles then the process is in control. If the path drops below the 2.5 percentile, then events are occurring more frequently than the previous trend might suggest. If the path rises above the 97.5 percentile, then events are occurring less frequently than the previous trend might suggest.
Here's an example

This chart represents the cumulative patient years between exit site infections in a cohort of patients undergoing peritoneal dialysis. Let's suppose that a change in treatment options was made after the 20th event. You want to examine the trend of the following events to see if the change led to a substantial slowing of these bad events. Although the original trend appears to persist for the next seven or eight events, the graph then takes a sharp upward swing. This increase in the amount of patient years between exit site infections shows that the change eventually led to a lower rate of exit site infections.
I'm not an expert on Bayesian methods, so most of the credit for this approach belongs to a colleague of mine, Byron Gajewski. These ideas are still in the early stage of development which may lead to some vagueness in my writing. My relative inexperience in Bayesian methods may also contribute to some of the vagueness. Please bear with me, though, because the Bayesian approach appears to be a very attractive one for safety data.
A common objection to the use of Bayesian prior distributions is that the researcher should not go into the research with preconceived notions on how the data should behave. That's a debate which I don't want to tackle today, but it is worth noting that there are some notable exceptions to the rule about preconceived notions.
First, the Bayesian approach always allows you to specify a vague prior. The vague prior can either be your acknowledgement that you don't really have a lot of information about how this experiment will come out or it can represent your effort not to incorporate any preconceived notions into the data analysis.
Second, the example that I just described involves accrual of patients into a clinical trial. No researcher would start a project unless they had at least an inkling of how many patients were out there who might qualify for the research and how many of those might volunteer for the study.
This perspective is probably accurate for pharmacovigilance studies as well. These studies are not done in a vacuum because you have already accumulated some information about adverse events during the process of getting your drug approved. It would be naive to ignore this information. In fact, the careful and judicious use of Bayesian priors might represent a formal way to combine safety information across Phase III and Phase IV trials.
Third, a process of careful Bayesian analysis ought to include the specification of not a single prior distribution, but several. It might be wise to adopt both an optimistic and a pessimistic prior distribution for an efficacy study, for example. If the Bayesian analysis midway through the trial shows that even a pessimistic prior leads to a declaration of efficacy, you have a strong case for stopping the trial for early evidence of efficacy. After all, the data is convincing enough that even a pessimist has to admit that the results are promising. If the Bayesian analysis midway through the trial shows that even an optimistic prior leads to declaration of no effect, you have a strong case for stopping the trial early for futility. After all, if the data is so disappointing that even an optimist's hopes are dashed, why go any further?
Conclusion
When you are monitoring safety for a newly marketed drug or device, the control chart represent a simple approach that is easy to apply and easy to understand. It is especially useful if the safety event is well defined. You can improve the sensitivity of the control chart by computing the date gap. Adjusting the date gap for the number of patients seen or the number of medications dispensed provides a way for you to continually monitor the number needed to harm. The CUSUM chart and Bayesian prior distributions allow you to improve the sensitivity to small but consistent changes in the signal.
This webpage was written by Steve Simon on 2007-11-18, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Adverse events in clinical trials
Monitoring adverse events during peritoneal dialysis (November 15, 2007).
One of the doctors I was working with had an interesting data set examining adverse events in patients with peritoneal dialysis. These patients start treatment with peritoneal dialysis on a specific day and are followed until they stop this treatment. Reasons for stopping peritoneal dialysis might be that the patient got better and no longer needed any treatment, the patient got worse and needed to switch to hemodialysis, or the patient died. Patients who moved out of town presumably continued their dialysis, but they were lost to follow-up in this particular study. There were two adverse events examined: exit site infections, and peritonitis. Although I ran several complex analyses on this data set, I thought it might be useful to look at a simpler approach to monitoring the frequency of adverse events using control charts.
Here's the data on when patients began treatment and when they ended their treatment.
id t0 t1
0 58 680
1 95 1416
2 189 1247
3 207 532
4 402 1136
5 412 501
6 431 1851
7 550 1414
8 581 1339
9 770 2325
10 809 1498
11 841 1339
12 880 1563
13 903 2664
14 939 1451
15 980 2920
16 1106 2103
17 1107 1291
18 1155 1792
19 1159 1654
20 1200 1968
21 1247 1574
22 1422 2247
23 1449 1544
24 1499 2310
25 1582 1755
26 1677 2142
27 1786 2920
28 1799 2108
29 1803 2729
30 2097 2723
31 2101 2639
32 2225 2419
33 2254 2920
34 2496 2920
35 2500 2920
36 2507 2920
37 2562 2730
38 2590 2870
39 2663 2920
40 2701 2838
41 2711 2920The first column is the patient id, the second column is when the patient started dialysis (number of days since the start of the review period), and the third column is when the patient ended dialysis (again in number of days).
There were 42 patients in this study. The average patient stayed in the study for 640 days (1.8 years). We will divide the number of days by 365 in most graphical presentations of the data to show time in years rather than days.
This graph shows the cumulative number of patients at risk of an adverse event at any time point in the review period.
The vertical line segments at the bottom of the graph represent the times that patients began dialysis and the segments at the top of the graph represent the times that patients ended dialysis. Each time a patent begins dialysis, the graph jumps up one unit. Every time a patient ends dialysis, the graph drops one unit. At the busiest time, there were 17 patients on dialysis. The graph falls to zero at 8 years, not because all patients were removed from dialysis, but because that represented the end of the observation period.
Notice that this graph generally climbs for the first three years, drops of a bit then is roughly level for years 4 through 8. This is not surprising because only patients who began dialysis during the eight year observation window were included in the study.
Here are the times of exit site infections:
id tx
0 643
0 657
1 143
1 203
1 756
1 900
1 1122
1 1331
2 596
3 231
4 905
4 953
4 986
6 680
6 1273
6 1410
6 1485
6 1584
7 967
9 1292
12 1442
13 1331
13 1351
13 2528
13 2658
14 949
14 1261
14 1447
16 1250
21 1267
27 1947
33 2520
33 2767
35 2769Notice that some patients experience more than one exit site infection and some patients experience no infections. You might already notice something interesting with the data. About half of the events have three digit days (days in the hundreds). Since the full time range is from zero to a bit less than 3 thousand, you would expect only about a third of these events to have three digit days. This is possible evidence that these events occurred more frequently early in the review period rather than later. A careful analysis, of course, would have to control for the number of patients at risk during these time intervals.
Here is the data on times of peritonitis.
id tx
0 218
1 652
1 1265
1 1328
3 237
7 641
7 1004
7 1080
9 978
9 1036
9 1236
9 1974
9 2116
10 904
10 1305
13 1815
13 1983
13 2082
14 949
15 2859
20 1959
24 1977
24 2089
25 1620
25 1641
27 1803
27 2354
27 2520
28 1809
28 2054
33 2351
41 2740
41 2755The graph below shows the cumulative number of patient years over the eight year study (you can get this by integrating the figure shown above). This graph also notes the occurrence of exit site infections using a red plus sign.
Here's a second graph with occurrence of peritonitis marked with a red plus sign.
The horizontal distance between successive plus signs represents the waiting time in years between the successive patients. The vertical distance between successive plus signs represents the waiting time in patient-years. It's a subtle but important difference. By calculating the number of patient-years between successive events, you can It adjusts for the fact that during the first two or three years of observation, there were fewer patient-years of data compared to successive years.
If you plot the patient-years between successive exit site infections, you will see the following control chart.
The gray region represents the first four years of the study. After the fourth year, an intervention was implemented to reduce adverse events. You can see that it had a dramatic impact at first, with the hospital twice waiting over 10 patient year between events shortly after the intervention started. More recently, it looks as if the waiting time has slipped back at least partway to the norm.
Here's the plot for peritonitis events.
Notice that there is no evidence that peritonitis is becoming a rarer event after the intervention.
This webpage was written by Steve Simon on 2007-11-15, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Adverse events in clinical trials
Tracking central line infections over time (November 18, 2007)
I'm working with a group that is tracking central line infections over time. There were 22 infections over the previous year, and the infants were divided into five risk groups. For this example, I am ignoring the risk groups.
ev# gp day | ev# gp day
1 4 24 | 12 2 113
2 4 43 | 13 1 137
3 4 43 | 14 3 153
4 4 46 | 15 5 165
5 1 47 | 16 1 185
6 2 55 | 17 1 195
7 5 55 | 18 1 228
8 4 71 | 19 1 342
9 5 90 | 20 2 342
10 5 91 | 21 4 343
11 4 102 | 22 1 363There are a varying number of patients with central lines being cared for at any given time. The number of central line days in each month is
month all gp1 gp2 gp3 gp4 gp5
1 593 70 0 67 188 268
2 624 66 48 53 222 235
3 704 44 69 75 231 285
4 578 0 80 32 115 351
5 582 38 62 61 140 281
6 441 104 36 51 82 168
7 384 64 28 38 72 182
8 521 47 156 24 103 191
9 459 35 50 23 122 229
10 562 23 51 108 93 287
11 531 46 70 67 59 289
12 1581 178 193 229 204 777Here is a plot showing central line infections and the number of central line days in each month.
Notice that each month has a few events, except for September, October, and November. Also notice that the number of central line days is almost three times as high in December than in any other month. It turns out, after later review of this data, that the surge in December was just a bookkeeping error. I am keeping it in this teaching example because it illustrates the importance of considering adjustments for sudden changes in work volume.
A control chart tracking the frequency of these adverse events would look like
Notice that we created a pseudo event at the end of the year to track the amount of time from the last event to the end of the calendar. This pseudo event is marked with an X.
The average waiting time between events is 0.5 months and the 19th event is unusual in that we had to wait almost 4 months between that event and the previous event. Something unusual happened at the end of summer that caused a welcome drought in central line infections.
You should consider whether the trends change when you account for the unusually high workload in December, and here is a control chart that looks at the number of patient years between events.
Notice that on average, you have 1.1 events per patient year.
I am still working on some graphs that show that central line infections occur more frequently in the lower birthweight groups.
How can you construct a graph like this?
The graphs shown above require computing of date gaps or waiting times. There are some special considerations when two or more events occur on the same day. You can then compute the average date gap and plot the data on a log scale. When there are variations in the number of patients seen or the volume of work done, then you can adjust these values by prorating the workload among the date gaps.
Computing date gaps or waiting times. The first graph displays the date gaps (also called waiting times) between successive events. The first event occurs on January 24, so you waited 23 days from the beginning of the calendar year (January 1). The second event occurs on February 12. There are 7 days left in January, and when you add that to the 12 days in February, you get a date gap of 19 days.
Two or more events on the same day. There are actually two events on February 12. How do you handle two or more events on the same day? There are several approaches that work reasonably well. The one I like is to consider that an event that occurs on a given day occurs at a random time between 0 hours and 24 hours. We don't know what that time is, so for convenience, we set the time to 12 hours or noon. If there were two events on the same day, you could place both events at noon, but then you have a zero difference, which leads to some complications. Instead, place one of the events at 6 hours and the other at 18 hours. If three events occur on the same day, place the first event at 4 hours, the second at 12 hours and the third at 20 hours. If four events occur on the same day, place the first event at 3 hours, the second at 9 hours, the third at 15 hours and the fourth at 21 hours.
With two events on the same day, this approach effectively sets the waiting time between the two events at half a day. This seems intuitive enough--the events could be separated by no more than 24 hours and no less than 0 hours, so a good compromise is to split the difference. When you use this approach, the "extra" half day is effectively taken from the date gaps on either side. So the number of days between Jan 24 and the first event on Feb 12 is actually 18.75, not 19 and the number of days between the second event on Feb 12 and the event on Feb 15 is actually 2.75 rather than 3.
The graph shown above illustrates how the waiting times between events would be calculated if you made no adjustments for multiple events on the same day. The three waiting times between events occurring on Jan 24, Feb 12, Feb 12, and Feb 15 would be 19, 0, and 3.
This graph shows how you would adjust for two events on the same day. A value of 0.5 days is assigned to the waiting time between two events occurring on Feb 12. The waiting time between Jan 24 and the first event on Feb 12 is reduced from 19 days to 18.75 days. The waiting time between the second event on Feb 12 and the event on Feb 15 is reduced from 3 days to 2.75 days.
If you continue with the rest of the calculations, the date gaps are
23.00 18.75 0.50 2.75
1.00 7.75 0.50 15.75
19.00 1.00 11.00 11.00
24.00 16.00 12.00 20.00
10.00 33.00 113.75 0.50
0.75 20.00There is a small amount of time left over at the end of the calendar year (3 days to be precise). Although it is not unreasonable to just ignore those 3 days, in some cases you can end up ignoring valuable information. So place a pseudo event on December 31. The last date gap, 3 days, represents a lower bound, we know that we will have waited at least 3 days from December 28 to the next event.
Computing the average date gap. The date gaps form a telescoping sum, and the total is simply the difference between the starting date of the time window and the ending date of the time window. In our case, this value is 365 days.
When you compute the average date gap, this represents length of the time window divided by the number of events. In our case, the numerator is 365 and the denominator is 23 (remember that we placed a pseudo event at the end of the calendar year), which produces an average date gap of 15.9 days. If you reversed the order of the division, placing 23 in the numerator and 365 in the denominator, you would get 0.063, an estimate of the daily rate of central line infections. Multiply by 365 to get 23, the estimated yearly rate.
There is a certain intuition to these calculations. Event rates and waiting times are inversely related. A high event rate implies a short waiting time between events. A low event rate implies a long waiting time between events.
Log transformation. The date gaps are typically skewed, so I use a log transformation on the data. I also reverse the scaling so that small date gaps appears at the top of the group and large date gaps appear at the bottom. This orientation makes improvements in quality (bad events occur less frequently and with larger date gaps) appear as values near the bottom of the graph and declines in quality (bad events occur more frequently and with smaller date gaps) appear as values near the top of the graph.
Adjusting date gaps for the volume of work done. When there is substantial variation in the number of patients seen or the amount of work done, then you can adjust the date gaps using simple linear interpolation. You may be more familiar with this approach as "prorating" or dividing in a proportionate fashion.
The first event occurred on January 24. There were 593 patients days in that month, so the prorated proportion of time until January 24 is
593 * (23/31) = 439.97.The second event occurred on February 12, so that gets the remainder of the January patient days plus a prorated proportion of the February patient days.
593 * (8/31) + 624 * (10.75/28) = 392.60.The half day between the first event on Feb 12 and the second event on Feb 12 translates into
624 * (0.5/28) = 11.14.The full list of adjusted date gaps are
439.97 392.60 11.14 61.29
22.29 172.71 11.14 355.66
431.48 22.71 211.93 211.93
454.52 296.31 176.40 287.06
123.87 475.06 2165.15 25.50
38.25 1020.00 153.00Note that this is also a telescoping sum. When you add all the values together, you get 7,560 patient days, which is the total number of patient days across all twelve months. The average is 328.7 patient days or 0.9 patient years. This tells you that you accumulate a bit less than a full patient year between successive central line infections. The inverse value is 1.11. You estimate that there are 1.11 infections per patient year of exposure.
Placing control limits on the chart. You can place control chart limits on this graph to determine when a sudden change in the infection rate has occurred. The average date gap (or adjusted date gap) represents the center line of the control chart. This is the easiest and also the most important reference line to compute. A classic rule for control charts is to declare a special cause whenever you see eight consecutive data points on the same side of the center line.
As a technical note, this rule was developed for symmetric distributions. The waiting time is usually skewed, and research needs to be done to identify whether the rule of eight consecutive points on the same side of the center line still applies or if a slightly different rule (e.g., nine consecutive points above the center line or six consecutive points below the center line) might produce better results.
The control limits can be computed using several different ways. Waiting times often follow an exponential distribution, and you can compute limits based on this distribution. Another approach is to use an individual value control chart (an XmR chart). The XmR chart requires the computation of a moving range, a range between pairs of consecutive data values. The first four date gaps are
23.00 18.75 0.50 2.75so the first three moving ranges are
|23.00-18.75| = 4.25
|18.75- 0.50| = 18.25
| 0.50- 2.75| = 2.25The entire list of moving ranges is
4.25 18.25 2.25 1.75
6.75 7.25 15.25 3.25
18.00 10.00 0.00 13.00
8.00 4.00 8.00 10.00
23.00 80.75 113.25 0.25
19.25 17.00The average of these moving ranges is 17.4. The formula for the control limits is
15.9 +/- 2.660 * 17.4.
The lower limit is negative and will be ignored (a negative lower limit on an XmR chart with skewed data is not uncommon). The upper control limit is 62 days, which corresponds to an infection rate of 6 per year. What this limit tells you is for this process anytime you go more than two months without an infection, you need to investigate. The process may have suddenly improved, though you also need to be on the lookout for a tendency to underreport problems.
Note that the graph above used months rather than days between events. This is simply a linear transformation (divide everything by 30).
The adjusted date gaps are
459.1 401.3 0.0 66.9
22.3 178.3 0.0 361.7
431.5 19.3 211.9 211.9
454.0 292.2 176.4 284.7
123.9 479.5 2212.1 0.0
51.0 1020.0 102.0and the average adjusted date gap is 328.7. The moving ranges for the adjusted date gaps are
57.8 401.3 66.9 44.6
156.0 178.3 361.7 69.8
412.2 192.7 0.0 242.1
161.8 115.8 108.3 160.9
355.6 1732.6 2212.1 51.0
969.0 918.0and the average moving range is 407.7. The upper control limit is
328.7 + 2.660 * 407.7 = 1413.2.
Divide every value by 30 to get estimates in patient months rather than patient days.
This webpage was written by Steve Simon on 2007-11-18, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Adverse events in clinical trials
Tracking adverse events during kidney biopsy (November 19, 2007).
This is a major revision of the March 14, 2007 and April 5, 2007 weblog entries. I have been helping a colleague who is interested in monitoring the safety of kidney biopsy events. He was kind enough to let me use his data set on my web pages in order to illustrate some new methods for monitoring adverse events. This data set allows you to see some examples of the use of control charts to track adverse events. Here is the raw data.
2003-01-12 ---- 2003-01-28 ---- 2003-02-01 ---- 2003-02-14 ----
2003-02-14 ---- 2003-02-15 H-NO 2003-03-09 ---- 2003-03-17 ----
2003-03-22 ---O 2003-03-25 ---- 2003-03-30 H--- 2003-03-31 ----
2003-04-05 ---- 2003-04-13 ---- 2003-04-15 --N- 2003-04-19 H-NO
2003-04-22 ---- 2003-04-27 ---- 2003-05-11 ---- 2003-05-12 ----
2003-05-13 ---- 2003-05-20 ---- 2003-05-24 ---- 2003-06-02 ----
2003-06-08 ---- 2003-06-10 ---- 2003-06-22 -I-- 2003-06-23 ----
2003-06-24 ---- 2003-07-04 ---- 2003-07-06 ---- 2003-07-15 ----
2003-07-22 ---- 2003-07-25 ---- 2003-07-26 ---- 2003-07-26 ----
2003-08-01 ---O 2003-08-24 ---- 2003-08-26 ---- 2003-08-30 ----
2003-09-26 ---O 2003-09-26 ---- 2003-09-27 ---- 2003-09-27 H-N-
2003-09-28 ---- 2003-10-10 -I-- 2003-10-10 ---- 2003-10-12 --N-
2003-10-19 -I-- 2003-10-24 ---- 2003-10-24 -I-- 2003-10-26 ----
2003-10-31 H--- 2003-11-02 ---- 2003-11-07 -I-- 2003-11-07 ----
2003-11-09 ---- 2003-11-09 ---- 2003-11-15 ---- 2003-11-17 --N-
2003-11-29 -I-- 2003-12-12 ---- 2003-12-20 ---- 2004-01-03 ----
2004-01-04 ---- 2004-01-23 ---- 2004-01-25 --N- 2004-02-08 ----
2004-02-10 ---- 2004-02-14 ---- 2004-02-15 ---- 2004-02-15 ----
2004-02-17 ---- 2004-02-20 ---- 2004-02-22 ---- 2004-03-02 ----
2004-03-19 -I-- 2004-03-22 ---- 2004-03-26 ---- 2004-03-27 H-NO
2004-03-28 ---- 2004-04-10 ---- 2004-04-18 -I-- 2004-04-25 ----
2004-04-30 ---- 2004-05-02 ---- 2004-05-11 -I-- 2004-05-22 ----
2004-05-23 ---- 2004-05-28 ---- 2004-06-08 ---- 2004-06-15 ----
2004-06-20 -I-- 2004-06-26 ---- 2004-07-05 ---- 2004-07-09 ----
2004-07-09 ---- 2004-07-11 -I-- 2004-07-13 ---- 2004-07-24 ----
2004-07-30 ---- 2004-08-01 ---- 2004-08-01 -I-- 2004-08-06 H---
2004-08-07 --N- 2004-08-10 ---- 2004-08-13 ---- 2004-09-05 ----
2004-09-12 ---- 2004-09-21 ---- 2004-10-08 ---- 2004-10-12 ----
2004-10-13 ---- 2004-10-22 ---- 2004-11-02 ---- 2004-11-07 H-NO
2004-11-14 ---- 2004-11-28 -I-- 2004-11-29 ---- 2004-12-07 ----
2004-12-10 -I-- 2004-12-12 ---- 2004-12-13 ---- 2004-12-26 ----
2004-12-26 ---- 2005-01-03 ---- 2005-01-03 ---- 2005-01-09 -I--
2005-01-13 H--- 2005-01-15 ---- 2005-01-17 ---- 2005-01-17 H---
2005-01-20 ---- 2005-01-25 H--- 2005-01-28 ---- 2005-02-08 ----
2005-02-11 --N- 2005-02-11 ---- 2005-02-14 ---- 2005-02-18 ----
2005-02-21 ---- 2005-03-01 ---- 2005-03-07 ---- 2005-03-07 ----
2005-03-18 ---- 2005-03-18 --N- 2005-03-19 H-NO 2005-03-21 ----
2005-03-25 ---- 2005-04-10 ---- 2005-04-11 ---- 2005-04-11 ----
2005-04-15 -I-- 2005-04-23 ---- 2005-04-25 HI-- 2005-04-26 ----
2005-04-26 ---- 2005-04-29 ---- 2005-05-07 ---- 2005-05-09 ----
2005-05-13 ---- 2005-05-23 ---- 2005-06-06 ---- 2005-06-06 --N-
2005-06-10 ---- 2005-06-13 ---- 2005-06-19 ---- 2005-06-20 ----
2005-06-26 ---- 2005-06-30 ---- 2005-07-08 ---- 2005-07-18 ----
2005-07-22 ---- 2005-07-31 H-NO 2005-08-15 ---- 2005-08-19 ----
2005-08-21 H-N- 2005-08-22 ---- 2005-08-28 ---- 2005-08-29 ----
2005-08-29 ---- 2005-09-12 ---- 2005-09-12 ---- 2005-09-16 ----
2005-09-19 H--O 2005-09-23 H--- 2005-09-24 H--- 2005-09-25 ----
2005-09-26 ---- 2005-09-30 ---- 2005-10-09 ---- 2005-10-16 ----
2005-10-21 H--O 2005-11-04 H--- 2005-11-07 ---- 2005-11-14 --N-
2005-11-15 ---- 2005-11-15 ---- 2005-11-26 ---- 2005-11-28 ----
2005-12-02 ---- 2005-12-12 ---- 2005-12-16 ---- 2005-12-18 ----
2006-01-01 ---- 2006-01-02 ---- 2006-01-06 --N- 2006-01-16 ----
2006-01-16 ---- 2006-01-17 ---- 2006-01-20 ---- 2006-01-22 ----
2006-02-05 ---- 2006-02-06 H--- 2006-02-13 --N- 2006-02-24 ----
2006-02-26 ---- 2006-03-12 ---- 2006-03-19 ---- 2006-03-20 ----
2006-03-22 ---- 2006-03-27 --N- 2006-04-03 ---- 2006-04-03 ----
2006-04-14 ---- 2006-04-17 ---- 2006-04-30 ---- 2006-05-01 ----
2006-05-07 ---- 2006-05-09 ---- 2006-05-12 ---- 2006-05-12 ----
2006-05-13 ---- 2006-05-14 ---- 2006-05-19 ---- 2006-05-21 --N-
2006-05-22 --N- 2006-05-26 ---- 2006-05-29 ----The dates represent dates of the kidney biopsies for 239 consecutive biopsies. I have shifted these dates by an arbitrary constant to protect confidentiality. Those dates with an H represent biopsies where gross hematuria was noted (n=21). An I represents a biopsy where an inadequate amount of tissue was obtained (n=17). An N represents a biopsy where narcotics were required to control the pain (n=22). An O represents any other adverse event (perforation, hematoma, fistula, transfusion needed, prolonged hospitalization, re-admission, or graft loss, n=11). Some of these events (perforation, graft loss) never occurred in this particular data set.
The first 170 biopsies occurred prior to a major change in procedure, the use of real time ultrasound to help with needle positioning. The three month period from July through September was considered a transition period. There were 20 biopsies performed during this transition. The biopsies from October 2005 onward were considered to be part of the post implementation phase.
The plot below shows the time when certain biopsy events occurred. The shaded region represents the transition period.
Here are some control charts monitoring the frequency of events.
Note that about one in 11 biopsies involves hematuria and that rate is relatively stable before, during, and after the transition.
In contrast, the problems with inadequate tissue have virtually disappeared. there are more than 80 consecutive biopsies since the transition without a single one resulting in inadequate tissue. If you remove this last data point from the calculation of the control limits, you will find that the average number needed to harm prior to the transition was approximately 9. So before the use of ultrasound for kidney biopsies, we were sending every ninth patient back for a second biopsy because of inadequate tissue. After using ultrasound, we have stopped experiencing any problems with inadequate tissue.
Approximately every tenth biopsy required the use of pain control medication. This rate is stable before, during, and after biopsy.
Other adverse events occur in one out of every 20 biopsies on average. We are currently experiencing a large gap in other adverse events, and if this continues for 9 or 10 more patients, we will have evidence that this rate has recently slowed.
This webpage was written by Steve Simon on 2007-11-19, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page. Category: Adverse events in clinical trials

I
recently published a book, Statistical Evidence in Medical Trials, What
do the Data Really Tell Us? through Oxford University Press. A good
summary of what this book is about appears on the back cover:




























