How to Improve (and How to Tell)

Original Contribution

How to Improve (and How to Tell)

“If you can’t measure it, you can’t improve it.”

This quote has been attributed to Dr. William Edwards Deming, often described as the father of quality improvement. While that isn’t actually what he said and not entirely what he meant, it is true that measuring indicators of performance gives managers an objective tool to use when improving the quality of their systems.

Key performance indicators (KPIs) are often used to help maintain a “10,000-foot” overview of the way a system is functioning. They can also help determine if an improvement effort actually enhances performance. To demonstrate how to use KPIs, let’s walk through a sample improvement effort.

Example: STEMI Care

To help make this a practical discussion, let’s look at a hypothetical system that’s trying to achieve the American Heart Association’s Mission: Lifeline recognition for STEMI care.

One of the requirements of this recognition is a first-medical-contact-to-device time under 90 minutes 75% of the time. For the sake of this discussion, let’s say this system is fortunate enough to have cardiologists meet them at the ED door on every STEMI alert and inflate balloons in less than 10 minutes from arrival. We need to focus on the EMS component of this metric. Let’s call this EMS time and define it as the time from patient contact to departure from scene (since we can’t change our transport distance, let’s leave that out).

If we look at our data, we can create a “run chart” that shows the average EMS time for each quarter in the past two years. Figures 2A–D show simple run charts that describe the various components of EMS time. We can see how our average times for each quarter compare to our goal. Each quarter is represented by a bar graph, while the goal is the straight line running across all quarters.

For the purposes of this article, we’ll focus on just one of these components, time to first ECG. The same concepts discussed here can obviously be applied to any of these components.

Almost all electronic health records allow us to pull this data into reports that export data into Microsoft Excel or similar spreadsheets; some EHR systems will create these run charts for you. Figure 1 shows an example of the time to first ECG for suspected ACS patients from one EHR.

Our goal is to obtain a 12-lead ECG within five minutes of patient contact for all adults with nontraumatic chest pain. The charts in Figure 2 show our average performance but don’t describe how well we’re performing on our actual goal of complying with the five-minute target 95% of the time. To do this we use Excel to add a new column to indicate if each individual call had a 12-lead obtained in under five minutes. Excel has a handy feature called Pivot Tables that allows us to easily and quickly summarize large amounts of data. Table 1 is an example of such a pivot table that organizes the data by quarter and includes both average data and the rate of compliance in each quarter. We can now visualize both quarterly average times and the monthly compliance rate. Figure 3 does this by adding a line chart showing the compliance rate for each quarter. In this chart the goal line we’ve shown is for compliance rate rather than average time.

In and Out of Control

One additional thing we should look at before attempting to figure out where our problems are is to see if the variation in performance from month to month is within what we would expect (because this is reality, we’re never going to have the exact same performance quarter to quarter) or if it’s outside of expected variation.

Deming called these variations common cause (the normal type) and special cause (abnormal). If the only thing we see in our process is common cause variation, this is expected, and we shouldn’t waste our time trying to identify a problem in the system. If, on the other hand, we see special cause variation, we should suspect that something “special” changed in our process. In this case it’s worth our time to figure out what it was.

For example, if our response time performance has been stable over time and suddenly increases, we’d want to know if something changed (like increased traffic congestion from road construction, for example) or if that increase is just random and will go back to normal. There are several rules for determining if variation has a common or special cause. Table 2 shows a general set of these rules.

Deming and his colleague Dr. Walter Shewhart developed a process for rapidly identifying processes that are “in control,” meaning there is only common cause variation, or “out of control,” meaning they have special cause variation. Data is plotted onto a process control chart, also known as a Shewhart chart. This is done by plotting the individual data points (in our case quarterly compliance with our goal) on a chart over time. The average is then calculated for all these points and plotted across the chart. Finally upper and lower control limits are plotted across the chart at three standard deviations above and below the center (average) line.

Using the rules from Table 2, we can see from Figure 4 that our process (time to first ECG) is “in control.” That is, only common cause variation exists. Another way of saying this is that the only variation present is within the limits we’d expect. We should not spend time trying to find the reason for this variation. On the other hand, if we saw a single quarter that was outside our control limit, we should consider what happened that quarter. Perhaps we had a software update to our monitors that required medics to input many variables into the monitor before it would acquire the ECG. In this case we would want to talk to the vendor to change the software and allow us to get the ECG faster.

Although our time to first ECG is “in control,” it isn’t meeting our goal. It’s important to understand that a process control chart is only one part of improvement efforts. They only allow us to focus our energy on the things we should focus on and not “chase squirrels” when we don’t need to. Instead we should focus on improving the compliance.

Digging Deeper

From Figure 3 we see that while we are below a five-minute average every quarter, our actual compliance is below our goal of 95%. This is because averages don’t tell the entire story. An often-used example of this shortcoming is that human beings, on average, have only one testicle. While it may be roughly true, it isn’t very helpful. So let’s dig deeper into this data and see if we can’t find some outliers that suggest areas for improvement.

We can add a bit more data to our average bars from Figure 3. Since we know that wide variation can be hidden within an average, it’s helpful to look at an indicator of the range of data summarized by an average. One common way of doing this is to look at the intraquartile range (IQR). This concept takes a column of numbers, sorted from smallest to largest, and identifies the values at the first, 25th, 50th, 75th and 100th percentile. Figure 5 adds a line to each average bar in Figure 3. These “error bars” indicate the 25th- and 75th-percentile values for each quarter. Looking at this we can see the values at the 75th percentile clearly exceed the goal of five minutes. This indicates there is wide variation in the times each quarter. This variation could be evenly distributed across the system; alternatively it could be that just a small group of medics or units contribute most of the outlying data. We can check for this by using Excel pivot tables to summarize our data by unit or by individual medic.

In Figure 6 we plot the average time to first ECG and compliance with our five-minute goal for each medic unit. We can see from this that on average, all but one medic unit beat the goal, but the actual compliance rate of each call is below our goal of 95%. The exceptions are Medic 1 and 30. If you look at the data table on this chart, you can see that Medic 1 had exactly two patients who needed an ECG. It turns out Medic 1 almost always arrives after another unit. Similarly, the worst compliance is Medic 59, which had a low average time (3.78 minutes) but a low compliance (50%). This is because they also only had two patients. You can see this reflected in the wide variation bars. Medic 59 is one of our community health paramedic units and doesn’t often respond to 9-1-1 calls.

Overall we see that each unit has roughly the same pattern as the overall system; there isn’t one outlier unit. If there were, we could consider the reasons.

Now let’s look and see if there are a few individual medics causing the compliance to drop. Figure 7 plots the time to first ECG by medic, sorted from lowest to highest. The goal of five minutes is plotted for reference. We also plotted the number of cases each medic saw on top of this chart as an alternative view to the data table in Figure 6. Like with the outlier medic units, several of the medics with the slowest average times had very few actual cases. In these cases one long time will dramatically influence the average.

Another way of looking at this is with a Pareto chart. These charts plot a value against its cumulative percentage of the whole. They are designed to help you rapidly identify areas needing the most improvement. As an example, let’s look hypothetically at the number of mechanical breakdowns of each of seven ambulances. Table 3 shows the raw data. We see the number of breakdowns for each ambulance, sorted from most to fewest. Using this data we calculate the percentage of all breakdowns for each unit and the cumulative proportion of all breakdowns. Figure 8 shows the Pareto chart for this example. These charts are good to identify where 80% of the issue lies in systems where performance is not evenly distributed.

If it’s with a relatively few units, your time would be best spent addressing those individual units rather than trying to fix units that aren’t a problem. In Figure 8 we see 80% of our breakdowns are caused by only three of our units. We can apply this concept to our time-to-first-ECG metric and see if there are just a few units or medics contributing disproportionately to our low compliance. In Figures 9 and 10 we see Pareto charts for time to first ECG broken down by unit and medic. In each of these charts, 80% of the poor compliance is marked, along with the number of units or medics responsible for that 80%. Neither figure indicates that the problem is isolated to just a few units or medics; instead it appears evenly distributed across the system. This is another indication that if we wish to improve our compliance with this metric, we need to address it from a system standpoint and not focus only on one medic or unit.

The PDSA Cycle

Based on our analysis of our data, it appears there is no individual cause for our poor compliance with time to first ECGs. We need to find a way to identify a workable, systematic solution if we want to improve this metric. A frequently used scheme for doing this is called a PDSA cycle (Figure 11). This is an intuitive approach to empiric improvement. The cyclic nature indicates there are often multiple iterations of an improvement effort.

In the first step (plan), we analyze our existing data and try to come up with some potential strategies to improve. These strategies should be specific and have a carefully defined expected outcome: What do we expect to happen, and how would we measure it? Once such a potential strategy has been developed, test it to see if it works (do). Implement change on a small scale (say at one station for one shift only) for a short period (say a week). The idea here is to quickly get feedback on the solution and identify any issues it may have.

We should collect the data we identified during our planning phase. After we’ve given the solution a bit of time, we analyze our experiences (study). The data collected in the do phase should obviously include the primary metric but also subjective things like medics’ opinions on the solution and suggestions for fine-tuning it. If the solution improved the metric and no additional modifications were identified, the solution should be implemented on a larger scale (act). If the process didn’t improve, abandon it and continue the cycle with additional solutions until you find one that works and can be implemented across the system.

In our example, let’s say a group of medics notice they seem to spend a lot of time after first reaching the patient with introductions and taking a history while trying to simultaneously apply ECG leads. They also note first responders are almost always on scene before they arrive. The medics wonder if they might have the first responders apply the ECG leads before they get there. They propose to use a PDSA cycle to try to improve the time to first ECG for their shift at their station.

They identify several potential obstacles (the first responders don’t have ECG leads and don’t know where to apply them) and solutions (provide them the ECG leads from EMS supplies and replace what’s used; provide a brief training course on how to apply them). They decide they’ll look at the time to first ECG for the next four shifts (they see a lot of patients with chest pain) and see if it makes an improvement. They’ll also be sure to get feedback from both medics and first responders at the end of the four-shift period. Their baseline performance is 65% compliance with the goal of getting a 12-lead ECG within the first five minutes of patient contact. They want to improve that to greater than 95%.

With a plan in place, they do their change for four shifts. At the end they study their results. They’ve seen 25 patients with chest pain, and their average time to first ECG didn’t change much. It started at 4.3 minutes and was 4.0 minutes with the new process. Their compliance with their goal, however, improved to 85%, solely by decreasing variation and prolonged outlier times. They discuss the feedback they’ve collected and decide giving the first responders monitor cables in addition to electrodes might further improve times. They arrange to provide these cables and then go another four shifts. Again the average time doesn’t change much, but the compliance increases to 98%. They don’t identify any unintended consequences, and their administration can absorb the cost of the cables. They now roll out the change to all three shifts at their station and see similar results. Finally they implement the change across the entire system.

Types of Performance Measures

There are four broad categories of performance measures:

1. Structure measures look at the things a system must have to care for patients. As an example, a system needs functioning ECG monitors. A structure measure might track the number of monitor failures each month.

2. Process measures look at the things we do when caring for patients. These are typically based on agreed-upon industry “standards” and, hopefully, evidence-based medicine. The extent to which we give adult patients aspirin for suspected acute coronary syndrome is an example of a process measure.

3. Outcome measures look at the impact we have on patients. These require data from hospitals and as a result are more challenging for EMS systems to measure but ultimately are the most important indicators of the things our patients care about. The most commonly used outcome measure in EMS is cardiac arrest survival rates. Entities such as the CARES registry help systems measure this outcome.

4. Balancing measures help us ensure we’re measuring the right things. It’s easy to overly focus on improving one aspect of a system, only to lose sight of the impact this has on other parts of the system. For example, emergency departments rightly focus much time and energy on improving the care of septic patients. A good balancing measure would look at what happens to other patients in the ED at the same time as septic patients. Did they suffer because attention was drawn away from them in favor of the septic patient?

Conclusion

With any improvement effort we must be clearly focused on what we want to achieve and honest about whether we achieved it. In this case we can use a run chart to compare our baseline performance and subsequent performance after our improvement effort. Figure 12 is a continuation of the process control chart from the last two years (Figure 4) but with an additional four quarters of hypothetical data through the end of 2017. We have also added annotations to mark where the improvement effort occurred and what the difference in average compliance was before and after the change.

To do this in Excel, we added several rows of additional data representing the additional quarters of data. The new center line and upper/lower control limits are based on the new data. You can see that the new process occurred in the first quarter of 2017 and resulted in an improvement in average compliance rates from 77.5% to 96.0%. We are now meeting our goal of obtaining an ECG within five minutes of patient contact 95% of the time.

Deming’s actual quote from his 1994 book The New Economics for Industry, Government, Education is, “It is wrong to assume that if you can’t measure it, you can’t manage it—a costly myth.” Deming is frequently associated with the tools of quality improvement, which may be why he has been misquoted, but his point is that measurement, while necessary for improvement, is not sufficient.

We have described several tools for measuring and improving the performance of EMS systems; tools like run charts, process control charts, Pareto charts and PDSA cycles. It is frighteningly easy to get absorbed in analyzing data, much like a modern version of divination, just using Excel instead of tea leaves. Management consultant Peter Drucker noted, “Management is doing things right. Leadership is doing the right things.” For a quality improvement process to be successful, we certainly need managers who can use the tools of performance measurement and improvement, but we must also have leaders who understand what is important to measure and, most important, that there are some things that can’t be measured. Some of those things, like the intrinsic value of a medic’s compassion, are more important than any performance indicator. May your EMS system have both managers and leaders and be full of both measurable excellence and unmeasurable compassion.

Jeffrey L. Jarvis, MD, MS, EMT-P, FACEP, FAEMS, is EMS medical director for the Williamson County EMS system and Marble Falls Area EMS and an emergency physician at Baylor Scott & White Hospital in Round Rock, TX. He is board-certified in emergency medicine and EMS. He began his career as a paramedic with Williamson County EMS in 1988 and continues to maintain his paramedic license.