Question: Is it Appropriate to Plot Averages on a Process Behavior Chart?

I’ve gotten this question a few times recently, basically asking if it’s OK to plot metrics like these on a Process Behavior Chart (PBC):

  • Weekly average emergency patient waiting time
  • Monthly average lost sick days
  • Daily median waiting time for clinic patients

The concern gets expressed in terms of, “I was taught it was dangerous to take an average of averages.” See here for an example of that math dilemma, which is worth paying attention to in other circumstances. We don’t need to worry about it with PBCs.

In a PBC, we are plotting a central line that’s usually the average of the data points that we are analyzing. So, it’s an average of averages, but it’s less problematic in this context.

I asked Donald J. Wheeler, PhD about this and he replied:

“The advantage of the XmR chart is that the variation is characterized after the data have been transformed.
Thus, we can transform the data in any way that makes sense in the context, and then place these values on an XmR chart.

PBC is another name for the XmR chart method.

In his excellent textbook Making Sense of Data, Wheeler plots weekly averages with no warnings about that being problematic, as shown here:

I did an experiment with a made up data set.

The data consist of individual patient waiting times. The only thought I put into it was that waiting times might get longer as the day goes on, so I built in some of that.

Here are the X chart and the MR chart:

When I look at the PBC of individual waiting times, it’s “predictable” (or “in control”) with an average of 31.42. The limits are quite wide, but I don’t see any signals. I had 7 consecutive above the average (dumb luck in how I entered the data). I could see a scenario where, for example, afternoon waiting times are always longer than morning waiting times, so we could see daily “shifts” in the metric perhaps).

I then plotted the average waiting time for each of these 5 days (using a minimal number of data points to test these charts with admittedly minimal effort to start).

The average of the averages is 33.23. Not a huge difference.

The PBC for the daily averages is also predictable, with narrower limits, as I’d expect.

I think it’s fine to plot a series of averages. What matters most is how we interpret the PBCs — to avoid overreacting to every up and down in the data, for example.

2 Comments

  • Simon Dodds Reply

    The obvious differences between the two charts is the and the reduction in the number of points and the narrowing of the process limits caused by the averaging. Averaging is throwing away data – so we need to be aware that aggregating a limited set of data will reduce our diagnostic ability to identify potentially significant diagnostic signals that have assignable causes from which we can learn. There is a further risk of doing this when designing the required flow resilience we need to buffer common cause variation … by suppressing the actual point-to-point variation we may under-estimate the buffering capacity we need and run the risk of tipping our design over the chaotic transition point. Your experiment could be extended to include some signals in the source data set and then see what happens to them when you aggregate. 🙂

  • Mark Graban Reply

    Thanks for the comment, Simon. Yes, there’s some risk caused by aggregating data. There are some scenarios where a weekly metric shows a signal (a data point above the Upper Natural Process Limit) and that gets lost when the data is aggregated into a monthly metric.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.