Analyze Phase: Lean Six Sigma in Healthcare Tutorial

5.1 Module 5 Analyze

Hello and welcome to the fifth module of the Lean Six Sigma in Healthcare course offered by Simplilearn.   The DMAIC process is the foundation of LSS projects. DMAIC stands for the project phases: Define, Measure, Analyze, Improve, and Control.   This module is dedicated to the third of these phases, analyze. In this phase the data acquired is analyzed to validate the hypotheses of vital few causes having the highest impact on the problem.   Let us explore the objectives of this lesson in the next screen.

5.2 Topic 1 Importance of Analyze Phase

The Analyze phase is considered to be the most important phase of the Lean Six Sigma project process because it allows for fact-based discovery rather than jumping to solutions based on pre-conceived notions. It makes this possible because analysis serves the purpose of helping the project team learn about the causal relationships between the input and output variables of a process which will allow them to then determine the highest impact solutions.   Ignoring this phase, just going through the motions, or jumping past it too quickly could result in a project team traveling down the wrong path to an ineffective or sub-par solution.   On the other hand, staying too long in this phase could lead to a situation commonly referred to as “analysis paralysis” where a project team keeps analyzing and does not do anything with the conclusions they come to.   The reasons for this situation vary from team to team and project to project but some common explanations are: Lack of confidence in the team’s ability to implement a solution Fear of changing the status quo Fear of making a mistake Fear of not having enough information to make an informed decision   One effective way of avoiding or overcoming analysis paralysis is to set clear goals of what has to be analyzed and what conclusions need to be made, and then sticking to that plan. Of course, sometimes the data reveals new angles that require new hypotheses of causal relationships to be explored but once again, the team must always keep the project goal in mind and not get lured into finding the panacea solution. Remember the Pareto principle – 80% of the problem is probably due to 20% of the causes.   In the next screen, we’ll discuss the types of analyses required in the Analyze phase.

5.3 Topic 2 Data Validation

As we can see from the data validation table given on the screen, a visual analysis is very useful for both existing and new data. Visual analysis can be performed by plotting the data on different types of graphs and also representing different points of view of the same data.   Let us look at some of these graphs in the next screen.  

5.4 Topic 3 The Cause and Effect Relationship

The mathematical equation Y=f(x) is a very important concept in LSS. What it basically means is that Y (the output variable) is a function of x (the input variable). While this representation is simplistic since we know that most outputs are usually the result of more than one input, the real representation is most likely much more complex.   It also suggests that the outcome (or problem) is a function of one or more causes (or x’s). And the degree of impact of each of these causes varies; the critical x’s having the greatest impact on the outcome. The equation may look more like this:   Y=f(x1+5x2+3x3+…..+xn)   So what we’re trying to do in the Analyze phase is determine which of the causes has the highest impact on the outcome (like the one with a multiplier of 5 in the equation). In other words, which are the critical x’s or key process input variables.   This equation is important because it allows us to have a deeper understanding of the cause and effect relationship and sets the stage for more robust statistical analyses, if required.   While statistical analyses like hypothesis testing and linear regression can help determine the degree to which certain factors contribute to the outcome, for this course, we’ll focus on effective, yet simpler, methods such as Pareto charts and 7 wastes.   To actually make this formula a working mathematical equation that can then be used for modeling and prediction, it stands to reason that the data in the model must be continuous and not discrete. But since we won’t be doing any modeling or statistical analyses in this course, suffice to say that the equation Y=f(x) is an excellent way to visually represent what critical x’s the project team is working to determine.   There are two necessary analyses required to determine the critical x’s, process analysis and data analysis. Let us look at process analysis in the next screen.

5.5 Topic 4 Data Analysis

Data analysis is very similar to the validation step we covered in the beginning of this module in which the project team used data to answer their questions. However, the types of questions are different this time around because the purpose of the analysis is to uncover the root causes that have the greatest impact on the problem. Earlier it was to check if the data was any good to use.   So data analysis will provide the project team insight into discovery of the vital few x’s (or KPIVs). This is accomplished by asking questions (or formulating hypotheses and then using the data to answer them. Different charts and graphs can be used and data can be represented in such ways to provide those answers.   We’ll cover the 3 most commonly used graphs in this course, the box plot, the pareto chart, and the histogram.   In the next screen we look at the Box plot

5.6 Topic 5 Case Study

The data collection effort at Mercy West hospital went well. After a few hiccups in the beginning where people were filling in the data travellers completely, a small information session in each department resolved the issue in short order. The team was very glad that they physically followed the first few travellers and caught the errors before the damage was done.   After 3-weeks, data gathering effort was complete, all the results were transcribed and tallied in an Excel spreadsheet. Now it was time to validate the spreadsheet data against the data sheets to make sure there were no errors.   The first step was to simply create a scatterplot to see if there were any outliers in the data. They discovered a dozen data points that indicated cycles of several months, which really didn’t make sense. After a little verification in the excel sheet against the data traveller, they discovered that the data entry was done correctly, but for some reason, Excel formatted it wrong and inversed the month and the date for each of those data points. The team fixed the error but also decided to check all the possible inversions by sorting the data by date and checking if any data points were outside of the sample date range.   Now that the data was validated, the analysis could begin. John and his team started with a Pareto chart to determine which activities accounted for most of the cycle time. To do this, they calculated the mean duration for each activity in the process.   The next thing they did was to calculate the Value added time and NVA time for the process as well as the Takt time (or pace required). They discovered that in order to meet the pace, the slowest element would have to triple its productivity. That bottleneck was actually administrative clerk who was charged to gather all the “releases” prior to closing the discharge file. This made complete sense, so the team had to return to the data and view the cycle time for each of those release elements.   A boxplot was created for each of them and the two with the greatest variation were release logistics arrangements, and continuing care.   Next, john had his team create a histogram for each of those two activities so they could have an idea if the long delays were a regular occurrence or an exception. In both cases, the histograms were skewed to the right which indicates that the mean occurred fairly regularly but there were many instances on longer delays.   The team then decided to see if there was a correlation between the delays and the time and day the discharge order was commenced. To do this, they simply plotted the cycle time of the activity against the time of day the discharge was started. There seemed to be a causal relationship that suggested that any discharge orders after 10 am resulted in longer cycle time. Most interesting, though is the relationship between cycle time and day of week. It seems that discharges on Tuesday and Wednesday have the longest cycle times while Thursday had the shortest.   The team now had some interesting insights to focus their improvement efforts.

5.7 Quiz

You will now attempt a quiz to check your understanding of this lesson..

5.8 Summary

In this module we discussed some of the important elements of the Analyze phase of an improvement project. Here is a quick recap of what was covered in this module: Analyze phase is the most important phase because it helps the team learn about causal relationships. There are two types of analyses in Analyze phase: Data and Process. Cause and effect is mathematically represented as Y=f(x) Process Analysis is used to determine process factors that influence the most Apart from VA and NVA, there is another type of ‘Value Add’ known as Business Non-Value Added (BNVA). Data analysis provides insight into discovery of vital few causes.

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Work Email*
Phone Number*
Job Title*