Analyze Phase of Lean Six Sigma Black Belt Tutorial

4.1 Welcome

Hello and welcome to the Lean Six Sigma Black Belt program offered by Simplilearn. Here, we will be discussing Analyze phase, the fourth Section. Let us discuss the agenda of this section in the next slide.

4.3 Agenda

This section is split into 8 lessons. In lesson one, we will cover some basic considerations; understand the objectives of Analyze phase; and some techniques around Visually Displaying Data in the form of Histogram, Run Chart, Pareto Chart, and Scatter Diagram. We will cover what to be used and when. After that in lesson two, we will learn value stream analysis. In this lesson, we will cover the details of Value-Added Analysis, waste reduction using Lean techniques, and process map building. In the next lesson, i.e., lesson three, we will understand the different sources of variation; we will use cause and effect diagrams, affinity diagram, and box plots to understand this better. After that, in lesson four, we will go into the details of understanding regression, in which we will cover simple linear, multiple linear, curvilinear, and stepwise regression types. In lesson five, we will get introduced to confidence intervals. In lesson six, we will learn parametric hypothesis testing, in which we will learn how to do f-test, t-test, z-test, and two types of ANOVA (pronounce it as ae-noh-vah), namely, one way and two way. In lesson seven, we will learn nonparametric hypothesis testing, where we will learn how to do Mann Whitney test, Wilcoxon rank sign, and Kruskal Wallis test. In lesson eight, we will learn how to analyze categorical data and how to use current reality tree. Let us now understand some basic considerations in lesson one, before we look into the Analyze phase.

4.4 Lesson 1 Pre Analyze Considerations

In the lesson, pre-analyze considerations, we will look into the basic considerations before the Analyze phase. We will look into the agenda of this lesson in the next slide.

4.5 Agenda

In this lesson, we will provide high level introduction to the Analyze Phase. Then, we will cover the pre-analyze considerations followed by the objectives of Analyze. And finally, we will discuss the visually displaying data (including Histogram, Run Chart, Pareto Chart, and Scatter Diagram.) We will be given a small introduction on the Analyze phase in the next slide.

4.6 Analyze Phase Introduction

Analyze is a key part of the Six Sigma project. It is in this phase that all the collected data are reviewed. Apart from that, the major analysis for both inputs and outputs are also done in this phase.

4.7 Pre Analyze Considerations

The Six Sigma team before stepping into the analyze phase must have collected the baseline data and process conditions inference, that is, normality, stability, etcetera. The baseline data along with process conditions need to be presented in the tollgate meeting. In some of the forthcoming slides we will learn all the bits of information that the Baseline should have. The Six Sigma team may also decide if they wish to continue with the DMAIC (pronounced as “d-mack”) process, especially if all efforts to bring the process in stable status continue to fail. This is an important consideration because most companies may not want to waste time on projects for which data collection is a challenge. In such a scenario, a DFSS project could be initiated at the end of the measure phase. In the next slide, we will look into the pre-analyze considerations.

4.8 Pre Analyze Considerations(Contd.)

In this slide, we will understand how the baseline data should be presented. We would present information on measurement system analysis stability and normality checks, and information about the Project Y data. Finally, present all the mathematical data collected in the measure phase like Cp (pronounce it a C-p), Cpk (pronounce it as C-p-k), DPMO (pronounced as D-P-M-O), and baseline sigma. A sample baseline document has been provided as part of the toolkit. In the next slide, we will discuss the objectives of analyze.Slide8 Objectives of Analyze The key objectives of analyze are as follows.

4.9 Objectives of Analyze

First is, to analyze the value stream to identify gaps in order to fix them. Second is, to analyze what is causing the variation for the gap. Then, to determine the key driver, namely key process input variable (also known as KPIV (pronounced as K-P-I-V) or X) that impacts the variations in key process output variables (also referred to as KPOVs or Y). Next is, to validate the relationship between KPIV and KPOV (pronounced as K-P-O-V). The final objective is to test hypothesis or confidence intervals to validate assumptions. In the next slide, we will discuss the visually displaying data.

4.10 Visually Displaying Data

Before we go any further, let us discuss some details of visual display and representation of the data. Depending on the data to be presented, appropriate type of graph or tool has to be used. Data can be represented in various forms. Some of these forms are as follows. One is histogram. It is used for displaying volume of data in each category or range. For example, if we want to represent the number of people in various age groups, this is the perfect tool. Another is run chart. This is typically used for plotting data on a time scale plot. This works the best at situations when we want to demonstrate and represent the number of products produced per hour by the machine. Another example for using run chart would be to plot temperature every hour to see trend. Next is scatter diagram. It is used for finding the correlation between 2 data points. This is widely used for understanding two set of data points. Pareto chart, yet another form of data representation, is a modified version of histogram to display the data in descending order of volume. A cumulative percentage line graph is plotted to aid in 80-20 rule representation. This is a special type of histogram, where the data is sorted in descending order and a cumulative percent is plotted to help the user find out where the 80 percent lies. In summary, depending on the data and purpose of the representation, the tool and chart need to be decided. Let us discuss summary in the next slide.

4.11 Summary

In this lesson, we started with an introduction to the Analyze phase. Then, we covered Pre-analyze considerations, followed by understanding the objectives of the Analyze phase. And finally, we reviewed how to visually display data in various forms like histogram, run chart, Pareto chart, and scatter diagram. Let us now get started with lesson two. This lesson is all about value stream analysis.

4.12 Lesson 2 Value Stream Analysis

Value stream is a process of mapping the current process in such a way that it becomes apparent what activities are value add and what are not. In this lesson, we will understand more about value stream analysis.

4.13 Agenda

In this lesson, we will start with an overview of value, waste, and non-value add (NVA (pronounced as N-V-A)) activities. Then we will understand value stream with examples. Next, we will discuss value stream analysis with respect to muda. After that, we will work on building the value stream map and spaghetti charts.

4.14 Value Waste and NVA Activities

In this slide, we will understand what value, waste, and NVA are. Value defined by the customer is any activity or thing in the product or the process for which the customer is willing to pay for. All activities that customers are not willing to pay for are considered non-value add activities, also known as NVA. Waste is anything unnecessary for the product or the process. Waste can be in the form of productivity, quality, time, or cost. A waste activity consumes resources but doesn’t deliver any value to the customer. We would observe that if there is a lot of waste, the price will go up and the customer will not be willing to pay. Some NVAs could be mandatory for the business. For example, training of the resources may not be considered value-adding by the customer as it doesn’t add value to the product directly, but is something a company would invest for long term benefits. In the next slide we will discuss value stream.

4.15 What is a Value Stream

So what is a value stream? A value stream is the flow of activities that consists of value add as well as non-value add activities. It takes the raw material from the supplier and delivers the end product to the customer. Analysis of value stream is important to determine value adding and non-value adding activities in any process. Please note that this might be a new concept for some and might encounter resistance from people while doing this activity. The customer defines the “value” in a product and not the business. So, when we analyze value, always keep in mind, whether the customer perceives it as value or not. Here, let us look into a few words that we have learned in section one. Let us start with warusa kagen, or the "condition of badness" or "how bad things are", in the current condition. All warusa kagen conditions namely, muda (Japanese word for Waste), mura (Japanese word for Unevenness), and muri (Japanese word for Overburden) are identified in the analysis of a value stream. In the next slide, we will discuss an example for value stream.

4.16 Value Stream Example

Let us take an example to understand what value stream really is. A company diagnosed that ‘dents in their shelf welds’ was the key area to be rectified. They spent a lot of money in rectifying the weld dents so that their customers could be happy. Then they took the rectified shelves to their customers to identify if the rectification of weld dents had added any value to them. The customers said, “No.” They flatly refused to identify the weld dents, as their primary need was to have shelves, straight in angle. Here, the voice of customer was that they wanted shelves in straight angle, while the voice of business focused on weld dents. This disconnect often lead to the misunderstanding of what ‘value’ is for the customers. It is important to note that the VOC often gives the insight on what is value as per the customer. In the next slide, we will discuss value stream analysis with respect to muda.

4.17 Value Stream Analysis Muda

In this slide, we will learn why muda is one of the important issues to be identified. When analyzing the value stream the Black Belt needs to first identify the muda in the system. In typical cases, the muda accounts for 90% of the time spent in operations. The Black Belt needs to identify these muda activities and to start with, has to make it transparent to the Six Sigma team and discuss with them. This will ensure that everybody is on the same page, and can understand the challenges around the same and contribute to its improvement. The Black Belt can expect tremendous change resistance here. Often, muda stems from the company’s current operating culture. The Black Belt may not be wrong in expecting the leadership support to take care of this issue, but in reality winning employee support is often considered a major roadblock. In the next slide, we will continue discussing this.

4.18 Value Stream Analysis Muda(Contd.)

This is how muda is categorized. When identifying the muda, the Black Belt should look out for CLOSEDMITTS. The acronym CLOSEDMITTS is explained as follows. C stands for complexity, which refers to the unnecessary steps and excessive documentation; L stands for Labor, which is the inefficient operations and excess headcount; O stands for overproduction. It refers to producing more or producing before than customer needs; S stands for space, which is the storage space for inventory; E stands for energy. It is the wasted human energy; D stands for defects, which is the repair or rework in products; M here stands for materials, which is the scrap, or ordering more than needed; I stands for idle time. It is used to capture the time during which the material sits idle between the processes. T or Time, is the waste of time; T stands for transportation, which is the movement adding no value; and S stands for safety hazards, which means unsafe environments. CLOSEDMITTS loosely translates to the popular TIMWOODS acronym for wastes.

4.19 Value Stream Map

The map given on the slide shows how a value stream map for a technical support process look like. We will look into the inferences of this map in the next slide.

4.20 Value Stream Map(Contd.)

The inferences from the map shown in the previous slide are as follows. Only about forty percent of the time spent on the support is value added. The rest of them are non- value added. Some activities in the NVA space could be considered mandatory for businesses. These are necessary NVAs. Then the variation and other NVAs in the process should be analyzed. Organizations have necessary NVAs due to poor quality levels in the first pass of the process (without any rework). Examples of necessary NVAs are quality checks, management approvals, etc. The Six Sigma team can begin with reviewing necessary NVAs and explore options to eliminate or reduce them. This will help in cost reduction, and time spent on these activities can be utilized better elsewhere. After these activities are done, the variation and other NVAs in the process should be analyzed and opportunities for improvements should be explored. In the next slide, we will discuss spaghetti charts.

4.21 Spaghetti Charts

In this slide, we will discuss the spaghetti charts, an important Lean tool. Every Black Belt must learn how to draw a spaghetti chart for a process. The spaghetti chart is created by calling out the process blocks and then, showing the way data is moved between these blocks. The spaghetti chart will help a Black Belt understand the physical process map (where, things actually move in the system physically) and not necessarily just the value map (where, the focus is only on the value add activities).These charts graphically depict the physical movement of products and people in a process. A populated spaghetti chart with a lot of lines and curves will indicate that the product and people move up and down a lot in the process. This is typically called as waste (Japanese term for this type of waste is Muda). A densely populated spaghetti chart indicates unnecessary redundancies. It means that things move from one place to another and back and forth, instead of moving in a direction with minimum repetition. In the next slide, we will see the spaghetti chart as it is today.

4.22 Spaghetti Chart As Is

A spaghetti chart sample can be seen on this slide. In this example, it shows how a process looks like today. As we can see, there are a lot of noodle straps representing how things move from one process block to other in the process, indicating inefficiencies.

4.23 Spaghetti Chart Should Be

The slide here shows the process how it is planned to be for tomorrow. A spaghetti chart sample can be seen on the slide. In this example, it shows how a process looks like today. As we can see there are a lot of noodle straps representing how things move from one process block to other in the process indicating inefficiencies.

4.24 Spaghetti Charts(Contd.)

The spaghetti charts are excellent process mapping tools that make wastes and redundant activities obvious to the naked eye. If used properly, the spaghetti charts could also be used to show the ‘To be or the future state process.’ This can be achieved by ensuring minimum overlap of flow and placing the process blocks close to each other, wherever possible. Spaghetti charts are also known as standardized work charts.

4.25 Summary

In this lesson, we’ve learned how to identify the wastes and NVAs in a process, how to perform the value stream analysis, the use of a value stream map, and the use of a spaghetti chart in a process setting.

4.26 Lesson 3 Sources of Variation

In this lesson, we will look into the sources of variation.

4.27 Agenda

In this lesson, we will provide details around the sources of variations with two major types namely, common cause and special causes of variations. Then, we will cover the cause and effect diagram followed by the affinity diagram. We will finish the lesson with the details on how to do a box plot.

4.28 Sources of Variation

Broadly, a process output varies due to the two reasons, or the sources of variation that are as follows. First, let us see the common causes of variation. These reasons happen frequently, and come from within a process. According to the principles of statistical control, they can only be reduced and not eliminated. Common causes of variation are often unassigned to their origins and are also known as chance causes, random causes, and noise. Now let us discuss the special causes of variation. These variations happen once in a while, unusual, and are not previously observed as they come from outside the process. They are non-quantifiable variations, and by principles of statistical control, they should be eliminated if found undesirable for the process. Special causes of variation can be tracked to a reason; and are also known as non-random causes, assignable causes, and signals. For example, our commute from home to office might be between 20-25 minutes, the difference in commute time on a daily basis will be based on the common cause of variation and so we cannot assign any specific cause to it. One day it takes 45 minutes to complete the commute, which would be due to a special cause. Something unusual might have happened, like an accident, rain, etc., which led to a huge variation; and we can assign a cause to it. This can be called a special cause of variation. That was a simple example differentiating the common cause and special cause of variation. In the next slide, let us continue sources of variation, explaining two examples.

4.29 Sources of Variation(Contd.)

A Black Belt must be able to understand all the reasons that contribute to the variation in the data, which results in the process output to vary. They can use variety of tools and techniques to understand this. If the Black Belt does any project without a proper understanding of the variation, it can lead to its failure. Here is an example to understand the difference between the common cause and the special cause. A train on an average takes 30 minutes to reach from point A to point B in a city. Typically, it takes around 28 to 35 minutes. The variation is caused due to various reasons. This can be called a common cause of variation. On the other hand, a train gets delayed by 10 minutes due to signal failure, which is a very rare case. In such a scenario, signal failure due to one of the electrical equipment malfunctioning is a special cause of variation.

4.30 Cause and Effect Diagram

In this slide, we will discuss the cause and effect diagram (in short known as CE diagram), also known as the fishbone diagram. It is a very popular and an important tool that gives a list of all the issues due to which a problem occurs. The CE diagram will give an insight whether the source of variation is traceable or not, and further, concludes if it is a common or a special cause of variation. The CE diagram allows detailed non numerical investigation into what is causing the end situation to vary. The end situation here is known as effect, and the issues that cause the effect to vary are known as causes. Causes could be level one. They are causes which are visible. If the causes cannot explain the reasons for their happening, they are called the common causes of variation. Then, there are the level two causes which are the subset of level one causes, or in other words more detailed level 1 causes. Level two causes could further be explored until we reach the dead end. Let us learn how to draw a cause and effect diagram in the next slide.

4.31 Cause and Effect Diagram(Contd.)

For drawing a CE diagram, first use the tool ‘cause and effect diagram’ provided in the toolkit. Next, draw the horizontal line and name the effect. Now, brainstorm for causes contributing to the effect. In the next step, correlate the causes to each of the six M categories. Now, ask questions on why each cause happens. Finally, add the causes as twigs to the level one cause.

4.32 Cause and Effect Diagram(Contd.)

In this slide we will know how a cause and effect diagram should be analyzed. CE diagrams will help the Black Belt identify the possible causes. These causes can be prioritized using a Pareto chart. Thick cluster of causes in one area will guide to investigate further on that area. If there are only a few specific causes for the main category, it may indicate a need for further identification. If there are only a few branches for several branches, one can choose to combine them. Look for causes that repeat. These could be root causes. Look for what can be measured in each cause. This will allow validating the cause correlation.

4.33 Cause and Effect Diagram(Contd.)

In this slide, we will see how a CE diagram looks like. A cause and effect diagram tool is added to the toolkit. In the next slide, we will look into the affinity diagram.

4.34 Affinity Diagram

The affinity diagram is a business tool used to organize ideas and items into groups and categories. The tool is commonly used to capture a large number of ideas stemming from brainstorming sessions and organize them. This tool helps large teams in brainstorming, in order to have focused idea generation on particular category or group. It also helps discover connections between various pieces of information. When the ideas and items are grouped together, it can help drive brainstorming discussion to find root causes and solutions to a problem. Now let us see how the process works. The first step is to record each idea generated by people on cards or notes separately. Once all the ideas are documented, go to the next step, looking for ideas that seem to be related. After that, sort cards into groups until all cards have been used. Once the cards have been sorted into groups, the team may sort large clusters into subgroups for easier management and analysis. Once sorting is done, the affinity diagram may be used to create a cause and effect diagram. In many cases, the best results tend to be achieved when the activity is completed by a cross-functional team, including key stakeholders. In the next slide, let us discuss box plot.

4.35 Box Plot

Box plot is another useful tool that helps know the nature of variability in a process. It is also known as whiskers plot, candlestick plot, and box n whiskers plot. The box plot drawn on Minitab as well as in Excel will divide the entire data set into four parts, known as quartiles. Quartiles consist or are expected to consist of at least twenty five percent of the data set. The box is plotted for data between 25 to 75 percentiles with a line in between for the 50th percentile. There is a single line drawn coming out of the box for the top 25% and the bottom 25%, and the outlier is highlighted separately. These plots are often used in the analyze phase to understand the process’ variability. It will help see distribution of the values in several groups. Box plot also gives information about location, spread, and variability of the data. We will continue with the same in the next slide as well.

4.36 Box Plot(Contd.)

Data shown on the slide is collected for delivery hours assuming it is one of the key input variables, which can impact the performance of the output variable.

4.37 Box Plot(Contd.)

In this and the next few slides, we will learn how to construct a box plot with various considerations and interpret. To complete a box plot we can use either Minitab or QI macros. If Macros is used, click on Graph, and then Box Plot. We can see the box plot generated at the right hand side of the slide.

4.38 Box Plot(Contd.)

We can use box plots to compare between two samples’ data and their variability. In a box plot, the data is represented with a line and a box. The box represents data from the 25th percentile to 75th percentile. A horizontal line is drawn in the box at the 50th percentile. In this example, one can observe that there is a special cause in delivery hours’ data denoted by asterisk on 33hrs. While comparing multiple data sets, box plots can be drawn side by side and observed to compare skewness, spread of data, deviation or variation in the mean, etc. Plot on the slide shows some special cause variability in delivery hours and no special cause variability in delivery hours 2. Also, the degree of skewness is less in delivery hours 2. The common cause variability is less in delivery hours two.

4.39 Box Plot(Contd.)

A box plot could also be used when we are stratifying the data. This helps understand variability within and between groups. Stratification is an important technique that helps divide the data based on the group and the selected grouping characteristic.

4.40 Summary

Let us do a quick recap of what we have learned in this lesson: We started with different sources of variation namely, common and special causes of variation. Then we understood cause and effect diagram. Finally, we discussed box plot and its use in determining variability in a process.

4.41 Lesson 4 Regression

Let us start with the fourth lesson, in which we will cover regression. Regression is a vast topic with plenty of techniques. In this lesson, we will cover only some of them.

4.42 Agenda

Now let us look into the agenda of this lesson. We will start off with the objectives of regression analysis, followed by the concepts of regression analysis. Then we will look into simple linear regression and multiple linear regression. And, in the end, we will provide an introduction to best subsets regression and stepwise regression. So now let us start with the objectives of regression analysis.

4.43 Objectives of Regression Analysis

The main objectives of regression analysis are as mentioned below. Regression analysis is a statistical technique for determining the relationships between a dependent variable and one or more independent variables. The next objective is to find details of correlation between key process input variables (also known as KPIV) and key process output variables (also known as KPOV). Another objective of regression analysis is to determine how much variability in KPOV is explained by a particular KPIV. The last objective is that since regression analysis helps with a Best Fit Line, it can be used to model data for the estimation of the process performance on certain values, prediction, and future use. In the next slide, we will discuss the concepts of regression analysis.

4.44 Concepts of Regression Analysis

In this and the next few slides, we will learn some concepts of regression analysis by theory and then in practice. The first concept that we will discuss is response variables. These are the output variables, which are been tested in the regression analysis. Next concept is predictor variables, which are the input variables that impact the performance of the output variables. Let us now see what p-value is. P-value provides the probability value for significance testing. In terms of use, if p-value is less than 0.05, it is considered significant, which means a relationship can be concluded between the response and the predictor variables. The last concept is multi-collinearity. It is a statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a non-trivial degree of accuracy. Collinearity will be discussed in detail later.

4.45 Concepts of Regression Analysis(Contd.)

Now let us look into Mallow’s Cp. It is a value that compares the precision and bias of full models to the models of predictors resulting from the best subsets. Next point to be discussed is Durbin Watson statistic. It is an important statistic that needs to be selected in multiple regression, which tests for presence of auto-correlation between residuals. Next is adjusted R square, which is the coefficient of determination to be used instead of R square, when multiple predictor variables are regressed against a response variable. Another concept that we will look into is predicted R square. R square shows how well the model fits the data while predicted R square can be used to predict responses for new observations. The last concept that we will discuss here is least squares method. The method used to calculate estimates by fitting a regression line to the points from a data set, in such a way that the distance from the data point to the line square is the minimum.

4.46 Simple Linear Regression

In this slide will learn simple linear regression, an easy and a popular technique used in Six Sigma projects. In the next few slides, we will learn how to do a simple linear regression with the help of Minitab. Simple linear regression is a simple tool that can be used when there is one response variable (KPOV (pronounced as K-P-O-V)) and one predictor variable (KPIV (pronounced as K-P-I-V)) to test. The decision on how many predictor variables to test should be based on the correlation and other factors in consultation with the Black Belt and the Process Owner. The sample case tested here is Delivery Hours (KPOV) and Training Hours (KPIV) to check if there is any correlation between the two. The objective here is to find a correlation and percent variability in delivery hours explained by training hours. Training hours could have been a key take away from the cause and effect analysis done on the fishbone diagram developed earlier.

4.47 Simple Linear Regression(Contd.)

Now let us understand how to do the simple linear regression using a Minitab. First click on Stat, and in the options given, click on Regression. Another set of options will come up, in that click on Regression. Choose the variables as shown in the window on the right side of the slide.

4.48 Simple Linear Regression(Contd.)

The first graph on the left top shows the data points between the residual and per cent, and the straight line drawn depicts the data spread as a normal plot (as we can draw an approximate straight line).The residuals are thus normally distributed. Now, look at the next graph on the top right side of the slide. We will observe that the data for delivery hours are concentrated around the residual line (around 0.0), and some other data points going out as high as 0.6 residual. The residuals fall on both the sides of 0. No sign of non-random variation. Next, we will come to the 3rd graph which is placed on the bottom left side of the slide. This is a histogram that will plot the frequency of occurrence for each of the values. We can observe that most of the values are concentrated between -0.1 to 0.1, with highest concentration around 0.1. The last chart is on the bottom right side. It shows the data for each of the order that was observed. We can see how the residual value behaves over a period of time. Most of the values are around the center and shows a random behaviour, and towards the end it goes to a value of 0.6. Reading both the normal and fit plots, one observation can be classified as an outlier. This is a red-flag, telling it is a possible special cause of variation, and so further investigation is needed.

4.49 Simple Linear Regression(Contd.)

This is a table format output that is generated from the Minitab showing the details of the simple linear regression analysis. The purpose of this analysis is to find out if there is any correlation between delivery hours and training hours; and if there is any, how can that be shown in the mathematical format for prediction. The output is the regression equation that is as follows: Delivery Hours = 35.6 – (0.663 X Training Hours) (pronounced as zero point six six three multiplied by training hours) i.e., we can compute delivery hours based on training hours. Here, the R-square value is 92%, which says the correlation is very strong.

4.50 Simple Linear Regression(Contd.)

It is the continuation of the results analysis of simple linear regression from the earlier slide. This shows the analysis of variance and unusual observations. One can notice in the unusual observations table that the observation numbers 9 and 18 have either the residual value substantially large or the X value giving it a large leverage.

4.51 Simple Linear Regression(Contd.)

Now we can interpret the results. Let us look at the R square value. It is given as ninety two percent. That is a good R square value for us to use for regression model. Now, let us see the predicted R square value, which is eighty eight percent approximately. The model can predict new responses for observations in the future as well. The equation, delivery hours equals to thirty five point six minus zero point six six three times training hours can be used as a predicting model. If the current average for delivery hours is thirty two hours the current average training would be 5.5 hours. In order to get to this number, replace the variables with the numbers given. If the desired delivery hours target is thirty hours, the training hours need to be increased to eight point four four hours, according to the formula. This gives the controllable measure of where the input variable needs to be, for giving a controlled and desirable output.

4.52 Multiple Linear Regression

Now let us look into multiple linear regression. Multiple linear regression is to be used when we have only one response variable (KPOV (pronounced as K-P-O-V)) and multiple predictor variables (KPIV (pronounced as K-P-I-V)) to test. The Black Belt will use the right set of KPIV to ensure if the multiple regression analysis is done effectively. In the sample case tested here, the response variable is delivery hours, which is the KPOV (pronounced as K-P-O-V); and multiple predictor variables, KPIVs, (pronounced as K-P-I-V) are training hours and packaging weight. Here, the objective is to find a correlation and percent variability in delivery hours explained by training hours and packaging weight. Training hours and packaging weight could have been a key take away from the fishbone diagram conducted earlier. The steps to do a multiple linear regression remain the same as the simple linear regression.

4.53 Multiple Linear Regression(Contd.)

The graphical representations of the multiple linear regression is seen on this slide. As we can see, there are four plots generated namely, normal probability plot, residuals versus fits, histograms, and residuals versus orders. This will be analyzed in the next slide.

4.54 Multiple Linear Regression(Contd.)

This is the graphical analysis. Let us look into the first graph namely the normal probability plot. In that plot, we can notice a curvature in the tails of the residuals, which is acceptable because of the small sample size. We can therefore say the residuals are normally distributed here. Let us understand the second graph namely, residuals versus fits. The ‘residuals versus fits’ graph indicates a regular up down pattern. We see that the residuals fall on both the sides of zero. There is no sign of non-random variation. Next, we will come to the third graph which is placed on the bottom left of the slide. This is a histogram that will plot the frequency of occurrence for each of the values. We can observe that most of the values are concentrated between -0.1(pronounced as minus zero point one) to 0.1 with highest concentration around -0.1(pronounced as minus zero point one). The last graph or the residual versus order graph is placed on the bottom right side. This shows the data for each of the order that was observed. We can see how the residual value behaves over a period of time. Most of the values are around the center and shows a random behavior, and towards the end it goes to a value of 0.6. Reading both the normal and fit plots, one observation can be classified as an outlier. This is a red flag, which tells us it is a possible special cause of variation and so, further investigation is needed. In the next two slides we will see the output results as generated with the help of Minitab.

4.55 Multiple Linear Regression(Contd.)

In this slide, the output results are shown for the regression analysis of delivery hour versus training hour and packaging weight. The regression equation is as follows: Delivery Hours = 34.8 minus (0.646 multiplied by training hours) + (0.063 multiplied by packaging weight in pounds) The table shown on the slide provides the details for each of the predictor. The p-value given in the table will help us understand if the predictor has any significant impact on the model and whether it has to be included or not. From the table we can see that training hours has a p-value of 0.0, which is less than 0.05. This means that training hours will have a significant impact on the model. However, the p-value of packaging weight is more than 0.05, which clearly says that it is not significant enough to be used in the model. The variance inflation factor (VIF (pronounced as V-I-F)) provides an index measuring the increased variance of an estimated regression coefficient due to collinearity. Any value that is less than 10 can be considered good. In this example, the value is 1.4. By standard convention, this would show moderate collinearity between variables. The other metrics on this are S equals to 0.213391. The correlation coefficient R-Square is 92.1%, which means a very strong correlation. The R-Square adjusted is 91.1%, value of PRESS is 1.13802, and the R-Square predicted value is 86.86%.

4.56 Multiple Linear Regression(Contd.)

This slide is a continuation of the results analysis of the multiple linear regression from the earlier slide. It shows the analysis of variance and other unusual observations. From the unusual observations table, we can notice that in the observation numbers 9 and 18, either the residual value is substantially large or the X value gives it a large leverage.

4.57 Multiple Linear Regression(Contd.)

In this slide, we will analyze the results and provide the details on how to interpret it. The variance inflation factor (VIF) provides an index measuring the increased variance of an estimated regression coefficient because of collinearity. In this example, the value is 1.4. By standard convention, this would show moderate collinearity between the variables. If VIF is greater than 10, collinearity may adversely affect the model. Right now, it is acceptable so, any predictors from the model need not be removed. In this example, p-value for training hours is 0.0 (it is significant as it is less than 0.05) but for packaging weight, it is 0.0594 which is considered non-significant (as it is more than 0.05). Packaging weight doesn’t significantly impact the result on delivery hours. The Black Belt may decide to drop the packaging weight from the model. The regression equation, delivery hours equals 34.8 minus 0.646 times training hours. The equation is modified post removing the packaging weight variable and also, by looking at the adjusted r-square of 91.1% and predicted r-square of 86.5%.

4.58 Multiple Linear Regression(Contd.)

Now let us explain some new and important terms. Let us start with collinearity. It refers to an exact or approximate linear relationship between two explanatory variables in any data set. It means some predictor variables are correlated to other predictors. Collinearity results will have a higher level of statistical accuracy when Minitab is used. The standard errors of coefficients are relatively high for predictors and residuals in it. Variance inflation factor or VIF (pronounced as V-I-F) should be identified for multi-collinearity. Multi-collinearity is an issue because it increases the variance of regression terms and makes the process unstable.

4.59 Multiple Linear Regression(Contd.).mp4

The next term, to be discussed in this slide is the VIF or variance inflation factor. VIF shows the extent to which multi-collinearity exists amongst predictors. If VIF is less than one, we can say the predictors are not correlated. But if VIF is between one and five, then predictors are moderately correlated. When VIF is greater than five, it means the predictors are highly correlated. On the detection of high correlation, the Black Belt should regress one predictor to another and see which predictor is causing the high variance. Next term to be discussed is the Durbin Watson statistic. It tests for the presence of auto correlation in residuals. When adjacent observations are correlated, the OLS method in regression underestimates the standard error of coefficients. If D is less than the lower bound, positive correlation exists. If D is greater than the upper bound, no correlation exists. If D is range bound, the test is inconclusive.

4.60 Multiple Linear Regression(Contd.)

In this slide, we will deal with a new term which is PRESS statistic. PRESS stands for predicted residuals sum of squares. It is used as an indication of the predicting power of the model, that is, how well the model could fit new observations. PRESS statistic is the sum of the squared external residuals. The external residuals are calculated by finding the predicted value for an observation by leaving out the observation. SSE (pronounced as S-S-E) or Sum Squares of Errors explains the quality of fit (that is, how well the data will fit the line), while PRESS statistic explains the predicting quality, (that is, how well the prediction is true).

4.61 Best Subsets Regression and Stepwise Regression

Now we move on to learn the best subsets regression and stepwise regression. First, let us understand best subsets regression. It identifies the best fitting regression model that can be constructed with the specified predictor variables. Here, the goal of regression is achieved with the fewest predictors possible. For example, if we have three predictors, Minitab will show the best and second best one-predictor models; and then show the full-predictor model. Let us discuss stepwise regression as the next term. In this regression, the most significant variable is added or removed from the regression model. There are three common stepwise regression procedures, namely, standard stepwise, forward selection, and backward elimination. Using Mallow’s Cp Statistic in stepwise regression is very popular.

4.62 Summary

In this lesson we have learned the concepts and applications of the objectives of regression analysis, concepts of regression analysis, and simple linear regression. We also discussed the concepts and applications of multiple linear regression. And finally, we got introduced to the best subsets regression and stepwise regression.

4.63 Lesson 5 Confidence Intervals

In this lesson, we will understand confidence intervals, one of the very important concepts in the analyze phase.

4.64 Agenda

Now let us look into the agenda of this lesson. We will start with discussing the concepts of confidence intervals and confidence intervals testing, and move on to the confidence intervals for difference between two means. Next we will discuss the confidence intervals working and the confidence intervals impactors. We will also look into the chi-square (pronounced as kai-square) confidence intervals for variances and the Z confidence intervals for proportions. Then we will understand chi-square and probability. Finally, we will discuss T distribution confidence intervals.

4.65 Concepts of Confidence Intervals and Confidence Intervals Testing

Let us first understand the confidence intervals conceptually. Collecting all the population data may not be practical all the time. We need to come up with an estimate on what sample size is good enough to provide a proxy for the population. Confidence interval (CI) is a type of interval estimate of a population parameter and is used to indicate the reliability of an estimate. Population parameters are µ (pronounced as myu) or the population mean and ? (pronounced as sigma) or the population standard deviation. When we collect data as a sample we represent the same parameters as sample statistics (Xbar as sample mean and s as Sample Standard Deviation). As there is variation in every process, we will find uncertainty in these sample estimates. These sample statistics are called sample estimates because they are used to estimate the population parameter. Confidence intervals are known as the interval estimates while mean, standard deviation, and other measures of descriptive statistics are known as the point estimates.

4.66 Concepts of Confidence Intervals and Confidence Intervals Testing(Contd.)

margins or confidence interval widths, which is likely to include the parameter of the population. The likelihood is determined by the confidence level which has maintained its standard at ninety five percent, but can also be tailored to ninety percent or ninety nine percent. This decision is often taken by the Black Belt depending on the kind of experiment he performs. If independent samples are taken from the population, and confidence intervals are computed for them; ninety five percent of the time these intervals will include the unknown population parameter.

4.67 Concepts of Confidence Intervals and Confidence Intervals Testing(Contd.)

Let us look into an example for the confidence intervals. A company conducted a survey amongst a sample of one thousand households in a society of ten thousand to know what percentage of people drank cola. When the company was asked the percentage, they replied that twenty five percent plus or minus two percent drank cola. That means on the basis of the survey conducted, they thought that the entire population could drink cola in the range of twenty three percent to twenty seven percent; and they were ninety five percent confident in their findings. Ninety five percent here is known as the confidence level and is always assumed before the sampling activity. The range between twenty three percent and twenty seven percent is known as margin of error, indicating the population result will lie in this range. If the sample survey result, for example, lies at twenty two percent this could be rejected as it doesn’t belong to the population.

4.68 Concepts of Confidence Intervals and Confidence Intervals Testing(Contd.)

Let us understand the statistical significance of confidence interval in this slide, using an example. A process improvement effort shows that the process yield has improved from seventy eight percent to eighty two percent. The population yield is in the confidence limits of eighty four percent to eighty eight percent. Now we have to see whether the process really has made any significant improvements? Let us look into the solution. As the sample yield is outside the range of the confidence limits defined by the interval, the process has not made any significant improvement, whatsoever. Although some improvement can be seen, the per cent needed to quantify statistical significance of improvement is still to be reached.

4.69 Confidence Intervals for Difference between Two Means

Confidence interval for the difference between two means establishes limits in which the difference between the two means could assume values. While doing any analysis, we might observe that the sample statistic and the population parameter could be different, although central limit theorem (also known as CLT) simplifies this analogy by providing statistical details about the data, in reality, there would be an estimation error or a sampling error. This error is known as the error of estimation, margin of error, or the standard error.

4.70 Confidence Intervals Working

In this slide, let us do some problems related to confidence intervals to understand how they work. The basic assumption here is that the variance of the population is known and the sample size is greater than or equal to thirty. Using z-distribution, we find that confidence interval equals Xbar plus or minus Z at alpha divided by two multiplied by Sigma by square root of n where n is the sample size. Let us take this example. At ninety five percent confidence level, calculate the confidence interval for sample mean of fifteen, and a known standard deviation of one with the data collected on hundred samples. Here is the solution. According to the question, confidence level is ninety five per cent, alpha is equal to 5 percent, and therefore alpha by two is two point five per cent. Use the Z-table provided in the toolkit for the calculations. The corresponding z score that we get is 1.96. The working for the confidence interval is shown on the slide. There is a two point five percent probability each that the population mean could be less than fourteen point eight one and more than fifteen point one nine six.

4.71 Confidence Intervals Working(Contd.)

Let us look into another example. The basic assumption here is that the variance of the population is known and the sample size is greater than or equal to thirty. Let us now use z-distribution confidence interval with the assumption mentioned above. According to the formula, confidence interval is equal to Xbar plus or minus Z at alpha (which is standard 5 percent) divided by two multiplied by Sigma by square root of sample size. Let us take an example. At ninety five percent confidence level calculate confidence interval for a sample mean of fifteen and a known standard deviation of one with sample data collected on thirty six samples. Now let us discuss the solution. As given in the question, confidence level is ninety five percent, and alpha by two is two point five percent. Use the Z-table provided in the toolkit. The corresponding z score is 1.96. The working for the confidence intervals is shown in the slide. There is a two point five percent probability each that the population mean could be less than fourteen point six six and more than fifteen point three three.

4.72 Confidence Intervals Impactors

In this slide we discuss the key factors that impact confidence intervals. One of the factors is significance level. The choice of alpha for the confidence interval dictates the margin of error or the confidence interval width. Alpha is the margin of error. (Alpha is calculated as 100 percent minus confidence interval. That is for a confidence interval of 95 percent, alpha will be 5percent, and for a confidence interval of 90 percent, alpha would be 10%). At ninety nine percent confidence intervals, alpha is one percent and Z at alpha by two is two point five seven six. The same confidence intervals for hundred samples will now move to fourteen point seven two four and fifteen point two nine six. Next factor to be discussed is sample size. We will find that at the same significance level, with the increase in sample size the confidence interval width decreases. In other words, the accuracy of the confidence intervals increases with the increase in the sample size.

4.73 Chi Square Confidence Intervals for Variances

Let us understand the Chi Square (pronounced as kai-square) test for confidence intervals for variances in this slide. Chi-square confidence intervals are used to determine the confidence intervals for variances. The interval represents the most likely place where population variances will fall, given a confidence level of ninety five, ninety, or ninety nine percent. The formula to calculate Chi-Square confidence intervals for variances is given on the slide.

4.74 Chi-Square Confidence Intervals for Variances(Contd.)

Now let us look into an example for calculating the confidence intervals for variances using chi-square. Find the confidence intervals for variance for a sample size of 35 and a known variance of 2.5. Assuming alpha to be zero point zero five, alpha by two is zero point zero two five. One minus alpha by two is 0.975. For determining the chi square values at these levels, use the chi square table provided in the toolkit. At alpha by two, chi square value is fifty one point nine six and at one minus alpha by two, chi square value is 19.806.Confidence intervals of variance, using the formula discussed in the previous slide, are four point zero eight and ten point seven two.

4.75 Z Confidence Intervals for Proportions

In this slide, let us learn to calculate confidence intervals for proportions. The formula for confidence intervals for proportions is pbar (pronounced as p bar) plus or minus z at alpha by two multiplied by square root of pbar multiplied by qbar divided by n. Let us now calculate the ninety five percent confidence interval, for a class of seventy reporting thirty per cent absenteeism. So, the solution is at alpha by two the Z score is one point nine six, as alpha is 5 %. pbar is zero point three and qbar (pronounced as q bar) is zero point seven. Substituting these values in the formula, the confidence intervals for 30% absenteeism in a class of 70 students are twenty per cent and forty per cent. In the next slide we will discuss chi-square and probability.

4.76 Chi Square and Probability

The chi-square statistic can be used to determine the probability of variances occurring. The chi square statistic is given by the formula n minus one multiplied by the square of s divided by the square of sigma. Now let us check an example. For a population variance of one and sample variance on data collected, zero point six, determine the probability that sample variance can exceed zero point six. Sample size given here is thirty. Substituting the values in the formula in the chi square statistic, the solution is calculated to be seventeen point four. Using the chi square table for the value seventeen point four, alpha is zero point zero five. For a degree of freedom twenty nine and the calculated chi-square statistic the probability of the variance being greater than zero point six is 95 to 97.5%.

4.77 T Distribution Confidence Intervals

After chi-square confidence intervals and Z confidence intervals let us look into the t distribution confidence intervals in this slide. The student’s t distribution can be used to determine the confidence intervals only when the sample size is less than thirty and the population standard deviation or population variance is unknown. Now let us look into an example, for a sample size of twenty five and a sample variance of two, let us calculate the confidence intervals for a mean of fourteen. For finding the solution, use the formula Xbar plus or minus t at alpha by two multiplied by s divided by square root of n, where alpha is 5 %. Here, s is the sample variance. From the t table, value at alpha by two is two point one three one. Substituting the values in the formula, we would get the confidence intervals as 13.15 and 14.85.

4.78 Summary

In this lesson we have learned the concepts of confidence intervals and confidence intervals testing. Then, we understood the objectives of confidence intervals testing and chi-square probability distribution. We also looked into Student’s T distribution, and confidence intervals between means (called as Myu) when the variance (which is Sigma) is known and unknown.

4.79 Lesson 5 Parametric Hypothesis Testing

We will now cover a lesson on parametric hypothesis testing. In this lesson, we will cover various hypotheses testing with the parameters available and perform several tests to validate the hypothesis.

4.80 Agenda

Let us look into the agenda of this lesson. We will start with discussing the hypothesis testing objectives and then hypothesis testing concepts. We will then move on to understand null and alternate hypothesis, followed by type one and type two errors. Significance level will be the next topic to be discussed, after which we will look into beta and power. The next topic that we will discuss is p-value, and acceptance and rejection conditions. Then, we will understand sample size determination for tests. After that, we will look into 1 sample z test.

4.81 Agenda(Contd.)

Then we will understand 2 sample z test. We will then understand f-test of equality of variances. It will be followed by 1 sample t test, 2 sample t test, and paired t test. We will also discuss paired t test interpretation. Towards the end, we will understand ANOVA (pronounce it as ae-noh-vah), one way ANOVA, and then finally, two way ANOVA with replication.

4.82 Hypothesis Testing Objective

Let us understand the hypothesis testing objectives in this slide. To begin with, let us discuss what hypothesis testing is. Hypothesis testing is a form of decision making based on the statistical inference that uses data from a sample to draw conclusions about a population parameter. Hypothesis testing is usually done to statistically validate whether a sample mean belongs to the population. It is also done to validate if the means of two groups are the same or significantly different statistically. It is also done to validate if the variances of two groups are the same or different. Parametric hypothesis testing is the testing on the groups of data that come from a normal distribution.

4.83 Hypothesis Testing Concepts

In this slide, we will look at some of the hypothesis testing concepts, and in the next set of slides we will learn them in detail. The main concepts that fall under hypothesis testing are null and alternate hypothesis, type one error, type two error, significance level (represented as ?lpha), Beta and power, p– value, and acceptance and rejection conditions, and at the end, sample size determination for tests.

4.84 Null and Alternate Hypothesis

Null hypothesis or Ho (pronounced as H-o) is the basic assumption behind doing any activity. For example, we go to a movie assuming it is good. So according to the null hypothesis, Ho, the movie is good. The next type of hypothesis is the alternate hypothesis or Ha (pronounced as H-a). This is the exact opposite of the null hypothesis. The alternate hypothesis is used when the null hypothesis is rejected. Taking the example, we can say that according to alternate hypothesis, Ha, movie is not good. The main objective of conducting hypothesis tests is to reject the null or the alternate hypothesis. Often rejecting the null or alternate hypothesis has practical connotations, which will be discussed when the tests are done.

4.85 Type 1 Error

We will discuss type 1 error in this slide. Type one error is said to be committed when the null hypothesis is rejected, when it was actually true. Considering the same example mentioned in the previous slide, when we say that the movie was bad though it was good, it is considered a type 1 error. Since the mistake was committed intentionally, this is a clear case of type one error. A type one error is known as “false positive”. Let us take the case of a patient, as an example. The patient underwent an HIV test, which concludes him being an HIV carrier and is declared HIV positive. But in reality, he doesn’t carry the virus. That’s the reason why type one error is known as false positive. Type one error is also referred to as producer’s risk, i.e., even though the product is good, it is getting rejected.

4.86 Type II Error

Another error that the experimenters commit is known as type two error, which will be discussed in this slide. This occurs when someone rejects the alternate when it was actually true. For example, a person is declared HIV negative, which means he doesn’t carry the HIV virus. This is done despite the person actually carrying the virus. In reality, he should have been declared positive but the doctor rejected the alternate (that the patient is HIV positive), and declared the patient to be free from the HIV virus. Thus, it can be concluded that type I error is considered serious because we rejected the null hypothesis wrongly. Here, we had a chance of going with the assumption but we did not. In the above mentioned case of the HIV incident, type two error could be even more dangerous as we are declaring the patient free from the disease when he is not. We will look into the example in the next slide. If men having high blood sugar problems are diagnosed with diabetes with the mean blood sugar level to be at one hundred fifty and a standard deviation of ten. Any individual having blood sugar greater than one hundred twenty five can be diagnosed with diabetes, what is the probability of committing a type two error? First, find out the z score by finding the difference between the individual value (which is 125) and the mean value (which is 150), and then dividing it by the standard deviation. This gives us the value of "z" as 2.5.Now, look into the z table. We can find that the area under two point five corresponds to zero point zero zero six two. That means probability for a type two error happening here is zero point six two per cent.

4.87 Significance Level (α)

In this slide we will be dealing with significance level or alpha. The probability of committing a type one error is known as the level of significance or the significance level, represented by alpha. When alpha is mentioned, one-tailed probability is used to reject the hypothesis. (In one-tailed probability, only one direction is considered extreme.)Alpha is equal to hundred percent minus confidence level. If the confidence level is ninety five percent, the level of significance is five percent. Ninety five percent confidence level is often assumed as the default confidence level. The confidence level could be changed depending on the type of experiment done, in that case, possible alpha values could be one percent or ten percent.

4.88 Significance Level (α) (Contd.)

If the sample data on delivery hours has a mean of thirty six hours and a standard deviation of two with delivery hours being normally distributed, what is the probability of a type one error with delivery hours over forty hours being diagnosed as defective? Let us look into the solution. First, we need to find out the z score. Applying the standard formula the z score here is two. Area under two corresponds to zero point zero two two eight in the z score table. That means probability of type one error happening or the level of significance is two point two eight percent. Let us discuss beta and power in the next slide.

4.89 Type II Error (Contd.)

Beta is the probability of committing a type two error by rejecting alternate when it was actually true. In this error a test on two group means is concluded saying that it is producing non-significant results, when the results were actually significant. Power is the fraction of experiments, which is expected to yield a statistically significant p-value. Based on the p-value, it is determined if the null hyphothesis is to be accepted or rejected. In other words, power shows the confidence in the test and its results. Usual power value is eighty percent, that is, out of the hundred experiments eighty of them can be expected to show statistically significant p-values. Therefore, beta is equal to hundred per cent minus power per cent.

4.90 β and Power

Beta is the probability of committing a type two error by rejecting alternate when it was actually true. In this error a test on two group means is concluded saying that it is producing non-significant results, when the results were actually significant. Power is the fraction of experiments, which is expected to yield a statistically significant p-value. Based on the p-value, it is determined if the null hyphothesis is to be accepted or rejected. In other words, power shows the confidence in the test and its results. Usual power value is eighty percent, that is, out of the hundred experiments eighty of them can be expected to show statistically significant p-values. Therefore, beta is equal to hundred per cent minus power per cent.

4.91 P Value, and Acceptance and Rejection Conditions

We will look into the p-value and acceptance and rejection conditions in this slide. If p-value is less than alpha, null hypothesis should be rejected. We will have to conclude that with the confidence level of one minus p value that there exists a significant statistical difference in the two groups. Now let us check the second case. If the p-value is greater than alpha, then reject the alternate hypothesis. We can conclude that there is no significant statistical difference in the two groups with p value level of confidence. By accepting or rejecting the hypothesis we should be in a position to conclude whether the mean has shifted or the variations really belong to the group or not.

4.92 Sample Size Determination for Tests

In this slide we will understand the one sample z test. One sample z test is used when a hypothecated mean is compared with the mean of a sample or population. Ho (pronounced as H-o) is used when mean belongs to the sample and H a(pronounced as H-a) is used when mean does not belong to the sample. Sample z test has to be used when the sample size is greater than thirty and when the standard deviation of population is known. Let us take an example. A sample of thirty five was taken for measuring delivery hours. The mean Xbar, was found to be thirty two point five and population standard deviation was found to be two point two. What is the probability that a mean of thirty four could belong to this sample? Let us interpret this in the next slide.

4.93 1 Sample z Test

In this slide we will understand the one sample z test. One sample z test is used when a hypothecated mean is compared with the mean of a sample or population. Ho (pronounced as H-o) is used when mean belongs to the sample and H a(pronounced as H-a) is used when mean does not belong to the sample. Sample z test has to be used when the sample size is greater than thirty and when the standard deviation of population is known. Let us take an example. A sample of thirty five was taken for measuring delivery hours. The mean Xbar, was found to be thirty two point five and population standard deviation was found to be two point two. What is the probability that a mean of thirty four could belong to this sample? Let us interpret this in the next slide.

4.94 1 Sample z Test(Contd.)

A straightforward confidence interval testing could have given us the result as thirty one point seven seven and thirty three point two two nine. This clearly asks for the rejection of the null hypothesis, as the hypothecated mean is outside the interval range. Minitab results for one sample z test are shown in the slide. The interpretation is that p-value of zero indicates that we can reject the null with hundred percent confidence. This means the hypothecated mean of thirty four cannot belong to the sample. We can therefore interpret that the mean of thirty four did not happen by chance. Something special might have contributed to it.

4.95 1 Sample z Test(Contd.)

We will continue with the sample z test in this slide also. Let us change the confidence level to ninety percent and then, test for just one bound that is the upper bound. The Minitab results are displayed on the slide. We can see that the confidence intervals are changed to thirty two point nine seven. Let us now look into the interpretations. From the results we can say the p-value of zero indicates the rejection of the null with hundred percent confidence. This means that the hypothecated mean of thirty four cannot belong to the sample. We can therefore say there is no way that the mean of thirty four happened by chance. Something special might have contributed to it.

4.96 2 Sample z test

Let us look into the two sample z test in this slide. Same conditions from one sample z test are applied here. The difference is that a two sample z test tests the means of two groups, and returns a significant or a non-significant p-value. Using a two-sample z test, we can compare the means of two groups and conclude if they are statistically the same. Here, Ho (pronounced as H-o) is Myu of group A which is equal to Myu of Group B. Ha(pronounced as H-a) is Myu of group A which is not equal to Myu of Group B. The above hypothesis is assumed for a two-tailed probability testing. We will understand the f-test of equality of variances in this slide.

4.97 f Test of Equality of Variances

We should first do an equal variances f test before conducting a t test. QI macros, a statistical software, can be used to conduct the f test. Minitab also provides this test under Stat Basic Statistics two Variances. The data groups are shown on the slide. As it can be seen from the result box, we should accept the fact that the variances of the two groups are statistically the same. The two-tailed probability value is 0.741. The interpretation clearly says the p-value is non-significant, so accept that two groups have equal variances. Now we can go ahead and work with the t-test assuming equal variances.

4.98 1 Sample t Test

Let us look into 1 sample t test. 1 sample t test is used when we want to compare a hypothecated mean with the mean of a sample or population. Here, Ho is the mean belonging to the sample and Ha is the mean not belonging to the sample. This has to be used when the sample size is less than thirty and the standard deviation of population is unknown. Let us look into the example. A sample of twenty five was taken for measuring delivery hours. The mean Xbar was found to be thirty two point five and sample standard deviation was found at two. What is the probability that a mean of thirty four could belong to this sample? Let us look into the solution in the next slide.

4.99 1 Sample t Test(Contd.)

The Minitab results for one sample t test are displayed below. If we look at the table, the confidence intervals are enough to reject the null and with the p-value of zero point zero zero one, we can reject the null hypothesis. This mean of thirty four doesn’t represent the population from which it is drawn from. This means that this couldn’t have happened by chance. Let us look into 2 sample t test in the next slide.

4.100 2 Sample t Test

For two sample t test, the conditions applied are the same as that of a one sample t test. The only difference from 1 sample t test is that a two sample t test tests the means of two groups, and returns a significant or a non-significant p-value. With a two-sample t test, we can compare the means of two groups and conclude if they are statistically the same. Here, Ho is Myu of group A is equal to Myu of Group B. Ha is Myu of group A is not equal to Myu of Group B. The above hypothesis is assumed for a 2-tailed probability testing. Ina two-tailed probability - both the directions can have extreme values. It needs to be assumed whether the two groups have equal variances or not for conducting a two Sample t test. To do this homogeneity of variance test has to be conducted first.

4.101 2 Sample t Test

Now let us conduct a two sample t test in this slide. The two tailed probability value is 0.741. The interpretation clearly says that the p-value is non-significant, therefore, accept that two groups have equal variances. Now, we can go ahead and work with the t-test assuming they are equal variances. For the same group of data, the t-test results are shown in the slide.

4.102 2 Sample t Test

Now let us analyze the 2 sample t test given in the previous slide. By assuming equal variances and with the two tailed probability value showing as zero point nine four eight, we can reject the alternate. Also, with ninety four point eight per cent confidence, we can say that the means of the two groups are equal. Thus, we can interpret that there is no statistical difference between the means of the two samples. This means that both the means seem to come from the same population. So, we can finally interpret that there is no indication of non-random variation in the two groups. The df, degree of freedom indicated here is ten, n one + n two minus two. In the next slide, we will look into the paired t test.

4.103 Paired t Test

The paired t test is a useful before-after test, which is used to validate the statistical improvements made. This test is often used in the improve stage to statistically validate improvements. As the paired t test is not done on independent samples, the groups of data must be correlated. The degrees of freedom would now be computed on one group and not two. A paired t test done on the data for delivery hours before and after improvement measures is shown in the next slide.

4.104 Paired t Test(Contd.)

This slide shows a snapshot of how the paired t test results look like.

4.105 Paired t Test Interpretation

The null hypothesis is represented by Ho, represents that there is no significant statistical difference between the means of the two groups of data. The alternate hypothesis is represented by Ha, which represents a significant difference between the means of the two groups of data. As p is less than zero point zero five, we can reject the null. We can find that the means are significantly different. It means that the group before could be less or greater than the group after. To check if the improvement has indeed worked, the after group mean has to be significantly less than the before group data. From the box plot, it is obvious that the mean has changed significantly. Click on the box plots on QI Macros option for a graphical display.

4.106 Paired t Test(Contd.)

Look at the box plot for delivery hours. As we can see, delivery hours post-improvement indicated by group 2 has got a much lower central tendency than delivery hours pre-improvement. The variability of post-improvement delivery hours is slightly greater than the pre-improvement. This is considered fine because post improvement, possible special causes do not seem to exist.

4.107 Paired t Test(Contd.)

Let us understand the interpretations of the paired t test. Looking into the paired t test mentioned in the previous slide, we can say that the improvement is positive. We can also understand that the improvement given to the group has worked. It also shows that the working of the improvement has been statistically validated. These improvements that have been implemented on a pilot could now be implemented across the organization. Importantly, note the degree of freedom here for a paired t test. It is five, while in the two-sample t test, we saw the degree of freedom to be ten.

4.108 ANOVA

In this slide, we will discuss ANOVA (pronounce it as ae-noh-vah) which stands for analysis of variance. ANOVA could be an excellent testing procedure to test more than two groups at the same significance level. Using ANOVA, the Black Belt would be able to accept or reject the null hypothesis just by doing one test. The three or more groups of data should be independent. Their value should not be dependent on other group’s value. Although ANOVA analysis is for variance it also tests for means. Here the null hypothesis, Ho (pronounced as H-o-), is that there is no significant statistical difference between the means of the three groups. Now let us look into one way ANOVA in the next slide.

4.109 One Way ANOVA

Here, we are to test the delivery hours metric across three independent samples of data. We will use a one way ANOVA test in QI macros. The results table is shown in the slide. As we can see, the p value is 0.843, which is non-significant. Therefore, we can reject the alternate hypothesis and conclude that there is no statistical significance between the three groups.

4.110 Two Way ANOVA with Replication

Let us, in the next three slides, learn the two-way ANOVA with replication, another powerful test. The two-way ANOVA test is used to test interactions between multiple parameters in an experiment. For the experiments, the significance level is maintained at alpha zero point zero five. This is the tolerance for type one risk based on the confidence level of ninety five percent. In order to use this test and present interpretations, data needs to be captured on the table in the right format. We can use either QI macros or Minitab to give the results.

4.111 Two Way ANOVA with Replication(Contd.)

Let us take an example. A company studies the interaction between multiple drugs by using different quantity of each in the category. The readings are presented in the excel sheet. Conduct a two-way ANOVA with replication and interpret. The drug data table is presented on the slide.

4.112 Two Way ANOVA with Replication(Contd.)

The results table is seen on the slide. There are three interpretations highlighted on the table, which would be discussed in the next slide.

4.113 Two Way ANOVA with Replication(Contd.)

Now, let us look into the interpretations of the result that saw in the table in the earlier slide. Based on the p-value determined in the analysis, we infer the following. The first highlighted p-value is 0.179, which is given against the sample. Looking at the value we can interpret there is no effect of the category of patients who use the drug, on the different rates (diff. rates). The second p-value given against columns is 0.106. This can be interpreted that there is no effect of the type of drug used by the patients on the different rates. The final interpretation is given against interaction. The interpretation here is that the interaction between the type of drug used and the category of the patients have a significant effect on the different rates, shown by the low p-value of zero point zero zero six.

4.114 Summary

In this lesson we have learned the following: We started the Lesson by covering the hypothesis testing objectives and concepts. Then we covered one sample z-test and two sample z-test. After that, we went in detail for determining equality of the variances using f-test. Then, we covered one sample t-test and two sample t-test. Followed by that, we covered paired t-test and at the end, we learned one-way ANOVA and two-way ANOVA. In the toolkit, you can look into the file name, Hypothesis Tests which has the workings of all these tests. Use QI Macros to conduct these tests. We will look into the next lesson, nonparametric hypothesis testing, in the next slide.

4.115 Lesson 7 Nonparametric Hypothesis Testing

We will now cover a lesson on nonparametric hypothesis testing. In this lesson, we will cover various tests that are done when the data is not meeting the assumptions of the parametric test.

4.116 Agenda

In this lesson, we will learn the Non Parametric Hypothesis testing. We will start off this lesson by looking into the nonparametric testing conditions and then, Mann-Whitney test. We will then move on to understand 1 Sample Sign and Wilcoxon Test. Following that, we will understand Kruskal Wallis, and Mood’s Median, ending the lesson with Friedman ANOVA.

4.117 Nonparametric Testing Conditions

Let us look into the conditions for using nonparametric tests, in this slide. Non parametric testing is used instead of parametric tests when data has counts or frequencies of different types. It is also used when the data is measured on nominal or ordinal scale and not meeting the assumptions of the parametric test. We use nonparametric sample, also when the data is a small sample. Nonparametric tests correspond to parametric tests, which are easier to comprehend, in the situations they can be used. For example, a one sample sign test is used to check if the sample mean indeed corresponds to the population mean. The corresponding parametric test is one sample z test and one sample sample t test. The table given on the slide presents the correlation between the nonparametric and parametric tests. The first column list down the nonparametric (that is distribution free) test and the right column shows the parametric (normal distribution) tests. Nonparametric test of 1-sample sign test is equivalent to parametric tests 1-sample z-test and 1-sample t-test. Wilcoxon test equivalents in parametric tests are 1-sample and 2-sample t-tests. Nonparametric test named Mann-Whitney test has an equivalent test named 2-sample t-test in parametric test. Kruskal-Wallis is a nonparametric test that is similar to the one-way ANOVA parametric test. The non-parametric test named Mood's median is parallel to one-way ANOVA test in parametric test series and finally, Friedman nonparametric test is similar to 2-way ANOVA paired sign parametric test.

4.118 Mann Whitney Test

Let us look into the Mann Whitney test in this slide. Here are some conditions of the Mann Whitney test. We can use the Mann Whitney test if the data meets nonparametric testing conditions and is divided into two independent samples. This test can be done when rejection and acceptance conditions remain the same, that is, when the p-value is less than alpha, we reject the null hypothesis. If p-value is not less than alpha, we reject the alternate hypothesis. In this test alpha is set by default at zero point zero five. You can use the tool nonparametric testing in the toolkit to know how nonparametric tests work. We will continue discussing Mann Whitney test in the next slide as well.

4.119 Mann Whitney Test(Contd.)

Like the various parametric tests each of these nonparametric tests will have their own way of calculating the test statistic. A Black Belt would not have to know the exact way of calculating the test statistic. But, he or she should know how to use the tests, which test to use when, and how to interpret these tests. The ready format of the Mann Whitney test is available for use in the tool file, Non Parametric Hypothesis Testing. Please type in the necessary data in columns coloured in yellow background, as given on the slide. As we can see, the p-value gets automatically calculated in the cell highlighted in blue-green color on updating the necessary data. Here, the null has to be accepted, which means, there are no differences in the medians of the two groups.

4.120 1 Sample Sign

Now let us discuss the 1 sample sign test. The one sample sign test should be used instead of the one sample t test. Here, the null hypothesis, Ho, is the hypothecated or assumed median of sample belongs to the population by testing its median. The one sample sign test is the nonparametric equivalent for one sample t test. The one sample sign test ready format is available for use in the tool file ‘Non Parametric Hypothesis Testing.’ Please type in the necessary data in columns coloured in yellow background. As we can see the p-value gets automatically calculated in the cell highlighted in blue-green color. For this test, confidence level is set to 95%.

4.121 Wilcoxon Sign Rank Test

Let us discuss another nonparametric test, Wilcoxon sign rank test, in this slide. The Wilcoxon sign rank test can be used as a substitute for the 2-sample t test. It can also be used in situations where the Black Belt wishes to regress an input variable and an output variable. For the data presented in the nonparametric testing sheet, which is a part of the toolkit, the interpretation is again done on the p-value. The Wilcoxon sign rank test ready format is available for use in the tool file ‘Non Parametric Hypothesis Testing.’ Please type in the necessary data in the column coloured in yellow background. As we can see, the p-value gets automatically calculated in the cell highlighted in blue-green color. The p-value comes out to be 0.01, which is less than the alpha value of 0.05. In this case, we can reject the null hypothesis.

4.122 Kruskal Wallis

Another nonparametric test, Kruskal Wallis is mentioned in this slide. Kruskal Wallis test should be used when the variance is to be checked and understood, whether the groups of data have the same variance or not. In Kruskal Wallis, Chi-square distribution is used to understand the variances unlike other tests which use z-distributions to calculate the test statistic. The Kruskal Wallis ready format is available for use in the tool file Nonparametric Hypothesis Testing provided in the toolkit. Please type in the data in the necessary columns. As we can see the p-value gets automatically calculated in the cell highlighted in blue-green color. For the data in the nonparametric testing sheet the results are as shown on the slide. As one can see, the p-value comes out to be 0.151. Since the p-value is more than alpha, which is 0.05, we will have to accept the null hypothesis and reject the alternate hypothesis.

4.123 Mood’s Median

In this slide, we will understand what Mood’s median is and how it is been used. The Mood’s median test is a nonparametric test that is used to test the equality of medians from two or more different populations. Please note that the Mood’s median test works when the Y variable is continuous, discrete-ordinal or discrete-count, and when the X variable is discrete with two or more attributes. Here is how the test works. The first step is to find the median of the combined data set. After that, find the number of values in each sample greater than the median and form a contingency table as shown on the slide. Then, using the formula we can find the expected value for each cell. The formula to be used is expected value equals a product of row total and column total divided by grand total. Once we get the expected value, we can move on to find the chi-square value. The formula for finding chi-square value is equal to the square of difference between the actual and expected value divided by the expected value. Chi-square is denoted in the formula as x-square with a subscript of contribution.

4.123 Mood’s Median

In this slide, we will understand what Mood’s median is and how it is been used. The Mood’s median test is a nonparametric test that is used to test the equality of medians from two or more different populations. Please note that the Mood’s median test works when the Y variable is continuous, discrete-ordinal or discrete-count, and when the X variable is discrete with two or more attributes. Here is how the test works. The first step is to find the median of the combined data set. After that, find the number of values in each sample greater than the median and form a contingency table as shown on the slide. Then, using the formula we can find the expected value for each cell. The formula to be used is expected value equals a product of row total and column total divided by grand total. Once we get the expected value, we can move on to find the chi-square value. The formula for finding chi-square value is equal to the square of difference between the actual and expected value divided by the expected value. Chi-square is denoted in the formula as x-square with a subscript of contribution.

4.124 Friedman ANOVA

In this slide, we will discuss the Friedman ANOVA. Milton Friedman, a U.S. economist developed this test and it was named after him as Friedman ANOVA test. The Friedman test is a form of nonparametric test that makes no assumptions about the specific shape of the population from which the sample is drawn. It allows the analysis of smaller sample data sets. Unlike ANOVA, the Friedman test does not require the dataset to be randomly sampled from normally distributed populations with equal variances. It uses a two-tailed hypothesis test where the null hypothesis is such that the population medians of each treatment are statistically identical to the rest of the group.

4.125 Friedman ANOVA(Contd.)

Consider the example, where three treatments are evaluated on four different patients. Data is captured for 3 different types of therapy across 4 patients. Ranks are imposed for each of the patients across different types of therapy. If values are equal, average the ranks they would have got if they were slightly different. The formula used for Friedman ANOVA is calculated as the formula shown on the screen.

4.126 Summary

Let us summarize the key observations of the nonparametric hypothesis tests. In this lesson, we have learned how to do Mann-Whitney test if the data is divided into two independent samples, and when to use one sample sign test instead of one sample t-test. Then, we covered how to use Wilcoxon sign rank test to regress an input variable and an output variable. We also learned how to do the Kruskal Wallis test to understand whether the groups of data have the same variance or not. Mood’s Median method was explained on how to test the equality of medians from two or more different populations. And finally, we learned how to use Friedman ANOVA test. It is important to note that as a Black Belt, one need not know the formulas that are used to calculate the test statistic for each of the non-parametric tests. But the Black Belt needs to know which test to use when, and how to use the test.

4.127 Lesson 8 Analyze Additionals Categorical Data and Current Reality Tree

Let us complete the last lesson in the Analyze phase. This is the last lesson of the Analyze phase, we will cover the additional analysis tools like categorical data analysis and current reality tree tool for root cause identification.

4.128 Agenda

This lesson is an auxiliary lesson for analyze phase and covers just two topics – namely, how to analyze categorical data and how to use the current reality tree tool. Let us understand categorical data analysis, in the next slide.

4.129 Categorical Data Analysis

Analyzing continuous or discrete data is not a major issue as the Black Belt has a wide variety of hypothesis tests and confidence intervals to determine the test values. With data that can be represented in the nominal scale, or which can be represented as a fixed number of nominal categories, categorical data analysis needs to be done differently. The most popular form of analyzing categorical data is with the help of a two by two contingency table. In this table, there is data listing number of men and women are dieting and non-dieting. Chi-square distribution is popularly used in analyzing the categorical data.

4.130 Categorical Data Analysis(Contd.)

Let us now discuss an example for analyzing categorical data. A sample of teenage dieting and non-dieting folks were studied across men and women population. Rather than representing the data in the form of frequency on classes the frequency was represented on categories. The data for the same can be seen on the slide. Test to see if the difference in proportions of dieting or non-dieting folks is significant. Let us do the Fisher’s exact test in the next slide.

4.131 Categorical Data Analysis(Contd.)

We will use a test known as Fisher’s exact test on two by two tables. This test sheet is provided as part of the toolkit. From the results table that we see on the slide, it is clear that the p-value is significant. Thus the null hypothesis is rejected, accepting the fact that there is a significant statistical difference in the proportions. In this example, we use the data set where there are men and women who are dieting and not dieting. The details are shown on the first table. The right table is the results table of Fisher’s exact test. From the results one can observe the following. The p-value is around 0.04, which is less than the alpha of 0.05. Therefore, the null hypothesis is rejected and can be concluded that there is a significant difference in the proportions of dieting people in men versus women.

4.132 Categorical Data Analysis(Contd.)

We will continue discussing the categorical data analysis in this slide as well. Fisher’s test done in the previous slide is always preferred when the degree of freedom is one. In the example, there were 2 variables as a result of which the degree of freedom would be (n-1) or one. Fishers test sheet is presented for you in the file Simplilearn toolkit. In most other cases, the Chi-square goodness of fit test is used. Other tests that are used are Yates correction for continuity, Cochran-Mantel-Haenzel test, McNemar’s test, Portmanteau test, and likelihood test for statistical modeling.

4.133 Current Reality Tree

Now let us look into the current reality tree. What is a current reality tree? A current reality tree is an excellent root cause identification tool. This tree helps to plot the problem and find the reason why it really happens. The tree is categorized into two groups – symptoms and problems. Problems here refer to any problem that a process or product might be having and the symptoms are the potential root causes for these. An undesirable effect also known as UDE (pronounced as U-D-E) is a symptom that results in multiple symptoms happening. Symptoms are things that happen and are visible to the naked eye. A snapshot of the current reality tree is given in the next slide. The tool for current reality tree is attached in the toolkit file TOC tools.

4.134 Current Reality Tree(Contd.)

This slide shows a snapshot of how a current reality tree looks like.

4.135 Summary

Let’s summarize this lesson now. In this brief lesson we have learned how to deal with the categorical data. We have also understood how to do the Fisher’s test, one of the tests we can do to deal with the categorical data. We also have understood a list of other tests that can be used to deal with the categorical data. Most of these tests are outside the Black Belt body of knowledge. It is important though, that a Black Belt knows about these tests.

4.136 Activity Summary Analyze

In this slide, we will look into the activity summary for analyze. It should be performed in the chronological order in the analyze phase. First, check the lean status of the process. Then if wastes are analyzed, eliminate or reduce them. Now re-check the process conditions, if no improvement is seen, proceed. Next step is to brainstorm. After that, map what is causing the variation in output variable. Then, test the relationships between input and output variable. In the next step, do hypothesis test or a confidence interval test for the characteristics. Finally, understand the root cause of the variation.

4.137 Tools Summary Analyze

Let us look into the tools used for analyze phase. Below mentioned tools should be performed in chronological order in the analyze phase. They are value stream map; spaghetti chart; lean tools; fishbone diagram with Pareto charts, cause and effect matrix; regression; confidence intervals; hypothesis tests; and current reality tree and fishbone. It is quiz time! Go and attempt the quiz and check the understanding of this section better. In the next section, we will be dealing with the improve phase.

4.123 Mood’s Median

In this slide, we will understand what Mood’s median is and how it is been used. The Mood’s median test is a nonparametric test that is used to test the equality of medians from two or more different populations. Please note that the Mood’s median test works when the Y variable is continuous, discrete-ordinal or discrete-count, and when the X variable is discrete with two or more attributes. Here is how the test works. The first step is to find the median of the combined data set. After that, find the number of values in each sample greater than the median and form a contingency table as shown on the slide. Then, using the formula we can find the expected value for each cell. The formula to be used is expected value equals a product of row total and column total divided by grand total. Once we get the expected value, we can move on to find the chi-square value. The formula for finding chi-square value is equal to the square of difference between the actual and expected value divided by the expected value. Chi-square is denoted in the formula as x-square with a subscript of contribution.

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*