## Certified Six Sigma Green Belt

Certification Training
8538 Learners
6 Chapters +

# Measure Phase of Lean Six Sigma Tutorial

## 3.1 Measure

Hello and welcome to the third lesson of the Certified Six Sigma Green Belt Course offered by Simplilearn. This lesson will cover the details of the measure phase. Let us explore the objectives of the lesson in the next screen.

## 3.2 Objectives

After completing this lesson, you will be able to explain process analysis and documentation, describe probability, statistics, and statistical distributions, and collect and summarize data. Next, you will be able to perform Measurement System Analysis (MSA). Finally, you will be able to describe process and performance Capability. Let us start with the first topic in the following screen.

## 3.3 Topic 1 Process Analysis And Documentation

In this topic, we will discuss process analysis and documentation in detail. Let us start with an introduction to the measure phase in the following screen.

## 3.4 Introduction To Measure Phase

The Measure phase is the second phase in a six sigma project. The key objective of the measure phase is to gather as much information as possible on the current processes. This involves three key tasks, i.e. (pronounced as “that is”), creating a detailed process map, gathering baseline data, and summarizing and analyzing the data. Let us understand process modeling in the following screen.

## 3.5 Process Modeling

Process modeling refers to the visualization of a proposed system, layout, or other change in the process. Process modeling and simulation can determine the effectiveness or ineffectiveness of a new design or process. They can be done using process mapping and flow charts. We will learn about these in the forthcoming screens.

## 3.6 Process Mapping

Let us understand process mapping in this screen. Process mapping refers to a workflow diagram which gives a clear understanding of the process or a series of parallel processes. It is also known as process charting or flow-charting. Process Mapping can be done either in the Measure phase or the Analyze Phase. The Features of process mapping are as follows: Process mapping is usually the first step in process improvement. Process mapping gives a wider perspective of the problems and opportunities for process improvement. It is a systematic way of recording all the activities performed. Process mapping can be done by using any of the methods like flowcharts, written procedures, and detailed work instructions. Let us learn about flowchart in the following screen.

## 3.7 Flowchart

A Flowchart is a pictorial representation of all the steps of a process in consecutive order. It is used to plan a project, document processes, and communicate the process methodology with others. There are many symbols used in a flowchart, and the common symbols are shown in the given table. It is recommended you take a look at the symbols and their description for better understanding. Click the button to view an example of a flowchart. The given flowchart shows the processes involved in Software development. The flowchart starts with the Start box which connects to the design box. In a software project, a software design is followed by coding, which is then followed by testing. In the next step, there is a check for errors. In case of errors, it is evaluated for the error type. If it is a design error, it goes back to the beginning of the design stage. If it is not a design error, it is then routed to the beginning of the coding stage. On the contrary, if there are no errors, the flowchart ends.

## 3.9 Work Instructions

Work instructions define how one or more activities involved in a procedure should be written in a detailed manner with the aid of technology or other resources like flowcharts. They provide step-by-step details for a sequence of activities organized in a logical format so that an employee can follow it easily and independently. For example, in the internal audit procedure, how to fill out the audit results report comes under work instructions. Selection of the three process mapping tools is based on the amount of detail involved in the process. For a less detailed process, you can select flowchart and for a detailed process, with lots of instructions, you can select Work Instructions. Click the button to view an example of work instructions. This example shows the work instructions for shipping electronic instruments. The company name is Nutri Worldwide Inc. The instructions are written by Briana Scott and approved by Andrew Murphy. It is a one page instruction. The work instructions are documented for the shipping of electronic instruments by the shipping department. The scope of the project states that it is applicable to shipping operations. The Procedure is divided into three broad steps. As a first step, the order for the shipment must be prepared. In this step, the shipping person receives an order number from the sales department through an automatic order system. The quantity of the instrument and its card number are looked up from the system file and the packaging is done as per the instructions on the card. The next step is packaging. Special packing instructions must be checked. The instruments are then marked as per the instructions on the card and packed in a special or standard container as per the requirement. The order number is written in the shipping system and the packing list and shipping documentation are obtained. Finally, the quantity of instruments and the documents is checked.

## 3.10 Process Input And Output Variables

Let us understand process input and output variables in this screen. Any improvement of a process has a few prerequisites. To improve a process, the Key Process Output Variables (KPOV) and Key Process Input Variables (KPIV) should first be measured. Metrics for key process variables include percent defective, operation cost, elapsed time, backlog quantity, and documentation errors. Critical variables are best identified by the process owners. Process owners know and understand each step of a process and are in a better position to identify the critical variables initially. Once identified, the relationship between the variables is depicted using tools such as SIPOC and Cause and Effect Matrix. The process input variables’ results are compared to determine which input variables have the greatest effect on the output variables. Let us proceed to the next topic of this lesson in the following screen.

## 3.11 Topic 2 Probability And Statistics

In this topic, we will discuss Probability and Statistics in detail. Let us learn about probability in the following screen.

## 3.12 Probability

Probability refers to the chance of something occurring or happening. An outcome is the result of a single trial of an experiment. Suppose there are N possible outcomes that are equally likely, the probability that a specific type of event or outcome, say f, can occur is, the number of specific outcomes divided by the total possible outcomes. Click the button to view an example of probability. In the event of tossing a coin, what is the probability of the occurrence of ‘heads’? A single trial of tossing a coin has two outcomes, heads and tails. Hence, the probability of heads occurring is 1 divided by 2, the total number of outcomes.

## 3.13 Basic Properties of Probability

Let us look at some basic properties of probability in this screen. There are three basic properties of probability. Click each property to know more. Property one states that the probability of an event is always between zero and one, both inclusive. According to property two, the probability of an event that cannot occur is zero. In other words, an event that cannot occur is called an impossible event. Property three states that the probability of an event that must occur is one. In other words, an event that must occur is called a certain event. If E is an event, then the probability of its occurrence is given by P(E) (Pronounce as: p of e). It is also read as the probability of event E.

## 3.14 Probability Common Terms and Example

In this screen, let us look at some common terms used in probability along with an example. The commonly used terms in probability are sample space, Venn diagram, and event. Sample space is the collection of all possible outcomes for a given experiment. In the coin example discussed earlier, the sample space consists of one instance each of heads and tails. If two coins are tossed, the sample space would be four in total. A Venn diagram shows all hypothetically possible logical relations between a finite collection of sets. An event is a collection of outcomes for an experiment, which is any subset of the sample space. Click the button to view an example of probability. What is the probability of getting a three followed by two when a dice is thrown twice? When the dice is thrown twice, the first throw can have any number from 1 to 6. Similarly, the second throw can also have any number from 1 to 6. So the total sample space is 6 times 6, i.e. (pronounce as “that is”), 36. The event in this case is 3 followed by 2. This can happen in only one way. So the probability in the question is 1 divided by 36.

## 3.15 Probability Concepts

Let us discuss the basic concepts of probability in this screen. Some basic concepts of probability are independent event, dependent event, mutually-exclusive, and mutually-inclusive events. Click each concept to know more. When the probability of occurrence of an event does not affect the probability of occurrence of another event, the two events are said to be independent. Suppose you rolled a dice and flipped a coin at the same time, the probability of getting any number on the dice in no way influences the probability of getting heads or tails on the coin. When the probability of one event occurring influences the likelihood of the other event, the events are said to be dependent. Events are said to be mutually exclusive if the occurrence of any one of them prevents the occurrence of all the others. In other words, only one event can occur at a time. Consider an example of flipping a coin. When you flip a coin, you will either get heads or tails, but not both. You can add the probabilities of these two events to prove they are mutually exclusive. Any two events wherein one event cannot occur without the other, are said to be mutually inclusive events.

## 3.16 Multiplication Rules or AND Rules

In this screen, let us learn about the multiplication rules, also known as AND rules. The multiplication rules or AND rules depend on the event dependency. For independent events, that is, if two events are independent of each other, the special multiplication rule applies. For mutually independent events, the special multiplication rule is as follows: If the events A, B, C, and so on are independent of each other, then the probability of A and B and C and so on is equal to the product of their individual probabilities. Click the button to view an example of this rule. Suppose there are three events which are independent of each other, such as, the event of flipping a coin and getting heads, drawing a card and getting an ace, and throwing a dice and getting a one. What is the probability of occurrence of all these events? The answer is, the probability of A and B and C is equal to the product of their individual probabilities, which is half multiplied by one-thirteenth multiplied by one-sixth. The result is 0.0064, which is 0.64% (pronounce as: zero point six four percent). Hence, there is 0.64% probability of all of the events occurring.

## 3.17 Multiplication Rules or AND Rules (contd.)

We will continue the discussion on multiplication rules in this screen. The multiplication rule for non-independent or conditional events, which is also the General multiplication rule, is as follows. If A and B are two events, then the probability of A and B is equal to the product of probability of A, and the probability of B given A. Alternatively, we can say that for any two events, their joint probability is equal to the probability that one of these events occurs multiplied with the conditional probability of the other event given the first event. Click the button to view an example of this rule. 19.1 A bag contains 6 golden coins and 4 silver coins. Two coins are drawn without replacement from the bag. What is the probability that both of the coins are silver? Let A be the event that the first coin is silver; and B be the event that the second coin is silver. There are 10 coins in the bag, 4 of which are silver. Therefore, P(A) = 4/10 After the first selection, there are 9 coins in the bag, 3 of which are silver. Therefore, P(B|A) = 3/9 Therefore, based on the rule of multiplication: P(A ? B) = (4/10) X (3/9) The answer is twelve divided by ninety, which is 0.1334 Hence, there is 13% probability that both the coins are silver.

## 3.18 Permutation And Combination

In this screen, we will look at the definitions and formulae of permutation and combination. Permutation is the total number of ways in which a set, group or number of things can be arranged. The order matters to a great extent in permutation. The manner in which the objects or numbers are arranged will be considered in permutation. The formula for permutation is n P r = p (n, r) = n!/(n-r)!, where n is the number of objects and r is the number of objects taken at a time. The unordered arrangement of set, group, or number of things is known as combination. The order does not matter in combination. The formula for combination is n C r equals C of n and r equals n factorial divided by r factorial multiplied by n minus r factorial, where n is the number of objects and r is the number of objects taken at a time. Let us look at an example for calculating permutation and combination in the following screen.

## 3.19 Calculating Permutation and Combination Example

From a group of 10 employees, a company has to select 4 for a particular project. In How many ways can the selection happen, given the following conditions? When the arrangement of employees needs to be different When the arrangement of employees need not be different Click the button to know the answer. 21.1 In the given example, the values of n and r are 10 and 4 respectively. Let us consider the first condition: From a group of 10 employees, 4 employees need to be selected. The arrangement needs to be different. Using the permutation formula: n P r = p (n, r) = n!/(n-r)! 10 P 4 = p (10, 4) = 10! / (10 – 4)! = 5040 Therefore, the 4 employees can be selected in 5040 ways. Let us now consider the second condition: From a group of 10 employees, 4 employees need to be selected. The arrangement of employees need not be different. Using the combination formula: n C r = c (n, r) = n!/r!(n-r)! 10 C 4 = c (10, 4) = 10! / 4! (10 – 4)! = 210 Therefore, the 4 employees can be selected from a group of 10 employees in 210 different ways.

## 3.20 Types Of Statistics

Let us understand the two types of statistics in this screen. Statistics refers to the science of collection, analysis, interpretation, and presentation of data. In Six Sigma, statistical methods and principles are used to measure and analyze the process performance and improvements. There are two major types of statistics, descriptive statistics and inferential statistics. Descriptive statistics is also known as Enumerative Statistics and inferential statistics is also known as analytical statistics. Descriptive statistics includes organizing, summarizing, and presenting the data in a meaningful way, whereas, inferential statistics includes making inferences and drawing conclusions from the data. Descriptive statistics describes what's going on in the data. The main objective of inferential statistics is to makes inferences from the data to more general conditions. Histograms, pie charts, box plots, frequency distributions, and measures of central tendency (mean, median, and mode) are all examples of descriptive statistics. On the other hand, examples of inferential statistics are Hypothesis testing, scatter diagrams, etc.

## 3.21 Analytical Statistics

The main objective of statistical inference is to draw conclusions on population characteristics based on the information available in the sample. Collecting data from a population is not always easy, especially if the size of the population is big. The easier way is to collect a sample from the population, and from the sample statistic collected, make an assessment about the population parameter. Click the button to see an example of statistical inference. 24.1 The management team of a cricket council wants to know if the team’s performance has improved after recruiting a new coach. The management conducts a test to prove this statistically. Let us consider y a and y b, where y a stands for efficiency of Coach A and y b stands for efficiency of Coach B. To conduct the test, the basic assumption is Coach A and Coach B are both effective. This basic assumption is known as Null Hypothesis. Here, let us assume the status quo is Null Hypothesis. Hence, null hypothesis (H0) (Pronounce as: h-oh) can be given by Ya = Yb (Pronounce as y a equals y b). The management team also challenges their basic assumption by assuming the coaches are not equally effective. This is their Alternate Hypothesis. The alternate hypothesis states that the efficiencies of the two coaches differ. If the null hypothesis is proven wrong, the alternate hypothesis must be right. Hence, alternate hypothesis (H1) (Pronounce as: h-one) can be given by Ya ? Yb (Pronounce as y a is not equal to y b). These Hypothesis statements are used in a Hypothesis Test, which will be discussed in the later part of the course.

## 3.22 Types Of Errors

In this screen, we will learn about the types of errors. When collecting data from a population as a sample and forming a conclusion on the population based on the sample, you run into the risk of committing errors. There are two possible errors that can happen, Type I (pronounce as type one) error and Type II (pronounce as type two) error. The type one error occurs when the null hypothesis is rejected when it is, in fact, true. Type I error is also known as Producers’ Risk. The chance of committing a Type I error is known as Alpha. Alpha or Significance Level is the chance of committing a Type 1 Error and is typically chosen to be 5%. This means the maximum amount of risk you have for committing a Type 1 Error is 5%. Let us consider the previous example. Arriving at a conclusion that coach B is better than coach A, when in fact they are at the same level, is a Type I error. The risk you have of committing this error is 5%, which means there is 5% chance your experiment can give wrong results. The type two error occurs when the null hypothesis is accepted when it is, in fact, false. Also, when you reject the Alternate Hypothesis, when it is actually true, you commit a Type II error. Type II error is also referred to as Consumer’s Risk. In comparing the two coaches, the coaches were actually different in their efficiencies, but the conclusion was that they are the same. The chance of committing a Type II error is known as Beta. The maximum chance of committing a Type II error is 20%. In the next screen, we will learn about central limit theorem.

## 3.23 Central Limit Theorem

Central Limit Theorem (CLT) states that for a sample size greater than 30, the sample mean is very close to the population mean. In simple words, the sample mean approaches the normal distribution. For example, if you have sample 1 and its mean is Mean 1, sample 2 and its mean is Mean 2, and so on, take the means of Mean 1, Mean 2, etc., and you will find that it is the same as the population mean. Population mean is the average of the sample means. In such cases, the Standard Error of Mean, also known as SEM (Pronounced as S-E-M) that represents the variability between the sample means, is very less. The SEM is often used to represent the standard deviation of the sample. The formula for SEM is Population Standard Deviation divided by the square root of the sample size. Selecting a sample size also depends on the concept called Power, also known as Power of the test. We will cover this concept in detail in the later part of the course. Let us look at the graphical representation of the Central Limit Theorem in the following screen.

## 3.24 Central Limit Theorem Graph

The plot of the three numbers two, three, and four looks as shown in the graph. It is interesting to note that the total number of times each digit is chosen is six. When the plot of the sample mean of nine samples of size two each is drawn, it looks like the red line which is plotted in the figure. The x axis shows numbers of the mean which are two, two point five, three, and four. On the y axis, the frequency is plotted. The point at which arrows from number 2 and 3 converge is the mean of 2 and 3. Similarly, the point at which arrows from 2 and 4 converge is the mean of the numbers 2 and 4. Let us discuss the concluding points of the central limit theorem in the next screen.

## 3.25 Central Limit Theorem Conclusions

The central limit theorem concludes that the sampling distributions are helpful in dealing with non-normal data. If you take the sample data points from a population and plot the distribution of the means of the sample, you get the sampling distribution of the means. The mean of the sampling distribution, also known as the mean of means, will be equal to the population mean. Also, the sampling distribution approaches normality as the sample size increases. Note that CLT enables you to draw inferences from the sample statistics about the population parameters. This is irrespective of the distribution of the population. CLT also becomes the basis for calculating confidence interval for hypothesis tests, as it allows the use of a standard normal table. Let us proceed to the next topic of this lesson in the following screen.

## 3.26 Topic 3 Statistical Distributions

In this topic, we will cover the concept of statistical distributions. Let us start with discrete probability distribution in the following screen.

## 3.27 Discrete Probability Distribution

Discrete probability distribution is characterized by the probability mass function. It is important to be familiar with discrete distributions while dealing with discrete data. Some of the examples of discrete probability distribution are binomial distribution, Poisson distribution, negative binomial distribution, geometric distribution, and hyper geometric distribution. We will focus only on the two most useful discrete distributions, binomial distribution and Poisson distribution. Like most probability distributions, these distributions also help in predicting the sample behavior that has been observed in a population. Let us learn about binomial distribution in the following screen.

## 3.28 Binomial Distribution

Binomial distribution is a probability distribution for discrete data. Named after the Swiss mathematician Jacob Bernoulli, it is an application of popular knowledge to predict the sample behavior. Binomial distribution also describes the discrete data as a result of a particular process like the tossing of a coin for a fixed number of times and the success or failure in an interview. A process is known as Bernoulli’s process when the process output has only two possible values like defective or ok, pass or fail, and yes or no. Binomial distribution is used to deal with defective items. Defect is any noncompliance with a specification. Defective is a product or service with at least one defect. Binomial distribution is most suitable when the sample size is less than thirty and less than ten percent of the population. It is the percentage of non-defective items, provided the probability of creating a defective item remains the same over a period. Let us look at the equation. The probability of exactly r successes out of a sample size of n is denoted by P of r, which is equal to n c r, whole multiplied by p to the power of r and one minus p whole to the power of n minus r. In the equation, p is the probability of success, r is the number of successes desired and n is the sample size. To continue discussing the binomial distribution, let us look at some of its key calculations in the following screen.

## 3.29 Binomial Distribution (contd.)

The mean of a binomial distribution is denoted by Meu and is given by n multiplied by p. The standard deviation of a binomial distribution is denoted by sigma which is equal to square root of n multiplied by p multiplied by one minus p. The method of calculating factorials, say factorial of five, is the product of five, four, three, two, and one, which is equal to one hundred twenty. Similarly, factorial of four is the product of four, three, two and one, which is equal to twenty four. Let us look at an example of calculating binomial distribution in the next screen.

## 3.30 Calculating Binomial Distribution Example

Suppose you wish to know the probability of getting heads five times in eight coin tosses, you can use the binomial equation for the same. Click the answer button to see how this is done. The tossing of a coin has only two outcomes, heads and tails. It means that the probability of each outcome is zero point five and it remains fixed over a period of time. Additionally, the outcomes are statistically independent. In this case, the probability of success denoted by p is zero point five, the number of successes desired is denoted by r, which is five, and the sample size is denoted by n, which is eight. Therefore, the probability of five heads is equal to factorial of eight c r, which is eight divided by factorial of five and factorial of eight minus five, whole multiplied by zero point five to the power of five multiplied by one minus zero point five whole to the power of eight minus five. This calculation gives a result of zero point two one eight seven, which is equal to twenty one point eight seven percent.

## 3.31 Poisson Distribution

Let us learn about Poisson distribution in this screen. Poisson distribution is named after Simeon Denis Poisson and is also used for discrete data. Poisson distribution is an application of the population knowledge to predict the sample behavior. It is generally used for describing the probability distribution of an event with respect to time or space. Some of the characteristics of Poisson distribution are as follows. Poisson distribution describes the discrete data resulting from a process like the number of calls received by a call center agent or the number of accidents at a signal. Unlike Binomial distribution, which deals with binary discrete data, Poisson distribution deals with integers which can take any value. Poisson distribution can be used for predicting the number of defects, as well, given a low defect occurrence rate. Poisson distribution is suitable for analyzing situations wherein the number of trials, similar to the sample size in binomial distribution, is large and tends towards infinity. Additionally, it is used in situations where the probability of success in each trial is very small, almost tending toward zero. This is the reason why Poisson distribution is applicable for predicting the occurrence of rare events like plane crashes, car accidents, etc. and is therefore widely used in the insurance sector. Let us look at the formula for calculating Poisson distribution in the next screen.

## 3.32 Poisson Distribution Formula

The Poisson distribution for a probability of exactly x occurrences is given by P of x equals to lambda to the power of x multiplied with log e to the power of minus lambda whole divided by factorial of x. In this equation, lambda is the mean number of occurrences during the interval, x is the number of occurrences desired, and e is the base of natural logarithm, which is equal to two point seven one eight two eight. The mean of the Poisson distribution is given by lambda and the standard deviation of a Poisson distribution is given by sigma which is the square root of lambda. Let us look at an example to calculate Poisson distribution in the next screen.

## 3.33 Calculating Poisson Distribution Example

The past records of a road junction which is accident-prone, show a mean number of 5 accidents per week at this junction. Assume that the number of accidents follows a Poisson distribution, and calculate the probability of any number of accidents happening in a week. Click the button to know the answer. Given the situation, you know that the value of lambda or mean is five. So P of zero, i.e., the probability of zero accidents per week, is calculated as five to the power of zero multiplied by e to the power of minus 5 whole divided by factorial of zero. The answer is zero point zero zero six. Applying the same formula, the probability of one accident per week is zero point zero three. The probability of more than two accidents per week is one minus the sum of probabilities of zero, one, and two accidents, which is zero point eight eight four. In other words, the probability is 88.4%.

## 3.34 Normal Distribution

Let us learn about normal distribution in this screen. The normal or Gaussian distribution is a continuous probability distribution. The normal distribution is represented as N, and depends on two factors, Myuu, which stands for mean, and sigma, which gives the standard deviation of the data points. Normal distribution normally has a higher frequency of values around the mean and lesser occurrences away from it. It is often used as a first approximation to describe real-valued random variables that tend to cluster around a single mean value. The distribution is bell shaped and symmetrical. The total area under the normal curve is one, which is p of x. Various types of data such as body weight, height, the output of a manufacturing device, etc., follow the normal distribution. Additionally, normal distribution is continuous and symmetrical with the tails asymptotic to the x-axis, which means they touch the x-axis at infinity. Let us continue to discuss normal distribution in the following screen.

## 3.35 Normal Distribution (contd.)

In a normal distribution, to standardize comparisons of dispersion or the different measurement units like inches, meters, grams, etc., a standard Z (pronounce as: zee) variable is used. The uses of Z value are as follows. While the value of Z or the number of standard deviations is unique for each probability within the normal distribution, it helps in finding probabilities of data points anywhere within the distribution. It is dimensionless as well, that is, it has no units such as mili meters, liters, coulombs, etc., There are different formulas to arrive at the normal distribution. We will focus on one commonly used formula for calculating normal distribution, which is: Z equals Y minus Myuu whole divided by sigma. Here, Z is the number of standard deviations between Y and the mean denoted by myuu, Y is the value of the data point in concern, myuu is mean of the population or data points, and sigma is the standard deviation of the population or data points. Let us look at an example for calculating normal distribution in the following screen.

## 3.36 Calculating Normal Distribution Example

Suppose the time taken to resolve customer problems follows a normal distribution with a mean of two hundred fifty hours and standard deviation of twenty three hours. Find the probability of a problem resolution taking more than three hundred hours. Click the button to know the answer. In this case, Y is equals 300, µ (read as: meu) equals 250, and sigma equals 23. Applying the normal distribution formula, Z is equal to three hundred minus two hundred fifty whole divided by twenty three. The result is two point one seven. When you look at the normal distribution table, the Z value of two point one seven covers an area of zero point nine eight four nine nine under itself. This means the probability of a problem taking zero to three hundred hours to be resolved is ninety eight point five percent and therefore the chances of a problem resolution taking more than three hundred hours is one point five percent.

## 3.37 Z Table Usage

Let us understand the usage of Z-table (Pronounce as: zee table) in this screen. The graphical representation of z-table usage is given here. The probability of areas under the curve is 1. For the actual value, one can identify the z score by using the Z table. As shown, this probability is the area under the curve to the left of point "+a" (Pronounce as: plus a) to 0. Using the actual data, when you calculate mean and standard deviation and the values are 25 and 5 respectively, it is the normal distribution. If the same data is standardized to mean value of zero and standard deviation value of one, it is the standard normal distribution. In the next screen, we will take a look at the z-table.

## 3.38 Z Table

The Z-table gives the probability that Z is between zero and a positive number. There are different forms of normal distribution z-tables followed globally. The most common form of z-table with positive z scores is shown here. The value of "a", called the percentage point, is given along the borders of the table (in bold) and is to 2 decimal places. The values in the main table are the probabilities that Z is between 0 and "+a" (Pronounce as: plus a). Note that the values running down the table are to 1 decimal place. The numbers along the column change only for the 2nd decimal place. Let us look at some examples on how to use a Z-table in the following screen. Let us find the value of p of (Z less than 0). The table is not needed to find the answer once we know that the variable Z takes a value less than (or equal to) zero. First, the area under the curve is 1, and second, the curve is symmetrical about Z equals zero. Hence there is 0.5 or 50% above chance of Z equals zero and 0.5 or 50% below chance of Z equals zero. Let us find the value of p of (Z greater than 1.12). In this case, the chance of Z is GREATER than a number (in this case, 1.12). You can find this by using the following fact. The opposite or complement of an event of A is the event of not A (that is, the opposite or complement of event A occurring is the event A not occurring); Its probability is given by P of not A equals one minus P of A. In other words, p of Z greater than 1.12 is 1 minus the opposite, which is p of Z lesser than 1.12. Using the table, p of Z less than 1.12 equals 0.5 plus p of 0 less than Z less than 1.12 equals 0.5 plus 0.3686, which is 0.8686. Hence the answer is p of Z greater than 1.12 equals 1 minus 0.8686, which is 0.1314. Note, the answer is less than 0.5.

## 3.39 Using Z Table Example

Let us find the value of P of (Z lies between 0 and 1.12) In this case, where Z falls within an interval, the probability can be read straight off the table.. P of (Z lies between 0 and 1.12) equals 0.3686.

## 3.40 Chi Square Distribution

We will learn about chi-square (pronounced as khaii square) distribution in this screen. Chi-square distribution (pronounced as khaii square) is also known as chi-squared or ?² (Pronounce as: chi square) distribution). Chi-squared with k-1 degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The Chi-square distribution is one of the most widely used probability distributions in inferential statistics. It is also known as hypothesis testing and the distribution is used in hypothesis tests. When used in hypothesis tests, it only needs one sample for the test to be conducted. Conventionally, Degree of freedom is k-1, where k is the sample size. For example, if w, x, y, and z are four random variables with standard normal distributions, then the random variable f which is the sum of w Square, x square, y square, and z square, has a chi square distribution. The degrees of the freedom of the distribution df equals the number of normally distributed variables used. In this case, df is equal to four. Let us look at the formula to calculate chi-square distribution in the following screen.

## 3.41 Chi Square Distribution Formula

Chi-square calculated or sigma or the chi square index equals f of o minus f of e whole square divided by f of e. Here, f of o stands for an observed frequency and f of e stands for an expected frequency, determined through a contingency table. Let us understand t-distribution in the next screen.

## 3.42 T Distribution

The t-distribution method is the most appropriate method to be used in the following situations: when you have a sample size of less than thirty, when the population standard deviation is not known, and when the population is approximately normal. Unlike the normal distribution, a t-distribution is lower at the mean and higher at the tails as seen in the image. T-distribution is used for hypothesis testing. Also, as seen in the image, the t-distribution is symmetrical in shape, but flatter than the normal distribution. As the sample size increases, the t-distribution approaches normality. For every possible sample size or degrees of freedom, there is a different t-distribution. Let us learn about F-distribution in the following screen.

## 3.43 F Distribution

The F-distribution is a ratio of two chi square distributions. A specific F-distribution is denoted by the ratio of the degrees of freedom for the numerator chi square and the degrees of freedom for the denominator chi square. The F-test is performed to calculate and observe if the standard deviations or variances of two processes are significantly different. The project teams are usually concerned about reducing the process variance. As per the formula, F calculated equals S one square divided by S two square, where S one and S two are the standard deviations of the two samples. If the F calculated is one, it implies there is no difference in the variance. If S one is greater than S2, then the numerator must be greater than the denominator. In other words, d f one equals n one minus one and d f two equals n two minus one. From the F-distribution table, you can easily find out the critical F-distribution at alpha and the degrees of freedom of the samples of two different processes, d f one and d f two. Let us proceed to the next topic of this lesson in the following screen.

## 3.44 Topic 4 Collecting And Summarizing Data

In this topic, we will discuss collecting and summarizing data in detail. Let us learn about types of data in the following screen.

## 3.45 Types of Data

Data is objective information, which everyone can agree on. It is a collection of facts from which conclusions may be drawn. The two types of data are attribute data and variable data. Click each type to know more. Discrete data is data that can be counted and only includes numbers such as two, forty or thousand fifty. Attribute data is commonly called Pass-Fail or Good-Bad data. Attribute or discrete data cannot be broken down into a smaller unit meaningfully. It answers questions such as how many, how often or what type. Some examples of attribute data are number of defective products, percentage of defective products, frequency at which a machine is repaired, or the type of award received. Any data that can be measured on a continuous scale is Continuous or Variable Data. This type of data answers questions such as how long, what volume, or how far. Examples of continuous data include height, weight, time taken to complete a task, temperature and so on.

## 3.46 Selecting Data Type

Let us understand the importance of selecting the data type in this screen. Deciding the data type facilitates analysis and interpretation. Therefore, the first step in the measure phase is to determine what type of data should be collected. This can be done by considering the following. The first consideration is to identify what is already known. For this, the values already identified for the process are listed. These include Critical to Quality parameters or CTQs, Key Process Output Variables or KPOVs, and the Key Process Input Variable or KPIVs. Next, to understand how to proceed with the data gathered, it is necessary to determine the data type that fits the metrics for the key variables identified. The question now arises, why should the data type be identified? This is important as it enables the right set of data to be collected, analyzed, and used to draw inferences. It is not advisable to convert one type of data into another. Converting attribute data to variable data is difficult and requires assumptions to be made about the process. It may also require additional data gathering including retesting units. Let us look at measurement scales in the following screen.

## 3.47 Measurement Scales

There are four measurement scales arranged in the table in increasing order of their statistical desirability. In the nominal scale, the data consists of only names or categories and there is no possibility of ordering. An example of this type of measurement can be a bag of colored balls, which contains ten green balls, five back balls, eight yellow balls, and nine white balls. This is the least informative of all scales. The most appropriate measure of central tendency for this scale is mode. In the ordinal or ranking scale, data is arranged in order and values can be compared with each other. An example of this scale can be the ratings given to different restaurants, three for A, five for B, two for C, and four for D. The central tendency for this scale is median or mode. The interval scale is used for ranking items in step order along a scale of equidistant points. For example, the temperatures of three metal rods are hundred degrees, two hundred degrees, and six hundred degrees Fahrenheit respectively. Note that three times two hundred degrees is not the same as six hundred degrees as a temperature measurement. The central tendency here is mean, median or mode. Mean is used if the data does not have any outliers. The ratio scale represents variable data and is measured against a known standard or increment. However, this scale also has an absolute zero, that is, no numbers exist below zero. An example of the ratio scale are physical measures where height, weight, and electric charge represent ratio scale data. Note that negative length is not possible. Again here, you would use mean, median or mode as the central tendency measure. In the next screen, we will learn about assuring data accuracy.

## 3.48 Assuring Data Accuracy

To ensure data is accurate, sampling techniques are used. Sampling is the process, act, or technique of selecting an appropriate test group or sample from a larger population. It is preferable to survey 100 people to surveying 10,000 people. Sampling saves the time, money and effort involved in collecting data. The three types of sampling techniques described here are: random sampling, sequential sampling, and stratified sampling. Click each type to know more 56.1 Random sampling is the technique where a group of subjects or a sample for study is selected from a larger group or population at random. Sequential sampling is similar to multiple sampling plans except that it can, in theory, continue indefinitely. In other words, it is a non-probability sampling technique wherein the researcher picks a single subject or a group of subjects in a given time interval, conducts the study, analyzes the results and then picks another group of subjects if needed and so on. In stratified sampling, the idea is to take samples from sub-groups of a population. This technique gives an accurate estimate of the population parameter.

## 3.49 Simple Random Sampling Vs Stratified Sampling

In this screen, we will compare simple random sampling with stratified sampling. Simple random sampling is easy to do, while stratified sampling takes a lot of time. The possibility of Simple Random Sampling giving erroneous results is very high, while Stratified Sampling minimizes the chances of error. Simple random sampling doesn’t have the power to show possible causes of variation, while Stratified Sampling, if done correctly, will show assignable causes. In the next screen, we will look at the check sheet method of collecting data.

## 3.50 Data Collection Methods Check Sheets

The process of collecting data is expensive. Wrongly collected data leading to wrong analysis and inferences results in resources being wasted. A check sheet is a structured form prepared to collect and analyze data. It is a generic tool that is relatively simple to use and can be adapted for a variety of purposes. Check sheets are used when the data can be observed and collected repeatedly by the same person or at the same location. They are also used while collecting data from a production process. A common example is calculating the number of absentees in a company. The table shows absentee data collected for a week. We will discuss data coding and its advantages in the following screen.

## 3.51 Data Coding

Data coding is a process of converting and condensing raw data into categories and sets so that the data can be used for further analysis. The benefits of data coding are listed here. Data coding simplifies the large quantity of data that is collected from sources. The large amount of data makes analysis and drawing conclusions difficult. It leads to chaos and ambiguity. Data coding simplifies the data by coding it into variables and then categorizing these variables. Raw data cannot be easily entered into computers for analysis. Data coding is used to convert raw data into processed data that can be easily fed into computing systems for calculation and analysis. Coding of data makes it easy to analyze the data. Converted data can either be analyzed directly or fed into computers. The analyst can easily draw conclusions when all the data is categorized and computerized. Data coding also enables organized representation of data. Division of data into categories helps organize large chunks of information, thus making analysis and interpretation easier. Data coding also ensures that data repetition does not occur and duplicate entries are eliminated so that the final result is not affected. In the following screen, we will discuss measures of central tendency of the descriptive statistics in detail.

## 3.52 Descriptive Statistics Measures of Central Tendency

A measure of central tendency is a single value that indicates the central point in a set of data and helps in identifying data trends. The three most commonly used measures of the central tendency are mean, median, and mode. Click each measure to know more. Mean is the most common measure of central tendency. It is the sum of all the data values divided by the number of data points. Also called arithmetic mean or average, it is the most widely used measure of central tendency. Also known as positional mean, median is the number present in the middle of the data set when the numbers are arranged in ascending or descending order. If the data set has an even number of entries, then the median is the mean of the two middle numbers. Median can also be calculated by the formula n plus one divided by 2, where n is the number of entries. Mode, also known as frequency mean, is the value that occurs most frequently in a set of data. Datasets that have more than one mode are known as Bi-Modal Data.

## 3.53 Mean Median and Mode Example

Let us look at an example for determining mean, median, and mode in this screen. The data set has the numbers one, two, three, four, five, five, six, seven, and eight. Click the button to know the answer. As previously defined, mean is the sum of all the data items divided by the number of items. Therefore, the mean is equal to forty one divided by nine, which is equal to four point five six. The number in the middle of the data set is 5, therefore the median is 5. Mode is the most frequently occurring number, which is again five.

## 3.54 Mean Median And Mode Outliers

In this screen, we will understand the effect of outliers on the dataset. Let us consider a minor change to the dataset. A new number, 100, is added to the dataset. On using the same formula to calculate mean, the new mean is 15.11. Ideally, 50% of values should lie on either side of the mean value. However, in this example, it can be seen that almost 90% of values lie below the mean value of 15.11, and only one value above the mean. The data point 100 is called an outlier. An outlier is an extreme value in the data set that skews the mean value to one side of the dataset. Note that the median remains unchanged at 5. Therefore, mean is not an appropriate measure of central tendency if the data has outliers; median is preferred in this case. In the next screen, we will look at measures of dispersion of the descriptive statistics.

## 3.55 Descriptive Statistics Measures Of Dispersion

Apart from central tendency, another important parameter to describe a data set is spread or dispersion. Contrary to the measures of central tendency such as mean, median, and mode, measures of dispersion express the spread of values. Higher the variation of data points, higher the spread of the data. The three main measures of dispersion are range, variance, and standard deviation. We will discuss each of these in the upcoming screens.

## 3.56 Measures Of Dispersion Range

Let us start with the first measure of dispersion, range. The range of a particular set of data is defined as the difference between the largest and the smallest values of the data. In the example, the largest value of the data is nine and the smallest value is one. Therefore the range is nine minus one, eight. In calculating range, all the data points are not needed, and only the maximum and minimum values are required. Let us understand the next measure of dispersion, variance, in the following screen.

## 3.57 Measures Of Dispersion Variance

The variance, denoted as sigma square or s square, is defined as the Average of Squared Mean Differences and shows the variation in a data set. To calculate the variance for a sample data set of 10 numbers, type the numbers in an excel sheet. Calculate the variance using the formula equals V-A-R-.P or V-A-R-.S. The V-A-R-.P formula gives the population variance, which is 7.24 for this example. The V-A-R-S formula gives the sample variance, 8.04. Population variance is calculated when the data set is for an entire population, and sample variance is calculated when data is available only for a sample of the population. Population variance is preferred over sample variance as the latter is only an estimate. Sample variance allows for a broader range of possible answers for the true mean of the population. That is, the confidence levels are higher in sample variance. Note that variance is a measure of variation and cannot be considered as the variation in a data set. In the following screen, we will understand the next measure of dispersion, standard deviation.

## 3.58 Measures Of Dispersion Standard Deviation

Standard deviation, denoted by sigma or S, is given by the square root of variance. The statistical notation of this is given on screen. Standard deviation is the most important measure of dispersion. Standard Deviation is always relative to the mean. For the same data set, the population standard deviation is 2.69 and sample standard deviation is 2.83. As in variance calculation, if the data set is measured for every unit in a population, the population standard deviation and sample standard deviation can be calculated in Excel using the formula given on the screen. The steps to manually calculate the standard deviation are: First, calculate the mean. Then calculate the difference between each data point and the mean, and square that answer. Next, calculate the sum of the squares. Next, divide the sum of the squares by N or n-1 to find the variance. Lastly, find the square root of variance, which gives the standard deviation. In the next screen, we will look at frequency distribution of the descriptive statistics.

## 3.59 Descriptive Statistics Frequency Distribution

Frequency distribution is a method of grouping data into mutually exclusive categories showing the number of observations in each class. An example is presented to demonstrate frequency distribution. A survey was conducted among the residents of a particular area to collect data on cars owned by each home. A total of 20 homes were surveyed. To create a frequency table for the results collected in the survey, the first step is to divide the results into intervals and count the number of results in each interval. For instance, in this example, the intervals would be the number of households with no car, one car, two cars, and so on. Next, a table is created with separate columns for the intervals, the tallied results for each interval, and the number of occurrences or frequency of results in each interval. Each result for a given interval is recorded with a tally mark in the second column. The tally marks for each interval are added and the sum is entered in the frequency column. The frequency table allows viewing distribution of data across a set of values at a glance. In the following screen, we will look at cumulative frequency distribution.

## 3.60 Cumulative Frequency Distribution

A cumulative frequency distribution table is similar to the frequency distribution table, only more detailed. There are additional columns for cumulative frequency, percentage, and cumulative percentage. In the cumulative frequency column, the cumulative frequency of the previous row or rows is added to the current row. The percentage is calculated by dividing the frequency by the total number of results and multiplying by 100. The cumulative percentage is calculated similar to the cumulative frequency. Let us look at an example for cumulative frequency distribution. The ages of all the participants in a chess tournament are recorded. The lowest age is thirty seven and the highest is ninety one. Keeping intervals of ten, the lowest interval starts with the lower limit as thirty five and the upper limit as forty four. Similar intervals are created until an upper limit of ninety four. In the frequency column, the number of times a result appears in a particular interval is recorded. In the cumulative frequency column, the cumulative frequency of the previous row is added to the frequency of the current row. For the first row, the cumulative frequency is the same as the frequency. In the second row, the cumulative frequency is one plus two, which is three and so on. In the percentage column, the percentage of the frequency is listed by dividing the frequency by the total number of results, which is 10, and multiplying the value by 100. For instance, in the first row, the frequency is one and the number of results is ten. Therefore, the percentage is ten. The final column is the cumulative percentage column. In this column, the cumulative frequency is divided by the total number of results, which is 10, and the value is multiplied by hundred. Note that the last number in this column should be equal to one hundred. In this example, the cumulative frequency is one and the total number of results is ten. Therefore, the cumulative percentage of the first row is ten.

## 3.61 Graphical Methods Stem And Leaf Plots

Let us look at the Stem and Leaf Plots, which is one of the graphical methods of understanding distribution. Graphical methods are extremely useful tools to understand how data is distributed. Sometimes, merely by looking at the data distribution, errors in a process can be identified. The stem and leaf method is a convenient method of manually plotting data sets. It is used for presenting data in a graphical format to assist visualizing the shape of a given distribution. In the example on the screen, the temperatures in Fahrenheit for the month of May are given. To collate this information in a stem and leaf plot, all the tens digits are entered in the Stem column and all the units digits against each tens digit are entered in the Leaf column. To start with, the lowest value is considered. In this case, the lowest temperature is fifty one. In the first row, five is entered in the stem column and zero in the leaf column. The next lowest temperature is fifty eight. Eight is entered in the Leaf column corresponding to five in the Stem. The next number is fifty nine. All the temperatures falling in the fifties are similarly entered. In the next row, the same process is repeated for temperatures in the 60s. This is continued till all the temperature values are entered in the table. Let us understand another graphical method in the next screen, box and whisker plots.

## 3.62 Graphical Methods Box And Whisker Plots

A box and whisker graph, based on medians or quartiles, is used to display a data set in a way that allows viewing the distribution of the data points easily. Consider the following example. The lengths of 13 fish caught in a lake were measured and recorded. The Data set is given on the screen. The first step to draw a box and whisker plot is, therefore, to arrange the numbers in increasing order. Next, find the median. As there is an odd number of data entries, the median is the number in the middle of the data set, which, in this case, is 12. The next step is to find the lower median or quartile. This is the median of the lower six numbers. The middle of these numbers is halfway between eight and nine, which would be eight point five. Similarly, the upper median or quartile is located for the upper six numbers to the right of the median. The upper median is halfway between the two values, fourteen and fourteen. Therefore, the upper median is fourteen. Let us now understand how the box and whisker chart is drawn using the values of the median, and upper and lower quartiles. The next step is, a number line is drawn extending far enough to include all the data points. Then, a vertical line is drawn from the median point 12. The lower and upper quartiles, 8.5 and 14 respectively, are marked with vertical lines and these are joined with the median line to form two boxes, as shown on the screen. Next, two whiskers are extended from either ends of the boxes as shown to the smallest and largest numbers in the dataset, 5 and 20 respectively. The box and whiskers graph is now complete. The following inferences can be drawn from the box and whisker plot. The lengths of the fish range from 5 to 20. The range, is therefore, 15. The quartiles split the data into four equal parts. In other words, one quarter of the data numbers is less than eight point five, one quarter between eight point five and twelve, next quarter of the data numbers are between twelve and fourteen, and another quarter has data numbers greater than fourteen.

## 3.63 Graphical Methods Scatter Diagrams

In this screen, we will learn about another graphical method, scatter diagrams. A scatter diagram or scatter plot is a tool used to analyze the relationship or correlation between two sets of variables, X and Y, with X as the independent variable, and Y as the dependent variable. A scatter diagram is also useful when cause-effect relationships have to be examined, or root causes have to be identified. There are five different types of correlation that can be used in a scatter diagram. Let us learn about them in the next screen.

## 3.64 Scatter Diagrams Types of Correlation

The five types of correlation are perfect positive correlation, moderate positive correlation, no relation or no correlation, moderate negative correlation, and perfect negative correlation. Click each type to learn more. In perfect positive correlation, the value of dependent variable Y increases proportionally with any increase in the value of independent variable X. This is said to be 1:1 (one is to one), that is any change in one variable results in an equal amount of change in the other. The following example is presented to demonstrate perfect positive correlation. The consumption of milk is found to increase proportionally with an increase in the consumption of coffee. The data is presented in the table on the screen. The scatter diagram for the data is also shown. It can be observed from the graph that as X increases, Y also increases proportionally. Hence, the points are linear. In this type of correlation, as the value of the X variable increases, the value of Y also increases, but not in the same proportion. To demonstrate this, the following example is presented. The increase in savings for increase in salary is shown in the table. As you can notice in the scatter diagram, the points are not linear. Although the value of Y increases with increase in the value of X, the increase is not proportional. When a change in one variable has no impact on the other, there is no relation or correlation between them. Let us consider the following example. To study the relation between the number of fresh graduates in the city and the job openings available, data for both was collected over a few months and tabulated as shown. The scatter diagram for the same is also displayed. It can be observed that the data points are scattered and there is no trend emerging from the graph. Therefore, there is no correlation between the number of fresh graduates and the number of job openings in the city. In moderate negative correlation, an increase in one variable results in a decrease in the other variable. However, this change is not proportional to the change in the first variable. To demonstrate moderate negative correlation, the prices of different products are listed along with the number of units sold for each product. The data is shown in the table. From the scatter diagram shown, it can be observed that, higher the price of a product, lesser are the number of units of that product sold. However, the decrease in the number of units with increasing price is not proportional. In perfect negative correlation, an increase in one variable results in a proportional decrease of the other variable. This is also an example of 1:1 correlation. As an example, the effect of an increase in the project time extension on the success of project is considered. The data is shown in the table. The scatter diagram for the data shows a proportional decrease in the probability of the project’s success with each extension of the project time. Hence, the points are linear. Perfect correlations are rare in the real world. When encountered, they should be investigated and verified.

## 3.65 Graphical Methods Histogram

In this screen, we will look at another graphical method, histograms. Histograms are similar to bar graphs, except that the data in histograms is grouped into intervals. They are used to represent category-wise data graphically. A histogram is best suited for continuous data. The following example illustrates how a histogram is used to represent data. Data on the number of hours spent by a group of 15 people on a special project in one week is collected. This data is then divided into intervals of 2 and the frequency table for the data is created. The histogram for the same data is also displayed. Looking at the histogram, it can be observed at a glance that most of the team members spent between 2-4 hours on the project. In the following screen, we will look at the next graphical method, normal probability plots.

## 3.66 Graphical Methods Normal Probability Plots

Normal probability plots are used to identify if a sample has been taken from a normal distributed population. When sample data from a normal distributed population is represented as a normal probability plot, it forms a straight line. The following example is presented to illustrate normal probability plots. A sampling of diameters from a drilling operation is done and the data is recorded. The dataset is given. To create a normal probability plot, the first step is to construct a cumulative frequency distribution table. This is followed by calculating the mean rank probability by dividing the cumulative frequency by the number of samples plus one, and multiplying the answer by 100. The fully populated table for mean rank probability estimation is shown on the screen. Please take a look at the same. In the next step, a graph is plotted on log paper or with Minitab using this data. Minitab is a statistical software used in Six Sigma. Minitab normal probability plot instructions are also given on the screen. The completed graph is shown on the screen. From the graph, it can be seen that the random sample forms a straight line and therefore the data is taken from a normally distributed population.

## 3.67 Topic 5 Measurement System Analysis

Let us proceed to the next topic of this lesson. In this topic, we will discuss Measurement System Analysis or MSA in detail. Let us understand what MSA is in the following screen.

## 3.68 Measurement System Analysis

Throughout the DMAIC process, the output of the Measurement System (MS) is used for metrics, analysis and control efforts. An error-prone measurement system will only lead to incorrect data. Incorrect data leads to incorrect conclusions. It is important to set right the MS before collecting the data. Measurement System Analysis or MSA is a technique that identifies measurement error or variation and sources of that error in order to reduce the variation. It evaluates the measuring system to ensure the integrity of data used for analysis. MSA is therefore one of the first activities in the measure phase. The measurement system’s capability is calculated, analyzed, and interpreted using Gage Repeatability and Reproducibility to determine measurement correlation, bias, linearity, percent agreement and precision or tolerance. Let us discuss the objectives of MSA in the next screen.

## 3.69 Measurement System Analysis Objectives

A primary objective of MSA is to obtain information about the type of measurement variation associated with the measurement system. It is also used to establish criteria to accept and release new measuring equipment. MSA also compares measuring one method against another. It helps to form a basis for evaluating a method which is suspected of being deficient. The measurement system variations should be resolved to arrive at the correct baselines for the project objectives. As baselines contain crucial data based on which decisions are taken, it is extremely important that the Measurement system be free of error as far as possible. Let us look at measurement analysis in detail in the next screen.

## 3.70 Measurement Analysis

In measurement analysis, the observed value is equal to the sum of the true value and the measurement error. The measurement error can be a negative or a positive value. Measurement error refers to the net effect of all sources of measurement variability that cause an observed value to deviate from the true value. True variability is the sum of the process variability and the measurement variability. Process variability and measurement variability must be evaluated and improved together. Measurement variability should be addressed before looking at process variability. If process variability is corrected before resolving measurement variability, then any improvements to the process cannot be trusted to have taken place, owing to a faulty measurement system. In the following screen, we will identify the types of measurement errors.

## 3.71 Types of Measurement Errors

The two types of measurement errors are measurement system bias and measurement system variation. Click each type to know more. Measurement system bias involves calibration study. In the calibration study, the total mean is given by the sum of the process mean and the measurement mean. The statistical notation is shown on the screen. Measurement system variation involves Gage Repeatability and Reproducibility or GRR study. In the GRR study, the total variance is calculated by adding the process variance with the measurement variation. The statistical notation is shown on the screen.

## 3.72 Sources Of Variation

In this screen, we will discuss the sources of variation. The chart on the screen lists the different sources of variation. Observed process variation is divided into two, actual process variation and measurement variation. Actual process variation can be divided into long term and short term process variations. In a Gage RR Study, process variation is often called Part variation. Measurement variation can be divided into variations caused by operators and variations due to the gage. The variation due to operators is owing to reproducibility. Variation due to gage is owing to repeatability. Both actual process variation and measurement variation have a common factor, that is, variation within a sample. Let us understand Gage Repeatability and Reproducibility or GRR in the next screen.

## 3.73 Gage Repeatability And Reproducibility

Gage Repeatability and Reproducibility or GRR is a statistical technique to assess if a gage or gaging system will obtain the same reading each time a particular characteristic or parameter is measured. Gage repeatability is the variation in measurement when one operator uses the same gage to measure identical characteristics of the same part repeatedly. Gage reproducibility is the variation in the average of measurements when different operators use the same gage to measure identical characteristics of the same part. The figures on the screen illustrate gage repeatability and reproducibility. In the next screen, we will discuss the components of GRR study.

## 3.74 Components Of Grr Study

The figure on the screen illustrates the difference between gage repeatability and reproducibility. The figure shows the repeatability and reproducibility for six different parts represented by the numbers one to six for two different trial readings by three different operators. As can be observed, a difference in reading for part one, indicated by the color green, by three different operators, is known as reproducibility error. A difference in reading of part four, indicated by red, by the same operator in two different trials, is known as the repeatability error. In the following screen, we will look at some guidelines for Gage Repeatability and Reproducibility studies.

## 3.75 Guidelines For Grr Studies

The following should be kept in mind while carrying out Gage repeatability and reproducibility or GRR studies. GRR studies should be performed over the range of expected observations. Care should be taken to use actual equipment for GRR studies. Written procedures and approved practices should be followed as would have been in actual operations. The measurement variability should be represented as is, not the way it was designed to be. After GRR, the measurement variability is separated into casual components, sorted according to priority and then targeted for action. In the following screen, let us look at some more concepts associated with GRR.

## 3.76 Other Grr Concepts

Bias is the distance between the sample mean value and the sample true value. It is also called accuracy. Bias is equal to mean minus reference value. Process variation is equal to 6 times the standard deviation. The bias percentage is calculated as bias divided by the process variation. The next term is linearity. Linearity refers to the consistency of bias over the range of the gage. Linearity is given by the product of slope and process variation. Precision is the degree of repeatability or closeness of data. Smaller the dispersion in the dataset better the precision. The variation in the gage is the sum of variation due to repeatability and the variation due to reproducibility In the following screen, we will understand measurement resolution.

## 3.77 Measurement Resolution

Measurement resolution is the smallest detectable increment that an instrument will measure/display. The number of increments in the measurement system should extend over the full range for a given parameter. Some examples of wrong gages or incorrect measurement resolution are given here. A truck weighing scale is used for measuring the weight of a tea pack. A caliper capable of measuring differences of 0.1 mm is used to show compliance when the tolerance limits are ±0.07mm. Thus, the measurement system that matches the range of the data should only be used. An important pre-requisite for GRR studies is that the gage has an acceptable resolution. In the next screen, we will look at examples for repeatability and reproducibility.

## 3.78 Repeatability And Reproducibility

Repeatability is also called equipment variation or EV. It occurs when the same technician or operator repeatedly measures the same part or process, under identical conditions, with the same measurement system. The following example illustrates this concept. A 36 km per hour pace mechanism is timed by a single operator over a distance of 100 meters on a stop watch and three readings are taken. Trial 1 takes 9 seconds, trial 2 takes 10 seconds, and trial 3 takes 11 seconds. The process is measured with the same equipment in identical conditions by the same operator. Assuming no operator error, the variation in the three readings is known as Repeatability or Equipment Variation. Reproducibility is also called Appraiser Variation or AV. It occurs when different technicians or operators measure the same part or process, under identical conditions, using the same measurement system. Let us extend the example for repeatability to include data measured by two operators. The readings are displayed on the slide. The difference in the readings of both operators is called Reproducibility or Appraiser Variation. It is important to resolve Equipment Variation before Appraiser Variation. If Appraiser Variation is resolved first, the results will still not be identical due to variation in the equipment itself.

## 3.79 Data Collection In Grr

In this screen, we will learn about data collection in GRR. There are some important considerations for data collection in GRR studies. There are usually three operators and around ten units to measure. General sampling techniques must be used to represent the population and each unit must be measured two to three times by each operator. It is important that the gage be calibrated accurately. It should also be ensured that the gage has an acceptable resolution. Another practice is that the first operator measures all the units in random order. Then this order is maintained by all other operators. All the trials must be repeated. In the next screen, we will discuss the ANOVA method of analyzing GRR studies.

## 3.80 Anova Method Of Analyzing Grr Studies

The ANOVA method is considered to be the best method for analyzing GRR studies. This is because of two reasons. The first being ANOVA not only separates equipment and operator variation, but also provides insight on the combined effect of the two. Second, ANOVA uses standard deviation instead of range as a measure of variation and therefore gives a better estimate of the measurement system variation. The one drawback of using ANOVA is the considerations of time, resources, and cost. In the next screen, we will understand how MSA can be interpreted.

## 3.81 Interpretation Of Measurement System Analysis

Two results are possible for an MSA. In the first case, the reproducibility error is larger than the repeatability error. This occurs when the operators are not trained and calibrations on the gage dial are not clear. The other possibility is that the repeatability error is larger than the reproducibility error. This is clearly a maintenance issue and can be resolved by calibrating the equipment or performing maintenance on the equipment. This indicates that the gage needs redesign to be more rigid and the location needs to be improved. It also occurs when there is ambiguity in SOPs. MSA is an experiment which seeks to identify the components of variation in the measurement. In the following screen, we will look at a template used for GRR studies.

## 3.82 Gage Rr Template

A sample GAGE RR Sheet is given on this screen. The operators here are Andrew Murphy and Lucy Wang, who are the appraisers in this study. They have measured and rated the performance of three employees: Ebrahim Glasov, Brianna Scott, and Jason Schmidt. This is a sample template for a GAGE R&R study. The parts are shown across the top of the sheet. In this case, the measurement system is being evaluated using three ‘parts’, the employees, Ebrahim Glasov, Brianna Scott, and Jason Schmidt. The operators measure each part repeatedly. From this data, the average (X) and ranges (R) for each inspector and for each part are calculated. The grand average for each inspector, and each part is also calculated. In this example, a control limit, UCL in the sheet, was compared with the difference in averages of the two inspectors, to identify if there is a significant difference in their measurements. Their difference is 0.111 which is outside the UCL of 0.108 given the R average of 0.042. In the next screen, we will look at the results page for this GRR study.

## 3.83 Gage Rr Results Summary

The sheet on the screen displays the results for the data entered in the template in the previous screen. Please spend some time to go through the data for a better understanding of the concept. In the following screen, we will look at the interpretation to this results page.

## 3.84 Gage Rr Interpretation

The percentage GRR value is highlighted in the center right of the table in the previous screen. There are three important observations to be made here about the Gage RR study. First, this study also shows the interaction between operators and parts. If the percentage GRR value is less than 30, then the Gage is acceptable, and the measurement system does not require any change. If the value is greater than 30, then the Gage needs correction. The Equipment Variation is checked and resolved first, followed by the Appraiser Variation. Second, if EV = 0, it means the MS is reliable, the equipment is perfect and the variation in the gage is contributed by different operators. If the AV is equal to zero, the MS is precise. Third, if EV = 0 and there is AV, the operators have to be trained to ensure all operators follow identical steps during measurement and the AV is minimal. The interaction between operators and parts can also be studied under GRR using Part Variation. The trueness and precision cannot be determined in a GRR if only one gage or measurement method is evaluated as it may have an inherent bias that would go undetected merely by varying operators and parts. Let us proceed to the next topic of this lesson in the following screen.

## 3.85 Topic 6 Process And Performance Capability

In this topic, we will discuss Process and Performance Capability in detail. In the following screen, we will look at the differences between natural process limits and specification limits.

## 3.86 Natural Process Limits Vs Specification Limits

Natural process limits, or control limits, are derived from the process data and are the voice of the process. The data consists of real-time values from past process performance. Therefore, these values represent the actual process limits and indicate variation in the process. The two control limits are Upper Control Limit (UCL) and Lower Control Limit (LCL). Specification limits are provided by customers based on their requirements, or the voice of customer, and cannot be changed by the organization. These limits act as targets for the organization, and processes are designed around the requirements. The product or service has to meet customer requirements and has to be well within the specification limits. If the product or service does not meet customer requirements, it is considered as a defect. Therefore, specification limits are the intended results or requirements from the product or service that are defined by the customer. The two specification limits are upper specification limit or USL and lower specification limit or LSL. The difference between the two is called tolerance. An important point to note is that, for a process, if the control limits lie within the specification limits, the process is said to be under control. Conversely, if specification limits lie within the control limits, the process will not meet customer requirements. In the following screen, we will look at process performance metrics and how they are calculated.

## 3.87 Process Performance Metrics

The two major metrics used to measure process performance are Defects per Unit or DPU and Defects per Million Opportunities or DPMO. DPU is calculated by dividing the number of defects by the total number of units. DPMO is calculated by multiplying the defects per opportunity with one million. In the following screen, we will look at an example for calculating process performance.

## 3.88 Calculating Process Performance Example

In this example, the quality control department checks the quality of finished goods by sampling a batch of ten items from the produced lot every hour. The data is collected over twenty four hours. The table displays the data for the number of defects for the sampling period. If items are consistently found to be outside the control limits on any given day, the production process is stopped for the next day. Let us now interpret the results of the sampling. In this example, as the sample size is constant, DPU or p-bar is used to calculate the process capability. The total number of defects is thirty four and the subgroup size is ten. The total number of units is ten multiplied by twenty four, which is two hundred forty. The defects per unit is zero point zero one four two. The defects per million opportunities is obtained by multiplying the defects per unit with one million, which is 141,666.66. Therefore, by looking at the D P M O table, it can be said that the process is currently working at two point six sigma or 86.4 percent yield.

## 3.89 Process Stability Studies

We will learn about process stability studies in this screen. The activities carried out in the measure phase are MSA, collection of data, statistical calculations, and checking for accuracy and validity. This is followed by a test for stability as changes cannot be made to an unstable process. With a set of data believed to be accurate, the process is checked for stability. This is important because, if a process is unstable, no changes can be implemented. Why does a process become unstable? A process can become unstable due to special causes of variation. Multiple special causes of variation lead to instability. A single special cause leads to an out-of-control condition. Run charts in Minitab can be used to check for process stability. Let us look at the steps to plot a Run chart in Minitab in the following screen.

## 3.90 Process Stability Studies Run Charts In Minitab

To plot a Run chart in Minitab, first enter the sample data collected to check for stability. Next, click Stat on the Minitab window followed by Quality Tools. Next, click Run Charts, select the column and choose the sub group size as 2. Click OK. The graph shown on the screen is interpreted by looking at the last four values. If any of the p-values is less than 0.05, the presence of special causes of variation can be validated. This means there is a good chance that the process will become unstable. In the following screen, we will look Process Stability Studies — Causes of Variation.

## 3.91 Process Stability Studies Causes of Variation

Variation can be due to two types of causes – common causes of variation and special causes of variation. Click each type to learn more. Common causes of variation are the many sources of variation within a process which have a stable and repeatable distribution over a period. They contribute to a state of statistical control where the output is predictable. Some other factors which do not always act on the process can also cause variation. These are special causes of variation. These are external to the process and are irregular in nature. When present, the process distribution changes and the process output is not stable over a period. Special causes may result in defects, and need to be eliminated to bring the process under control. Run charts indicate the presence of special causes of variation in the process. If special causes are detected, the process has to be brought to a stop and a root cause analysis has to be carried out. If the root cause analysis reveals the special cause to be undesirable, corrective actions are taken to remove the special cause.

## 3.92 Verifying Process Stability And Normality

We will learn about verifying process stability and normality in this screen. Based on the type of variation a process exhibits, it can be verified if the process is in control. If there are special causes of variation, the process output is not stable over time. The process cannot be said to be in control. Conversely, if there are only common causes of variation in a process, the output forms a distribution that is stable and predictable over time. A process being in control means the process does not have any special causes of variation. Once a process is understood to be stable, the control chart data can be used to calculate the process capability indices. In the following screen, we will discuss process capability studies.

## 3.93 Process Capability Studies

Process capability is the actual variation in the process specification. To carry out a process capability study, first, plan for data collection. Next, collect the data. Finally, plot and analyze the results. Obtaining the appropriate sampling plan for the process capability study depends on the purpose and whether there are any customer or standard requirements for the study. For new processes, or a project proposal, the project capability can be estimated by a pilot run. Let us look at the objectives of process capability studies in the next screen.

## 3.94 Objectives Of Process Capability Studies

The objectives of a process capability study are to establish a state of control over a manufacturing process and then maintain the state of control over a period of time. On comparing the natural process limits or the control limits with the specification limits, any of the following outcomes is possible. First, the process limits are found to fall between the specification limits. This shows that the process is running well and no action is required. The second possibility is that the process spread and the specification spread are approximately the same. In this case, the process is centered by making an adjustment to the centering of the process. This would bring the batch of products within the specifications. The third possibility is that the process limits fall outside the specification limits. In this case, reduce the variability by partitioning the pieces of batches to locate and target the largest offender. A design experiment can be used to identify the primary source of variation. In the following screen, we will learn about identifying characteristics in process capability.

## 3.95 Identifying Characteristics

Process capability deals with the ability of the process to meet customer requirements. Therefore, it is crucial that the characteristic selected for a process capability study indicates a key factor in the quality of the product or process. Also, it should be possible to influence the value of the characteristic by adjusting a process. The operating conditions that affect the characteristic should also be defined and controlled. Apart from these requirements, other factors determining the characteristics to be measured are customer purchase order requirements or industry standards. In the following screen, we will look at identifying specifications or tolerances in process capability.

## 3.96 Identifying Specifications Or Tolerances

The process specification or tolerances are defined either by industry standards based on customer requirements, or by the organization’s engineering department in consultation with the customer. A stability study followed by a comprehensive capability study also helps in identifying if the process mean meets the target or the customer mean. The process capability study indicates whether the process is capable. It is used to determine if the output consistently meets specifications and the probability of a defect or defective. This information is used to evaluate and improve the process to meet the tolerance requirements. In the following screen, we will learn about process performance indices.

## 3.97 Process Performance Indices

Process performance is defined as a statistical measurement of the outcome of a process characteristic which may or may not have been demonstrated to be in a state of statistical control. In other words, it is an estimate of the process capability of a process during its initial set-up, before it has been brought into a state of statistical control. It differs from the process capability, in that, for process performance, a state of statistical control is not required. The three basic process performance indices are Process Performance or Pp, Process Performance Index or Ppk (Pronounce as: p p k), and Process Capability Index, denoted as Ppm or Cpm (Pronounce as: p p m or c p m). Click each index to know more. Pp stands for process performance. It is computed by subtracting the lower specification limit from the upper specification limit, the whole divided by natural process variation, or six sigma. Ppk is the process performance index and a minimum of the values of the upper and lower process capability indices. The upper and lower process capability indices are calculated as shown on the screen. Ppu or upper process capability index is given by the formula U S L minus x divided by 3s. Ppl or lower process capability index is given by x minus LSL divided by 3s. Here x is process average, better known as X-bar, and s is sample standard deviation. Cpm denotes the process capability index mean, which accounts for the location of the process average relative to a target value. It can be calculated as shown on the screen. Here, myu stands for process average, sigma symbol denotes the process standard deviation, U S L is the upper specification limit, and L S L is the lower specification limit. T is the target value which is typically the center of the tolerance, xi (Pronounce as: ex-eye) is the sample reading and n is the number of sample readings.

## 3.98 Key Terms In Process Capability

We will look at the key terms in Process Capability in this screen. ZST, or short term capability is the potential performance of the process in control at any given point of time. It is based on the sample collected in the short term. The long term performance is denoted by ZLT. It is the actual performance of the process over a given period of time. Subgroups are several small samples collected consecutively. Each sample forms a sub group. The subgroups are chosen so that the data points are likely to be identical within the subgroup, but different between two subgroups. The process shift is calculated by subtracting the long term capability from the short term capability. The process shift also reflects how well a process is controlled. It is usually a factor of one point five. Let us look at short term and long term process capability in the next screen.

## 3.99 Short Term And Long Term Process Capability

The concept of short term and long term process shift is explained graphically on this screen. There are three different samples taken at time one, time two, and time three. The smaller waveforms represent the short term capability, and they are joined with their means to show the shift in long term performance. The long term performance curve is shown below, with the target value marked in the center. It is important to note that over a period of time or subgroups, a typical process will shift by approximately one point five times the standard deviation. Also, long term variation is more than short term variation. This difference is known as the sigma shift and is an indicator of the process control. The reasons for a process shift include changes in operators, raw material used, wear and tear, and time periods. We will discuss the assumptions and conventions of process variations in the following screen.

## 3.100 Assumptions and Conventions Process Variations

Long term variation is always longer than short term variation. Click each term to know more. Short term variations are due to the common causes. The variance is inherent in the process and known as the natural variation. Short term variations show variation within subgroup, and are therefore called within subgroup variation. They are usually a small number of samples collected at short intervals. In short term variation, the variation due to common causes are captured. However, common causes are difficult to identify and correct. The process may have to be redesigned to remove common causes of variation. Long term variations are due to common as well as special causes. The added variation or abnormal variation is due to factors external to the usual process. Long term variation is also known as the overall variation and is a sample standard deviation for all the samples put together. Long term variation shows variations within the subgroup and between subgroups. Special causes increasing variation include changes in operators, raw material, and wear and tear. The special causes need to be identified and corrected for process improvement.

## 3.101 Stability Capability Spread And Defects Summary

This screen explains how the factors of stability, capability, spread, and defects summary are used to interpret the process condition. This table gives the process condition for different levels or types of variation with reference to common causes and special causes. The table is read as follows. In the first scenario, the process has lesser common causes of variation or CCV and no special causes of variation or SCV. In this case, the variability is less, the capability is high, the possibility of defects is less, and the process is said to be capable and in control. Next, if the process has lesser CCV, and some SCV are present, then it has high variability, low capability, and a high possibility of defects. The process is said to be out of control and incapable. The third possibility is that the process has high CCV and no SCV. In this case, the variability is moderate to high, the capability is very low, and possibility of defects is very high. Although the process is in control, it is incapable. Finally, at the other extreme, is the situation where the process has high CCV and SCV is also present. Here, the process has high variability, low capability, high possibility of defects, and is out of control and incapable. This table is a quick reference to understand process conditions. In the next screen, we will compare the Cpk and Cp values.

## 3.102 Comparing Cpk And Cp

When Cpk and Cp values are compared, three outcomes are possible. When Cpk is lesser than Cp, it can be inferred that the mean is not centered. When Cpk is equal to Cp, the inference is that the process is accurate. The process is considered capable if Cpk is greater than 1. This will happen only if variations are in control. Cpk can never be greater than Cp. If this situation occurs, the calculations have to be rechecked. We will look at an example problem for calculating process variation in the following screen.

## 3.103 Process Variation Example

The table on this screen shows data for customer complaint resolution time over a period of three weeks. Each week’s data forms a subgroup. For example, the resolution time is forty eight hours for a particular case in week one. In week two, the case takes up fifty hours and in week three it takes about forty nine hours. The subgroup size is ten. Let us understand how to calculate long term and short term standard deviations are calculated for this data. The average for each week is calculated by dividing the total number of complaints resolved by the subgroup size. A grand average is also calculated for all the three weeks. The variations within subgroups and between subgroups for each week are calculated. This is followed by calculating the total variations within and between subgroups. Overall variation is given by the sum of total variation within subgroups and total variation between subgroups. Finally, the standard deviations for the short term and the long term are calculated using the formulae given on the screen. The results for the process variation calculations are as follows: The grand average for all three weeks is forty seven point five. The total variation within subgroups is one thousand twenty three point eight. The total variation between subgroups is one sixty one point six seven. Both these variations are added to give the overall variation of one thousand one hundred and eighty five point five. The short term standard deviation is 6.2 and the long term standard deviation is 6.4. Note that the overall variation can also be calculated with the usual sample variance formula.

## 3.104 Effect Of Mean Shift

Let us discuss the effect of mean shift on the process capability in this screen. The table given here shows the defect levels at different sigma multiple values and different mean shifts. From the table, it can be seen that when the mean is centered within the specification limits and the process capability is 1, that is, ± 3s (Pronounce as: plus or minus 3 s) fits within the specification limits, the DPMO is 2700 percent and the probability of a good result is 99.73%. If the mean shifts by 1.5 sigma, then a tail moves outside the specification limit to a greater extent. Now the DPMO increases to over 66,000.This is almost a 2500 percent increase in defects. If the process has a process capability of 2, that is, ±6s (Pronounce as: plus or minus 6 s) fits within the specification limits, and the mean shifts by 1.5 sigma, then the probability of a good result is 99.99966 percent. This is the same as a process with a capability of 1.5 that is, ±4.5s (Pronounce as: plus or minus four point five s) fitting within the specification limits and no shift in the mean. The long term and short term capability table shows the variations in capabilities. For the purposes of Six Sigma, the assumption is that the long term variability will have a 1.5s difference from the short-term variability. As seen in Statistical Process Control, this assumption can be challenged if control charts are used and these kinds of shifts are detected quickly. In the chart, it can be seen that the mean shift is negligible as the process capability increases. Therefore, for a six sigma process, the long term variation does not have much effect. In the next screen, we will look at key concepts in process capability for attribute data.

## 3.105 Process Capability For Attribute Data Key Concepts

The customary procedure for defining process capability for attribute data is to define the mean rate of non-conformity. Defects and defectives are examples of non-conformity. Defects per million opportunities or D P M O, is the measure of process capability for attribute data. For this, the mean and the standard deviation for attribute data have to be defined. For defectives, p bar is used for checking process capability for both constant and variable sample sizes. For defects, c bar and u bar are used for constant and variable sample sizes respectively. The p-bar, c-bar, and u-bar are the equivalent of the standard deviation denoted by sigma, for continuous data.

## 3.106 Quiz

Following is the quiz section to check your understanding of the lesson.

## 3.107 Summary

Let us summarize what we have learned in this lesson. Process modeling refers to the visualization of a proposed system, layout, or other change in the process and can determine the effectiveness of a new design or process. Probability refers to the chance of an event occurring. The three properties of probability are: 1-the probability of an event is always between zero and one, both inclusive, 2-the probability of an event that cannot occur is zero, and 3-the probability of an event that must occur is one. Statistics refers to the science of collection, analysis, interpretation, and presentation of data. The major types are Descriptive Statistics and Inferential Statistics. We will continue reading the summary in the next screen.

## 3.108 Summary (contd.)

Measures of central tendency, dispersion, and graphical methods are used to analyze sample data. MSA is used to calculate, analyze, and interpret a measurement system's capability using Gage Repeatability and Reproducibility. Variation in a process can be because of common causes and special causes which determine the stability, capability, distribution and defects of a process.

## 3.109 Thank You

With this, we have come to the end of this lesson. The next lesson will cover the concepts of the Analyze phase.

Related Courses
Learner Reviews
Related Articles
• Disclaimer
• PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Name*
Email*
Phone Number*