Measure Phase: Lean Six Sigma Application in Information Technology Tutorial

4.1 Lesson 04 Measure

Hello and welcome to the third lesson of Lean Six Sigma Application in Information Technology offered by Simplilearn. In this lesson we will look at applying Measure phase tools and concepts to IT. Let us explore the objectives of this lesson in the next screen.

4.2 Objectives

After completing this lesson, you will be able to: Identify and interpret basic statistics Identify common metrics tracked in IT and common data sources available Determine the accuracy of data being used for analysis Evaluate the capability of processes within IT In the next screen we will discuss the goal and tools common to the measure phase of a DMAIC project.

4.3 Measure Phase Review

The purpose of the measure phase in a lean six sigma project is to understand and document current state, understand how the process is performing relative to the customer expectations, identify data needed, and ensure the accuracy of the data to be leveraged in the analyze project phase. The screen provides a detailed list of common measure phase deliverables. (Read each box) In this lesson, we will be focusing on several of the measure concepts and learning how to apply them to IT processes. Specifically, we will be focusing on basic data analysis , measurement systems analysis, and process capability. We will begin with the first topic of the lesson in the next screen.

4.4 Measure Topic 1 Basic Statistics Review

The first topic in this lesson is Basic Statistics Review In the next screen, we will begin our discussion on basic statistics.

4.5 Importance of Statistics

Statistics plays an important role in today’s world. The proliferation of technology and the internet produces vast amounts of data, most of which is of little or no value in its raw form. Statistics is method in which the raw data stored in a variety of systems can be translated to useful, actionable knowledge. The ability to understand and apply even the simplest statistics can have a powerful impact on an organization. While many people are intimidated by the thought of statistics, the fact is it is present in all aspects of our daily routines. Sports statistics and opinion polls are just two examples of statistics that are used and discussed in everyday life. Next, let us discuss the purpose of statistics in a organization.

4.6 Purpose of Statistics

Statistics play an important role in describing process performance. Simple statistics such as mean and median provide valuable information on process performance. In addition to basic statistics, complex statistical models and tools can be used to identify critical inputs and build predictive models of process behavior. The use of statistics also allows for rational, fact based decision making. To often, people rely on “gut instincts” or experience alone to guide decision making. When used correctly, statistics provide an impartial tool to ensure sound decision making. As referenced in the previous slide, statistics provides a method to translate meaningless data points into knowledge that can be used to drive efficiency and enhanced performance results. In the next screen, we will discuss common statistics used in Lean Six Sigma Green Belt projects

4.7 Basic Statistics in a Lean Six Sigma Project

Listed here are a few examples of simple, basic statistics commonly used in the measure phase of a lean six sigma project. We will cover many of these in greater depth in future slides. Range: spread between the minimum and maximum value of a data set Probabilities: calculates the likelihood an event will occur Percentiles: Measure specific waypoints in the data set. Generally calculate the 25th, 50th, and 75th percentile. Sums: simple sum of data set Counts: simple count of occurrences Percent defective: measures the percentage of the outcomes that do not meet specification Median and Mean: common statistical measures of location Variation: common statistical measure of the spread of the data Let us now discuss the concept of population and sample statistics

4.8 Population vs. Sample

When talking about statistics it is important to understand the concepts: population and sample, the two types of statistics. The population consists of every data point or object that share a characteristic or attribute in common. An example of a population would be all registered voters in the United States of America-the common characteristic they share is the ability to cast a vote during an election. When statistics are calculated using the full population data, they are called descriptive statistics. They are referred to in this way because they describe the attributes of the entire population . Keeping with our example, as of 2012, there were approximately 146.3 million registered voters in the US. It would be very expensive and time consuming to gather data from every single one of them and produce descriptive statistics. This is where sampling or sample statistics come into play. Instead of trying to collect data on all 146 million voters, it is possible to collect data on a representative sample of the registered voters, then use sample statistics to draw inferences about the overall population based on what was observed in the sample. This is where the name inferential statistics comes from. We will discuss the different types of data in the next screen.

4.9 Data Types

Different processes produce different data It is important to understand what type of data is available , as this will determine they types of tools and analyses that can be used in the measure and analyze phase of a project There are two main types of data, which are defined below: Continuous or Variable: Continuous data can be thought of as measurement data. Examples of continuous data include height, weight, time, temperature, and length. Attribute or Discrete data can be thought of as descriptive or count data: Pass/ Fail: Good /Bad ; Low/Medium/High. Discrete data is often shown in the form of a proportion or percentage. Care should be taken when working with survey data as it can be averaged and is easily mistaken for continuous data. Continuous data is common in manufacturing and operational types of processes, while attribute data is more often found in transactional processes. We will look into some IT specific examples in the next screen.

4.10 Data Types (contd.)

IT processes create both continuous and discrete data. Listed below are a few examples : Continuous/Variable – Processing speed Call answer time Time to fix Project cycle time Time to close defects Discrete/Attribute Defects in production % of SLA achieved HelpDesk customer satisfaction System uptime % Defect closure rates The next screen will cover commonly used tools for continuous data.

4.11 Continuous Data Tools

Graphing and analysis can be done in a number of ways when continuous data is used. The table provides some examples of commonly used statistics, graphs, and analysis tools in lean six sigma projects. Mean/Median/Mode Standard Deviation/Variance Confidence Interval Normality Histogram Dot Plot Scatter Plot Simple Linear Regression Multiple Regression T test Design of Experiment X-bar and R control chart I-MR control chart Next we will review some commonly used tools for discrete data

4.12 Discrete Data Tools

The choices for discrete data is not as comprehensive as continuous data, however there are tools that can be used to analyze and draw conclusions from discrete data. Some of them are: Basic Statistics % defective Defects per Million Opportunities (DPMO) Confidence Interval Graphs Pareto Chart Pie Chart Bar Chart Main Effects Plot Analysis Logistic Regression Chi Squared % defective Proportion testing P control chart U control chart The next screen will review the purpose of basic statistics.

4.13 Basic Statistics

Basic statistics are easy to calculate and understand. Though simple, they form the foundation for the more complex statistical tools used to understand performance and identify root cause. Understanding the basic characteristics of the date produced by a process provides vital information needed to improve and maximize process performance Basic statistics consist of measures of location- measured by the mean, median, and mode and measures of variation- measured by range, standard deviation, and variance We will begin our discussion of measures of location in the next screen

4.14 Measures of Location

There are 3 measures of location- Mean, median, and mode. These measures of location are also sometimes referred to as measures of central tendency The first measure of location is the mean, which is the calculated average of a data set. It is calculated by simple division in which the numerator is the sum of the entire data set and the denominator is the total count of data points. The equation for mean is located in the appendix at the end of the lesson. Care should be taken when using mean, as it is very much influenced by extreme outliers. In a data set with a large standard deviation, a more accurate statistic will be the Median. The median represents the midpoint of a data set. If the data set has an odd number of data points, the midpoint is the exact middle value. When the data set consists of an even number, the two middle values of the data set are averaged. The median is the most common measure of location used in polling and scientific analysis as it is not dramatically influenced by extreme outliers as mean. The final measure of location is the mode. The mode is simply the value that occurs most frequently in a data set. Mean and Median are often used to report performance against goal. While this is useful information, it is important to remember that they are both simple calculated statistics. The customer rarely experiences the mean, but do experience the range of values that make up the process spread. We will now move to a discussion on measures of process spread.

4.15 Measures of Spread

There are three major measures of spread. Range: It is a very simple calculation and represents the difference between the largest data point in the data set and the smallest. Standard Deviation- this is the most commonly used measure of spread and represents the average deviation of values from the mean in a data set. The larger the standard deviation, the more variability is present in the processes. Variance is the third measure of process spread. It is simply the standard deviation squared. Calculating variance is helpful because there are statistical tests that require the sources of variation together to estimate the total. Also, variances can be summed while standard deviations cannot. The formulas for both standard deviation and variance are located in the appendix at the end of the lesson. As discussed in the previous screen, measures of spread are important because they represent what the customer actually experiences. After understanding basic statistic measures, we will now move on to basic graphical techniques.

4.16 Graphing Basic Stats

Statistics are important, however adults often respond to pictures better than lines and lines of data and calculations. A picture really is worth a thousand words when it comes to letting the data tell the story of the process. Graphs can take the most complex of analysis and put it in a form that is easy to understand and interpret by others, regardless of their statistical knowledge. There are lots of graphs to choose from and options available if your data is continuous or discrete. We will cover a few common graphs in the next few slides. We will start with the boxplot graph in the next screen.

4.17 Box Plots

A box plot is a useful graph to visualize the location, spread, and symmetry of the data set. The box plot is used with continuous data. The components of a box plot is as below: This is a boxplot graphing of the help desk call times for tier 2 calls. There are several things we can learn from the box plot graph. They are: Median: The line in the box plot represents the median value of the data. In a perfectly symmetrical distribution, the median would be in the center of the boxplot. We can also tell the data set is not symmetrical because the bottom tail is longer than the top tail of data. The box plot also illustrates what is called the interquartile range. The bottom of the box represents the 25th percentile and the top of the box represents the 75th percentile of the data set. The whiskers cover the full range of data, so the bottom whisker ranges from minimum to 25th percentile and the top whisker represents the data from the 75th percentile to the maximum. Finally, the asterisk seen above the top tail indicates there is an outlier in the data- in this case an extreme high value. We will cover histograms on the next screen.

4.18 Histograms

Histogram is a very common graph used to provide a picture of the data distribution- it is an important picture to help understand the process behavior as well as to determine appropriate statistical tools to use. Data distributions will be covered later in the lesson. Histograms are useful as we can create them with continuous or variable data Displayed is a histogram of the Help desk call time data used to produce the box plot in the previous screen. We will cover dot plots in the next screen

4.19 Dot Plot

In this screen we will discuss dot plot. A dot plot is another graph that is used to illustrate the distribution of data points. Dot plots are best used with continuous data. Most statistical software packages provide functionality to create a single dot plot of the entire process. More importantly, they provide the ability to create dot plots broken down into categories. The graph above is a dot plot of the help desk call times, broken down by the category of days of week. This is helpful in determining differences in process performance and distribution on different days of the week. Advance to the next screen to discuss main effects plots

4.20 Main Effects Plot

The final graph we will cover in this lesson is a main effects plot. The main effects plot graphs the mean and can be produced using continuous or discrete data. When using discrete data, it must be transformed into a numerical value such as percentage or count The main effects plot compares means across defined categories or factors The graph illustrates the mean call time data categorized by days of the week. The main effects plot is effective in highlighting the highest mean call time, which is on Friday and the lowest on Monday. This provides valuable information to help direct a team to dig deeper into what is occurring on Wednesdays and Fridays. We will now move to a discussion on different data distributions in the next screen.

4.21 Normal Distribution

Normal distribution is the most widely used, and sometimes misused distribution in statistics As seen on the screen, the normal distribution is smooth and symmetrical. Many of the continuous data types discussed earlier in the topic follow a normal distribution. Examples include temperature, weight, and height. Many classic lean six sigma statistical tools assume a normal distribution. Unfortunately, most transactional processes, which produce data including cycle time and defects do not follow a normal distribution, so care must be taken to understand the type of data available and use the appropriate tools for the type of data. In the next screen let us look into the probabilities of normal distribution.

4.22 Normal Distribution Probabilities

Due to its symmetrical nature, it is easy to determine the probabilities of occurrence in relation to the mean. The graph here demonstrates the proportion of data that will fall within plus or minus 6 standard deviations from the mean. As the graphic demonstrates, in a normal distribution, 68.26% of data will fall within 1 standard deviation of the mean. Moving out to 2 standard deviation , 95.46% of the data is included. By the time you get to 6 standard deviations from the mean, 99.999998% of data will fall within the distribution. In the next screen we will begin a discussion on the Central Limit Theorem.

4.23 Central Limit Theorem

In the prior screens, the concept of sample or inferential statistics was introduced. The Central Limit Theorem provides the link between a normal distribution and the sampling distributions, which is crucial to in making sample statistics correctly infer the properties of the population. The central limit theorem states that any distribution will approximate a normal distribution if the sample size is large enough. The central limit theorem uses the mean taken from multiple samples to approximate the normal distribution. We will illustrate how the central limit theorem works in the next screen

4.24 Central Limit Theorem in Pictures

The central limit theorem may sound complicated, but it is a simple concept. The example shown on the screen will walk you through the concept in pictures. (Graphic 1) This is a non-normal distribution. In this case, it is a uniform distribution, which is a distribution that contains values that occur roughly with the same frequency (Graphic 2) Multiple samples are drawn from the population (Graphic 3) The sample mean is then calculated for each of the individual sample data sets pulled (Graphic 4) With a large enough sample size, when the sample means are recorded and a histogram is produced, the result will approximate the normal distribution It is important to understand the distribution itself does not change- it is the distribution of the sample means that approximate a normal distribution. The next screen will cover why using the central limit theorem is important.

4.25 Central Limit Theorem

The central limit theorem can seem abstract, however it can be used for various practical purposes. One such use is to be able to predict probabilities using the normal distribution. From what we have discussed earlier, there are probabilities of occurrence associated with a normal distribution. These probabilities are not available for non normal distributions, so using the central limit theorem to approximate a normal distribution and assign a probability of the data falling within a set standard deviation range is the foundation of process control and control chart application. Control charts and their benefits in IT operations will be covered in the Control lesson. The central limit theorem can also be leveraged to minimize the variation seen in measurement system analysis, which will be covered in a later topic in this lesson This concludes topic 1. The next screen will begin topic 2.

4.26 Measure Topic 2 Common IT Metrics and Data Sources

The next topic in this lesson is common IT metrics and data sources

4.27 IT Metrics

The basic statistics covered in the last lesson form the foundation upon which metrics are built. Metrics serve many purposes in an organization. They help an organization to understand the current performance of the function or process. While there is benefit in understanding current state, the true benefit of metrics is establishing a target and monitoring the current performance against the set target. Targets can be set in many ways, but most commonly are tied to required business goals or set as a result of benchmarking against best practice organizations. Metrics also allow an organization to identify trends- positive or negative in their performance Finally, if the right metrics are in place, they are an effective tool in measuring and communicating the value IT brings to the business. In the context of a lean six sigma project, existing metrics can identify problem areas to focus project efforts on and can provide a baseline and goal for the project. Many organizations track metrics, with varying degrees of success. Some guidelines for effective metrics are: Timely: Metrics should be collected and reported in a timely manner. Metrics that are reported weeks or months after the fact may be interesting, however they typically are not actionable or correctable at that point. Comprehensive and balanced: metrics should encompass a variety of factors including cost, quality, speed, and safety. Focusing exclusively on one type of metric may provide gains in that area, but almost guarantees problems across the other dimensions Aligned to the business strategy and goal: It is important to monitor and report the items that are critical to the strategy and goals of the business Finally, metrics should incorporate the voice of the customer. If time is not taken to understand what is important to the customer, it is not possible for the metric performance to look good, as it will still have dissatisfied customers. We will now take a look at common metrics and data sources for the three main IT functions.

4.28 Plan Metrics and Data Sources

As a review from a prior lesson, the planning process encompasses work from the initial project idea to the project launch There are many metrics that can be applied to the planning process. IT organizations have historically focused metrics on the build and run functions, but planning metrics are growing in importance. Following are a few examples of commonly reported metrics that measure the effectiveness of the planning process. Portfolio ROI- this measures the overall benefit the portfolio delivers Total cost of ownership: measures the financial cost of implementing and maintaining a technology solution over its useful life. Application Retirement: Measures the number or % of current applications that will be retired from the technology environment as a result of the implementation of new technology Demand metrics: Predicating the demand to execute the approved portfolio is a key planning activity. Many organizations measure the actual demand vs the estimate of demand to evaluate the effectiveness of the estimating processes Strategic Spend %: With IT organizations playing an increasingly strategic role in organizations, the % of portfolio dedicated to strategic vs tactical efforts is increasing in prominence The data sources for planning metrics is varied as well and includes items such as: Business Case, financial systems, capacity management systems, the prioritization matrix, and the technology roadmap. Next we will discuss common build metrics in the next screen

4.29 Build Metrics and Data Sources

The build process encompasses all the activities that occur during project delivery Common Build Metrics include items such as: Project Cycle Time- overall and in relation to the original project timeline estimate Project Cost in relation to estimate Defects per requirements: Test case completion: % of test cases successfully completed Defects released into production Common data sources for build are: The project plan Project tracking system Results of system and user acceptance testing We will cover run metrics and data sources in the next screen

4.30 Run Metrics and Data Sources

Run processes encompass all activities to maintain and support systems There are dozens of run metrics that are commonly tracked in an organization. Listed below are a few: Service level agreements or SLAs: performance against an agreed upon time to fix a problem or defect System downtime and user reported incidents: Time when system in not available for use Cycle time: time from when a case is opened to closed Intrusion events: data security breaches First time final rates: help desk calls solved with one call Hold times: time for calls to be answered in help desk Escalated cases: cases that can’t be solved with initial call Systems to collect metrics are also varied and include: Output from system monitoring programs’ Call center tracking systems Security monitoring software Help desk ticket systems Problem tracking systems We will now move to the next topic in this lesson

4.31 Measure Topic 3 Measurement Systems Analysis in IT

The third topic in this lesson is Measurement System Analysis The next screen will begin our discussion of this topic

4.32 Measurement Systems Analysis

Measurement Systems Analysis is a critical first step any time data is going to be used for metrics or any other data analysis Measures key characteristics of the data including locations: the stability, bias, and linearity as well as variation: repeatability and reproducibility Ultimately measurement systems analysis lets you know if your data can be trusted in regard to accuracy. Continuous data can be tested using a gage study and attribute data using an attribute study In the next screen we will gauge studies.

4.33 Gauge MSA

A good gauge has several critical attributes including: Resolution: The gauge must be able to measure the resolution needed. If a process is being measured in seconds, and the gauge used to measure the cycle time is a watch with no second hand- the gauge will have poor resolution. In a process measured in seconds an appropriate gauge would be a stopwatch. Accuracy: Accuracy has several components, however in general this measures if the same gauge measuring the same item gets the same reading every time. Precision: Precision measures if the gauge records the accurate value . It is possible to have a gauge that is accurate but not precise. Examples of gauge include items such as scales, stopwatches, and calipers Bottom lines is a gauge MSA, sometimes referred to as a gauge r&r study is effective in measuring the accuracy of the gauge and understanding how much of observed variation in the data is because of the actual process output vs how much is because of the gauge being used to measure the data. Let us now discuss how to evaluate the quality of a measurement system for attribute data.

4.34 Attribute MSA

Since much of the data generated in IT departments is discrete in nature, Attribute MSA will be covered in more detail than Gauge of variable MSA. Completing measurement system analysis is critical with discrete data because it often requires human judgment to decide if an outcome is good or bad, or to categorize a defect into a certain classification Like the gauge study, an attribute MSA measures if the measurement method is able to be duplicated amongst different operators (reproducibility) and if the same operator looking at the same thing gets the same outcome (repeatability). It also measures if the appraiser gets the answer correct. In the next screen we will discuss how to conduct an attribute MSA.

4.35 Attribute MSA Process

Listed below are the steps to conduct an attribute MSA: Select 30 items to measure from the process: 50% good output 50% defects Use marginal good and bad samples if possible Have an appraiser decide the correct answer Use statistical software to set up and record the outcome Have 2-3 process doers review the output and classify each as good or defect Wait several days and have the same doers revaluate the output and classify as good or defect Run the analysis We will review a sample output on the next screen

4.36 MSA Output

An MSA study was run to test the output of a reporting process. Two operators, Brad and Julie, looked at 20 reports. Output: This first section of the output measures the repeatability of the measurement system-how often the appraiser agreed with him/her across the two trials. In the example, Brad agreed with himself 80% of the time and Julie 70% of the time. The confidence interval included with the analysis indicates we are 95% confident the true matched percent is between 56.34% and 94.27%. Let us now see how the appraisers did versus the standard or the correct answer on the next screen.

4.37 MSA Output

The next part of the analysis measures how often the appraiser agreed with the correct answer- Again Brad was at 80 % and Julie at 70% Next we will evaluate how often the two appraisers agreed with each other.

4.38 MSA Output Each Appraiser vs. Standard

The final part measures the overall effectiveness of the measurement system. A general rule of thumb states that a measurement system error under 10 % is good, under 30% is acceptable and anything over that figure should be fixed before using. In this case, there is a 40% error rate with all appraisers agreeing with the known standard, so action should be taken to address the measurement system before using the data for metrics or other analysis. We will now look at a case study regarding measurement system analysis

4.39 MSA Output Between Appraisers

The next part of the analysis measure how often the appraiser agreed with each other- in the example they agreed with each other 60% of the time Finally, we will evaluate the overall effectiveness of the measurement system

4.40 MSA Output All Appraisers vs. Standard

The final part measures the overall effectiveness of the measurement system. A general rule of thumb states that a measurement system error under 10 % is good, under 30% is acceptable and anything over that figure should be fixed before using. In this case, there is a 40% error rate with all appraisers agreeing with the known standard, so action should be taken to address the measurement system before using the data for metrics or other analysis. We will now look at a case study regarding measurement system analysis

4.41 Measure Topic 3 Case Study—Should I trust this data

We will work through a case study involving an MSA in the next screens. The background for the case study is on the next screen

4.41 Measure Topic 3 Case Study—Should I trust this data

We will work through a case study involving an MSA in the next screens. The background for the case study is on the next screen

4.42 MSA Case Study Background

Software Development LLC is a small software solutions company who provide custom developed software solutions to health care organizations The manager of solution delivery is concerned about the number of defects being released into the production environment. His hypothesis is testing is missing critical defects prior to release The manager has asked a trained Green Belt to reduce the % of defects reaching production The staff in the solution delivery area keep statistics in a database in which they classify if an observed defect was caused by a defect in testing, or is a true production defect The staff members use their judgment to determine the cause of the defect Prior to launching the analysis, the green belt decides to conduct a MSA on the data to evaluate the quality of the defect classification We will discuss the set up of the MSA on the next screen

4.43 MSA Case Study Set up

30 classified defects were pulled for analysis. 3 staff members, John, Tim, and Anne were selected to participate in the MSA study. Each participant looked at the defects and classified it in one of the two ways: Testing defect Production defect Each participant did the same exercise on the same defects one week later. The manager evaluated the defects and provided the correct answer for use in the study. The data was collected and the analysis is covered in the following screens. The following 4 screens contain the output from the MSA study

4.44 MSA Case Study Results

Output measuring how often the appraiser agreed with him/her self

4.45 MSA Case Study Results

Output measuring how often the appraiser agreed with the standard

4.46 MSA Case Study—Results

Output measuring how often the appraiser agreed with each other

4.46 MSA Case Study—Results

Output measuring how often the appraiser agreed with each other

4.47 MSA Case Study Results

The next two screens contain questions to answer based on the case study findings

4.48 Case Study Question 1

How often did the assessors agree with each other? They agreed with each other approximately 57% of the time.

4.49 Case Study Question 2

Which assessor was the most accurate at classifying the defect type? John is the most accurate at 80%.

4.50 Case Study Question 3

Should you trust this data? No, given the assessors agreed with themselves and the known standard 50% of the time, some remedial actions should be taken to address the accuracy of the measurement system before proceeding. We will now move to the final topic in the lesson.

4.51 Measure Topic 4 Determining Process Capability for Common IT Processes

The final topic in this lesson will focus on process capability The topic will begin on the next screen

4.52 Process Capability Defined

Process capability compares the voice of the customer to process performance. When calculating process capability, the voice of the customer is translated into a set of specification limits. The specification limit is the range about the target that is considered acceptable performance by the customer. Any values outside of the specification limit are considered defects Process capability measures how often the process delivers a result within the specification limit We will discuss uses of process capability in the next screen

4.53 Uses of Process Capability

Process capability can be used to set the baseline for current performance. The calculation of process capability is a key deliverable in any lean six sigma project, and is generally calculated in the measure phase to determine baseline, then calculated again after solution implementation to test for improvement in process performance. Outside of a lean six sigma project, understanding current process capability is useful when developing metrics and scorecards as it provides the starting point and allows a methodology to measure changes in performance Process capability does not have to be a one time calculation- it can be calculated over time to monitor process performance over time Since process capability is measuring performance against customer specifications, it is a strong predictor of customer satisfaction Process capability can be determined for any process, as long as targets and specifications can be established. Given this, process capability is useful in comparing the quality level of disparate processes. Let us move to the next screen to review process capability definitions

4.54 Process Capability Terminology

Target: desired value Specifications: boundary where performance outside the limits is not acceptable to the customer LSL: Lower Specification Limit USL: Upper Specification Limit Cpk: measure of process location as compared to the specification limit The next screen will review the concept of Cpk

4.55 Cpk Defined

Cpk is a measure of process location as compared to the specification limit. This index accounts for the mean shift in the process—the amount that the process is off target. (Equation) Cpk is calculated against both the upper and lower specification limits. To calculate the CPK, you need the process mean and standard deviation of the process. Once calculated, the CPK is the smaller of the two numbers as a process is only as good as it’s worst performing part. BP5: A cpk over one is considered to be a capable process- the target should be a CPk of at least 1.3, as this allows for special cause variation without creating defect. Any CPK under one indicates a process that is currently not delivering to customer specifications Let us move to the next screen to discuss conducting a capability study

4.56 Conducting a Process Capability Study

Conducting a Process Capability Study Utilize voice of customer techniques to establish target and specification limits Collect process data There are different process capability studies depending on the type of data the process produces- it is very important to understand the underlying distribution of your data and select the right study. Identify data distribution Conduct process capability via statistical software package Advance to the next screen for a discussion of common IT data distributions to help in the selection of the correct process capability study

4.57 Process Capability Data Types

Process capability is often presented and calculated in training and texts using normally distributed data. As discussed in a prior lesson, data commonly available in IT does not follow a normal distribution. Luckily, most statistical software packages provide tools to calculate process capability on all kinds of data. The common types of IT data and the underlying distribution are: Cycle Time- IT produces cycle time data in a variety of areas- help desks, outage times, project cycle times are all examples of cycle time data. Process Capability can be calculated for cycle time using an exponential distribution Defective or % defective: When calculating process capability for defective data, utilize the binomial distribution Defects per unit is the final data type – an example would be defect per requirement. When calculating process capability for this type of data, Poison distribution should be used.

4.59 Summary

Statistics is a powerful way to take raw data and turn it into knowledge that can be used to improve the organization. There are two types of data: continuous and discrete. Basic statistics provide the foundation for all statistical tools. The use of graphical techniques helps translate statistics to something everyone can understand and interpret. Verifying the validity and accuracy of data via an MSA study is critical to ensure correct statistical analysis. Process capability provides a measure of process performance against customer specifications. The next screen show the equations discussed in the lesson.

4.60 Statistical Formulas

The topic in our lesson is Basic Statistics Review In the next screen, we will begin our discussion of basic statistics

4.61 conclusion

This concludes lesson four, Measure In the next lesson, we will discuss Analyze tools for IT

4.41 Measure Topic 3 Case Study—Should I trust this data

We will work through a case study involving an MSA in the next screens. The background for the case study is on the next screen

4.46 MSA Case Study—Results

Output measuring how often the appraiser agreed with each other

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Work Email*
Phone Number*
Job Title*