The Measure phase of Lean Six Sigma represents a critical juncture in any process improvement initiative. During this phase, practitioners collect data, establish baselines, and quantify process performance using various statistical tools and methodologies. Understanding the terminology associated with this phase is essential for effective communication among team members and stakeholders, as well as for implementing meaningful improvements within an organization.
This comprehensive glossary breaks down the most important statistical and measurement terms you will encounter during the Measure phase. Whether you are a business professional beginning your Lean Six Sigma journey or a manager seeking to better understand your improvement team’s work, this guide will provide clarity on these essential concepts. You might also enjoy reading about Measure Phase Certification Questions: Key Concepts for Your Six Sigma Exam.
Fundamental Measurement Concepts
Accuracy
Accuracy refers to how close a measured value is to the true or actual value of what is being measured. For instance, if a scale shows 100 grams when weighing an object that actually weighs 100 grams, the scale is perfectly accurate. In a manufacturing setting, imagine measuring the diameter of a bolt that should be exactly 10 millimeters. If your measuring instrument consistently shows readings of 10.1 millimeters when measuring this 10-millimeter bolt, your measurements lack accuracy even if they are very consistent. You might also enjoy reading about Measure Phase for Beginners: Everything You Need to Know to Get Started.
Precision
Precision describes the consistency or repeatability of measurements. A precise measurement system produces similar results when measuring the same item multiple times under the same conditions. Consider a scenario where you weigh the same package five times, obtaining readings of 5.2 kg, 5.21 kg, 5.19 kg, 5.2 kg, and 5.21 kg. These measurements demonstrate high precision because they cluster closely together, even though they might not be accurate if the actual weight is 5.5 kg. You might also enjoy reading about Discrete vs. Continuous Data: Understanding Data Types in Six Sigma for Better Process Improvement.
Bias
Bias represents the difference between the observed average measurement and the true value. It indicates systematic error in your measurement system. For example, if a thermometer consistently reads 2 degrees higher than the actual temperature across all measurements, it has a bias of +2 degrees. In a call center measuring customer wait times, if the timing system starts counting 10 seconds after customers actually begin waiting, every measurement would have a negative bias of 10 seconds.
Resolution
Resolution indicates the smallest increment a measurement instrument can detect and display. A bathroom scale with 0.1-pound increments has better resolution than one showing only whole pounds. In quality control applications, choosing instruments with appropriate resolution is crucial. For instance, when measuring a component that must be accurate within 0.01 millimeters, using a measuring tool with 1-millimeter resolution would be inadequate for detecting important variations.
Data Types and Classification
Continuous Data
Continuous data can take any value within a given range and can be meaningfully divided into smaller increments. Examples include temperature, time, weight, height, and distance. In a hospital emergency department, patient wait time represents continuous data. A patient might wait 23.5 minutes, 45.8 minutes, or 62.3 minutes. These measurements can theoretically be recorded with infinite precision, limited only by the measuring instrument.
Sample dataset of continuous data for product weights in grams: 250.3, 249.8, 251.2, 250.1, 249.9, 250.7, 250.4, 250.0, 249.7, 250.6
Discrete Data
Discrete data consists of countable values, typically integers, with no meaningful values between the counts. Examples include the number of defects, number of customers, or number of complaints. In a manufacturing environment, counting the number of scratches on a finished product yields discrete data. You might find 0, 1, 2, or 3 scratches, but never 2.5 scratches.
Sample dataset of discrete data for daily customer complaints: 3, 7, 2, 5, 4, 8, 3, 6, 4, 5
Attribute Data
Attribute data classifies observations into categories and often appears as pass/fail, yes/no, or good/bad classifications. When inspecting light bulbs on a production line, each bulb either works or does not work. This binary classification represents attribute data. Similarly, in healthcare, screening tests often produce attribute data, indicating whether a patient tests positive or negative for a particular condition.
Central Tendency Measures
Mean
The mean, commonly called the average, is calculated by summing all values in a dataset and dividing by the number of observations. The mean is sensitive to extreme values and works best with normally distributed continuous data. Consider daily production output over ten days: 245, 252, 248, 251, 247, 249, 250, 253, 246, and 259 units. The mean production is (245+252+248+251+247+249+250+253+246+259)/10 = 250 units per day.
Median
The median represents the middle value when data is arranged in order. Half of all observations fall below the median, and half fall above it. The median is particularly useful when data contains outliers that might distort the mean. Using the production data above and arranging it in order: 245, 246, 247, 248, 249, 250, 251, 252, 253, 259. With ten values, the median falls between the fifth and sixth values: (249+250)/2 = 249.5 units.
Mode
The mode identifies the most frequently occurring value in a dataset. A dataset might have one mode (unimodal), two modes (bimodal), or more. In customer service, if you track call durations and find that calls of 5 minutes occur more frequently than any other duration, then 5 minutes is the mode. The mode is particularly useful for categorical data where mean and median cannot be calculated.
Variation and Spread Measures
Range
Range measures the spread between the highest and lowest values in a dataset. While simple to calculate, the range only considers two data points and ignores all other information. In monitoring daily website traffic over a week, if your lowest daily visitors numbered 1,200 and your highest reached 1,850, your range would be 650 visitors. This gives a quick sense of variability but provides no information about how the other five days performed.
Variance
Variance quantifies how far individual data points spread from the mean. It is calculated by finding the average of squared differences from the mean. While variance is mathematically important, its units are squared, making interpretation less intuitive. For a process where delivery times are 25, 28, 22, 30, and 25 minutes, the mean is 26 minutes. The variance calculation would be: [(25-26)² + (28-26)² + (22-26)² + (30-26)² + (25-26)²]/5 = [1 + 4 + 16 + 16 + 1]/5 = 7.6 square minutes.
Standard Deviation
Standard deviation is the square root of variance and returns to the original units of measurement, making it more interpretable. It indicates the typical distance of data points from the mean. In the delivery time example above, the standard deviation would be the square root of 7.6, approximately 2.76 minutes. This means that delivery times typically vary by about 2.76 minutes from the average of 26 minutes.
Coefficient of Variation
The coefficient of variation expresses standard deviation as a percentage of the mean, enabling comparison between datasets with different units or scales. It is calculated as (standard deviation / mean) Ă— 100. This is particularly useful when comparing variation between processes measured in different units. For instance, if Process A has a mean of 50 with a standard deviation of 5, its coefficient of variation is 10%. If Process B has a mean of 500 with a standard deviation of 25, its coefficient of variation is 5%, indicating that Process B is relatively more consistent despite having a higher absolute standard deviation.
Process Capability Terminology
Specification Limits
Specification limits define the acceptable range of variation for a product or process characteristic as determined by customer requirements or design specifications. Upper Specification Limit (USL) represents the maximum acceptable value, while Lower Specification Limit (LSL) represents the minimum. For example, a bottling company might specify that bottles should contain 500 milliliters with an LSL of 495 ml and USL of 510 ml. Any bottle outside these limits fails to meet specifications.
Control Limits
Control limits differ from specification limits by describing what the process actually does rather than what it should do. They are calculated from process data and typically set at three standard deviations from the process mean. These limits help identify when a process experiences unusual variation. If a process naturally produces outputs between 480 and 520 ml (control limits), but specifications require 495 to 510 ml (specification limits), the process is incapable of consistently meeting specifications even when operating normally.
Process Capability Index (Cp)
Cp measures the potential capability of a process by comparing the specification width to the process width. It is calculated as (USL – LSL) / (6 Ă— standard deviation). A Cp value of 1.0 indicates the process spread exactly equals the specification width. Values above 1.33 are generally considered acceptable for most processes. If specifications require parts between 10 and 20 mm, and the process standard deviation is 1.5 mm, then Cp = (20-10)/(6Ă—1.5) = 10/9 = 1.11, suggesting marginal capability.
Process Capability Index (Cpk)
Cpk improves upon Cp by accounting for process centering. A process might have adequate spread (good Cp) but be off-center, producing defects. Cpk is the minimum of two calculations: (USL – mean)/(3 Ă— standard deviation) and (mean – LSL)/(3 Ă— standard deviation). Using the previous example, if the process mean is 12 mm instead of the centered 15 mm, even with the same standard deviation, the Cpk would be lower than Cp, indicating that the off-center process produces more defects.
Measurement System Analysis Terms
Repeatability
Repeatability measures the variation in measurements when one operator measures the same item multiple times using the same instrument under identical conditions. Imagine a quality technician measuring the same metal rod ten times with the same caliper. If the measurements vary significantly, the measurement system has poor repeatability. Good repeatability means the measurement instrument itself contributes minimal variation to the results.
Reproducibility
Reproducibility assesses the variation between different operators measuring the same items using the same measurement instrument. If three different technicians measure the same ten parts and their average measurements differ substantially, the measurement system has poor reproducibility. This variation might result from different measurement techniques, interpretation differences, or operator training levels.
Gage R&R (Gage Repeatability and Reproducibility)
Gage R&R studies quantify the total measurement system variation by combining repeatability and reproducibility. The study typically involves multiple operators measuring multiple parts several times each. Results express measurement system variation as a percentage of total variation or specification tolerance. A Gage R&R below 10% indicates an acceptable measurement system, 10-30% suggests marginal acceptability requiring improvement, and above 30% indicates an unacceptable system needing immediate correction.
Discrimination
Discrimination, sometimes called resolution or sensitivity, refers to the measurement system’s ability to detect small changes in the measured characteristic. A good rule of thumb suggests that the measurement instrument should discriminate to at least one-tenth of the specification tolerance or process variation. If you are measuring a dimension with a tolerance of ±0.1 mm, your measuring instrument should read to at least 0.01 mm to adequately discriminate between parts.
Sampling and Data Collection Terms
Population
A population includes all possible observations or measurements of interest for a particular study. In manufacturing, the population might consist of all products produced in a month. In customer service, it might include all customer interactions during a specific period. Populations can be finite, such as all employees in a company, or effectively infinite, such as all possible measurements of a continuous process.
Sample
A sample is a subset of the population selected for measurement and analysis. Since measuring entire populations is often impractical or impossible, we draw samples that represent the population. If a factory produces 10,000 widgets daily, measuring all of them might be prohibitively expensive. Instead, selecting a random sample of 100 widgets can provide reliable information about the entire day’s production if done correctly.
Sample Size
Sample size refers to the number of observations included in a sample. Larger samples generally provide more reliable estimates of population parameters but require more resources. Determining appropriate sample size involves balancing statistical requirements with practical constraints. For detecting a 10% improvement in a process with 95% confidence and 80% power, statistical calculations might indicate that 50 observations are needed, but practical considerations like time and cost influence the final decision.
Sampling Frequency
Sampling frequency determines how often samples are collected from a process. High-frequency sampling provides more information about process changes but increases measurement costs. In monitoring chemical concentrations in a production process, sampling every hour might be necessary to detect and respond to problems quickly. However, if the process is very stable, sampling every four hours might suffice, reducing laboratory costs while maintaining adequate process understanding.
Rational Subgrouping
Rational subgrouping involves organizing data collection so that measurements within each subgroup are as similar as possible while differences between subgroups are maximized. This strategy makes detecting process changes easier. In a three-shift manufacturing operation, grouping measurements by shift creates rational subgroups because variation between shifts might indicate operator differences, while variation within shifts reflects natural process variation.
Distribution and Normality Terms
Normal Distribution
The normal distribution, often called the bell curve, is a symmetric probability distribution where most values cluster around the mean with fewer values appearing as you move away from the center. Many natural phenomena and process outputs follow this pattern. Heights of adult males, measurement errors, and many manufacturing dimensions approximate normal distributions. Understanding whether data follows a normal distribution matters because many statistical tools assume normality.
Skewness
Skewness describes the asymmetry of a distribution. Positive skew means the distribution has a longer tail extending toward higher values, while negative skew indicates a longer tail toward lower values. Wait times often show positive skew because while most customers might wait 5-10 minutes, some might wait 30 minutes or more, creating a long right tail. A perfectly normal distribution has zero skewness.
Kurtosis
Kurtosis measures the heaviness of distribution tails compared to a normal distribution. High kurtosis indicates more extreme outliers than expected in a normal distribution, while low kurtosis suggests fewer outliers. In financial applications, returns that exhibit high kurtosis experience more extreme gains or losses than normally distributed returns would predict, affecting risk assessment.
Normality Tests
Normality tests are statistical procedures that assess whether data follows a normal distribution. Common tests include the Anderson-Darling test, Kolmogorov-Smirnov test, and Shapiro-Wilk test. These tests compare your data against what would be expected from a normal distribution. However, with very large samples, these tests might flag minor deviations as significant even when the deviation has little practical impact on subsequent analyses.
Correlation and Relationship Terms
Correlation
Correlation measures the strength and direction of the linear relationship between two variables. Correlation coefficients range from negative one to positive one. A correlation of +1 indicates perfect positive linear relationship, negative one indicates perfect negative linear relationship, and zero indicates no linear relationship. In studying process factors, you might find that oven temperature and product hardness have a correlation of 0.85, suggesting that higher temperatures strongly associate with harder products.
Correlation Coefficient (r)
The correlation coefficient, represented by r, quantifies correlation between two variables. In analyzing the relationship between advertising spending and sales








