How to Perform the Shapiro-Wilk Test: A Complete Guide to Testing Data Normality

by Lean 6 Sigma Hub | Apr 7, 2026 | Lean Six Sigma Basics

Table of Contents

Understanding whether your data follows a normal distribution is a fundamental requirement in many statistical analyses. The Shapiro-Wilk test stands as one of the most powerful and widely used methods for assessing normality in datasets. This comprehensive guide will walk you through everything you need to know about conducting and interpreting the Shapiro-Wilk test, complete with practical examples and step-by-step instructions.

What Is the Shapiro-Wilk Test?

The Shapiro-Wilk test is a statistical hypothesis test that evaluates whether a given sample of data comes from a normally distributed population. Developed by Samuel Shapiro and Martin Wilk in 1965, this test has become the gold standard for normality testing, particularly for small to medium-sized datasets containing fewer than 2000 observations. You might also enjoy reading about What is the Lean Philosophy?.

Unlike visual methods such as histograms or Q-Q plots, the Shapiro-Wilk test provides an objective, numerical measure of normality. This makes it especially valuable when you need to make definitive decisions about which statistical procedures to apply to your data. You might also enjoy reading about What is Six Sigma?.

Why Testing for Normality Matters

Many statistical tests and quality improvement methodologies assume that data follows a normal distribution. These include t-tests, analysis of variance (ANOVA), linear regression, and various Six Sigma tools. When data significantly deviates from normality, these methods may produce unreliable results, leading to incorrect conclusions and poor decision-making.

By performing the Shapiro-Wilk test before conducting your main analysis, you can:

Determine whether parametric or non-parametric tests are appropriate
Identify potential data quality issues or outliers
Validate assumptions required for statistical process control
Ensure the reliability of confidence intervals and hypothesis tests
Make informed decisions about data transformations

Understanding the Hypotheses

Like all hypothesis tests, the Shapiro-Wilk test operates on two competing hypotheses:

Null Hypothesis (H0): The data is normally distributed. The sample comes from a population that follows a normal distribution.

Alternative Hypothesis (H1): The data is not normally distributed. The sample does not come from a normally distributed population.

The test produces a test statistic (W) and a corresponding p-value that helps you decide which hypothesis to support.

Interpreting the Test Results

The Shapiro-Wilk test statistic (W) ranges from 0 to 1, where values closer to 1 indicate greater similarity to a normal distribution. However, the p-value is what most analysts use for decision-making.

Decision Rule:

If the p-value is greater than your chosen significance level (typically 0.05), you fail to reject the null hypothesis. This suggests your data is consistent with a normal distribution.
If the p-value is less than or equal to your significance level, you reject the null hypothesis. This indicates significant evidence that your data is not normally distributed.

Step-by-Step Guide to Performing the Shapiro-Wilk Test

Step 1: Prepare Your Data

Begin by organizing your data into a single column or vector. Ensure that you have removed any missing values, as these can interfere with the test. The Shapiro-Wilk test works best with sample sizes between 3 and 2000 observations, though it is most powerful with samples between 20 and 50.

Step 2: Choose Your Significance Level

Select an appropriate significance level (alpha) before conducting the test. The most common choice is 0.05, which represents a 5% risk of concluding that the data is not normal when it actually is. More conservative researchers might choose 0.01, while exploratory analyses might use 0.10.

Step 3: Conduct the Test

Most statistical software packages include built-in functions for the Shapiro-Wilk test. The calculation involves comparing the ordered sample values with expected values from a normal distribution, but the mathematical details are handled automatically by the software.

Step 4: Examine the Results

Review both the W statistic and the p-value. Document these values in your analysis report.

Step 5: Make Your Decision

Compare the p-value to your predetermined significance level and draw your conclusion about normality.

Practical Example with Sample Data

Let us work through a concrete example to illustrate the process. Imagine you are a quality engineer measuring the diameter of manufactured bolts in millimeters. You collect the following 25 measurements:

Sample Dataset: 10.2, 10.1, 10.3, 10.0, 10.2, 10.4, 10.1, 10.3, 10.2, 10.1, 10.3, 10.2, 10.4, 10.0, 10.2, 10.3, 10.1, 10.2, 10.3, 10.2, 10.1, 10.4, 10.2, 10.3, 10.2

You need to verify whether this data follows a normal distribution before conducting further analysis.

Analyzing the Example

After entering this data into statistical software and running the Shapiro-Wilk test with a significance level of 0.05, you might obtain results similar to these:

W statistic: 0.962
P-value: 0.458

Interpretation: Since the p-value (0.458) is greater than the significance level (0.05), you fail to reject the null hypothesis. There is insufficient evidence to conclude that the bolt diameter measurements deviate from a normal distribution. You can proceed with parametric statistical methods that assume normality.

Example of Non-Normal Data

Consider a different scenario where you measure customer wait times in minutes at a service center:

Sample Dataset: 2.1, 2.5, 3.0, 2.8, 3.2, 15.6, 2.9, 3.5, 18.2, 2.7, 3.1, 2.6, 20.5, 3.3, 2.4, 3.0, 2.8, 16.8, 3.2, 2.9

Running the Shapiro-Wilk test on this data might produce:

W statistic: 0.721
P-value: 0.0002

Interpretation: The p-value (0.0002) is much smaller than 0.05, leading you to reject the null hypothesis. This data shows significant departure from normality, likely due to the presence of extreme outliers (the long wait times). You should consider using non-parametric methods or investigating and addressing these outliers before proceeding.

Important Considerations and Limitations

Sample Size Matters: The Shapiro-Wilk test can be overly sensitive with very large samples, potentially flagging minor deviations from normality that have little practical impact on subsequent analyses. Conversely, with very small samples (fewer than 10 observations), the test may lack sufficient power to detect non-normality.

Complement with Visual Methods: Always supplement the Shapiro-Wilk test with visual inspections such as histograms, Q-Q plots, or box plots. These graphics can reveal the nature of any departures from normality and help identify outliers or data entry errors.

Context Is Critical: A statistically significant result does not always mean you cannot use parametric methods. Many statistical procedures are robust to moderate departures from normality, especially with larger sample sizes. Consider the practical implications rather than relying solely on the p-value.

What to Do When Data Is Not Normal

If the Shapiro-Wilk test indicates non-normality, you have several options:

Apply data transformations such as logarithmic, square root, or Box-Cox transformations
Use non-parametric statistical methods that do not assume normality
Investigate and potentially remove outliers if they represent data errors
Increase your sample size, as some procedures become more robust with larger datasets
Consider alternative distributions that might better fit your data

Applications in Quality Improvement

The Shapiro-Wilk test plays a vital role in Lean Six Sigma projects and other quality improvement initiatives. Process capability analyses, control charts, and hypothesis testing all rely on normality assumptions. By incorporating normality testing into your DMAIC (Define, Measure, Analyze, Improve, Control) methodology, you ensure that your improvement decisions rest on solid statistical foundations.

Quality professionals use the Shapiro-Wilk test to validate measurement systems, verify process stability assumptions, and determine appropriate statistical process control methods. This rigor distinguishes successful improvement projects from those that fail to deliver sustainable results.

Enhance Your Statistical Expertise

Mastering the Shapiro-Wilk test represents just one component of a comprehensive statistical toolkit. To truly excel in data-driven decision-making and process improvement, you need structured training that builds both theoretical knowledge and practical skills.

Professional certification in Lean Six Sigma provides the systematic framework you need to apply normality testing and other statistical methods effectively in real-world situations. Whether you are beginning your quality journey or advancing your existing capabilities, structured training accelerates your development and enhances your value to organizations.

Enrol in Lean Six Sigma Training Today and transform your approach to data analysis and process improvement. Gain hands-on experience with normality testing, hypothesis testing, statistical process control, and the full range of Six Sigma tools. Our comprehensive programs, taught by experienced practitioners, provide the knowledge and credentials that employers seek. Do not let statistical uncertainty hold back your projects or your career. Take the next step toward becoming a confident, capable problem-solver. Visit our website to explore training options tailored to your experience level and professional goals. Your journey to statistical mastery and process excellence begins now.

← Previous Post

Related Posts

How to Perform the Kolmogorov-Smirnov Test: A Complete Guide for Beginners

Statistical testing plays a crucial role in quality control, process improvement, and data analysis across various industries. Among the numerous statistical tests available, the Kolmogorov-Smirnov test stands out as a powerful tool for comparing probability...

Anderson-Darling Test: A Complete How-To Guide for Testing Data Normality

Understanding whether your data follows a normal distribution is crucial for making informed decisions in quality control, process improvement, and statistical analysis. The Anderson-Darling test stands as one of the most powerful statistical tools for determining if...

How to Perform the Bartlett Test: A Complete Guide for Statistical Analysis

When conducting statistical analyses that involve comparing multiple groups, one critical assumption often needs verification: homogeneity of variance. The Bartlett Test serves as a powerful statistical tool that helps researchers and quality professionals determine...

How to Perform Levene’s Test: A Complete Guide to Testing Homogeneity of Variance

Statistical analysis forms the backbone of data-driven decision making in business, research, and quality improvement initiatives. Among the various statistical tests available, Levene's test stands out as a crucial tool for determining whether multiple groups have...

How to Perform the Mood Median Test: A Complete Step-by-Step Guide

In the world of statistical analysis and quality improvement, understanding whether different groups come from populations with the same median is crucial for making informed decisions. The Mood Median Test offers a robust, nonparametric method for comparing medians...

How to Perform the Friedman Test: A Complete Guide for Data Analysis

The Friedman test serves as a powerful statistical tool for analyzing data when you need to compare multiple related samples or repeated measurements. This non-parametric alternative to repeated measures ANOVA helps researchers and quality improvement professionals...

Consulting Services

Login/Register

LSS In Action

How to Perform the Shapiro-Wilk Test: A Complete Guide to Testing Data Normality

What Is the Shapiro-Wilk Test?

Why Testing for Normality Matters

Understanding the Hypotheses

Interpreting the Test Results

Step-by-Step Guide to Performing the Shapiro-Wilk Test

Step 1: Prepare Your Data

Step 2: Choose Your Significance Level

Step 3: Conduct the Test

Step 4: Examine the Results

Step 5: Make Your Decision

Practical Example with Sample Data

Analyzing the Example

Example of Non-Normal Data

Important Considerations and Limitations

What to Do When Data Is Not Normal

Applications in Quality Improvement

Enhance Your Statistical Expertise

How to Perform the Kolmogorov-Smirnov Test: A Complete Guide for Beginners

Anderson-Darling Test: A Complete How-To Guide for Testing Data Normality

How to Perform the Bartlett Test: A Complete Guide for Statistical Analysis

How to Perform Levene’s Test: A Complete Guide to Testing Homogeneity of Variance

How to Perform the Mood Median Test: A Complete Step-by-Step Guide

How to Perform the Friedman Test: A Complete Guide for Data Analysis

One Stop shop for all your lean six sigma training and materials