Analyse Phase: Creating Process Failure Trees for Effective Problem Solving in Lean Six Sigma

by Lean 6 Sigma Hub | Dec 20, 2025 | DMAIC - Analyze Phase

Table of Contents

In the world of continuous improvement and quality management, identifying the root causes of process failures stands as one of the most critical challenges organizations face. The Analyse phase of the DMAIC (Define, Measure, Analyse, Improve, Control) methodology introduces powerful tools that help teams dissect complex problems systematically. Among these tools, Process Failure Trees emerge as an exceptionally valuable technique for visualizing and understanding the multiple pathways through which processes can fail.

This comprehensive guide explores the concept of Process Failure Trees, their construction, application, and practical implementation within the Analyse phase of Lean Six Sigma projects. Whether you are a quality professional, process improvement enthusiast, or business leader seeking to enhance operational excellence, understanding this analytical tool will significantly strengthen your problem-solving capabilities. You might also enjoy reading about Data Stratification Analysis: Breaking Down Data to Reveal Hidden Patterns for Better Decision Making.

Understanding the Analyse Phase in DMAIC Methodology

Before diving into Process Failure Trees specifically, we need to establish context within the broader DMAIC framework. The Analyse phase represents the third stage of this structured improvement methodology, positioned strategically after Define and Measure phases have established the problem scope and collected relevant data. You might also enjoy reading about Statistical Significance vs. Practical Significance: Understanding the Difference in Data Analysis.

During the Analyse phase, project teams focus on identifying root causes rather than symptoms. This phase answers the fundamental question: “Why is the process failing?” The tools and techniques employed here transform raw data into actionable insights, enabling teams to target their improvement efforts precisely where they will generate maximum impact. You might also enjoy reading about Correlation vs. Causation: Why Relationship Does Not Mean Cause and Effect.

The primary objectives of the Analyse phase include:

Identifying potential root causes of process failures and defects
Validating hypotheses through statistical analysis
Distinguishing between causes and symptoms
Prioritizing root causes based on their impact on process performance
Establishing relationships between process inputs and outputs

What Are Process Failure Trees?

A Process Failure Tree, also known as a Fault Tree Analysis (FTA), represents a top-down, deductive analytical method that maps out all possible causes of a specific failure or undesired event. Think of it as a detailed family tree, but instead of showing genealogical relationships, it illustrates the logical relationships between various failure modes and their contributing factors.

The technique originated in the aerospace industry during the 1960s when engineers at Bell Laboratories developed it to evaluate the reliability of missile launch control systems. Since then, it has been adapted and widely applied across industries including manufacturing, healthcare, software development, and service operations.

Process Failure Trees use logical gates (primarily AND and OR gates) to show how different failures combine to produce an undesired top-level event. This visual representation helps teams understand not just what might go wrong, but how various factors interact to create failure scenarios.

Key Components of Process Failure Trees

To construct and interpret Process Failure Trees effectively, you must understand their fundamental building blocks:

Top Event

The top event represents the ultimate failure or problem you are analyzing. This should be a specific, clearly defined undesirable outcome. For example, “Customer receives incorrect order” or “Production line stoppage exceeding 30 minutes.” The top event sits at the apex of your tree and everything below it represents potential causes or contributing factors.

Intermediate Events

These are failures or conditions that result from combinations of other events. Intermediate events bridge the gap between the top event and basic events, showing the logical progression of how lower-level failures cascade upward to create the ultimate problem.

Basic Events

Basic events represent the fundamental failure modes or root causes that require no further breakdown. These are the actionable elements where improvement interventions can be applied. Basic events typically represent equipment failures, human errors, environmental conditions, or system limitations.

Logic Gates

Logic gates define the relationships between events. The two primary types are:

OR Gate: The output event occurs if any one or more of the input events occur. This represents alternative pathways to failure.

AND Gate: The output event occurs only when all input events occur simultaneously. This represents combined conditions necessary for failure.

Step-by-Step Process for Creating Process Failure Trees

Constructing an effective Process Failure Tree requires systematic thinking and collaborative input from subject matter experts. Follow these detailed steps to develop comprehensive failure trees:

Step 1: Define the Top Event Precisely

Begin by articulating exactly what failure you are analyzing. Ambiguous definitions lead to incomplete analysis. Your top event should be observable, measurable, and significant enough to warrant detailed investigation. Gather data on frequency, impact, and cost associated with this failure to establish its priority.

Step 2: Identify First-Level Causes

Ask yourself: “What immediate conditions or events could directly cause this top event?” List all plausible direct causes. At this stage, include all possibilities without filtering. You can validate and prioritize later using your measurement data.

Step 3: Determine Logical Relationships

For each first-level cause, determine whether it alone could trigger the top event (OR relationship) or whether it must occur in combination with other events (AND relationship). This distinction is crucial for understanding failure scenarios and prioritizing corrective actions.

Step 4: Decompose Each Branch

Take each first-level cause and repeat the questioning process: “What causes this event?” Continue breaking down each branch until you reach basic events that cannot be meaningfully subdivided further or represent actionable root causes.

Step 5: Validate Against Data

Cross-reference your Process Failure Tree against the data collected during the Measure phase. Historical failure records, process documentation, and incident reports should support the relationships you have mapped. Remove branches that lack data support and add any failure modes your data reveals but your initial analysis missed.

Step 6: Calculate Probabilities

If you have sufficient data, assign probability values to basic events. Using Boolean algebra and the rules for combining probabilities through logic gates, you can calculate the probability of the top event occurring. This quantitative dimension helps prioritize improvement efforts.

Practical Example: E-commerce Order Fulfillment Failure

Let us examine a detailed, practical example to illustrate Process Failure Tree construction. Consider an e-commerce company experiencing problems with incorrect orders being shipped to customers.

Sample Context and Data

The company processes approximately 5,000 orders monthly. Over the past quarter, they recorded 185 incidents of incorrect orders reaching customers, representing a 3.7% error rate. This failure generates significant costs including returns processing, shipping expenses, customer service time, and damage to brand reputation. The company estimates each incorrect shipment costs $45 in direct expenses, plus immeasurable customer satisfaction impact.

After data collection during the Measure phase, the team identified these incident breakdowns:

Wrong item picked from warehouse: 78 incidents (42%)
Correct item, wrong quantity: 35 incidents (19%)
Order packed incorrectly despite correct picking: 28 incidents (15%)
System generated incorrect picking list: 24 incidents (13%)
Wrong shipping label applied: 20 incidents (11%)

Constructing the Process Failure Tree

Top Event: Customer receives incorrect order

The first-level analysis reveals that this top event can occur through multiple independent pathways, so we use an OR gate connecting to these intermediate events:

Wrong item selected during picking
Correct item selected but incorrectly packed
Correct packing but wrong label applied
System error generating incorrect pick list

Branch 1: Wrong item selected during picking

Further analysis of the 78 wrong-picking incidents reveals this can happen when multiple conditions exist. The team determines this requires both a confusing warehouse situation AND a human error, so an AND gate connects:

Similar products stored in adjacent locations (Basic Event: Probability 0.35 based on warehouse layout analysis)
Picker working without adequate verification (Basic Event: Probability 0.42 based on observation data)

Additional OR-connected paths under wrong picking include:

Inadequate lighting in picking zone (Basic Event: Probability 0.18, occurs in specific warehouse sections)
Barcode scanning equipment malfunction (Basic Event: Probability 0.12 based on equipment logs)

Branch 2: Correct item selected but incorrectly packed

Investigation of the 28 incorrect packing incidents shows these scenarios connected by OR gates:

Multiple orders processed simultaneously causing confusion (Basic Event: Probability 0.25 during peak periods)
Packaging station lacks clear order separation (Basic Event: Probability 0.31 based on workstation audits)
Packer interrupted during process (Basic Event: Probability 0.28 per time-motion studies)

Branch 3: Correct packing but wrong label applied

The 20 labeling errors break down into OR-connected causes:

Printer produces multiple labels in sequence leading to misapplication (Basic Event: Probability 0.22)
Label adhesive fails, requiring replacement label with potential mixup (Basic Event: Probability 0.15)
Manual label application without barcode verification (Basic Event: Probability 0.38)

Branch 4: System error generating incorrect pick list

Analysis of the 24 system-generated errors reveals an AND relationship between:

Inventory database contains incorrect product location data (Basic Event: Probability 0.18)
Recent system update introduced bug in pick list algorithm (Basic Event: Probability 0.09)

Plus OR-connected alternatives:

Product variants not properly distinguished in database (Basic Event: Probability 0.28)
Manual inventory adjustments not properly recorded (Basic Event: Probability 0.33)

Analyzing the Process Failure Tree

Once constructed, your Process Failure Tree becomes a powerful analytical tool. The visual representation immediately highlights several insights:

Critical Path Identification

In our e-commerce example, the “wrong item selected during picking” branch accounts for 42% of failures. Within this branch, the AND relationship between similar product storage and inadequate verification creates a critical vulnerability. Addressing both factors becomes a high-priority improvement opportunity.

Single Point Failures

Events connected by OR gates represent independent failure paths. The “manual label application without barcode verification” basic event shows probability 0.38 and connects through OR gates, meaning this single factor alone can cause the top event. This makes it a prime target for quick wins through process standardization.

Compound Failures

AND gates reveal where multiple conditions must align for failure to occur. These might seem less urgent since both conditions must exist, but they often represent systemic vulnerabilities. The system error requiring both database inaccuracy AND the software bug suggests underlying data governance issues that, if unaddressed, will create recurring problems.

Integrating Process Failure Trees with Other Analyse Phase Tools

Process Failure Trees deliver maximum value when integrated with complementary analytical tools:

Failure Mode and Effects Analysis (FMEA)

While Process Failure Trees map logical relationships between failures, FMEA evaluates the severity, occurrence, and detection of each failure mode. Use your Process Failure Tree to identify failure modes, then apply FMEA to prioritize them based on Risk Priority Numbers (RPN). This combination ensures both comprehensive coverage and smart prioritization.

Root Cause Analysis (RCA)

Process Failure Trees provide structure for root cause analysis by systematically breaking down complex problems. The basic events in your tree represent hypothesized root causes that you can validate through techniques like the 5 Whys or fishbone diagrams.

Statistical Analysis

Assign probabilities to basic events using statistical data from your Measure phase. Hypothesis testing can validate whether specific factors truly contribute to failures at statistically significant levels. Regression analysis might reveal which variables most strongly predict failure occurrence.

Common Pitfalls and How to Avoid Them

Even experienced practitioners encounter challenges when creating Process Failure Trees. Avoid these common mistakes:

Analysis Paralysis

Teams sometimes create excessively detailed trees that become unmanageable. Focus on levels of detail that lead to actionable insights. If a branch does not change your improvement recommendations, it may be unnecessarily detailed.

Confirmation Bias

Teams may construct trees that confirm preexisting beliefs about causes rather than objectively analyzing all possibilities. Combat this by involving diverse perspectives, validating against data, and actively seeking disconfirming evidence.

Incomplete Gate Logic

Incorrectly assigned logic gates fundamentally misrepresent failure scenarios. Carefully consider whether events must occur together (AND) or independently (OR). When uncertain, gather more observational data or conduct designed experiments.

Stopping at Symptoms

Ensure your basic events represent true root causes, not symptoms of deeper issues. Apply the “5 Whys” test to each basic event to verify you have reached foundational causes.

Moving from Analysis to Action

The ultimate purpose of Process Failure Trees is enabling effective improvement. Once your analysis is complete, translate insights into action:

Prioritize Based on Impact and Feasibility

Not all root causes merit immediate attention. Consider factors like frequency of occurrence, severity of consequences, cost to address, and implementation timeline. Create a prioritization matrix that balances quick wins with strategic long-term improvements.

Design Targeted Interventions

For each prioritized root cause, develop specific countermeasures. In our e-commerce example, addressing the “similar products in adjacent locations” basic event might involve warehouse reorganization using product differentiation principles. The “manual label application without verification” issue could be resolved by implementing mandatory barcode scanning before order closure.

Predict Improvement Impact

Use the probability calculations in your tree to forecast improvement outcomes. If you eliminate a basic event with probability 0.38 connected through OR gates, you can estimate the reduction in top event occurrence. This quantitative prediction helps justify improvement investments and sets measurable targets for the Improve phase.

Real-World Application Across Industries

Process Failure Trees demonstrate versatility across diverse sectors:

Healthcare

Hospitals use failure trees to analyze medication errors, surgical complications, and patient safety incidents. The systematic breakdown helps identify contributing factors from technology failures to communication breakdowns to environmental conditions.

Manufacturing

Production facilities apply this technique to equipment failures, quality defects, and safety incidents. The logical structure helps maintenance teams develop preventive maintenance strategies targeting the most critical failure pathways.

Software Development

Technology companies analyze system outages, data breaches, and user experience failures through this methodology. The tree structure maps technical dependencies and identifies single points of failure in complex architectures.

Financial Services

Banks and financial institutions examine transaction errors, security breaches, and compliance failures. The formal structure supports regulatory documentation requirements while driving operational improvements.

Building Competency in Process Failure Tree Analysis

Mastering Process Failure Trees requires both conceptual understanding and practical application. The technique combines logical thinking, process knowledge, statistical analysis, and collaborative problem-solving. While this article provides foundational knowledge, developing true proficiency demands hands-on practice with real organizational challenges.

Formal Lean Six Sigma training provides structured learning pathways that build these competencies systematically. Through instructor-le

← Previous Post Next Post →

Related Posts

Process Stability Analysis in the Analyse Phase: A Complete Guide to Understanding Variation and Control

In the world of quality management and continuous improvement, understanding whether a process is stable and predictable forms the cornerstone of effective decision-making. Process stability analysis, a critical component of the Analyse phase in Lean Six Sigma...

Analyse Phase: Creating Data Driven Decision Matrices for Process Improvement Excellence

In the world of process improvement and quality management, making decisions based on gut feeling or assumptions can lead to costly mistakes and missed opportunities. The Analyse phase of the DMAIC (Define, Measure, Analyse, Improve, Control) methodology stands as a...

Analyse Phase: Identifying Low Hanging Fruit Opportunities for Quick Business Wins

In the world of process improvement and operational excellence, the Analyse phase of the DMAIC (Define, Measure, Analyse, Improve, Control) methodology represents a critical juncture where data transforms into actionable intelligence. During this phase, teams discover...

Understanding Process Mining Techniques in the Analyse Phase: A Comprehensive Guide

In today's data-driven business environment, organizations are constantly seeking innovative ways to understand, optimize, and improve their operational processes. Process mining has emerged as a powerful analytical technique that bridges the gap between traditional...

Mastering the Analyse Phase: A Complete Guide to Creating Quick Changeover Analysis

In today's competitive manufacturing landscape, the ability to reduce changeover time between production runs can significantly impact an organization's bottom line. Quick Changeover Analysis, a critical component of the Analyse phase in Lean Six Sigma methodology,...

Understanding the Analyse Phase: Mastering Process Input Output Relationships in Lean Six Sigma

In the world of process improvement, understanding how inputs affect outputs is fundamental to achieving operational excellence. The Analyse phase of the DMAIC (Define, Measure, Analyse, Improve, Control) methodology represents a critical juncture where data...