In the world of process improvement and quality management, identifying potential failures before they occur is not just beneficial; it is essential. The Failure Mode and Effects Analysis (FMEA) stands as one of the most powerful tools in the Lean Six Sigma methodology, particularly during the Analyse phase of the DMAIC (Define, Measure, Analyse, Improve, Control) framework. This comprehensive guide will walk you through the intricacies of conducting an effective FMEA, complete with practical examples and real-world applications that demonstrate its transformative impact on organizational excellence.
What is Failure Mode and Effects Analysis?
Failure Mode and Effects Analysis is a systematic, proactive method for evaluating processes to identify where and how they might fail and assessing the relative impact of different failures. The primary objective is to identify potential failure modes, determine their effect on the operation of the product or process, and prioritize actions to reduce failures based on their severity, occurrence, and detectability. You might also enjoy reading about Process Cycle Efficiency: A Complete Guide to Calculating Value-Added Time Ratio.
Originally developed in the 1950s by the United States military, FMEA has evolved into a cornerstone technique across industries, from automotive manufacturing to healthcare, software development to food service. Its versatility and structured approach make it invaluable for organizations committed to continuous improvement and risk mitigation. You might also enjoy reading about How to Conduct a 5 Whys Analysis: Step-by-Step Guide with Examples.
The Role of FMEA in the Analyse Phase
Within the DMAIC framework, the Analyse phase serves as the bridge between understanding what is happening (Measure) and determining what to do about it (Improve). During this critical phase, teams dig deep into data to identify root causes of problems and potential sources of variation. FMEA complements other analytical tools by providing a forward-looking perspective that anticipates problems before they manifest. You might also enjoy reading about Gap Analysis in Six Sigma: A Complete Guide to Comparing Current State to Desired State.
While tools like root cause analysis examine why failures have occurred, FMEA asks the equally important question: what could go wrong in the future? This proactive stance allows organizations to implement preventive measures rather than reactive fixes, saving time, resources, and reputation.
Core Components of FMEA
Understanding the fundamental elements of FMEA is crucial for conducting an effective analysis. Each component plays a specific role in evaluating and prioritizing potential failures.
Failure Modes
A failure mode describes the way in which a process, product, or service could potentially fail to meet design intent or performance requirements. These are the specific ways something can go wrong. For example, in a coffee shop scenario, failure modes might include: coffee served at incorrect temperature, wrong order delivered to customer, excessive wait time, or equipment malfunction during peak hours.
Failure Effects
The consequences of each failure mode on the customer, process, or system constitute the failure effects. These effects can range from minor inconveniences to catastrophic outcomes. Using our coffee shop example, the effect of serving coffee at incorrect temperature might be customer dissatisfaction, negative reviews, or even potential injury if the beverage is dangerously hot.
Severity, Occurrence, and Detection
The three critical ratings that determine prioritization are severity (how serious the effect is), occurrence (how frequently the failure mode happens), and detection (how likely the failure will be caught before reaching the customer). Each is typically rated on a scale of 1 to 10, with higher numbers indicating greater severity, frequency, or difficulty in detection.
Risk Priority Number (RPN)
The RPN is calculated by multiplying severity, occurrence, and detection ratings. This numerical value helps teams prioritize which failure modes require immediate attention. The maximum RPN is 1000 (10 x 10 x 10), while the minimum is 1 (1 x 1 x 1). Generally, higher RPNs indicate higher priority for corrective action.
Step-by-Step Process for Conducting FMEA
Step 1: Assemble Your FMEA Team
Success begins with the right people. An effective FMEA team should include individuals with diverse perspectives and expertise related to the process under examination. Include process owners, operators, engineers, quality specialists, and customer representatives when possible. A cross-functional team typically produces more comprehensive and realistic assessments.
For our working example, let us consider a mid-sized electronics manufacturer producing smartphone charging cables. The FMEA team would include production managers, quality control specialists, design engineers, procurement staff, and customer service representatives who handle warranty claims.
Step 2: Define the Scope and Boundaries
Clearly articulate what process, product, or service you are analyzing. Be specific about boundaries to ensure the analysis remains focused and manageable. For the charging cable manufacturer, the scope might be limited to the cable assembly process, specifically the steps involved in connecting the USB connector to the cable wire.
Step 3: Map the Process
Create a detailed process map showing each step in the workflow. This visual representation helps ensure the team has a shared understanding and identifies all potential failure points. For the cable assembly process, the key steps might include:
- Wire stripping and preparation
- Connector pin insertion
- Soldering connections
- Insulation application
- Connector housing assembly
- Quality inspection
- Packaging
Step 4: Identify Potential Failure Modes
For each process step, brainstorm all the ways it could fail. Encourage creative thinking and avoid dismissing ideas prematurely. Document every potential failure mode, no matter how unlikely it may seem initially.
For the soldering connections step in our example, potential failure modes might include: insufficient solder applied, excessive solder creating bridges between pins, cold solder joints due to incorrect temperature, wrong pins soldered together, or solder contamination from foreign materials.
Step 5: Determine Failure Effects
For each identified failure mode, describe the consequences. Consider effects on the end user, downstream processes, and the organization. Effects can cascade, so consider both immediate and downstream consequences.
Taking the “cold solder joint” failure mode from our example, the effects might include: intermittent electrical connection, complete connection failure during use, cable overheating during charging, device damage, potential fire hazard, customer injury, warranty claims, and brand reputation damage.
Step 6: Assign Severity Ratings
Rate the seriousness of each effect on a scale of 1 to 10. Use consistent criteria across the team. A rating of 1 indicates minimal consequence, while 10 represents catastrophic impact potentially involving safety hazards or regulatory non-compliance.
For the cold solder joint example, considering the potential for device damage and fire hazard, the severity rating might be assigned an 8, reflecting serious safety and legal implications.
Step 7: Identify Potential Causes
Determine the root causes that could lead to each failure mode. Understanding causes is essential for developing effective preventive actions. Causes for cold solder joints might include: soldering iron temperature set too low, insufficient heating time, contaminated surfaces, incorrect solder type, operator training deficiency, or faulty soldering equipment.
Step 8: Assign Occurrence Ratings
Rate how frequently each cause is likely to occur on a scale of 1 to 10. Base these ratings on historical data when available, or on team expertise and industry benchmarks when data is limited.
If the cable manufacturer has documented that soldering temperature issues occur approximately once per 500 units produced, the occurrence rating might be assigned a 3, indicating a low but not negligible frequency.
Step 9: Identify Current Controls
Document existing measures that either prevent the failure mode from occurring or detect it before the product reaches the customer. For our soldering example, current controls might include: temperature monitoring systems on soldering equipment, operator training programs, visual inspections, electrical continuity testing, and periodic equipment calibration.
Step 10: Assign Detection Ratings
Rate the likelihood that current controls will detect the failure mode before it causes harm, on a scale of 1 to 10. A rating of 1 means detection is virtually certain, while 10 means detection is highly unlikely or impossible.
If the manufacturer performs electrical continuity testing on every cable, which would reliably detect cold solder joints, the detection rating might be assigned a 2, indicating high confidence in catching the defect.
Step 11: Calculate Risk Priority Numbers
Multiply severity, occurrence, and detection ratings to calculate the RPN for each failure mode. For our cold solder joint example: RPN = 8 (severity) x 3 (occurrence) x 2 (detection) = 48.
Step 12: Prioritize and Develop Action Plans
Rank failure modes by RPN and develop action plans for the highest priority items. Most organizations establish threshold RPNs above which action is mandatory. Actions might focus on reducing severity (design changes), reducing occurrence (process improvements), or improving detection (enhanced testing).
Practical Example: Complete FMEA Sample Dataset
To illustrate how an FMEA comes together, here is a sample dataset for the smartphone charging cable assembly process:
Process Step: Soldering connections
Failure Mode 1: Cold solder joint
Potential Effects: Intermittent connection, cable failure, device damage, fire hazard
Severity: 8
Potential Causes: Incorrect soldering temperature
Occurrence: 3
Current Controls: Temperature monitoring, electrical testing
Detection: 2
RPN: 48
Recommended Actions: Implement automated temperature control with alerts, increase testing frequency
Failure Mode 2: Solder bridges between pins
Potential Effects: Short circuit, device damage, immediate cable failure
Severity: 9
Potential Causes: Excessive solder application, operator error
Occurrence: 2
Current Controls: Visual inspection, electrical testing
Detection: 2
RPN: 36
Recommended Actions: Implement solder paste dispensing system for consistent application
Process Step: Wire stripping
Failure Mode 3: Insufficient insulation removed
Potential Effects: Poor electrical connection, increased resistance, cable heating
Severity: 6
Potential Causes: Incorrect tool setting, worn stripping blades
Occurrence: 4
Current Controls: Periodic tool calibration, visual inspection
Detection: 5
RPN: 120
Recommended Actions: Implement automated wire stripping equipment, add measurement verification step
Failure Mode 4: Wire strands damaged during stripping
Potential Effects: Reduced current capacity, premature cable failure
Severity: 7
Potential Causes: Excessive stripping force, dull blades
Occurrence: 3
Current Controls: Blade replacement schedule, operator training
Detection: 6
RPN: 126
Recommended Actions: Establish preventive maintenance program with tracking, add microscopic inspection sampling
In this example, “Wire strands damaged during stripping” emerges as the highest priority with an RPN of 126, followed closely by “Insufficient insulation removed” at 120. These would receive immediate attention in the improvement planning phase.
Common Pitfalls and How to Avoid Them
Inconsistent Rating Scales
Team members often interpret rating scales differently. Avoid this by establishing clear criteria for each rating level before beginning the analysis. Create a reference guide that defines what each number means for severity, occurrence, and detection specific to your context.
Analysis Paralysis
Some teams become so thorough that the FMEA process becomes unwieldy and time-consuming. Set realistic boundaries on scope and establish time limits for each phase. Remember that FMEA is a living document that can be updated as knowledge increases.
Focusing Only on High RPNs
While RPN provides valuable prioritization, do not ignore failure modes with high severity ratings even if their overall RPN is moderate. A failure mode with severity of 9 or 10 deserves attention regardless of occurrence or detection ratings.
Inadequate Follow-Through
The most comprehensive FMEA is worthless without action. Assign clear ownership for recommended actions, establish deadlines, and create accountability mechanisms. Schedule follow-up reviews to verify that actions have been implemented and assess their effectiveness.
Benefits of Implementing FMEA
Organizations that effectively implement FMEA experience numerous tangible and intangible benefits. Financial advantages include reduced warranty costs, fewer product recalls, decreased scrap and rework, and lower insurance premiums. Quality improvements manifest through enhanced product reliability, improved customer satisfaction, and strengthened brand reputation.
Operational benefits include better process understanding, improved communication across departments, enhanced risk awareness, and more efficient resource allocation. The systematic documentation creates valuable institutional knowledge that persists even as personnel change.
From a strategic perspective, FMEA demonstrates due diligence to regulators and customers, supports compliance with industry standards like ISO 9001 and IATF 16949, and provides competitive advantage through superior quality and reliability.
Integrating FMEA with Other Six Sigma Tools
FMEA does not exist in isolation but works synergistically with other Lean Six Sigma tools. Process mapping provides the foundation for identifying failure points. Control charts and capability analysis from the Measure phase inform occurrence ratings with statistical evidence. Root cause analysis tools like fishbone diagrams and the 5 Whys help identify causes of failure modes.
During the Improve phase, Design of Experiments (DOE) can test proposed solutions identified through FMEA. In the Control phase, the improved process incorporates prevention and detection methods recommended by FMEA, which are then monitored through control plans and statistical process control.
Advanced FMEA Variations
Design FMEA (DFMEA)
Applied during product development, DFMEA evaluates potential failures in product design before manufacturing begins. This proactive approach prevents costly design flaws from reaching production.
Process FMEA (PFMEA)
Focused on manufacturing and assembly processes, PFMEA identifies potential failures in how products are made rather than in the design itself. Our charging cable example represents a PFMEA application.
System FMEA (SFMEA)
This higher-level analysis examines interactions between subsystems and components, identifying failures that might occur due to interface issues rather than individual component problems.
Service FMEA
Applied to service industries, this variation analyzes potential failures in service delivery processes, from healthcare procedures to financial transactions to customer support workflows.
Real-World Impact: Success Stories
Consider a medical device manufacturer that implemented FMEA during the design of an insulin pump. The analysis identified a potential failure mode where the pump could deliver incorrect dosages due to software timing issues under specific battery conditions. By detecting this during design, the company modified the software architecture and added redundant safety checks. This prevented a potentially life-threatening failure that would have resulted in product recalls, legal liability, and incalculable damage to patients and reputation.
In the automotive industry, a transmission manufacturer used FMEA to








