[OR] [P2] [S3] [ITo] [C14] Common Challenges and Pitfalls

Written by Moh Heng Goh | May 8, 2026 9:59:45 AM

[P2] [S3] Chapter 14

Common Challenges and Pitfalls

Introduction

While the concept of impact tolerance is increasingly well understood, its practical implementation remains challenging for many organisations.

Common pitfalls arise from legacy thinking, fragmented data, weak governance, and insufficient testing. If not addressed, these issues can lead to misaligned tolerances, ineffective resilience strategies, and regulatory non-compliance.

This chapter highlights the most frequent challenges encountered when setting and implementing impact tolerances, along with practical mitigation strategies to address them.

Purpose of the Chapter

The purpose of this chapter is to:

Identify common challenges in setting impact tolerances
Highlight pitfalls that undermine resilience effectiveness
Provide practical mitigation strategies
Help organisations avoid common implementation mistakes

Over-Reliance on RTO/RPO

Challenge

Many organisations continue to rely heavily on traditional Recovery Time Objective (RTO) and Recovery Point Objective (RPO) metrics when defining impact tolerance.

Why This Is a Problem

RTO/RPO are technology or process-centric, not service-centric
They focus on recovery after disruption, not continuity during disruption
They do not fully capture customer harm, systemic impact, or service degradation

Example

A system may recover within its RTO of 4 hours, but during that time:

Customers cannot access funds
Transactions fail
Regulatory expectations are breached

Mitigation Strategies

Shift from system-level metrics → service-level impact metrics
Combine RTO/RPO with:

Customer impact thresholds
Transaction volume limits
Service capacity measures

Align tolerances with CBS outcomes, not individual systems

Lack of Service-Level Thinking

Challenge

Organisations often focus on internal processes rather than end-to-end service delivery.

Why This Is a Problem

Leads to fragmented tolerance definitions
Ignores how disruptions affect the customer experience
Fails to capture interdependencies across functions

Example

Payment initiation works, but clearing fails
Individual systems appear operational, but the service is not delivered end-to-end

Mitigation Strategies

Adopt a service-centric approach
Define tolerances at the CBS and Sub-CBS levels
Map and assess end-to-end service delivery chains
Conduct end-to-end scenario testing

Poor Data Quality in Mapping

Challenge

Dependency mapping is often incomplete, outdated, or inaccurate.

Why This Is a Problem

Hidden dependencies are not identified
Single points of failure remain undiscovered
Tolerances are based on incorrect assumptions

Example

Missing a third-party dependency leads to unexpected service failure
Outdated system mapping does not reflect the current architecture

Mitigation Strategies

Establish structured mapping templates
Regularly update mapping data
Validate mapping through cross-functional workshops
Integrate mapping with configuration management and asset inventories

Unrealistic Tolerance Setting

Challenge

Impact tolerances may be set too leniently or too strictly.

Why This Is a Problem

Type	Risk
Too lenient	Excessive customer harm and regulatory risk
Too strict	Operationally unachievable and costly

Example

Tolerance set at 1 hour, but recovery capability is 4 hours
Tolerance set at 24 hours for a critical payment service (regulatory misalignment)

Mitigation Strategies

Use scenario-based calibration
Align tolerances with:

Customer expectations
Regulatory requirements
Actual operational capability

Validate tolerances through testing and data analysis

Weak Governance and Ownership

Challenge

Lack of clear accountability for CBS and impact tolerances.

Why This Is a Problem

Tolerances are not actively managed
No ownership for maintaining service within thresholds
Weak oversight and challenge

Example

Multiple teams assume responsibility, but no clear owner exists
Tolerance breaches are not escalated or addressed

Mitigation Strategies

Assign clear ownership for each CBS and Sub-CBS
Align governance with the Three Lines of Defence model
Establish Board and Senior Management oversight
Define clear approval, monitoring, and escalation processes

Inadequate Scenario Testing

Challenge

Scenario testing is often limited, unrealistic, or not conducted regularly.

Why This Is a Problem

Tolerances are not validated
Gaps remain unidentified
Organisations lack evidence for regulatory review

Example

Testing only covers system recovery, not end-to-end service
Scenarios do not reflect severe but plausible conditions

Mitigation Strategies

Conduct regular scenario testing (OR-P2-S4)
Use severe but plausible scenarios (SuPS)
Include:

Technology failures
Cyber incidents
Third-party disruptions
People unavailability

Perform end-to-end CBS testing
Measure actual vs defined tolerance

Siloed Implementation Across Functions

Challenge

Different functions (ORM, BCM, IT, Risk) operate independently.

Why This Is a Problem

Inconsistent metrics and assumptions
Lack of coordination during disruptions
Inefficient use of resources

Mitigation Strategies

Integrate impact tolerance across:

Operational Risk Management
Business Continuity Management
Cyber Resilience
Third-Party Risk Management

Establish common metrics and frameworks
Promote cross-functional collaboration

Insufficient Monitoring and Metrics

Challenge

Lack of effective monitoring mechanisms and performance indicators.

Why This Is a Problem

Organisations cannot detect approaching tolerance breaches
Reactive rather than proactive response
Limited visibility into resilience performance

Mitigation Strategies

Define clear metrics and Key Risk Indicators (KRIs)
Implement real-time monitoring systems
Establish early warning thresholds
Integrate monitoring into incident response frameworks

Failure to Update and Refine Tolerances

Challenge

Impact tolerances are not reviewed regularly.

Why This Is a Problem

Tolerances become outdated
Misalignment with the current business and technology environment
Increased risk of failure during disruption

Mitigation Strategies

Conduct regular reviews (at least annually)
Update tolerances based on:

Incident lessons learned
Scenario testing results
Business and technology changes
Regulatory updates

Embed tolerance review into governance processes

Lack of Evidence and Documentation

Challenge

Insufficient documentation to support tolerance setting and validation.

Why This Is a Problem

Difficult to demonstrate compliance
Weak audit trail
Reduced credibility with regulators

Mitigation Strategies

Maintain comprehensive documentation:

Impact tolerance registers
Scenario testing results
Dependency maps
Governance approvals

Ensure documentation is structured, updated, and accessible

Summary of Challenges and Mitigations

Challenge	Key Mitigation
Over-reliance on RTO/RPO	Adopt service-level impact metrics
Lack of service-level thinking	Focus on CBS and end-to-end delivery
Poor mapping data	Improve data quality and validation
Unrealistic tolerances	Use scenario-based calibration
Weak governance	Strengthen ownership and oversight
Inadequate testing	Conduct regular, realistic scenario testing
Siloed implementation	Integrate across resilience pillars
Weak monitoring	Implement KRIs and real-time monitoring
Lack of updates	Establish regular review cycles
Poor documentation	Maintain comprehensive audit evidence

Setting impact tolerances is a complex process that requires a shift in mindset, strong governance, accurate data, and continuous validation. The challenges outlined in this chapter are common across organisations, but they can be effectively managed through structured methodologies, cross-functional collaboration, and disciplined execution.

By recognising and addressing these pitfalls early, organisations can avoid missteps that undermine resilience and instead build a robust, credible, and sustainable impact tolerance framework. Ultimately, overcoming these challenges ensures that impact tolerances are not only defined but are realistic, actionable, and aligned with both operational capability and regulatory expectations.

C1	C2	C3	C4	C5	C6

C7	C8	C9	C10	C11	C12

C13	C14	C15	C16	C17	C18

More Information About OR-5000 [OR-5] or OR-300 [OR-3]

To learn more about the course and schedule, click the buttons below for the OR-300 Operational Resilience Implementer course and the OR-5000 Operational Resilience Expert Implementer course.



	If you have any questions, click to contact us.

View full post