eBook OR

[OR] [P2] [S3] [ITo] [C14] Common Challenges and Pitfalls

Written by Moh Heng Goh | May 8, 2026 9:59:45 AM

[P2] [S3] Chapter 14

Common Challenges and Pitfalls

Introduction

While the concept of impact tolerance is increasingly well understood, its practical implementation remains challenging for many organisations.

Common pitfalls arise from legacy thinking, fragmented data, weak governance, and insufficient testing. If not addressed, these issues can lead to misaligned tolerances, ineffective resilience strategies, and regulatory non-compliance.

This chapter highlights the most frequent challenges encountered when setting and implementing impact tolerances, along with practical mitigation strategies to address them.

Purpose of the Chapter

The purpose of this chapter is to:

  • Identify common challenges in setting impact tolerances
  • Highlight pitfalls that undermine resilience effectiveness
  • Provide practical mitigation strategies
  • Help organisations avoid common implementation mistakes

Over-Reliance on RTO/RPO

Challenge

Many organisations continue to rely heavily on traditional Recovery Time Objective (RTO) and Recovery Point Objective (RPO) metrics when defining impact tolerance.

Why This Is a Problem
  • RTO/RPO are technology or process-centric, not service-centric
  • They focus on recovery after disruption, not continuity during disruption
  • They do not fully capture customer harm, systemic impact, or service degradation
Example

A system may recover within its RTO of 4 hours, but during that time:

  • Customers cannot access funds
  • Transactions fail
  • Regulatory expectations are breached
Mitigation Strategies
  • Shift from system-level metrics → service-level impact metrics
  • Combine RTO/RPO with:
    • Customer impact thresholds
    • Transaction volume limits
    • Service capacity measures
  • Align tolerances with CBS outcomes, not individual systems

Lack of Service-Level Thinking

Challenge

Organisations often focus on internal processes rather than end-to-end service delivery.

Why This Is a Problem
  • Leads to fragmented tolerance definitions
  • Ignores how disruptions affect the customer experience
  • Fails to capture interdependencies across functions
Example
  • Payment initiation works, but clearing fails
  • Individual systems appear operational, but the service is not delivered end-to-end
Mitigation Strategies
  • Adopt a service-centric approach
  • Define tolerances at the CBS and Sub-CBS levels
  • Map and assess end-to-end service delivery chains
  • Conduct end-to-end scenario testing

Poor Data Quality in Mapping

Challenge

Dependency mapping is often incomplete, outdated, or inaccurate.

Why This Is a Problem
  • Hidden dependencies are not identified
  • Single points of failure remain undiscovered
  • Tolerances are based on incorrect assumptions
Example
  • Missing a third-party dependency leads to unexpected service failure
  • Outdated system mapping does not reflect the current architecture
Mitigation Strategies
  • Establish structured mapping templates
  • Regularly update mapping data
  • Validate mapping through cross-functional workshops
  • Integrate mapping with configuration management and asset inventories

Unrealistic Tolerance Setting

Challenge

Impact tolerances may be set too leniently or too strictly.

Why This Is a Problem

Type

Risk

Too lenient

Excessive customer harm and regulatory risk

Too strict

Operationally unachievable and costly

Example
  • Tolerance set at 1 hour, but recovery capability is 4 hours
  • Tolerance set at 24 hours for a critical payment service (regulatory misalignment)
Mitigation Strategies
  • Use scenario-based calibration
  • Align tolerances with:
    • Customer expectations
    • Regulatory requirements
    • Actual operational capability
  • Validate tolerances through testing and data analysis

Weak Governance and Ownership

Challenge

Lack of clear accountability for CBS and impact tolerances.

Why This Is a Problem
  • Tolerances are not actively managed
  • No ownership for maintaining service within thresholds
  • Weak oversight and challenge
Example
  • Multiple teams assume responsibility, but no clear owner exists
  • Tolerance breaches are not escalated or addressed
Mitigation Strategies
  • Assign clear ownership for each CBS and Sub-CBS
  • Align governance with the Three Lines of Defence model
  • Establish Board and Senior Management oversight
  • Define clear approval, monitoring, and escalation processes

Inadequate Scenario Testing

Challenge

Scenario testing is often limited, unrealistic, or not conducted regularly.

Why This Is a Problem
  • Tolerances are not validated
  • Gaps remain unidentified
  • Organisations lack evidence for regulatory review
Example
  • Testing only covers system recovery, not end-to-end service
  • Scenarios do not reflect severe but plausible conditions
Mitigation Strategies
  • Conduct regular scenario testing (OR-P2-S4)
  • Use severe but plausible scenarios (SuPS)
  • Include:
    • Technology failures
    • Cyber incidents
    • Third-party disruptions
    • People unavailability
  • Perform end-to-end CBS testing
  • Measure actual vs defined tolerance

Siloed Implementation Across Functions

Challenge

Different functions (ORM, BCM, IT, Risk) operate independently.

Why This Is a Problem
  • Inconsistent metrics and assumptions
  • Lack of coordination during disruptions
  • Inefficient use of resources
Mitigation Strategies
  • Integrate impact tolerance across:
    • Operational Risk Management
    • Business Continuity Management
    • Cyber Resilience
    • Third-Party Risk Management
  • Establish common metrics and frameworks
  • Promote cross-functional collaboration

Insufficient Monitoring and Metrics

Challenge

Lack of effective monitoring mechanisms and performance indicators.

Why This Is a Problem
  • Organisations cannot detect approaching tolerance breaches
  • Reactive rather than proactive response
  • Limited visibility into resilience performance
Mitigation Strategies
  • Define clear metrics and Key Risk Indicators (KRIs)
  • Implement real-time monitoring systems
  • Establish early warning thresholds
  • Integrate monitoring into incident response frameworks

Failure to Update and Refine Tolerances

Challenge

Impact tolerances are not reviewed regularly.

Why This Is a Problem
  • Tolerances become outdated
  • Misalignment with the current business and technology environment
  • Increased risk of failure during disruption
Mitigation Strategies
  • Conduct regular reviews (at least annually)
  • Update tolerances based on:
    • Incident lessons learned
    • Scenario testing results
    • Business and technology changes
    • Regulatory updates
  • Embed tolerance review into governance processes

Lack of Evidence and Documentation

Challenge

Insufficient documentation to support tolerance setting and validation.

Why This Is a Problem
  • Difficult to demonstrate compliance
  • Weak audit trail
  • Reduced credibility with regulators
Mitigation Strategies
  • Maintain comprehensive documentation:
    • Impact tolerance registers
    • Scenario testing results
    • Dependency maps
    • Governance approvals
  • Ensure documentation is structured, updated, and accessible

Summary of Challenges and Mitigations

Challenge

Key Mitigation

Over-reliance on RTO/RPO

Adopt service-level impact metrics

Lack of service-level thinking

Focus on CBS and end-to-end delivery

Poor mapping data

Improve data quality and validation

Unrealistic tolerances

Use scenario-based calibration

Weak governance

Strengthen ownership and oversight

Inadequate testing

Conduct regular, realistic scenario testing

Siloed implementation

Integrate across resilience pillars

Weak monitoring

Implement KRIs and real-time monitoring

Lack of updates

Establish regular review cycles

Poor documentation

Maintain comprehensive audit evidence

 

Setting impact tolerances is a complex process that requires a shift in mindset, strong governance, accurate data, and continuous validation. The challenges outlined in this chapter are common across organisations, but they can be effectively managed through structured methodologies, cross-functional collaboration, and disciplined execution.

By recognising and addressing these pitfalls early, organisations can avoid missteps that undermine resilience and instead build a robust, credible, and sustainable impact tolerance framework. Ultimately, overcoming these challenges ensures that impact tolerances are not only defined but are realistic, actionable, and aligned with both operational capability and regulatory expectations.

C1 C2 C3 C4 C5 C6
C7 C8 C9 C10 C11 C12 
C13 C14 C15 C16 C17 C18

 

More Information About OR-5000 [OR-5] or OR-300 [OR-3]

To learn more about the course and schedule, click the buttons below for the OR-300 Operational Resilience Implementer course and the OR-5000 Operational Resilience Expert Implementer course.

If you have any questions, click to contact us.