[P2] [S3] Chapter 14
Common Challenges and Pitfalls
Introduction
While the concept of impact tolerance is increasingly well understood, its practical implementation remains challenging for many organisations.
Common pitfalls arise from legacy thinking, fragmented data, weak governance, and insufficient testing. If not addressed, these issues can lead to misaligned tolerances, ineffective resilience strategies, and regulatory non-compliance.
This chapter highlights the most frequent challenges encountered when setting and implementing impact tolerances, along with practical mitigation strategies to address them.
Purpose of the Chapter
The purpose of this chapter is to:
- Identify common challenges in setting impact tolerances
- Highlight pitfalls that undermine resilience effectiveness
- Provide practical mitigation strategies
- Help organisations avoid common implementation mistakes
Over-Reliance on RTO/RPO
Challenge
Many organisations continue to rely heavily on traditional Recovery Time Objective (RTO) and Recovery Point Objective (RPO) metrics when defining impact tolerance.
Why This Is a Problem
- RTO/RPO are technology or process-centric, not service-centric
- They focus on recovery after disruption, not continuity during disruption
- They do not fully capture customer harm, systemic impact, or service degradation
Example
A system may recover within its RTO of 4 hours, but during that time:
- Customers cannot access funds
- Transactions fail
- Regulatory expectations are breached
Mitigation Strategies
- Shift from system-level metrics → service-level impact metrics
- Combine RTO/RPO with:
- Customer impact thresholds
- Transaction volume limits
- Service capacity measures
- Align tolerances with CBS outcomes, not individual systems
Lack of Service-Level Thinking
Challenge
Organisations often focus on internal processes rather than end-to-end service delivery.
Why This Is a Problem
- Leads to fragmented tolerance definitions
- Ignores how disruptions affect the customer experience
- Fails to capture interdependencies across functions
Example
- Payment initiation works, but clearing fails
- Individual systems appear operational, but the service is not delivered end-to-end
Mitigation Strategies
- Adopt a service-centric approach
- Define tolerances at the CBS and Sub-CBS levels
- Map and assess end-to-end service delivery chains
- Conduct end-to-end scenario testing
Poor Data Quality in Mapping
Challenge
Dependency mapping is often incomplete, outdated, or inaccurate.
Why This Is a Problem
- Hidden dependencies are not identified
- Single points of failure remain undiscovered
- Tolerances are based on incorrect assumptions
Example
- Missing a third-party dependency leads to unexpected service failure
- Outdated system mapping does not reflect the current architecture
Mitigation Strategies
- Establish structured mapping templates
- Regularly update mapping data
- Validate mapping through cross-functional workshops
- Integrate mapping with configuration management and asset inventories
Unrealistic Tolerance Setting
Challenge
Impact tolerances may be set too leniently or too strictly.
Why This Is a Problem
|
Type
|
Risk
|
|
Too lenient
|
Excessive customer harm and regulatory risk
|
|
Too strict
|
Operationally unachievable and costly
|
Example
- Tolerance set at 1 hour, but recovery capability is 4 hours
- Tolerance set at 24 hours for a critical payment service (regulatory misalignment)
Mitigation Strategies
- Use scenario-based calibration
- Align tolerances with:
- Customer expectations
- Regulatory requirements
- Actual operational capability
- Validate tolerances through testing and data analysis
Weak Governance and Ownership
Challenge
Lack of clear accountability for CBS and impact tolerances.
Why This Is a Problem
- Tolerances are not actively managed
- No ownership for maintaining service within thresholds
- Weak oversight and challenge
Example
- Multiple teams assume responsibility, but no clear owner exists
- Tolerance breaches are not escalated or addressed
Mitigation Strategies
- Assign clear ownership for each CBS and Sub-CBS
- Align governance with the Three Lines of Defence model
- Establish Board and Senior Management oversight
- Define clear approval, monitoring, and escalation processes
Inadequate Scenario Testing
Challenge
Scenario testing is often limited, unrealistic, or not conducted regularly.
Why This Is a Problem
- Tolerances are not validated
- Gaps remain unidentified
- Organisations lack evidence for regulatory review
Example
- Testing only covers system recovery, not end-to-end service
- Scenarios do not reflect severe but plausible conditions
Mitigation Strategies
- Conduct regular scenario testing (OR-P2-S4)
- Use severe but plausible scenarios (SuPS)
- Include:
- Technology failures
- Cyber incidents
- Third-party disruptions
- People unavailability
- Perform end-to-end CBS testing
- Measure actual vs defined tolerance
Siloed Implementation Across Functions
Challenge
Different functions (ORM, BCM, IT, Risk) operate independently.
Why This Is a Problem
- Inconsistent metrics and assumptions
- Lack of coordination during disruptions
- Inefficient use of resources
Mitigation Strategies
- Integrate impact tolerance across:
- Operational Risk Management
- Business Continuity Management
- Cyber Resilience
- Third-Party Risk Management
- Establish common metrics and frameworks
- Promote cross-functional collaboration
Insufficient Monitoring and Metrics
Challenge
Lack of effective monitoring mechanisms and performance indicators.
Why This Is a Problem
- Organisations cannot detect approaching tolerance breaches
- Reactive rather than proactive response
- Limited visibility into resilience performance
Mitigation Strategies
- Define clear metrics and Key Risk Indicators (KRIs)
- Implement real-time monitoring systems
- Establish early warning thresholds
- Integrate monitoring into incident response frameworks
Failure to Update and Refine Tolerances
Challenge
Impact tolerances are not reviewed regularly.
Why This Is a Problem
- Tolerances become outdated
- Misalignment with the current business and technology environment
- Increased risk of failure during disruption
Mitigation Strategies
- Conduct regular reviews (at least annually)
- Update tolerances based on:
- Incident lessons learned
- Scenario testing results
- Business and technology changes
- Regulatory updates
- Embed tolerance review into governance processes
Lack of Evidence and Documentation
Challenge
Insufficient documentation to support tolerance setting and validation.
Why This Is a Problem
- Difficult to demonstrate compliance
- Weak audit trail
- Reduced credibility with regulators
Mitigation Strategies
- Maintain comprehensive documentation:
- Impact tolerance registers
- Scenario testing results
- Dependency maps
- Governance approvals
- Ensure documentation is structured, updated, and accessible
Summary of Challenges and Mitigations
|
Challenge
|
Key Mitigation
|
|
Over-reliance on RTO/RPO
|
Adopt service-level impact metrics
|
|
Lack of service-level thinking
|
Focus on CBS and end-to-end delivery
|
|
Poor mapping data
|
Improve data quality and validation
|
|
Unrealistic tolerances
|
Use scenario-based calibration
|
|
Weak governance
|
Strengthen ownership and oversight
|
|
Inadequate testing
|
Conduct regular, realistic scenario testing
|
|
Siloed implementation
|
Integrate across resilience pillars
|
|
Weak monitoring
|
Implement KRIs and real-time monitoring
|
|
Lack of updates
|
Establish regular review cycles
|
|
Poor documentation
|
Maintain comprehensive audit evidence
|
Setting impact tolerances is a complex process that requires a shift in mindset, strong governance, accurate data, and continuous validation. The challenges outlined in this chapter are common across organisations, but they can be effectively managed through structured methodologies, cross-functional collaboration, and disciplined execution.
By recognising and addressing these pitfalls early, organisations can avoid missteps that undermine resilience and instead build a robust, credible, and sustainable impact tolerance framework. Ultimately, overcoming these challenges ensures that impact tolerances are not only defined but are realistic, actionable, and aligned with both operational capability and regulatory expectations.
| C1 |
C2 |
C3 |
C4 |
C5 |
C6 |
|
|
|
|
|
|
|
| C7 |
C8 |
C9 |
C10 |
C11 |
C12 |
|
|
|
|
|
|
|
| C13 |
C14 |
C15 |
C16 |
C17 |
C18 |
|
|
|
|
|
|
|
More Information About OR-5000 [OR-5] or OR-300 [OR-3]
To learn more about the course and schedule, click the buttons below for the OR-300 Operational Resilience Implementer course and the OR-5000 Operational Resilience Expert Implementer course.
|
|
|
|
|
|
|
|
|
|
If you have any questions, click to contact us.
|
|
|
|
|
|