Enterprise AI: redefining resilience in the age of AI
Executive conversations over the last few years have centred on a single, critical concept: resilience. Whether framed as ‘business’, ‘operational’, or ‘digital’ resilience, the core concern remains the same: mitigating the impact of system downtime.
This focus on resilience is not theoretical; it is often being written into law. In the Australian financial sector, APRA has mandated CPS-230, directly embedding operational resilience into banking compliance.
The impact of AI on resilience
Resilience is now colliding with the reality of AI. AI applications are typically complex workflows built on a chain of API endpoints. These workflows — which might trigger a search, run inference, query a database or synthesise a response — introduce potential delay which accumulates with each workflow step, directly impacting the user experience. Recent industry analysis shows the share of organisations citing latency as a key AI infrastructure constraint has roughly doubled in a year, from about a third to more than half, driven by the proliferation of real-time applications.
As AI shifts from experimental use to the core of daily operations, this dependency forces a critical shift: the conversation is moving beyond simply asking, ‘Is the service available?’ to demanding, ‘Is the service responsive?’ Latency is no longer just an engineering concern; it is becoming a quantifiable business continuity risk.
Pragmatically, not every AI workload requires low latency. Training jobs, overnight analytics, document summarisation or human-paced workflows can all tolerate slower rates. The critical shift, however, focuses on a specific and growing subset of workloads: those where AI is making or gating a real-time decision on the operational path, and where the value of the answer collapses once a timing threshold is missed.
Classifying workloads by response sensitivity
To manage this risk, we can classify workloads into distinct tiers:
- Tier 1 — Decision critical (sub-50 ms latency): These are mission-critical scenarios — such as real-time control loops or point-of-transaction fraud detection. Here, if the answer arrives late, the decision has already been made without it.
- Tier 2 — Retention sensitive (~300 ms latency): This affects live, customer-facing interactions, like voice AI. As delays build, human perception kicks in, and dissatisfaction begins to erode the user experience.
- Tier 3 — Functional response: Many text chatbots fall here. Users notice the delay, but the system remains functional. This forces a business decision: is the risk of losing user focus/failure to complete an interaction worth the cost of optimising response time?
-
Tier 4 — Non-sensitive: This broad category includes batch analytics, document processing and overnight research — workloads that are not sensitive to timing.
With the explosion of agentic AI in 2026, tiers 1 and 2 are the fastest-growing segments, representing the areas where latency most critically impacts core business resilience.
Operationalising AI resilience
To maintain and improve resilience, adopting an SLO (Service Level Objective) mindset treats latency with the same rigor we apply to uptime. Key measures include:
Reclassify risk by business impact: Do not treat AI as a single category, but instead mandate that every critical AI workflow be mapped against a response threshold. It allows assessing whether a 100 ms delay is acceptable for a document summary, while jeopardising a real-time fraud check.
- Treat latency SLOs as availability targets: Apply the same rigour used for service availability to latency, assigning clear owners, defining incident response playbooks and budgeting for remediation when latency degrades.
- Architect for proximity: The future of AI performance demands moving compute power closer to the point of interaction — the edge. This is not just an IT decision; it is a strategic decision about where value is created and where risk resides.
- Embed performance in contracts: If AI is on the critical path of revenue generation, latency guarantees must be negotiated alongside uptime guarantees in vendor and partnership agreements.
-
Consider network security: Protecting AI systems against denial-of-service attacks and degradation attacks is no longer a separate concern. For workloads where delay equals downtime, appropriate security measures determine whether the service can meet its purpose at all.
In today’s hyper-competitive landscape, the market rewards seamless and instantaneous response. For AI-dependent enterprises, managing latency is not merely about optimising infrastructure; it is the defining factor in delivering a resilient and responsive customer experience that directly impacts preserving loyalty and market share.
The future of data platforms: from pipelines to intelligent orchestration
The evolution from pipelines to intelligent orchestration is not a distant vision, but already...
How sovereign AI is transforming the cloud landscape
Sovereign AI aims to secure data, and provide operational and technical control over AI...
Cloud and AI: two sides of the same coin
For Australia's AI vision to become a reality, industry leaders and organisations across...
