The rise of AI-native ops

I recently attended AWS re:Invent 2025. Here are the headlines you may seen:

Observability now includes models, prompts, retrieval, tools and token paths.
GPU, Trainium, Inferentia and data movement are reshaping cost curves.
“Reliability” now includes inference latency, RAG accuracy and agent success rates.
From Amazon’s own perspective: Prime Day traffic hit eye-watering levels. 200 million Prime members, 9 billion same- or next-day packages, ElastiCache at ~1.5 quadrillion requests/day, ads infrastructure at over a trillion requests/minute, CloudFront serving more than 3 trillion HTTP requests.
Custom silicon matters: over 40 percent of Amazon.com traffic now runs on AWS Graviton, while AWS Trainium and AWS Inferentia power Amazon Bedrock for training and inference with drastically lower cost per token.
Robotics & autonomy have gone mainstream: more than one million robots operate across Amazon’s networks; Zoox uses petabyte-scale S3, thousands of GPUs, Slurm scheduling, Amazon EKS and Amazon SageMaker HyperPod for simulation and model development.
Internal AI agents are exploding: more than 21,000 internal agents deployed across Amazon’s eCommerce Foundation, built on AgentCore and the Strands Agents SDK.
Enterprise customers are already seeing results, from 2% conversion lifts across 100,000 SKUs to 30% autonomous resolution of complex workflows to measurable reduction of operational overhead.

Elevator pitch concluded.

Now, let’s get into why this matters for you.

Why AI-native operations and agents matter – in detail

The sessions made one truth unavoidable: AI-native systems change how the business operates. For the better – if done well. And painfully – if not.

A strong AI operating model unlocks:

1. Faster execution and experimentation

Amazon teams are already using spec-first workflows and integrated AI tooling to achieve 4.5× higher deployment frequency. This isn’t “code generation.”

It’s AI woven into planning, testing, deployment and rollback.

When the platform supports it, teams ship more with less.

2. Measurable revenue and cost outcomes

This is outcomes, not hype:

Rufus – Amazon’s generative shopping assistant – increased purchase completion by around 60% with 4.5× lower cost thanks to Trainium/Inferentia inference, continuous batching and prompt caching.

Marketplace optimisation agents delivered 2% absolute conversion increases over massive SKU catalogs.

Internal Amazon teams reported billions in projected cost savings from workflow automation and tool-driven execution at scale.

The pattern is clear: AI is moving real commercial metrics, not just internal enthusiasm.

3. Reliability as feature zero

Across Prime Video’s streaming stack, Zoox’s autonomy platform and Amazon’s retail backbone, reliability wasn’t a footnote – it was the strategy.

AI workloads now demand:

SLOs for inference latency, token streaming and RAG freshness
Multi-region and edge-cloud architectures for real-time loads
Agent success, escalation and tool-call reliability tracking
Observability across prompt → model → reasoning → tool No AI system gets a pass on uptime.

4. A better experience for teams

AI agents step into the repetitive, procedural work: checking eligibility, summarising state, generating options, running validations.

Humans shift into:

Higher-order design
Debugging real issues
Experimenting and improving
Delivering customer impact

Teams ship more. On-call pain goes down. Morale goes up.

…but a weak operating model reverses everything

Pilots stall
Costs explode
Governance breaks
Incidents cross multiple opaque AI layers
Confidence erodes inside and outside the company

Your AI operating model is now your moat.

What makes an AI operating model actually work?

At the highest level, it must be:

Achievable

Your teams can realistically support it at 3 a.m.

Market-competitive

You don’t need Amazon’s numbers – but you need credible performance.

Customer-centric

Agents should solve real friction: conversion loss, slow resolution, operational bottlenecks.

Transparent

Stakeholders should know what agents do, why they act and how quality is measured.

Flexible

You can add new agents and tools without rebuilding the platform.

Governed

Guardrails, IAM integration, auditability and human oversight where needed.

The components of an AI-native stack

1. Foundation platform

The runtime for models, agents and orchestration. Correct AWS names:

Amazon Bedrock for foundation models
AWS Trainium for training
AWS Inferentia for inference
Amazon EKS and Amazon ECS for container workloads
Application Load Balancer (ALB) tuned for AI traffic patterns
Continuous batching, token streaming and cost-aware autoscaling

Teams should standardise on a tight set of building blocks.

2. Data & context layer

Agents require current, contextual and governed data. That means:

High-quality data contracts
Search and retrieval pipelines
Feature stores freshness SLOs
Strict access control

Without this, your agents operate on hope, not truth.

3. Agent & tool framework

Defines how agents reason, call tools, escalate and log actions. Correct AWS names:

Strands Agents SDK – open-source agent framework
AgentCore – the managed agent runtime in Amazon Bedrock

Provides:

Tool registries
Reasoning loops
Policy & guardrail enforcement
Human-in-loop patterns

4. Reliability & observability

AI ops require deep visibility across:

Latency, throughput and token use
RAG retrieval success
Model drift and content safety
Tool-call chains
Agent decision traces
Escalation logic

If you can’t observe it, you can’t run it.

5. Security, compliance and risk

AI must fit into your identity, access and governance stack:

IAM integration
Model and agent policy boundaries
Audit logs for all tool calls
Safety constraints and approvals
Data minimisation

6. Operating model & ownership

This is where AI becomes “how we work,” not “what we’re testing.” It requires:

Named owners for platform and agents
Change-management processes
Regular performance and risk reviews
Lifecycle management for prompts, data and tools

Six proven practices for getting value from AI & agents

1. Monitor what your customers and users really care about

Uptime, latency, agent completion and business metrics – not model stats.

2. Treat reliability as a designed feature, not an outcome

Build it into the architecture, not the incident process.

3. Insert agents into existing flows

Adoption skyrockets when agents sit inside the tools teams already use.

4. Put guardrails into the runtime

Policies, access, approvals and auditability must live in the platform.

5. Review and tune regularly

Agents are not “set and forget.” They drift. Your data drifts. Your business moves. Reviews must be scheduled and mandatory.

6. Turn wins into stories you can sell

Every uplift in conversion, cost efficiency or resolution speed must fuel your value narrative.

How Just After Midnight helps

We help organisations move from AI slideware to AI that runs reliably, safely and cost-efficiently in production.

That includes:

SRE & DevOps for AI workloads – inference SLOs, guardrails, observability
Agentic AI enablement – design, implementation and ops for Strands & AgentCore
Cloud modernisation – Trainium/Inferentia optimisation, GPU strategy, FinOps for AI
Data foundations – retrieval pipelines, feature stores, governance
Edge & real-time operations – robotics, streaming, containerised inference at the edge

If you’re looking at the re:Invent announcements thinking, “This is incredible… but who keeps this running at 3 a.m.?” that’s the gap we fill.

The real question isn’t whether you’ll adopt AI – it’s how you’ll operate it in 2026.

Get in touch with our team if concerns about reliability, cost control, observability or agent governance are already surfacing, that’s a signal to act early.

We help teams close that gap before it shows up as incidents and overruns.

The rise of AI-native ops

by Freddie Heygate

Why AI-native operations and agents matter – in detail

What makes an AI operating model actually work?

The components of an AI-native stack

Six proven practices for getting value from AI & agents

How Just After Midnight helps

How content marketers and digital leaders can get what they want with WAR

Singapore agencies, win clients with cloud

Top tech fails: End of 2019

The rise of AI-native ops

by Freddie Heygate

Why AI-native operations and agents matter – in detail

What makes an AI operating model actually work?

The components of an AI-native stack

Six proven practices for getting value from AI & agents

How Just After Midnight helps

Know someone who'd find this useful? Pass it on.

Other articles you may also like...

How content marketers and digital leaders can get what they want with WAR

Singapore agencies, win clients with cloud

Top tech fails: End of 2019

Have a question? We’re here to help.