No. 12 · Technical

The Agentic Technology Stack

The cloud platforms, models, gateways, and controls behind Alberta's agentic workloads, and the principles for choosing them.

Abstract. Alberta has tested a wide range of agentic coding tools and generative AI platforms. It is important to select the right model, in the right environment, for the right use case, with a deliberately diversified supply chain to prevent provider lock-in. Alberta's AI solutions run in a hybrid environment across the Google Enterprise Agent Platform, Azure AI Foundry, and AWS Bedrock with an on-premises compute cluster for sensitive workloads. AI usage is governed by Alberta's AI usage policy, AI governance framework, and the Protection of Privacy Act. These technologies are dynamic and will change with the evolving pace of new AI-driven innovations.

To properly select the best tools, we have tested a wide range of agentic coding tools and generative AI platforms. What follows is our current stack. We expect it to change as new tools emerge, and our choices are guided by a simple aim: the right model, in the right environment, for the right use case. We have deliberately diversified our AI supply chain, for both the capacity and the variety needed to cover a wide range of work, and we avoid placing every workload with a single provider.

## §01 Choosing the right tools

The industry moves quickly, and the lead alternates every couple of months. Google releases an image model like Nano Banana 2 that is astounding, and a few months later OpenAI ships GPT Image 2, a step change again. We stay flexible and use each platform where it fits. Over the past eighteen months we have tried a range of models, floating between Gemini, OpenAI, and Anthropic models for most use cases. Presently, most staff find that Anthropic's Claude models outperform most others. We expect that to change, and the stack is built so it can. For the work in these white papers, we have relied mainly on Claude, the Opus and Sonnet models, moving from 3.5 a year ago up to today's Opus 4.8. The models are evolving quickly and so are the tools and interfaces we use to access them, with new additions to Claude Code and other coding apps evolving quite literally on a daily basis.

An abstraction layer, like the Bifrost gateway, lets us move a workload from one provider to another, or from the cloud to our own compute, as needs and prices change. It also keeps spending under control, with per-workload budgets so token use does not run away. We would not advise any government to put all its eggs in one basket; the industry is moving too quickly for that.

## §02 The stack

Alberta runs a hybrid environment. We use the major hyperscaler clouds, Google, Azure, and AWS, alongside a number of smaller platforms, and we still operate three large legacy data centres carrying older code, systems, and infrastructure. We run open-source models on our own GPU cluster, and we reach hosted models through AWS Bedrock, the Google Enterprise Agent Platform, and Azure AI Foundry. We proactively test nearly every model that comes online: open-source models on our own hardware for offline, Protected C workloads, and a range of open models running locally on high-performance Mac M5 devices. Our primary workloads run through AWS Bedrock and the Google Enterprise Agent Platform.

Most of our cloud-based agentic workloads currently deploy into Google Cloud, while our traditional applications continue to run in Azure and AWS. The containerization behind this is described in the Nexus paper. We are also looking at Red Hat OpenShift to host more workloads on our own hardware and to make fuller use of the data-centre capacity we already have.

## §03 Classification and residency

The Government of Alberta classifies information as unclassified, Protected A, Protected B, and Protected C. Only a few systems contain Protected C information, with the vast majority (ninety-nine percent or more) containing Protected B data at the highest. Much of the AI-built work in these papers uses unclassified, government-owned legacy code and architecture, free of any personal information in production systems. This lets us reimagine how government works, with AI as the builder of future systems, while the AI never touches or observes citizen data unless it is genuinely needed.

All data tied to these workloads is controlled in Canada through our enterprise agreements and platform guardrails. Depending on the workload, we use domestic compute or inference from the United States, which is what the cloud providers primarily offer today. We are working with every cloud provider to bring more hardware into Canada and shift ever more workloads into a sovereign environment, and we are watching a range of Canadian sovereign-compute consortiums closely. We have released a Sovereign Compute prequalification request inviting wholly Canadian companies to help with our most sensitive work, and we support the Government of Canada's objective to grow the sovereign-compute space. For health and security workloads, exclusive Government of Alberta control of the data is a non-negotiable position.

## §04 Governance and privacy

Our decisions are guided by Alberta's AI usage policy and our AI governance framework, which set out the considerations for the public service when adopting AI. Privacy is governed by Alberta's Protection of Privacy Act, which defines where AI can be used and when government must complete a privacy impact assessment and register it with the Office of the Information and Privacy Commissioner before proceeding.

Prompt-injection resistance 95-98%. Even the best models resist hijacking only 95 to 98 percent of the time. No model today is immune, so public-facing AI and anything touching personal information is avoided or selectively designed.

We are deliberately careful with introducing public-facing AI workloads. No model today fully resists prompt injection, and even the best resist hijacking only ninety-five to ninety-eight percent of the time. There are well-documented cases of platforms being gamed and agents tricked into harmful work on a prompter's behalf. Presently, public-facing AI chatbots and assistants, and anything touching personal information, are either avoided or selectively designed and evaluated prior to implementation. To support exploration and ideation, we have used unclassified and synthetic data to validate what these systems could do and to give ministry partners a clear set of considerations to weigh for their use cases.

## §05 Security and the supply chain

We have been cautious with open-source agent components, by practice rather than by policy. A harness, even when it is only a set of skill files, is software, and it needs auditing before use. Its provenance is important. An agent skill can carry a prompt injection, or a poisoned supply chain can make an agent behave erratically, and in the worst case the agent itself becomes the threat actor. So, we are selective about what we pull down, and we evaluate it first in our sandbox environments before opening it to broader use.

Around the models we run gateways like Bifrost for provider abstraction, cost containment, and personal-information detection controls, and we employ Microsoft Defender and Purview, or equivalents, as data-loss-prevention controls, alongside GitHub's integrated tooling. There is more to do. We are engaging industry on the security apparatus around agents, on agentic identity management, and on following and observing agent activity across the network, and we will invite strong companies to test tools against government workloads through the Sovereign Compute process and forthcoming requests for proposals. We are also seeking partnerships with federal, provincial, territorial, and municipal peers to keep an active community of practice and to learn from how other governments are solving the same problems.

## §06 Cost and scale

We now consume billions of compute tokens monthly, and anticipate the cost of consumption to reach hundreds of thousands a month as the four approaches are implemented and an AI agent supports every application for monitoring, cybersecurity remediation, and development. Relative costs remain low, and even with these new financial expenditures, the cost avoidance on a single application out of hundreds can quite frequently exceed the entire cost for AI compute for the year.

Capacity is becoming a constraint: a shortage of infrastructure and memory is limiting compute, and even our enterprise agreements, at twenty-five to fifty million tokens per minute, will likely be exceeded soon, so we are moving to a diversified, load-balanced approach to throughput.

Planning a long-running job 250 agents · 50 days. Processing roughly 14 million historical images and records will require 250 concurrent agents running for as long as 50 days. However, at a cost of several hundred thousand dollars, this is roughly one percent of the cost of doing the same work by hand.

These cost numbers are interesting. While they are highly cost-effective when used well, AI agents are also opening up work we would simply not have pursued otherwise instead of strictly reducing the cost of work we already do. As the tools become available, both the appetite to build meaningful new services and the need to remediate technical debt grow together. Our aim is to modernize at speed and to deliver enhanced services, with hundreds of millions a year in cost avoidance and staffing steady, and to move from vendor-locked technologies toward platforms that give us latitude as we scale. We explore AI for the reasons set out in The Cyber Imperative and the two-billion-dollar Ship of Theseus, and we are careful not to trade one problem for another.

Ensuring we get the best overall investment means teaching our staff to effectively plan and implement AI-driven workloads in government, the topic of our next paper on the Alberta AI Academy.

Tags: tech-stack, cloud, models, sovereign-compute, bifrost, security, governance, open-source

Open the interactive version