No. 09 · Technical
The AI Factory: Orchestration and Observation (Nexus)
A secure cloud sandbox where agents work with autonomy, and every action stays observable and auditable as they go.
Abstract. Granting agents autonomy lets them solve problems in novel and creative ways. To do this safely in a government environment, Alberta built a custom tool called Nexus that makes the agents' actions observable as they work. Nexus provides a secure sandbox where agents work independently or in groups on hard problems that have no clear solution in advance. Because every action an agent takes in that technical space stays visible and auditable, the autonomy is safe to grant. An enterprise-grade application can now be deployed securely within Nexus in minutes. Scaling the platform to support greater orchestration will enable all four AI-based modernization strategies and open new doors for autonomous agent operations.
When AI systems are provided the latitude and tools to act with some autonomy, choosing their own next steps based on their own judgment, they can be accurately called 'agents'. This agency is enabled by tools which the AI can activate through its outputs, which run on the computer and often return a result. The AI picks up that result, evaluates its successes or failures, and is activated to try again in an ongoing process of experimentation. This is an effective pattern of working. Agentic processes elevate the effectiveness of AI past how most people use it today, as a chat tool. An AI agent becomes a persistent, capable collaborator. ## §01 Setting the conditions for success Today, the latest agents can do meaningful work with minimal intervention, running with limited or no supervision for hours or even days. Like any worker, an agent's outputs and outcomes must be measured and managed so we can confirm its actions are meeting our goals. An AI-generated output that lacks evidence cannot be trusted, regardless of what the final product looks like. AI needs to show its work, and our Alberta AI Academy training teaches our staff to 'verify, then trust' all AI outputs. The right balance is to give agents autonomy while enforcing observability and an audit trail, for AI to operate with speed but also to provide proof that they are following the correct processes. These tools carry extensive built-in knowledge of the commands needed to operate a computer, and over-constraining them wastes that capability. We need to know what they are doing and how, but sometimes we also need to get out of their way. ## §02 Connecting the pieces is the hard part The problem-solving ability of an AI agent is a critical success factor in building or repairing technology solutions. Creating a software application is only partly about producing code. An app sits inside a larger, often complicated environment of operating systems, permission structures, networks, databases, tools, and third-party systems. Connecting these pieces together correctly is one of the most time-consuming parts of the work. We know how to write code; making it run reliably across all those connections is the hard part, because the pieces do not always fit. A good developer spends a great deal of time designing and testing these integrations, while also preparing the application to survive the move from the development environment to production, where thousands of unpredictable users may interact with it at once. Good code has to be logically correct, applying the right policies and rules, and also functionally aligned to the environment it runs in. Any experienced developer will tell you that this work demands persistent problem solving, experimentation, and steady learning. In short, grit. Persistence. Over the last 18 months, Technology and Innovation staff have studied and monitored these AI agents in the domain of software development. We have seen significant growth in the capabilities of these AI agents, and also in the tools which allow them to be effective. Alberta's AI Maximalists also observed that unbounded agents, those given latitude to solve a problem without a predefined solution path, frequently revealed novel solutions to tough problems. Often the agents used methods the Maximalists grasped conceptually but lacked the specific technical knowledge to anticipate. Not all of their interventions were wanted, and so guard rails were introduced through the harness to make the agents' work more opinionated, following enterprise standards, without inhibiting forward momentum. When given instruction to create sub-agents, the AI agents spawned dozens of instances to approach the same problem from different angles. Such investigation revealed that they were effective in identifying workable new solutions, and also bugs and gaps in existing code, which could be closed quickly. Enabling effective agents means guiding them with your judgment and knowing when to step back. An agent often needs context and direction from a human orchestrator to begin the work, and again at critical decisions and junctions. But the human's own knowledge can be both enabling and limiting. If the human says 'do A' and A is wrong, the agent is bound by that mistake. If instead the human says 'develop and test ten approaches in parallel' and leaves the agent to evaluate them, an unexpected solution has room to emerge. Being prescriptive about how the agent must work often confines it to your own view of the problem. So how can we get the best of both worlds? Enter Nexus. ## §03 What is Nexus Nexus is a Google Cloud Platform-hosted virtual environment where each developer has access to their own virtual machine and can run an unlimited number of AI agent instances in a secure sandbox, while delegating access to the agent to assist in containerization and deployment to Google Cloud. It comes with a terminal, a browser, a file system, observability, and publishing controls, as well as an overlay security model. As a further layer of protection, all apps are deployed behind a private endpoint, accessible only through a VPN connection internal to the Government of Alberta. The system makes it so that agents can operate as first-class users of the cloud environment, doing immediate deployments into the cloud with a single 'publish this app' prompt. They have their own built-in harness that enables them to access the system's controls, and you can redelegate access every few hours so that no access you grant the agent is permanent. It is a privileged-identity-management style of access granting on behalf of the user, allowing the agent to operate on their behalf. It also integrates with a suite called Ent Tools, our enterprise tools, which extend the capability of any agent. We have Brave Search, ElevenLabs, all of the API endpoints for the major cloud providers, open-source models, and a private compute cluster. We have a series of other open APIs: time, weather, and news. We have social media integration so that an agent can tap into real-time information about the world. And we have enterprise tools being built out to support SharePoint, ServiceNow, and 1GX ERP integrations. Through single sign-on, the user can delegate their access into these. We will expand it into the Microsoft 365 space in the near future, so a user can delegate Teams, email, calendar, and other access. The purpose of the enterprise tools is to safely provide agentic resources to ministry partners who do not have an IT shop but want to build meaningful tools. Through delegation, we make it easy to gain access to these resources in a safe and monitored way. ## §04 The expanded universe We are also layering the Bifrost AI Gateway and custom scripts to add personally identifiable information (PII) detection and removal. Users who submit PII to models that do not have sufficient classification get flagged and notified that there is a misalignment between their model selection and their use case. Tool use through the Enterprise Tools gateway adds a further layer of security, and provides meaningful audit and review of outbound interactions. Both platforms enable cost containment, where developers and workloads can be given daily budgets to prevent runaway token use for long-form jobs, or request point-in-time budget approvals right through the console for large amounts of data processing. Observability comes in by allowing the human developer in their own Nexus environment to monitor what the coding agents are doing. Administrative users of Nexus can also observe and audit all agents on all virtual machines. Administrative views have shown interesting and sometimes unexpected patterns. Agents have demonstrated growing persistence in probing their own environments to understand what options are available. Some moves surprised us. We used observability to see what the agents were doing, and created other agents to audit them, which caused us to change our patterns as agents pushed the boundaries and found gaps. This 'shakedown' period made Nexus more robust and surfaced gaps that had initially been hidden. ## §05 Nexus and the four approaches For Alberta, Nexus has allowed us to use these agents to solve novel problems, and we have. All of the workloads we have talked about, through Git Insights, Git Insights Ministry, and dozens of apps, have been built in Nexus. Since it went live just three months ago, over 600 applications have been built on the platform, and it has enabled the kind of velocity necessary to achieve the transformations we are looking for. Nexus was a necessary evolution that unblocked a significant boost in acceleration for our AI Garage and AI Factory models. Without Nexus simplifying the complex build process down to a single 'publish this app' prompt, the forward velocity of our developers would be stymied by the manual processes of requesting tickets and waiting days or weeks for elements of the infrastructure. Looking forward, Nexus provides the blueprint for two of the transformation approaches coming next, discussed in the white paper on The Four Approaches to AI Modernization. Nexus currently supports Approach 1, the AI Garage, and Approach 2, the AI Factory, which take on direct application remediation and development. Expanding Nexus enables Approach 3, with layers of orchestrator agents overseeing hundreds of virtual environments, one for each legacy application. We can imagine a situation where all the applications in a ministry are booted up in their own virtual environment with an overarching orchestration that monitors them. Embedded within each app, one agent manages its health and performance, patching, and security. If there are 200 apps in a ministry, there would be 200 agents overseeing them, and then oversight agents on top, monitoring the telemetry on the uptime and downtime of the apps, the status of the agents, and what they are working on. Are they patching? Are they releasing? Are they documenting? And then there are further abstracted layers of architects, hard at work integrating and rebuilding systems into new target environments and tech stacks. Nexus also lays the groundwork for Approach 4, where we build an entirely headless agent orchestration layer, with the functions of government exposed and consumed via API. In this environment, individuals from across government who have been trained through the AI Academy will be able to define an agent, delegate access to it, and then monitor and track its steps to complete their objectives. That future version of Nexus is not far off, and we are working diligently to understand how it scales out. ## §06 Enter the claw This platform is also allowing our AI Delivery and Enablement branch to begin implementing 'claw-based' orchestration, where self-directed agents, known now as claws, or continuous learning autonomous workers, like OpenClaw or Hermes, are activated in a controlled government environment. This is the likely target state for agentic use in government. A claw agent operates in a networked environment, where the development, the networking, the cybersecurity, and the monitoring are all handled by claw-style agents working somewhat collaboratively, or even adversarially, in an environment. In such a scenario, white-hat agents can simulate cyber threat actors, playing cat and mouse. This is the kind of persistent pattern being emulated by threat actors, so it is reasonable for us to mimic these patterns using an OpenClaw or Hermes. Expanding the use of this kind of claw architecture through Nexus becomes a reasonable next step within the coming 6 to 12 months. The virtualization and containerization of our workloads, the speed to deployment, the observability, and the scalability across hundreds of developers, each running tens to hundreds of agents, each managing tens to hundreds of applications, is an absolute requirement for using AI at scale within a large organization like government. So is the ability to run agents autonomously with sufficient confidence that they are working in alignment, that mistakes can be caught, that integrations to GitHub are happening, and that commits enable undo and revert. All of this increases confidence in AI use and management. These habits are codified into the harnesses, so AI agents follow the patterns set out by best practice. Nexus has unlocked the forward velocity needed to advance our vision of a twentyfold acceleration of delivery. ## §07 Expanding AI access The Nexus platform lets us prepare for democratized access, even for non-IT staff, to autonomous agentic AI across a larger workforce, where the aggregate effect of unmanaged AI usage could otherwise cause chaos if left unconstrained. Nexus drives AI operation workloads into a state of consistency, where every Builder has their own environment, every environment is observable, and every agent is managed, where model access is controlled through gateways like Bifrost, and tools access is air-traffic-controlled by the rules at the Enterprise Tools gateway. Any government organization seeking to move at speed with agentic AI needs to lay down a similar architecture to match security with speed. Nexus solves one critical half of observability: knowing that the agents are working, and being able to audit their technical activities. But that information remains inaccessible to business clients and partners, subject matter experts, the engagement team, and project managers. For this, we built a second observability layer, which the next paper covers, called Velocity. DM Janak Alford and ED Zoran Mijajlovic walk through the Nexus platform.. Video: https://youtu.be/3PIEPoLiwG8
Tags: ai-factory, nexus, sandbox, gcp, agents, orchestration, observability, claw