No. 14 · Policy & People
The Compression Problem
How hierarchy, communication, and human oversight must change in the AI age.
Abstract. Hierarchical organizations exist to solve an information problem. Ground truth must flow up to senior leadership for decision-making, and along the path it is compressed and selectively disclosed. The pyramid is a pre-digital, lossy compression algorithm. AI agents work differently and remove much of the time and verbosity tax that the human hierarchy imposes, though they face their own compression limits set by finite context windows. To capture the gains, organizations must redesign the hierarchy, shift briefings away from text toward simulation and visual interaction, and move human oversight from the individual agent to the level of the system.
Hierarchical organizations exist to solve an information problem. Ground truth information must flow up to senior leadership for decision-making, but along the path it must be prioritized and selectively disclosed based on importance. Each level of middle management further compresses and reduces the information based on experience and insight, so that only what is perceived to be the most important information reaches decision-makers. Along this path, individual managers discriminate based on relevance and abstract into common themes and principles. ## §01 The compression problem This is a necessity. If an organization has a thousand individual contributors, each providing a one-page written report on the issues they are facing, the leadership of the organization would need to read a thousand pages of documentation every day, which is not possible. So the information along that path must be compressed down, the concepts grouped thematically, and the lower priority and lower value issues resolved locally or omitted, so that by the time the information reaches senior leadership it is roughly in the same length and structure as the ground truth, although highly compressed. The hierarchy acts as a form of lossy compression algorithm, with unwritten rules and inconsistent application. In its best implementation, there are standard operating procedures and organizational managers understand clearly what is important and what is not. In a dysfunctional organization, it is reasonable to expect that important signals are lost as individual intermediary managers hold an inconsistent understanding of the strategic vision and of what constitutes an important matter. The pyramid-shaped hierarchy is a pre-digital information compression algorithm to support decision-making. If this were not the case, organizational hierarchies would be flat, with a single leader and a thousand individual contributors reporting directly. "The pyramid-shaped hierarchy is a pre-digital information compression algorithm to support decision-making." · Janak Alford, Deputy Minister, Ministry of Technology and Innovation In reverse, hierarchical organizations also serve as an information decompression algorithm. In the Alberta AI Academy we use the term 'hydration': a small amount of information, such as a vision statement, can be expanded and contextualized until it is relevant to each of the thousand individual contributors. Imagine a vision statement from a leader that the organization will be 'extremely human-centric'. That statement means different things to different functions. In customer service, it may mean training people to have meaningful and empathetic conversations. In product development, it may mean building products that are highly customizable to the user's individual physiology or psychology. In marketing, the teams may focus on telling stories that resonate with customers on a personal level. In finance, it could suggest sacrificing some profitability to support the brand image and to remain principled in how you deliver your services. All of these require the context of the unit, and a leader may not have the time to write a bespoke vision statement for every employee. Middle management takes a direction or strategy and, through its closer association with and knowledge of the units it oversees, contextualizes that vision and answers the critical question: what do we do differently on Monday morning as a result of this vision? A functional organization does this effectively, transforming vision into specific tactics. A dysfunctional organization leaves such statements ambiguous and leaves individual contributors confused and uncertain about how their standard operating procedures and tactics are expected to change. ## §02 How hierarchies translate in the AI age As AI agents enter the organization, they offer an opportunity to address this compression problem. They work fundamentally differently from people. They still suffer from challenges in context, but they overcome many of the specific challenges that emerge within a hierarchical organization. The first is the time tax. It is common for an organization to take a month to transit a request from senior leadership down to ground truth, to formulate a response, and to transit it back up through the levels of the hierarchy. With a strong sense of urgency this can happen in hours, but standard business takes weeks to months for a round trip. This is a function of the cadence of meetings, the method of communication, the time it takes to absorb a message and transform it, and the need at each level to structure it, vet it, disseminate it differently to each unit, assemble and vet the response of individual contributors, and pass editorial and quality checks back up through the organization. That month of delay is a time and productivity tax on any given issue. AI agents arranged in a hierarchy do not suffer from this tax. Given their speed and their ability to share a common, exact recollection of what is being asked, a request in an agent hierarchy could take seconds. A supervisor agent asks a worker agent to respond and receives a ground truth reading, contextualized back to it, in moments. The speed problem of hierarchy disappears. The second is verbosity. The length and detail of ground truth information no longer need the same compression. An AI with a million-token context window could absorb and hold the status reports of a thousand individual contributors in a single prompt and then determine, through pattern-matching, which signals are most relevant. It could save this information, write deterministic scripts to process it and extract the most relevant tags, and recursively analyze the data and insights across a wide range of thematic categories, a process that takes seconds. AI does not face the same compression requirements as people, so the data does not need to transit through intermediary layers or be contextualized in the same way. The third concerns the structure of memory itself. People communicate the way they do because they do not share a common memory bank. What I know has to be converted into words and transmitted slowly, through typing at roughly one word per second or through oral communication at roughly forty bits per second. What I have observed and understood goes through a level of lossy human compression before I can share it at all. AI agents do not share this failing, because they can access the same ground truth information at any level of the hierarchy. The transiting and transformation of data, the compression and contextualization, need not apply. One agent passes a signal that such data exists, and another agent selects it and reads it in real time, synchronized with the agent itself, and provides feedback given its function. There is still a need for differentiation between agents, and this is a key point. Agents can share the same common memory source, but a well-configured agent with its own harness and its own context will perform a different function. Industry literature shows that you cannot build one agent to rule them all. You must separate concerns, similar to how humans hold specific roles. A coder agent and a cybersecurity agent, separated and not sharing the same exact context, will outperform a single agent that holds the entirety of the skills. Differentiation also enables parallelism. Two separate agents can undertake different pieces of work concurrently, while one agent can take only a single step at a time. The more agents you bring on, the more you increase parallel processing, and it is common to have dozens or hundreds of agents processing concurrently on the same initiative as they each run their own auditing and review. A further reason for separation is that an individual AI agent is an unreliable witness to its own performance. If prompted, it will often make mistakes and overconfidently claim completion of its work. Another agent, with its own context window and its own prompt, will look at the claims with a fresh set of eyes and evaluate the work differently. Two coding agents, one with the primary task of coding and one with the secondary task of audit, will outperform a single agent in speed and reliability as well as in individual skills and harness capabilities. ## §03 Inverting the hierarchy In a human hierarchy, you want to maximize individual contributors while minimizing the overhead of management. In government there is roughly a one-to-eight ratio: one manager for every eight staff. In an agentic hierarchy it may be beneficial to invert that ratio. For every AI worker, you could reasonably expect eight supervisors auditing and validating that the work is accurate and complete. Inverting the ratio 1 : 8. In a human hierarchy, roughly one manager for every eight staff. In an agentic hierarchy the ratio may invert, with eight supervisor agents auditing and validating every worker. If a single coding agent were producing an enterprise application, you could surround it with a cybersecurity agent, a QA agent, a change management agent, a code review agent, an infrastructure agent, a communications agent, a legal agent, and a privacy agent, each overseeing the work and raising tickets when deficiencies occur. This is the likely future implementation model, where you segment the different concerns. There can be only one chef in the kitchen, but you can surround that chef with sub-agents, each with its own concern, running ongoing audit and evaluation. You may use your primary model with its largest context window as the worker, while secondary agents perform simpler deterministic high-speed checks and flag when systems fall out of sync, as covered in the Red, Blue, Yellow, and Green paper and in the Anti-Drift Harness paper. This inverts the hierarchy in the agentic space, because for every worker you may have eight supervisors. You may also require a differently shaped hierarchy altogether, where issues that cannot be resolved at the individual agentic layer are escalated, with the ground truth attached, to be resolved in real time by a council or a single agent that keeps implementation and vision from falling out of sync with the overall strategy. A role for hierarchy and for differentiation of tasks remains, but many traits of the human hierarchy fall away. You are left with a fundamentally different operating model that retains some individual distinction but rests on a common memory system, and many of the artifacts of a traditional human hierarchy disappear. ## §04 The limits agents still face Agentic systems still suffer from a significant failing. Even though an agent can process thousands of times more information in a single prompt than a person, at the largest scale this moves the problem up an order of magnitude or two rather than removing it. This paper opens with "The $2 Billion Ship of Theseus" and a system of 466 million lines of code. That is more code by a wide margin than a single agent can process. Single agents are effective with perhaps forty to four hundred thousand characters, which means an individual agent can hold an effective understanding of only so much information at its lowest level. The effective reach of a single agent 40K–400K. Single agents are effective with perhaps forty to four hundred thousand characters, a fraction of the 466 million lines of code in the estate, which is why levels of hierarchy persist, simply set at a different scale. To guide the implementation, we need to read every line of that code to determine the business rules and functions it describes. What was the code attempting to do? Was it a user interface, a workflow, data protection, a sign-in form, or any of the thousands of functions a modern application serves? To analyze this, we need thousands of agents reading the code, contextualizing the function of every piece, and abstracting it up into common themes. A sign-in form with a username, a password, and single sign-on can be categorized as authorization. Each of the thousands of primary functions built into the code becomes an abstraction in an overall architecture. We could agree that an application has authorization, but the implementation will vary across every application based on the nature and sensitivity of the data it holds. There are common themes and common architectures, and arriving at them is itself a form of compression. You need this process of compression and abstraction so that an overall plan is not 466 million lines of code, and not even thousands of modules, but a comprehensive yet highly compressed business capability canvas that a strategic agent can understand, vet, and govern. Levels of hierarchy will persist because of the context window limits of the model. They will simply be set at a different scale. A year or two ago we were pleased when a model could reliably produce a piece of working code that compiled. Now it is uncommon for a page, or even a full application, to fail to compile, because coding agents have become far more advanced. That pushes us higher up the hierarchy. Instead of one page we want the whole application, instead of one application a ministry, and instead of a ministry a government. The context windows have increased, but not by that much. If you tried to have one agent parse and redevelop the whole of government, it would be too slow and would hit the same limits a human encounters, where complexity eventually overwhelms the agent. Unlike people, AI has firmly capped context windows. After a million tokens, even the best models reset, go through a form of compression, and summarize what they are working on. That is a form of engineered amnesia. In building even a single application, an agent can go through tens or hundreds of compression cycles and lose the train of thought of the worker. The human analogy is sleep. If you stop mid-task and sleep, you wake remembering the general nature of the task, but some fidelity is lost when you pick the work back up. It may not happen over one night, but after a two-week vacation, or a year of leave, the fidelity of the task at hand fades. That is similar to the compression a model goes through. Every handoff from one agent to the next is in essence a new agent stepping up to take on the task. We can maintain a high-fidelity, compressed narrative of the steps we have taken for the next agent to pick up, much as a worker documents the case file before going home so a colleague can resume the engagement the next day. Information compression has to happen in the agentic era too. It simply happens at a different magnitude and across a different degree of fidelity in memory and communication. If we are going to transform 466 million lines of code down to several thousand business functions and several hundred modules, this compression has to occur, for the agent's benefit and for our own. If we are going to keep a human in the loop, and that currently seems untenable to forgo, we will need to compress the information back down to human levels of understanding. ## §05 Transforming the briefing If a human and an agent confer on the transformation of two hundred applications, the information presented needs radical compression for a decision-maker. The rationale and methodology behind the compression may be captured in thousands of pages of process, but that exceeds human capacity to read, and exceeds the capacity of the often non-technical decision-maker to oversee and vet. So how do you compress a proposal such as the one in the Velocity white papers, which themselves span hours of reading, into a single briefing a decision-maker can understand? The common method is a briefing note that gives a decision-maker enough to understand the broad concepts, the risk mitigations, and the general intent, with evidence that the implementation can be trusted. There is research, there is evidence, and there are identified risks with mitigation strategies. All of this lets senior leadership feel grounded in their approval and justified if challenged or if things go off the rails. That compression matters, because if it is done wrongly, if the briefing fails to satisfy the human desire for information even though it is well-grounded in its thousands of underlying pages, the project will not be approved. This paper argues that we are over-compressing our briefings, because human capacity and AI capacity are now misaligned. An AI supervisor would readily read the thousand pages and make a determination. The human supervisor may read only one or two. So how do we keep human capacity limits from blocking a good idea simply because people cannot process the data needed to fully appreciate it? There is an opportunity to change the modality into a less compressed form of communication. Written and spoken briefings suffer from a comprehension and compression problem, while our visual and spatial understanding is far stronger. Instead of preparing a written briefing, an AI could generate a simulation, a visualization, a spatial interaction, or a game that lets the decision-maker engage with the material in a different way. If a five-year remediation were simulated and animated forward at an accelerated rate, showing the kinds of processes that would be executed, the decision-maker's spatial and visual reasoning could fill the gaps that a written briefing cannot represent. By shifting the modality from writing toward visual, spatial, and audible analysis, we present the information in a different way. The cost of a mistake may also fall significantly. To implement a hundred-million-dollar system, I need high confidence that the checks and balances are met, because it is a one-shot opportunity. The economics of AI de-risk some of that. If the downside of a decision is a few thousand dollars and a weekend of lost time, the risk of the decision is trivial, and I can run ideation and development as a hypothesis, like a lab experiment. I come back on Monday morning, see the results of the modernization effort, refine, review, and test, and make my next decision at a low cost of failure. That may be a reasonable way to refine our expectations when the only cost is a small amount of compute. The risk decreases because the cost of failure becomes orders of magnitude smaller and issues can be addressed as they appear. Without a shift toward different modalities, humans will not be able to keep up with or understand the technical basis of what the AI presents, simply because of the mismatch in context windows. ## §06 The future organization and the human in the loop We speak at length about the need for a human in the loop, but that presupposes we keep our current hierarchies and our current human-centric briefings, neither of which survives the AI era. Organizations such as Anthropic already identify that we should not be prompting agents directly but building loops of processes that prompt them on our behalf. An individual contributor having a one-on-one conversation with an agent becomes an anti-pattern as agents scale in capability, because the interaction is limited by the lowest common denominator. If you communicate at forty bits per second but the AI can work at four million bits per second, you inhibit its growth and constrain it to what humans are slowest at. You leave the Ferrari in the garage and ignore our millennia of evolution in visual, spatial, and auditory understanding that can exceed those applied limitations. If the human-in-the-loop modality is going to survive long-term, it must shift away from text toward simulation, engagement, and a fully spatial, visual, and sensory means of receiving a briefing as a proxy for understanding the detailed decisions. Two speeds of communication 40 vs 4M bits/s. People carry meaning at roughly forty bits per second; an AI can work at roughly four million. Hold the interaction to human speed and you leave the Ferrari in the garage. "You leave the Ferrari in the garage and ignore our millennia of evolution in visual, spatial, and auditory understanding." · Janak Alford, Deputy Minister, Ministry of Technology and Innovation Consider an analogy. In a garden, billions or trillions of organisms interact within the soil, air, and water. There are the plants, the bacteria, the viruses, the worms, and the bugs. In your own body, somewhere between fifteen and thirty trillion organisms make up your microbiome. You deal with all of them through an abstracted experience. You understand in principle what they are doing and how to intervene when the aggregated signs show that things are not going well, but you leave those ground-level processes without distinct management. You do not manage every bacterium in the soil or every corn plant in the field. You manage the overall system for their success. You meet the broad necessities of life and then allow each organism to use its own intelligence, because each has its own non-human biological intelligence. You monitor the science rather than the individual activities. The human role in the loop needs to shift in the same way, because of the compression challenge we face. It will not be that every action of an agent is individually overseen by a human, but that we have systems of compression and abstraction that let us understand these things at a meta level, a systems level, and make decisions accordingly. The work of a human individual contributor today is likely to change significantly as the sophistication of the work rises. We scale along that line of growth with AI and move toward a different form of experience. So what does a briefing note look like in the future? I do not want to take the transformation of government on anyone's word, no individual person and no individual language model, because we are all prone to our specific deficiencies. What I will do is take the evidence of a system being presented and simulated, where I can witness with confidence the transformation of a government system through an accelerated, simplified simulation that fits within my own ability to perceive and manage, that satisfies my curiosity through interaction, and that leaves me confident the outputs are within the controls I am looking for. Then I allow it to go forward, especially when the costs of investment are so reduced. The only way I get there is by addressing the compression problem: changing the nature of hierarchy in my organization, the nature of communication, and the way the human in the loop factors in. Only when I alleviate this issue can we maximally reap the benefits of AI integration into the workforce. If we maintain the existing protocols, the slow hierarchy that takes a month to pass a message between an individual contributor and a senior executive, no amount of AI will produce the return I am seeking, because the time being lost is not the time of people doing work. It is the time in between work. Recall from the Velocity paper that the system now tracks the time a task spends at the stage or step level within a module and allocates it to the proper player. The AI may complete its work in an hour, but the human may complete its review in a week. If we do not use different ways of communicating and different types of hierarchy to present information, then even collapsing the time of doing the work might move the briefing only from thirty days to twenty-nine, because the interstitial losses happen at human speed and at the failings of the organization: the time to read a briefing, the time to write and edit it, the movement between inboxes, and the sequencing of meetings. The prize, and the condition 20×. A twentyfold improvement in delivery speed is not achievable while maintaining the exact same hierarchy we administer today. The hierarchy has to be reinvented at the same time we explore AI. A twenty-fold improvement in speed is not achievable while maintaining the exact same hierarchy we administer today. The successful organizations will be the ones willing to reinvent the hierarchy at the same time they explore AI, producing a result that is the best of both worlds: human decision-making at the ecosystem level rather than at the level of the individual agent, with AI doing what it is good at and people doing what they are good at, spatial and strategic reasoning complementing tactical, on-the-ground reasoning. There is more to say on this topic. It is the defining challenge of AI implementation and will be an ongoing priority for this government and this ministry as we work to find the savings and gains needed to reach that twenty-fold improvement.
Tags: compression, hierarchy, organizational-design, agents, context-window, human-in-the-loop, change-management