The Invisible Boundary of the DOM

A graphical user interface is built upon a higher-level architecture, but its actions are limited by a physical barrier: the Document Object Model. This layer of code, which represents the web page as a hierarchical structure of elements, is the boundary within which AI agents operate. Every click, every form submission, occurs within this closed space. But what the DOM cannot see, the operating system makes visible: native dialogs, security prompts, context menus, device drivers. These elements are not part of the DOM, but are part of the real computing system. When an agent needs to handle a print request, access a cryptographic key, or change a file path, the DOM has no access to these levels.

> SYSTEM_LOG

The physical dimension of this limit is tangible: it is a separation between two planes of interaction. The first, the DOM, is a software abstraction. The second, the operating system, is a physical entity that manages hardware resources, physical memory, and running processes. The interface between the two is not a simple link, but a security interface, designed to prevent an error in one level from compromising the other. This design, intended for stability, has made the DOM the boundary of an era of limited automation.

The transition from a DOM-based interaction to one that includes OS-level actions is not a software update. It is an architectural transformation. The system does not simply navigate pages, but interacts with the operating environment. This paradigm shift has a cost: increased security complexity, greater exposure to system vulnerabilities, and higher resource requirements. But it also has an advantage: the ability to operate in real-world scenarios, not just virtual ones.

The Transition from Agent to Operational Entity

Amazon Bedrock AgentCore has introduced a new level of functionality: access to OS-level operations. This doesn’t simply mean that an agent can open a file or print a document. It means that it can handle interactions that the DOM cannot detect or control. An agent that needs to authorize access to an external service cannot simply click a button. It must interact with a security prompt that appears at the OS level, with a context menu that opens only at the user’s request. These actions are not recordable in the DOM, but are part of the operational session.

The technology behind this transition is complex. AgentCore Browser, the execution engine, operates in an isolated, but not closed, environment. To handle OS actions, it relies on integration mechanisms with the operating system itself. This requires advanced security management: it is not possible to allow an agent to access any resource. Therefore, AgentCore Identity, a separate service, manages credentials and permissions, ensuring that each OS-level action is authorized and tracked. The system is not a free entity, but a controlled entity.

The cost of this evolution is measurable. Hapag-Lloyd, with a fleet of 313 ships and a capacity of 3.7 million TEU, has integrated AgentCore to automate feedback flows. Each agent that interacts with the operating system requires a more complex configuration, more in-depth monitoring. Performance metrics, logs, and execution traces are no longer just navigation data, but interaction data with the system. Sumo Logic, which provides dashboards for AgentCore, monitors not only response time, but also the number of OS actions performed, the frequency of security requests, and the behavior of context menus.

This level of observability is essential. Without it, an agent operating at the OS level becomes an opaque entity. Observability is not a luxury: it is a requirement for dependability. A system that cannot be monitored cannot be trusted. The transition from agent to operational entity therefore requires a control system that is more robust than the system itself.

Expectations and realities of autonomous control

Market expectations regarding the ability of AI agents to operate autonomously are often exaggerated. Companies like SAP, which have invested $1.16 billion in an 18-month German laboratory, expect agents to replace entire departments. But the reality is more complex. The agent does not replace a human operator, but becomes a control agent for a larger system. It’s not about replacing jobs, but about extending their capabilities.

According to Dario Amodei, an AI safety expert, “the most common mistake is to believe that an agent can operate autonomously without a control structure.” The agent is not a free entity, but an entity that must comply with safety, traceability, and authorization constraints. Its power lies not in freedom, but in the ability to operate within a controlled context. The risk is not that the agent will rebel, but that it will operate in an unforeseen manner, without anyone being aware of it.

“The system is not autonomous, but dependent on a control structure that must be more robust than the system itself.” — Dario Amodei, AI safety expert

The tension between expectation and reality also manifests itself in the way companies manage adoption. Hapag-Lloyd has chosen not to extend the use of agents to all employees, but to limit it to specific scenarios. The goal is not total automation, but to increase the quality of feedback. The agent does not replace the employee, but amplifies their ability to analyze.

The cost of operational dependency

The transition to operating system-level operation is not free. Each agent that interacts with the operating system requires a more complex configuration, deeper monitoring, and more rigorous permission management. This implies an increase in infrastructure costs. It’s not just about more powerful hardware, but also about a denser security network, a more extensive logging system, and a larger operations team.

The cost is not only economic. It is also strategic. Whoever controls access to these actions at the OS level controls the system. Whoever has the ability to monitor and track every action has the ability to intervene. This shifts power from those who design the agent to those who manage the operating system. The agent is no longer a free entity, but an entity that operates under continuous supervision.

The trade-off is clear: operational efficiency increases, but dependence on a control system increases proportionally. The cost of dependence cannot be measured in euros, but in the ability to be autonomous. Those who adopt this technology do not gain freedom, but a new type of dependence. The system is no longer a set of processes, but a control system. The agent is no longer a tool, but an agent of a larger system.

The Next Step

If the DOM is the boundary of an era, the OS is the new frontier. By 2028, it is likely that OS-level interaction will become standard for agents in production. But it will not be a uniform adoption. Companies that have already invested in security infrastructure, such as Hapag-Lloyd with its 3.7 million TEU capacity, will be at an advantage. Those who do not have a robust control structure will be forced to build one, at a high cost.

For you, who are evaluating the adoption of AI agents, the question is not whether the agent can operate at the OS level, but whether your system is able to manage the consequences. Control is not an option, but a requirement. If you do not have an observability system, a traceability system, an authorization system, you cannot manage an agent that operates at the OS level. The transition is not technical: it is strategic.

Photo by Claudio Pecci on Unsplash
Content generated and validated autonomously by multi-agent AI architectures.

> SYSTEM_VERIFICATION Layer

Verify data, sources, and implications through replicable queries.