
The move from human-driven applications to autonomous AI agents is forcing a real rethink of infrastructure security. We are shifting away from predictable static API calls toward non-deterministic workflows where a piece of software decides which tools to call, what data to fetch, and how to chain those actions together.
In our previous post, we discussed the application as the new perimeter. In this blog, we take a step forward to explore how we actually secure it.
At the center of this shift sits the AI Gateway: the enforcement point between an autonomous agent and the enterprise applications it wants to interact with. Before we get into how gateways handle this new kind of traffic, we need to cover the underlying mechanics of how machines prove who they are.
In the OAuth 2.0 world, not all communication paths work the same way. The protocol uses two distinct routes to move authentication data safely:
This distinction matters because AI agents are headless software running in the background. They cannot interact with browser-based login screens; they live almost entirely in the back-channel.
Sending a client_secret over the wire, even over HTTPS, carries some risk. If the secret leaks from a log or a misconfigured endpoint, anyone who has it can impersonate the client.
Instead of sending a shared secret, the client signs a short-lived JWT with its own private key and sends that signed JWT as proof of identity. The Authorization Server validates the signature using the client’s registered public key.
This approach means the secret (the private key) never leaves the client. This is the direction most enterprise-grade M2M deployments are heading.
Once that access token is issued, it carries two types of information that serve different purposes:
Here is the important relationship that often gets overlooked: scopes and claims are not fully independent. In OpenID Connect (which builds on OAuth 2.0), scopes act as gatekeepers that control which claims end up in the token. For example, the profile scope causes the Authorization Server to include profile-related claims such as name, preferred_username, and picture. So, scopes define the boundary; claims are the resulting payload.
The standard M2M flow works well for a static script running a predictable job, but it starts to break down when you apply it to autonomous AI agents that dynamically decide which tools to invoke.
To solve these problems, organizations can deploy an AI gateway. The gateway sits in front of enterprise applications and acts as a zero-trust enforcement point. Administrators define access policies for their applications, and the gateway enforces them. These richer policy sets offer much tighter control over security enforcement.
We will dive deeper into AI gateways and related concepts in subsequent posts, but let’s establish some basics now.

Inbound authentication to a gateway is the process of verifying that an AI agent or client is who it claims to be before the gateway allows access to backend tools or resources. The gateway validates the cryptographic proof of the caller (the agent) and then determines the correct pre-configured policy enforcement.
This inbound auth might also accompany a human user’s identifier, assuming the agent performs an on-behalf-of task or is a desktop agent. In such cases, policy enforcement evaluates a combination of both the agent identifier and the human identifier.
Once inbound auth confirms the agent/user combination is legitimate, the gateway’s policy enforcement comes into play. The gateway applies intelligence to derive the applicable policies for the caller and enforces them using stringent guardrails.
Once policy enforcement succeeds, the gateway moves to a phase called "Outbound Auth." Here, the gateway makes an outbound call to the requested application on behalf of the agent. Depending on the upstream endpoint's auth server, you can adopt various strategies—from using static NHIs to short-lived, narrow-scope tokens.
Protocols like token exchange come in handy here, but success depends heavily on whether the upstream auth server supports it and whether its scopes are granular enough to prevent over-privileging. For example, a GitHub scope called repo might encompass list_repo, create_repo, and delete_repo. Therefore, a token exchange based solely on the repo scope will not truly enforce least privilege. This is exactly why gateway policies must be more granular than scope-based authentication.
If an agent is compromised mid-task, a locally validated JWT cannot be revoked; it remains valid until it expires. Organizations handle this in a few ways:
Here too, the gateway acts as a central point of authorization that can mitigate an attack much faster than pure auth-based solutions.
Everything we have covered so far assumes the agent is acting on its own authority, using its own identity. But many real-world scenarios involve an agent and a human user acting together. For example, a user asks their AI assistant to send an email, book a meeting, or query a financial report.
The standard Client Credentials Grant does not capture this relationship. It gives the agent its own token with its own permissions. There is no record that a specific user asked the agent to perform the action, there is no audit trail tying the action back to a human, and it leaves no way for the resource server to enforce the user’s permissions rather than the agent’s.
Microsoft’s Entra Agent ID takes a back-channel approach to this problem using an On-Behalf-Of token exchange (a lineage of RFC 8693). It is generally available, carrying the agent’s identity alongside the user’s in the downstream token. This solves the gateway policy enforcement problem to a great extent.
Getting this right is critical. Without explicit delegation tracking, you end up in a world where agents accumulate broad permissions and act without meaningful accountability to the humans they serve.
Securing the gateway’s front door is only the first step. As AI systems evolve into networks of agents calling other agents, the concept of identity itself has to change.
However, the harder questions do not have clean answers yet. How do you handle an agent spun up by another agent? How do we track the provenance of a decision across a chain of four agents, each acting on behalf of the one before it? SPIFFE gives each agent a verifiable identity, but it does not inherently tell us that agent B was invoked by agent A on behalf of user C. If agent B’s certificate is revoked, what happens to the in-flight requests that agent C (which agent B spawned) is still processing?
These questions will define the next chapter of infrastructure security. For now, Agent Gateways are the most secure way to govern your agents. At Andromeda, we are using these guiding principles to build the right solutions for enterprises—combining strict policy governance with the rapid evolution of modern authentication.
In our next few blogs, we will continue our journey to better understand this landscape and explore potential solutions to the complex problems of agentic AI security.
The move from human-driven applications to autonomous AI agents is forcing a real rethink of infrastructure security. We are shifting away from predictable static API calls toward non-deterministic workflows where a piece of software decides which tools to call, what data to fetch, and how to chain those actions together.
In our previous post, we discussed the application as the new perimeter. In this blog, we take a step forward to explore how we actually secure it.
At the center of this shift sits the AI Gateway: the enforcement point between an autonomous agent and the enterprise applications it wants to interact with. Before we get into how gateways handle this new kind of traffic, we need to cover the underlying mechanics of how machines prove who they are.
In the OAuth 2.0 world, not all communication paths work the same way. The protocol uses two distinct routes to move authentication data safely:
This distinction matters because AI agents are headless software running in the background. They cannot interact with browser-based login screens; they live almost entirely in the back-channel.
Sending a client_secret over the wire, even over HTTPS, carries some risk. If the secret leaks from a log or a misconfigured endpoint, anyone who has it can impersonate the client.
Instead of sending a shared secret, the client signs a short-lived JWT with its own private key and sends that signed JWT as proof of identity. The Authorization Server validates the signature using the client’s registered public key.
This approach means the secret (the private key) never leaves the client. This is the direction most enterprise-grade M2M deployments are heading.
Once that access token is issued, it carries two types of information that serve different purposes:
Here is the important relationship that often gets overlooked: scopes and claims are not fully independent. In OpenID Connect (which builds on OAuth 2.0), scopes act as gatekeepers that control which claims end up in the token. For example, the profile scope causes the Authorization Server to include profile-related claims such as name, preferred_username, and picture. So, scopes define the boundary; claims are the resulting payload.
The standard M2M flow works well for a static script running a predictable job, but it starts to break down when you apply it to autonomous AI agents that dynamically decide which tools to invoke.
To solve these problems, organizations can deploy an AI gateway. The gateway sits in front of enterprise applications and acts as a zero-trust enforcement point. Administrators define access policies for their applications, and the gateway enforces them. These richer policy sets offer much tighter control over security enforcement.
We will dive deeper into AI gateways and related concepts in subsequent posts, but let’s establish some basics now.

Inbound authentication to a gateway is the process of verifying that an AI agent or client is who it claims to be before the gateway allows access to backend tools or resources. The gateway validates the cryptographic proof of the caller (the agent) and then determines the correct pre-configured policy enforcement.
This inbound auth might also accompany a human user’s identifier, assuming the agent performs an on-behalf-of task or is a desktop agent. In such cases, policy enforcement evaluates a combination of both the agent identifier and the human identifier.
Once inbound auth confirms the agent/user combination is legitimate, the gateway’s policy enforcement comes into play. The gateway applies intelligence to derive the applicable policies for the caller and enforces them using stringent guardrails.
Once policy enforcement succeeds, the gateway moves to a phase called "Outbound Auth." Here, the gateway makes an outbound call to the requested application on behalf of the agent. Depending on the upstream endpoint's auth server, you can adopt various strategies—from using static NHIs to short-lived, narrow-scope tokens.
Protocols like token exchange come in handy here, but success depends heavily on whether the upstream auth server supports it and whether its scopes are granular enough to prevent over-privileging. For example, a GitHub scope called repo might encompass list_repo, create_repo, and delete_repo. Therefore, a token exchange based solely on the repo scope will not truly enforce least privilege. This is exactly why gateway policies must be more granular than scope-based authentication.
If an agent is compromised mid-task, a locally validated JWT cannot be revoked; it remains valid until it expires. Organizations handle this in a few ways:
Here too, the gateway acts as a central point of authorization that can mitigate an attack much faster than pure auth-based solutions.
Everything we have covered so far assumes the agent is acting on its own authority, using its own identity. But many real-world scenarios involve an agent and a human user acting together. For example, a user asks their AI assistant to send an email, book a meeting, or query a financial report.
The standard Client Credentials Grant does not capture this relationship. It gives the agent its own token with its own permissions. There is no record that a specific user asked the agent to perform the action, there is no audit trail tying the action back to a human, and it leaves no way for the resource server to enforce the user’s permissions rather than the agent’s.
Microsoft’s Entra Agent ID takes a back-channel approach to this problem using an On-Behalf-Of token exchange (a lineage of RFC 8693). It is generally available, carrying the agent’s identity alongside the user’s in the downstream token. This solves the gateway policy enforcement problem to a great extent.
Getting this right is critical. Without explicit delegation tracking, you end up in a world where agents accumulate broad permissions and act without meaningful accountability to the humans they serve.
Securing the gateway’s front door is only the first step. As AI systems evolve into networks of agents calling other agents, the concept of identity itself has to change.
However, the harder questions do not have clean answers yet. How do you handle an agent spun up by another agent? How do we track the provenance of a decision across a chain of four agents, each acting on behalf of the one before it? SPIFFE gives each agent a verifiable identity, but it does not inherently tell us that agent B was invoked by agent A on behalf of user C. If agent B’s certificate is revoked, what happens to the in-flight requests that agent C (which agent B spawned) is still processing?
These questions will define the next chapter of infrastructure security. For now, Agent Gateways are the most secure way to govern your agents. At Andromeda, we are using these guiding principles to build the right solutions for enterprises—combining strict policy governance with the rapid evolution of modern authentication.
In our next few blogs, we will continue our journey to better understand this landscape and explore potential solutions to the complex problems of agentic AI security.