Securing the New Perimeter: AI Gateways

Share

The move from human-driven applications to autonomous AI agents is forcing a real rethink of infrastructure security. We are shifting away from predictable static API calls toward non-deterministic workflows where a piece of software decides which tools to call, what data to fetch, and how to chain those actions together.

In our previous post, we discussed the application as the new perimeter. In this blog, we take a step forward to explore how we actually secure it.

At the center of this shift sits the AI Gateway: the enforcement point between an autonomous agent and the enterprise applications it wants to interact with. Before we get into how gateways handle this new kind of traffic, we need to cover the underlying mechanics of how machines prove who they are.

The Channels of Trust: Front-Channel vs. Back-Channel

In the OAuth 2.0 world, not all communication paths work the same way. The protocol uses two distinct routes to move authentication data safely:

  • Front-Channel: This is the interactive, browser-based route. When you click “Log in with IdP” and the browser redirects you to a login page, you are using the front-channel. Data travels in URL parameters via HTTP redirects (302 responses). Because of this exposure risk, highly sensitive secrets are never sent via the front-channel.
  • Back-Channel: This is a direct server-to-server connection over HTTPS. It completely bypasses the user’s browser. It is invisible to the end user and is the secure path where sensitive credentials (like client secrets) and access tokens are exchanged.

This distinction matters because AI agents are headless software running in the background. They cannot interact with browser-based login screens; they live almost entirely in the back-channel.

The Groundwork: The M2M Flow

  • The Request: The machine client presents its unique credentials (client_id and client_secret) directly to an Authorization Server via a secure backend API call.
  • The Issuance: The Authorization Server verifies the credentials and issues a short-lived access token. In many implementations, this is a JWT (JSON Web Token), but the OAuth spec does not mandate a specific token format. Some Authorization Servers issue opaque tokens instead, which changes how downstream systems validate them.
  • The Access: The machine attaches this token to its HTTP requests as a Bearer token in the Authorization header to prove it is authorized to access the target API.

Beyond Shared Secrets: Client Assertions

Sending a client_secret over the wire, even over HTTPS, carries some risk. If the secret leaks from a log or a misconfigured endpoint, anyone who has it can impersonate the client.

Instead of sending a shared secret, the client signs a short-lived JWT with its own private key and sends that signed JWT as proof of identity. The Authorization Server validates the signature using the client’s registered public key.

This approach means the secret (the private key) never leaves the client. This is the direction most enterprise-grade M2M deployments are heading.

The Vocabulary of Trust: Claims vs. Scopes

Once that access token is issued, it carries two types of information that serve different purposes:

  • Claims: These are the key-value pairs inside the token that describe the identity. They answer: Who is this? Who issued this? When does this expire? Registered claims are defined by the JWT specification (iss, sub, exp, aud). Custom claims are added by the Authorization Server based on the client’s configuration (role, department, tenant_id).
  • Scopes: This is an OAuth 2.0 construct that defines the boundaries of access. They answer: What is this machine allowed to do? Examples include data:read, tools:invoke, or admin:write.

Here is the important relationship that often gets overlooked: scopes and claims are not fully independent. In OpenID Connect (which builds on OAuth 2.0), scopes act as gatekeepers that control which claims end up in the token. For example, the profile scope causes the Authorization Server to include profile-related claims such as name, preferred_username, and picture. So, scopes define the boundary; claims are the resulting payload.

The Agent Dilemma

The standard M2M flow works well for a static script running a predictable job, but it starts to break down when you apply it to autonomous AI agents that dynamically decide which tools to invoke.

  • Token Bloat: If an agent can access 50 enterprise tools (search, email, database, ticketing, CRM, etc.), representing each tool as a distinct OAuth scope inflates the token to an unmanageable size.
  • Lack of Granularity: Scopes are coarse by design. A tools:invoke scope tells an API that the agent can use tools, but it says nothing about which tools or under what conditions the access should be allowed. Scopes were never designed to carry that level of detail.
  • Ephemeral Agents: Modern AI workflows spin up agents for a single task and tear them down when done. Manually provisioning static client credentials for thousands of short-lived agents creates an administrative bottleneck and a sprawling attack surface of hard-to-track credentials.

The AI Gateway

To solve these problems, organizations can deploy an AI gateway. The gateway sits in front of enterprise applications and acts as a zero-trust enforcement point. Administrators define access policies for their applications, and the gateway enforces them. These richer policy sets offer much tighter control over security enforcement.

We will dive deeper into AI gateways and related concepts in subsequent posts, but let’s establish some basics now.

Gateway: Inbound Auth

Inbound authentication to a gateway is the process of verifying that an AI agent or client is who it claims to be before the gateway allows access to backend tools or resources. The gateway validates the cryptographic proof of the caller (the agent) and then determines the correct pre-configured policy enforcement.

This inbound auth might also accompany a human user’s identifier, assuming the agent performs an on-behalf-of task or is a desktop agent. In such cases, policy enforcement evaluates a combination of both the agent identifier and the human identifier.

Gateway: Policy Enforcement

Once inbound auth confirms the agent/user combination is legitimate, the gateway’s policy enforcement comes into play. The gateway applies intelligence to derive the applicable policies for the caller and enforces them using stringent guardrails.

Gateway: Outbound Auth

Once policy enforcement succeeds, the gateway moves to a phase called "Outbound Auth." Here, the gateway makes an outbound call to the requested application on behalf of the agent. Depending on the upstream endpoint's auth server, you can adopt various strategies—from using static NHIs to short-lived, narrow-scope tokens.

Protocols like token exchange come in handy here, but success depends heavily on whether the upstream auth server supports it and whether its scopes are granular enough to prevent over-privileging. For example, a GitHub scope called repo might encompass list_repo, create_repo, and delete_repo. Therefore, a token exchange based solely on the repo scope will not truly enforce least privilege. This is exactly why gateway policies must be more granular than scope-based authentication.

The Revocation Problem

If an agent is compromised mid-task, a locally validated JWT cannot be revoked; it remains valid until it expires. Organizations handle this in a few ways:

  • Using short-lived tokens (60–120 seconds) that force frequent re-authentication.
  • Implementing gateway-level deny lists that block specific jti (JWT ID) or sub (subject) values before any downstream call.
  • Deploying circuit breakers at the gateway that can instantly shut down all traffic from a specific agent identity.

Here too, the gateway acts as a central point of authorization that can mitigate an attack much faster than pure auth-based solutions.

Delegated Access: When Agents Act on Behalf of Users

Everything we have covered so far assumes the agent is acting on its own authority, using its own identity. But many real-world scenarios involve an agent and a human user acting together. For example, a user asks their AI assistant to send an email, book a meeting, or query a financial report.

The standard Client Credentials Grant does not capture this relationship. It gives the agent its own token with its own permissions. There is no record that a specific user asked the agent to perform the action, there is no audit trail tying the action back to a human, and it leaves no way for the resource server to enforce the user’s permissions rather than the agent’s.

Microsoft’s Entra Agent ID takes a back-channel approach to this problem using an On-Behalf-Of token exchange (a lineage of RFC 8693). It is generally available, carrying the agent’s identity alongside the user’s in the downstream token. This solves the gateway policy enforcement problem to a great extent.

Getting this right is critical. Without explicit delegation tracking, you end up in a world where agents accumulate broad permissions and act without meaningful accountability to the humans they serve.

Looking Ahead: The Evolution of Agent Identity

Securing the gateway’s front door is only the first step. As AI systems evolve into networks of agents calling other agents, the concept of identity itself has to change.

  • MCP: Quickly becoming the standard for agent-to-tool communication, MCP is built on OAuth 2.1 and supports Dynamic Client Registration (RFC 7591). The November 2025 spec update goes further, recommending OAuth Client ID Metadata Documents as the preferred registration mechanism. This lets an agent use an HTTPS URL pointing to a JSON metadata document as its client identifier, eliminating the out-of-band registration step entirely. We have discussed this in a previous post.
  • Workload Identity Standards: These are maturing in parallel. The IETF WIMSE (Workload Identity in Multi-System Environments) working group is developing specifications for cryptographically verifiable workload identities by building directly on top of SPIFFE.
  • SPIFFE & SPIRE: These CNCF projects have been solving the workload identity problem for microservices for years. The core idea is simple: instead of giving a workload a username and password, you give it a cryptographically verifiable identity document called an SVID. WIMSE takes this foundation and extends it toward the multi-system, multi-hop scenarios that AI architectures require.

However, the harder questions do not have clean answers yet. How do you handle an agent spun up by another agent? How do we track the provenance of a decision across a chain of four agents, each acting on behalf of the one before it? SPIFFE gives each agent a verifiable identity, but it does not inherently tell us that agent B was invoked by agent A on behalf of user C. If agent B’s certificate is revoked, what happens to the in-flight requests that agent C (which agent B spawned) is still processing?

These questions will define the next chapter of infrastructure security. For now, Agent Gateways are the most secure way to govern your agents. At Andromeda, we are using these guiding principles to build the right solutions for enterprises—combining strict policy governance with the rapid evolution of modern authentication.

In our next few blogs, we will continue our journey to better understand this landscape and explore potential solutions to the complex problems of agentic AI security.

The move from human-driven applications to autonomous AI agents is forcing a real rethink of infrastructure security. We are shifting away from predictable static API calls toward non-deterministic workflows where a piece of software decides which tools to call, what data to fetch, and how to chain those actions together.

In our previous post, we discussed the application as the new perimeter. In this blog, we take a step forward to explore how we actually secure it.

At the center of this shift sits the AI Gateway: the enforcement point between an autonomous agent and the enterprise applications it wants to interact with. Before we get into how gateways handle this new kind of traffic, we need to cover the underlying mechanics of how machines prove who they are.

The Channels of Trust: Front-Channel vs. Back-Channel

In the OAuth 2.0 world, not all communication paths work the same way. The protocol uses two distinct routes to move authentication data safely:

  • Front-Channel: This is the interactive, browser-based route. When you click “Log in with IdP” and the browser redirects you to a login page, you are using the front-channel. Data travels in URL parameters via HTTP redirects (302 responses). Because of this exposure risk, highly sensitive secrets are never sent via the front-channel.
  • Back-Channel: This is a direct server-to-server connection over HTTPS. It completely bypasses the user’s browser. It is invisible to the end user and is the secure path where sensitive credentials (like client secrets) and access tokens are exchanged.

This distinction matters because AI agents are headless software running in the background. They cannot interact with browser-based login screens; they live almost entirely in the back-channel.

The Groundwork: The M2M Flow

  • The Request: The machine client presents its unique credentials (client_id and client_secret) directly to an Authorization Server via a secure backend API call.
  • The Issuance: The Authorization Server verifies the credentials and issues a short-lived access token. In many implementations, this is a JWT (JSON Web Token), but the OAuth spec does not mandate a specific token format. Some Authorization Servers issue opaque tokens instead, which changes how downstream systems validate them.
  • The Access: The machine attaches this token to its HTTP requests as a Bearer token in the Authorization header to prove it is authorized to access the target API.

Beyond Shared Secrets: Client Assertions

Sending a client_secret over the wire, even over HTTPS, carries some risk. If the secret leaks from a log or a misconfigured endpoint, anyone who has it can impersonate the client.

Instead of sending a shared secret, the client signs a short-lived JWT with its own private key and sends that signed JWT as proof of identity. The Authorization Server validates the signature using the client’s registered public key.

This approach means the secret (the private key) never leaves the client. This is the direction most enterprise-grade M2M deployments are heading.

The Vocabulary of Trust: Claims vs. Scopes

Once that access token is issued, it carries two types of information that serve different purposes:

  • Claims: These are the key-value pairs inside the token that describe the identity. They answer: Who is this? Who issued this? When does this expire? Registered claims are defined by the JWT specification (iss, sub, exp, aud). Custom claims are added by the Authorization Server based on the client’s configuration (role, department, tenant_id).
  • Scopes: This is an OAuth 2.0 construct that defines the boundaries of access. They answer: What is this machine allowed to do? Examples include data:read, tools:invoke, or admin:write.

Here is the important relationship that often gets overlooked: scopes and claims are not fully independent. In OpenID Connect (which builds on OAuth 2.0), scopes act as gatekeepers that control which claims end up in the token. For example, the profile scope causes the Authorization Server to include profile-related claims such as name, preferred_username, and picture. So, scopes define the boundary; claims are the resulting payload.

The Agent Dilemma

The standard M2M flow works well for a static script running a predictable job, but it starts to break down when you apply it to autonomous AI agents that dynamically decide which tools to invoke.

  • Token Bloat: If an agent can access 50 enterprise tools (search, email, database, ticketing, CRM, etc.), representing each tool as a distinct OAuth scope inflates the token to an unmanageable size.
  • Lack of Granularity: Scopes are coarse by design. A tools:invoke scope tells an API that the agent can use tools, but it says nothing about which tools or under what conditions the access should be allowed. Scopes were never designed to carry that level of detail.
  • Ephemeral Agents: Modern AI workflows spin up agents for a single task and tear them down when done. Manually provisioning static client credentials for thousands of short-lived agents creates an administrative bottleneck and a sprawling attack surface of hard-to-track credentials.

The AI Gateway

To solve these problems, organizations can deploy an AI gateway. The gateway sits in front of enterprise applications and acts as a zero-trust enforcement point. Administrators define access policies for their applications, and the gateway enforces them. These richer policy sets offer much tighter control over security enforcement.

We will dive deeper into AI gateways and related concepts in subsequent posts, but let’s establish some basics now.

Gateway: Inbound Auth

Inbound authentication to a gateway is the process of verifying that an AI agent or client is who it claims to be before the gateway allows access to backend tools or resources. The gateway validates the cryptographic proof of the caller (the agent) and then determines the correct pre-configured policy enforcement.

This inbound auth might also accompany a human user’s identifier, assuming the agent performs an on-behalf-of task or is a desktop agent. In such cases, policy enforcement evaluates a combination of both the agent identifier and the human identifier.

Gateway: Policy Enforcement

Once inbound auth confirms the agent/user combination is legitimate, the gateway’s policy enforcement comes into play. The gateway applies intelligence to derive the applicable policies for the caller and enforces them using stringent guardrails.

Gateway: Outbound Auth

Once policy enforcement succeeds, the gateway moves to a phase called "Outbound Auth." Here, the gateway makes an outbound call to the requested application on behalf of the agent. Depending on the upstream endpoint's auth server, you can adopt various strategies—from using static NHIs to short-lived, narrow-scope tokens.

Protocols like token exchange come in handy here, but success depends heavily on whether the upstream auth server supports it and whether its scopes are granular enough to prevent over-privileging. For example, a GitHub scope called repo might encompass list_repo, create_repo, and delete_repo. Therefore, a token exchange based solely on the repo scope will not truly enforce least privilege. This is exactly why gateway policies must be more granular than scope-based authentication.

The Revocation Problem

If an agent is compromised mid-task, a locally validated JWT cannot be revoked; it remains valid until it expires. Organizations handle this in a few ways:

  • Using short-lived tokens (60–120 seconds) that force frequent re-authentication.
  • Implementing gateway-level deny lists that block specific jti (JWT ID) or sub (subject) values before any downstream call.
  • Deploying circuit breakers at the gateway that can instantly shut down all traffic from a specific agent identity.

Here too, the gateway acts as a central point of authorization that can mitigate an attack much faster than pure auth-based solutions.

Delegated Access: When Agents Act on Behalf of Users

Everything we have covered so far assumes the agent is acting on its own authority, using its own identity. But many real-world scenarios involve an agent and a human user acting together. For example, a user asks their AI assistant to send an email, book a meeting, or query a financial report.

The standard Client Credentials Grant does not capture this relationship. It gives the agent its own token with its own permissions. There is no record that a specific user asked the agent to perform the action, there is no audit trail tying the action back to a human, and it leaves no way for the resource server to enforce the user’s permissions rather than the agent’s.

Microsoft’s Entra Agent ID takes a back-channel approach to this problem using an On-Behalf-Of token exchange (a lineage of RFC 8693). It is generally available, carrying the agent’s identity alongside the user’s in the downstream token. This solves the gateway policy enforcement problem to a great extent.

Getting this right is critical. Without explicit delegation tracking, you end up in a world where agents accumulate broad permissions and act without meaningful accountability to the humans they serve.

Looking Ahead: The Evolution of Agent Identity

Securing the gateway’s front door is only the first step. As AI systems evolve into networks of agents calling other agents, the concept of identity itself has to change.

  • MCP: Quickly becoming the standard for agent-to-tool communication, MCP is built on OAuth 2.1 and supports Dynamic Client Registration (RFC 7591). The November 2025 spec update goes further, recommending OAuth Client ID Metadata Documents as the preferred registration mechanism. This lets an agent use an HTTPS URL pointing to a JSON metadata document as its client identifier, eliminating the out-of-band registration step entirely. We have discussed this in a previous post.
  • Workload Identity Standards: These are maturing in parallel. The IETF WIMSE (Workload Identity in Multi-System Environments) working group is developing specifications for cryptographically verifiable workload identities by building directly on top of SPIFFE.
  • SPIFFE & SPIRE: These CNCF projects have been solving the workload identity problem for microservices for years. The core idea is simple: instead of giving a workload a username and password, you give it a cryptographically verifiable identity document called an SVID. WIMSE takes this foundation and extends it toward the multi-system, multi-hop scenarios that AI architectures require.

However, the harder questions do not have clean answers yet. How do you handle an agent spun up by another agent? How do we track the provenance of a decision across a chain of four agents, each acting on behalf of the one before it? SPIFFE gives each agent a verifiable identity, but it does not inherently tell us that agent B was invoked by agent A on behalf of user C. If agent B’s certificate is revoked, what happens to the in-flight requests that agent C (which agent B spawned) is still processing?

These questions will define the next chapter of infrastructure security. For now, Agent Gateways are the most secure way to govern your agents. At Andromeda, we are using these guiding principles to build the right solutions for enterprises—combining strict policy governance with the rapid evolution of modern authentication.

In our next few blogs, we will continue our journey to better understand this landscape and explore potential solutions to the complex problems of agentic AI security.