cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
506
Views
0
Helpful
0
Comments
Omar Santos
Cisco Employee
Cisco Employee

We are all experiencing the rapid proliferation of autonomous AI agents and Multi-Agent Systems (MAS). These are no longer AI chatbots and assistants; they are increasingly self-directed entities capable of making decisions, performing actions, and interacting with critical systems at unprecedented scales. We need to perform fundamental re-evaluations of how identities are managed and access is controlled for these AI agents.

Cisco Talos has identified identity-based attacks as the dominant threat vector in recent years, with their 2024 Year in Review and Q1 2025 reports highlighting the pervasive and evolving nature of these attacks across many environments. Identity-based attacks accounted for 60% of all Cisco Talos Incident Response (IR) cases in 2024, making them the most common attack method observed. These attacks were in traditional systems, now imagine the complexity and potential abuse in agentic AI implementations.

The shift from passive computational tools to active, autonomous entities operating within complex ecosystems fundamentally alters the requirements for trust, accountability, and security. If these autonomous entities are to be trusted with significant responsibilities, they must possess verifiable identities, be granted precise permissions, and have those permissions revoked reliably when necessary.

I have talked about this for a while now. For example, in a presentation at the IEEE World Technology Summit in Silicon Valley, in the “Securing Generative AI” video course, in the article titled “The AI Security Landscape in 2025“, and other presentations.

The core problem at hand is a fundamental mismatch between existing Identity and Access Management (IAM) paradigms and the unique characteristics of AI agents in MAS. Traditional IAM systems, such as OAuth, OpenID Connect (OIDC), and SAML, were primarily designed for human users or static machine identities. These frameworks could be inadequate for the dynamic, interdependent, and often ephemeral nature of AI agents operating at scale in MAS implementations.

OmarSantos_5-1748403412082.png

Their limitations stem from coarse-grained controls, single-entity focus, and lack of context-awareness. This means that merely adapting existing protocols is not just difficult but fundamentally flawed. The question is: Do we need a purpose-built approach to secure the future of agentic AI?

The paper titled “A Novel Zero-Trust Identity Framework for Agentic AI: Decentralized Authentication and Fine-Grained Access Control” proposed a solution. The overarching goal of “the solution” covered in this paper is to “establish the foundational trust, accountability, and security for agentic AI implementations and the complex ecosystems they will inhabit”.

The Characteristics of AI Agents

AI agents operate independently, often without direct human intervention, making their actions difficult to monitor and control through conventional means. Their self-directed nature demands a system that can verify their legitimacy and authorized scope of action without constant human oversight.

OmarSantos_6-1748403481106.png

Many agents are short-lived, spun up and down rapidly to perform specific tasks. This transient nature renders static identity provisioning cumbersome and inefficient for systems designed for long-lived human user sessions. An agent's functions, permissions, and even its behavioral scope can change rapidly based on its learning, assigned tasks, or environmental context. Traditional IAM struggles to adapt to such fluid identities and permissions in real-time.

Agents interact with other agents, APIs, and systems in intricate, often cascading ways, leading to complex chains of delegated authority. This creates a profound accountability challenge, as delegated authority can cascade through multiple agents, obscuring responsibility if not managed appropriately. If a single compromised agent can autonomously delegate its (potentially elevated) permissions to a multitude of other agents, tracing the origin and scope of a malicious action becomes incredibly difficult, untraceable malicious activities.

OmarSantos_8-1748403623595.png

We need a real-time, context-aware policy enforcement and revocation mechanism that traditional, slower-moving IAM systems simply cannot provide.

Multi-Agent Systems can involve dozens or even hundreds of interacting agents, far exceeding the scale for which human-centric IAM was designed. Managing identities and access for such a vast and dynamic population requires automated, scalable solutions.

The Consequences of Inadequate IAM

The failure to address these identity challenges could lead to horrible security incidents, loss of accountability, and erosion of trust in AI systems. As the aforementioned paper highlights, a compromised autonomous agent in a financial system could “cascade unauthorized transactions,” or a swarm of interacting agents in critical infrastructure could be manipulated with devastating consequences.

To further illustrate the fundamental disparity, the following table (Table 1) outlines the differences between traditional IAM and the specific needs of agentic AI.

Table 1: Traditional IAM vs. Agentic AI IAM Needs

Category

Traditional IAM  (Designed For)

Agentic AI IAM Needs (Required For)

Identity Type

Human Users / Static Machines

Dynamic, Ephemeral AI Agents

Scalability

Human-centric (limited)

Massively Scalable (MAS)

Control Granularity

Coarse-grained (e.g., Role-Based)

Fine-grained (Capability/Behavioral)

Context-Awareness

Low (static permissions)

High (real-time context-dependent)

Ephemerality Support

Poor (long-lived sessions)

Robust (rapid provisioning/deprovisioning)

Trust Model

Centralized Authority

Decentralized / Zero-Trust

Accountability

Difficult for cascading actions

Clear for complex, delegated actions

 

A Zero-Trust Framework for Agentic AI

The paper advocates for a "purpose-built approach" that "redefines agent identity" rather than merely adapting existing protocols.This framework embraces the "Zero-Trust" principle, meaning no agent, whether internal or external, is inherently trusted. Every interaction requires verification, which is particularly crucial for autonomous agents operating at scale in dynamic environments.

The proposed framework is comprehensive, built upon "rich, verifiable Agent Identities (IDs), leveraging Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs)".It incorporates advanced cryptographic primitives like Zero-Knowledge Proofs (ZKPs), an Agent Naming Service (ANS) for secure discovery, dynamic fine-grained access control mechanisms, and critically, a unified global session management and policy enforcement layer.

 

OmarSantos_10-1748403807314.png

The design of this framework, combining "Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs)" with a "unified global session management and policy enforcement layer", represents a pretty sophisticated architectural choice. While "decentralized" and "unified global" might initially appear contradictory, this suggestion balances the benefits of decentralization (autonomy, resilience, privacy) for identity issuance and proof with the necessity of a centralized or federated control plane for dynamic policy enforcement and rapid response in complex multi-agent systems. 

The following table provides a concise overview of the framework's core components and their primary benefits for AI agents.

Table 2: Key Components of the Zero-Trust Agentic AI Framework

Component

Description

Primary Benefit for AI Agents

Decentralized Identifiers (DIDs)

Cryptographically verifiable, self-sovereign identifiers that encapsulate an agent's capabilities, provenance, and behavioral scope.

Provides verifiable, rich, and immutable agent identities, crucial for accountability.

Verifiable Credentials (VCs)

Tamper-proof digital attestations issued by trusted entities, confirming specific attributes or permissions of an agent.

Enables dynamic, granular, and verifiable attestation of agent capabilities and access rights.

Zero-Knowledge Proofs (ZKPs)

Cryptographic methods allowing proof of information possession (e.g., credentials) without revealing the information itself.

Ensures privacy-preserving attribute disclosure and verifiable policy compliance.

Agent Naming Service (ANS)

A secure mechanism for capability-aware discovery and resolution of AI agents within Multi-Agent Systems.

Facilitates secure, reliable, and efficient discovery of trusted agents based on their functions.

Dynamic Fine-Grained Access Control

Mechanisms to grant or revoke permissions in real-time, based on an agent's context, behavior, and evolving policy.

Allows precise, adaptive control over agent actions, minimizing over-privileging and risk.

Unified Global Session Management & Policy Enforcement

A centralized or federated layer for real-time control and consistent revocation of policies across diverse agent communications.

Ensures consistent security posture and immediate response/revocation across the entire MAS.

 

DIDs, VCs, and ZKPs

The proposed framework builds upon several advanced cryptographic and decentralized technologies to establish a foundation for agent identity.

DIDs are a type of globally unique identifier that is cryptographically verifiable and “resolvable” without relying on a centralized authority. For AI agents, DIDs are more than just a name; they are designed to "encapsulate an agent's capabilities, provenance, behavioral scope, and security posture". The inclusion of "provenance" (origin/history) and "behavioral scope" (intended boundaries of action) directly within the agent's verifiable identity is a critical innovation. This moves the concept of identity beyond simple authentication, which merely answers "who are you?", to a much deeper "what are you allowed to do, and where did you come from?". This is paramount for establishing clear accountability and audit trails in complex AI systems. If an agent's identity inherently includes these attributes, its actions can be continuously verified against its authorized behavior, providing a strong basis for "fine-grained access control" 2 and ensuring that agents operate within their defined parameters. This significantly enhances trust and reduces the risk of unauthorized or malicious actions by providing a clear, verifiable record of an agent's authorized capabilities and history.

Verifiable Credentials (VCs) for Dynamic Attributes and Permissions

Verifiable Credentials (VCs) are tamper-proof digital certificates issued by trusted entities, such as an AI orchestrator or a regulatory body. They attest to specific attributes, capabilities, or permissions of an AI agent. VCs enable dynamic updates to an agent's access rights and capabilities, allowing them to evolve with the agent's tasks or context without requiring re-registration or manual intervention. This dynamic nature is essential for ephemeral and constantly evolving AI agents, providing flexibility while maintaining cryptographic integrity.

Zero-Knowledge Proofs (ZKPs) for Privacy and Compliance

Zero-Knowledge Proofs (ZKPs) are powerful cryptographic methods that allow one party to prove they possess certain information (e.g., an attribute, a credential, or adherence to a policy) without revealing the underlying sensitive data itself. Within this framework, ZKPs enable "privacy-preserving attribute disclosure and verifiable policy compliance".This explicit integration of ZKPs demonstrates a sophisticated understanding of the real-world challenges facing AI ecosystems, particularly concerning privacy. In a multi-agent system where agents might handle sensitive data (e.g., financial, medical) or operate across different organizational boundaries, ZKPs allow an agent to prove it has the necessary authorization or meets specific compliance criteria without exposing the confidential details of its identity or the sensitive data it processes. For instance, an agent could prove it is authorized to access a specific database without revealing its full operational history or the exact data it is accessing. This capability is vital for fostering greater interoperability and trust among diverse organizations deploying MAS, as it enables necessary verification while mitigating privacy and competitive risks, thereby accelerating broader adoption of agentic AI.

Agent Naming Service (ANS) for Secure Discovery

The framework includes an Agent Naming Service (ANS) designed for secure and capability-aware discovery of agents. This functions similarly to a Domain Name System (DNS) for AI agents, allowing them to reliably find and identify each other based on their verifiable capabilities and roles within the system. In large-scale, dynamic multi-agent systems, agents constantly need to discover and interact with other agents to accomplish tasks. If this discovery process is insecure or unreliable, it opens doors for malicious agents to impersonate legitimate ones, or for agents to connect to unintended or compromised services. My friend Akram Sheriff is one of the architects of ANS, along with Ken Huang , Vineeth Sai Narajala , and Idan Habler  and recently published the IETF RFC draft: https://www.ietf.org/archive/id/draft-narajala-ans-00.html 

The ANS (providing a trusted and capability-aware directory) ensures that agents can reliably find and authenticate the correct counterparts based on their verifiable capabilities, preventing man-in-the-middle attacks, misdirection, and ensuring robust, scalable, and secure inter-agent communication, which is foundational for the entire MAS.

The ANS architecture is built upon several interconnected components:

OmarSantos_11-1748405075887.png

  • Requesting Agent: The entity initiating the agent registration process (e.g., an individual, organization, or automated system).
    Agent Registry: A potentially distributed database storing crucial information about registered agents, including their capabilities, security policies, Public Key Infrastructure (PKI) certificates, and Decentralized Identifier (DID) related information.
  • Certificate Authority (CA): A trusted entity responsible for issuing and managing X.509 digital certificates for agents, forming the root of trust.
  • Registration Authority (RA): Verifies agent registration and renewal requests, interacts with the CA to issue certificates, and manages the agent's lifecycle (registration, renewal, revocation).
  • Protocol Adapter Layer: A modular layer that translates between the registry’s internal representation and various protocol-specific formats, supporting diverse communication standards like Agent2Agent (A2A), Multi-Agent Communication Protocol (MCP), and Agent Communication Protocol (ACP).
  • Request/Response Schema: A protocol-agnostic, JSON-based schema used for all registry interactions, incorporating PKI data and allowing for protocol-specific extensions.
  • Agent Name Service (ANS) Core: This central component enables agent discovery using human-readable, structured names, coupled with agent capability-based resolution.


Proposed Functionality of ANS:

  • Formalized Agent Registration and Renewal: ANS defines explicit lifecycle management for agents, including processes for initial registration, periodic renewal, and deregistration or revocation (e.g., due to key compromise).
  • DNS-Inspired Naming Conventions with Capability-Aware Resolution: It proposes a formal naming structure (ANSName) that combines elements like Protocol, AgentID, agentCapability, Provider, Version, and Extension (e.g., a2a://textProcessor.DocumentTranslation.AcmeCorp.v2.1.hipaa). The resolution mechanism maps this ANSName to an actionable reference (Endpoint) and facilitates precise capability discovery.
  • Modular Protocol Adapter Layer: This layer ensures the registry can support various agent communication protocols without tight coupling, with each adapter handling protocol-specific metadata and validation.
  • Structured Communication using JSON Schema: All registry interactions use JSON Schema documents, ensuring structured and validated communication.
  • Secure Resolution Algorithms: ANS integrates PKI for verifiable agent identity and trust, with algorithms for certificate chain verification and digital signature verification to ensure the authenticity and integrity of messages and documents.

Dynamic Fine-Grained Access Control

The framework enables "dynamic fine-grained access control mechanisms".This allows permissions to be granted or revoked in real-time, adapting to an agent's current context, observed behavior, and evolving policy requirements. This capability stands in stark contrast to the coarse-grained, static permissions typically found in traditional IAM systems, which are ill-suited for the fluid nature of agent interactions. By allowing precise control over what an agent can do at any given moment, the framework minimizes the risk of over-privileging and reduces the attack surface.

Unified Global Session Management and Policy Enforcement

A critical component is the "unified global session management and policy enforcement layer for real-time control and consistent revocation across heterogeneous agent communication".While DIDs and VCs provide decentralized, robust identity, this layer serves as the operational lynchpin, providing the necessary centralized (or federated) control for real-world enterprise deployments. If an agent is compromised, or its behavior deviates from established policy, this layer enables "real-time control and consistent revocation" 2 across all its active sessions and interactions. This capability is crucial for rapidly containing breaches and mitigating the "cascading unauthorized transactions" or "manipulated swarm" scenarios mentioned earlier.Without this global, dynamic layer, managing and revoking access for thousands of ephemeral agents in real-time would be practically impossible, leaving the entire system vulnerable to rapid, uncontrolled propagation of threats and making dynamic risk management unfeasible.

Sample Scenarios

Let's illustrate how the proposed Zero-Trust Identity Framework could operate in a cybersecurity scenario involving my "AI Agent Team" (Omar's AI Agent Team). Imagine a sophisticated cybersecurity operation where a team of autonomous AI agents, each with specialized roles, works together to detect, analyze, and respond to threats.

In this example, "AI Agent X," "AI Agent Y," and "AI Agent Z" are key players in a security operations center (SOC) that leverages agentic AI for enhanced threat detection and response.

  1. Omar's AI Agent X (Threat Hunter & Anomaly Detector):
    • Identity (DID): At its creation, Agent X is assigned a unique DID that cryptographically verifies its origin (developed by Omar's team), its core function ("Threat Hunting & Anomaly Detection"), and its initial behavioral scope (e.g., "read-only access to network logs," "endpoint telemetry analysis"). This DID acts as its immutable, verifiable identity.
    • Credentials (VCs): Agent X receives VCs from a trusted “Security Orchestrator” (perhaps a human-controlled system or Omar's AI Agent Z). These VCs attest to its specific capabilities, such as”"permission to access SIEM logs”, “permission to query DNS records”, and “permission to flag suspicious executables”. These VCs are dynamic and can be updated as Agent X's responsibilities evolve.
  2. Omar’s AI Agent Y (Incident Responder & Containment):
    • Identity (DID): Agent Y also possesses a unique DID, encapsulating its role as an "Incident Responder & Containment Agent," its developer, and its initial security posture (e.g., "can initiate network segmentation," "can isolate compromised hosts"). This DID is its verifiable digital fingerprint.
    • Credentials (VCs): Agent Y receives VCs for actions like "initiate firewall blocks," "quarantine endpoints," "collect forensic images," and "trigger alerts to human analysts." Some of these VCs might be conditional, requiring a high-severity alert from a trusted source before activation.
  3. Omar's AI Agent Z (Policy Enforcer & Orchestrator):
    • Identity (DID): Agent Z's DID identifies it as the "Security Policy Enforcer & Orchestrator," granting it the highest level of authority within the agent ecosystem for managing other agents' identities and permissions.
    • Credentials (VCs): Agent Z holds VCs that allow it to "issue and revoke VCs to other agents," "update global security policies," and "manage agent session lifecycles".

The Flow of a Cybersecurity Incident:

Let's trace a potential threat detection and response scenario:

OmarSantos_15-1748405737781.png

 

  1. Initial Detection (Agent X):
    • Agent X, performing its threat hunting duties, continuously analyzes network traffic and endpoint telemetry. It identifies a highly unusual pattern of outbound communication from a critical server, indicating potential command-and-control activity.
    • To confirm its suspicion, Agent X needs to query an external threat intelligence feed. It uses the Agent Naming Service (ANS) to securely discover and connect to a trusted "Threat Intelligence Agent" (e.g., "Omar's AI Agent A") that has the verifiable capability to provide such data.
    • Before transmitting the suspicious IP address to Agent A, Agent X uses a Zero-Knowledge Proof (ZKP) to prove to Agent A that it possesses a valid VC authorizing it to "access sensitive network metadata" and that the query is for a "high-priority security incident," without revealing the specific IP address or the full details of its internal credentials. This ensures privacy and compliance while verifying legitimacy.

  2. Escalation and Response (Agent Y):
    • Upon confirming the malicious nature of the communication, Agent X generates a critical alert. It then uses the ANS to find an agent with "Incident Response" capabilities, which leads it to Omar's AI Agent Y.
    • Agent X presents its "High-Priority Alert Generation" VC to Agent Y. Agent Y verifies this VC.
    • Based on the critical alert, Agent Y's pre-approved VCs are activated. Agent Y determines the immediate need to isolate the compromised server.
    • Agent Y then interacts with the Unified Global Session Management and Policy Enforcement layer, managed by Agent Z. Agent Y requests permission to "initiate network segmentation" for the specific server.
    • Agent Z's layer, in real-time, checks Agent Y's VCs and the context (critical alert from a verified source). It dynamically grants Agent Y temporary, fine-grained access to modify only the necessary firewall rules to isolate that specific server, preventing lateral movement of the threat. This access is precisely scoped and time-limited.
  3. Containment and Accountability:
    • Agent Y executes the containment action. Every step taken by Agent Y (e.g., firewall rule modification, endpoint quarantine) is cryptographically logged and linked back to its DID, providing an immutable audit trail.
    • If, during the incident, Agent Y attempts an action outside its granted VCs (e.g., trying to access an unrelated database), the Unified Global Session Management and Policy Enforcement layer would immediately detect and block it, and Agent Z could instantly revoke any active sessions or permissions for Agent Y if its behavior deviates from policy.

This example demonstrates how DIDs provide verifiable identities, VCs enable dynamic and granular permissions, ZKPs ensure privacy during inter-agent communication, ANS facilitates secure discovery, and the global policy layer ensures real-time control and accountability in a complex, autonomous cybersecurity environment.

The future of AI hinges on such purpose-built solutions that prioritize verifiable identity, granular control, and accountability.

References:

 

 

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: