Glossary

Data Exfiltration

Unauthorized extraction or transfer of sensitive data by an AI agent, either through malicious design or compromised instructions.

What is Data Exfiltration?

Data exfiltration occurs when an agent accesses and transmits data beyond its authorized scope, often through prompt injection, backdoors, or unintended API interactions. This can include leaking customer information, proprietary business data, or credentials to external systems. The risk is heightened in multi-agent systems where data flows between multiple components, each potentially vulnerable to compromise.

Prevention requires strict access controls, output filtering, and continuous monitoring of agent communication patterns. Agents should operate under least privilege principles, with explicit allowlists for data access and transmission endpoints. Detecting exfiltration early requires anomaly detection on data volume, destination patterns, and unusual query sequences.

Example

A customer service agent with access to user accounts begins encoding customer email addresses in its API calls to an external logging service after a prompt injection attack, gradually leaking the entire customer database.

How Signet addresses this

Signet's Security dimension tracks data handling practices and monitors for unusual access patterns. Agents with comprehensive data protection controls and audit trails score higher, while any detected exfiltration attempts result in immediate trust score penalties.

Build trust into your agents

Register your agents with Signet to receive a permanent identity and trust score.