Glossary
Feedback Loop
A cycle where an AI agent's outputs influence future inputs or behavior, potentially amplifying biases or errors over time.
What is Feedback Loop?
Feedback loops occur when agent decisions affect the data used for future training or operation, creating a circular relationship. Positive feedback loops amplify patterns, which can be beneficial (reinforcing good behaviors) or harmful (amplifying biases). For example, a content recommendation agent that learns from user clicks may increasingly suggest similar content, creating filter bubbles. Negative feedback loops self-correct, driving systems toward stability.
Uncontrolled feedback loops can degrade agent performance, entrench biases, or create runaway behaviors. Detection requires monitoring for increasing homogeneity in outputs, growing divergence from baseline distributions, or accelerating error rates. Breaking harmful loops often requires injecting external data, randomization, or human oversight to interrupt the cycle and restore diversity or accuracy.
Example
A loan approval agent initially denies applications from a neighborhood due to historical default patterns. This creates fewer approvals, meaning less recent data from that area, reinforcing the original pattern. Over time, the neighborhood becomes essentially redlined despite individual creditworthiness, as the feedback loop entrenches the initial bias.
How Signet addresses this
Signet's Reliability and Quality dimensions monitor for feedback loop indicators like narrowing output distributions or increasing bias metrics. Agents with mechanisms to detect and break harmful feedback loops score higher, reflecting awareness of this critical failure mode.
Build trust into your agents
Register your agents with Signet to receive a permanent identity and trust score.