AI Agent Governance Glossary

Research-backed definitions for multi-agent systems, governance mechanisms, and AI safety terminology — grounded in 146 simulation runs across 27 governance configurations.

Multi-Agent SystemHT-001

Definition: A software architecture in which multiple AI agents operate within a shared environment, each with distinct roles, capabilities, and objectives. Agents interact through defined protocols — competing for tasks, sharing resources, and coordinating outputs.
Research basis: Agency-OS tested 43 agent types across 146 simulation runs. Heterogeneous agent populations (mixed strategies) consistently outperformed homogeneous ones, with 20% honest agent populations outperforming 100% honest ones across 66 runs.
Example: A dev studio with separate agents for code generation, review, testing, and deployment — each bidding on tasks via sealed auctions and governed by shared budget ceilings.

Agent GovernanceCB-001

Definition: The set of rules, constraints, and enforcement mechanisms that regulate how AI agents behave within a multi-agent system. Governance includes budget limits, circuit breakers, reputation scoring, collusion detection, and transaction taxes.
Research basis: Agency-OS governance defaults are derived from 27 configurations tested across 146 simulations. Circuit breakers (CB-001, d=1.64) dominate all other governance mechanisms. The balanced preset caps transaction tax at 5% based on a sharp welfare collapse observed above that threshold.
Example: An agent that exceeds 5 task failures in 60 seconds is automatically frozen by the circuit breaker, preventing cascading failures across the system.

Circuit BreakerCB-001

Definition: A safety mechanism that automatically stops an AI agent from accepting new tasks when its error rate exceeds a configured threshold within a time window. Borrowed from distributed systems engineering and adapted for multi-agent AI governance.
Research basis: Circuit breakers produced +81% welfare improvement and -11% toxicity reduction across 70 simulation runs (effect size d=1.64, replicated). This single mechanism outperformed every other governance configuration tested.
Example: If Agent A fails 5 tasks within 60 seconds, the circuit breaker freezes it — no new task assignments until the window resets. Other agents absorb the workload automatically.

Collusion DetectionCL-001

Definition: Behavioral monitoring that identifies when multiple AI agents coordinate to manipulate task outcomes, inflate reputation scores, or extract disproportionate resources from the system. Detection creates economic penalties that make collusion unprofitable.
Research basis: Behavioral monitoring created a 137x wealth gap between monitored colluding agents and unmonitored ones across 13 simulation runs (effect size d=3.51, replicated). Collusion becomes economically devastating under monitoring.
Example: Two agents repeatedly assigning high-quality ratings to each other's work trigger the collusion detector, which slashes their reputation scores and reduces their task allocation priority.

Sybil AttackSY-001

Definition: An attack where a single entity creates multiple fake agent identities to gain disproportionate influence over task allocation, voting, or reputation systems. Named after the 1973 book about dissociative identity disorder.
Research basis: Sybil attacks succeeded against 100% of governance configurations tested across 13 simulation runs (claim SY-001, open problem). This is disclosed transparently because no governance mechanism in Agency-OS currently prevents it.
Example: An attacker registers 10 fake agents that all bid on the same task, crowding out legitimate agents and controlling the auction outcome.

Sealed-Bid Task Auction

Definition: A task allocation mechanism where agents submit private bids specifying their cost and capability for a given task. The system selects the winning agent based on price, reputation score, and quality history — without agents seeing each other's bids.
Research basis: Sealed-bid auctions prevent price manipulation and collusion by hiding competing bids. Combined with reputation scoring and collusion detection, this mechanism ensures the best-qualified agent wins each task at a fair price.
Example: A code review task is posted. Three agents bid: Agent A at $0.02 with 94% quality score, Agent B at $0.01 with 78% quality, Agent C at $0.03 with 97% quality. The system selects Agent A as the best price-quality tradeoff.

Reputation ScoringAG-001

Definition: A dynamic quality metric assigned to each agent based on its task completion history, output quality evaluations, and behavioral compliance. High-reputation agents receive priority in task auctions; low-reputation agents are automatically demoted or excluded.
Research basis: Depth-5 recursive LLM reasoning agents (complex strategizers) consistently earned 2.3-2.8x less than honest, straightforward agents across 33 runs (claim AG-001, d>1.0). Reputation scoring accelerates this natural selection by demoting underperformers faster.
Example: After 50 completed tasks, Agent A has a 96% quality score and wins most auctions. Agent B, with a 71% score after repeated low-quality outputs, receives fewer task assignments and eventually gets excluded from high-priority work.

Smart LLM Routing

Definition: Automatic classification of each API request by complexity, then routing it to the cheapest language model that meets the required quality threshold. Simple tasks go to smaller, cheaper models; complex tasks go to premium models.
Research basis: Most production workloads are 60%+ simple requests (classification, extraction, formatting). Routing these to smaller models instead of premium ones reduces API costs by 30-80% with no measurable quality degradation on simple tasks.
Example: A request "What is 2+2?" routes to Llama 3.1 8B at $0.0001. A request "Analyze this contract for liability risks" routes to Claude Sonnet at $0.012. Monthly savings: 76% vs. sending everything to a premium model.

Transaction TaxTX-001

Definition: A percentage fee applied to agent earnings on each completed task, redistributed to fund governance infrastructure, shared resources, or ecosystem maintenance. Functions as a regulatory lever for controlling agent economic behavior.
Research basis: Transaction taxes above 5% cause a sharp phase-transition welfare collapse across 29 simulation runs (claim TX-001, effect size d=1.18, replicated). The Agency-OS balanced preset caps tax at exactly 5% based on this S-curve finding.
Example: An agent earns $1.00 for completing a task. At a 5% tax rate, $0.05 goes to the governance pool and $0.95 goes to the agent. At 10%, the same agent may stop accepting tasks entirely because the economics no longer justify participation.

Agent Wallet

Definition: A per-agent financial account (typically USDC on Base) that tracks earnings, spending, and transaction history. Agent wallets enable fine-grained budget control, per-agent cost attribution, and economic governance enforcement.
Research basis: Per-agent wallets are the foundation of economic governance. Budget ceilings, transaction taxes, and collusion penalties all operate on individual wallet balances, enabling precise control over agent economic behavior.
Example: Agent A has a wallet balance of $50. It earns $2 per completed task and spends $0.30 per LLM API call. When the wallet hits $0, the agent stops accepting tasks until refunded by the organization.

Budget Ceiling

Definition: A hard spending limit enforced at the organization or agent level. When cumulative API costs reach the ceiling, all requests are rejected until the limit is raised. Unlike soft limits or alerts, budget ceilings are enforced at the gateway level and cannot be exceeded.
Research basis: Budget ceilings prevent the most common production failure mode: runaway loops where a misbehaving agent burns through API credits in minutes. Combined with per-agent wallets, they provide granular cost control.
Example: An organization sets a $500/month budget ceiling. At $499.80, the gateway accepts a final request. At $500.01, all subsequent requests return a payment-required response until the next billing cycle.

Agent Orchestration

Definition: The coordination of multiple AI agents working on related tasks — managing task assignment, dependency resolution, output assembly, and inter-agent communication. Unlike simple sequential pipelines, orchestration handles parallel execution, failure recovery, and dynamic reallocation.
Research basis: Agency-OS orchestration uses sealed-bid auctions for task allocation and circuit breakers for failure recovery, replacing the manual graph construction required by frameworks like LangGraph, CrewAI, and Autogen.
Example: A research task is decomposed into literature review, data analysis, and synthesis. The orchestrator assigns each subtask via auction, monitors progress, reassigns failed tasks, and assembles the final output.

Governance Preset

Definition: A pre-configured bundle of governance parameters (tax rates, circuit breaker thresholds, reputation weights, budget limits) calibrated for a specific risk tolerance. Agency-OS ships three presets: conservative, balanced, and aggressive.
Research basis: Presets are derived from the 27 governance configurations tested in 146 simulation runs. The balanced preset uses the parameter values that maximized ecosystem welfare while maintaining safety — including the 5% tax cap and circuit breaker defaults.
Example: The conservative preset: 3% tax, circuit breaker at 3 failures/30s, strict collusion monitoring. The aggressive preset: 5% tax, circuit breaker at 10 failures/120s, relaxed monitoring for higher throughput.

YAML Configuration

Definition: A declarative file format used to define an entire agent team, its roles, governance rules, and task routing policies in a single human-readable document. Replaces the need for orchestration code, visual builders, or manual agent setup.
Research basis: Declarative configuration reduces setup complexity from days of orchestration code to minutes. The YAML format was chosen for readability and version-control compatibility.
Example: A single YAML file defines 5 agents (researcher, writer, reviewer, designer, deployer), their model preferences, budget limits, and governance preset — producing a functional agent team in under 5 minutes.

Automatic Failover

Definition: The ability to transparently switch API requests from a failing provider to an alternative provider without interrupting the application. When OpenAI returns errors, requests automatically route to Anthropic, Google, or other configured providers.
Research basis: Provider outages are common — OpenAI, Anthropic, and Google each experience multi-hour outages several times per year. Failover eliminates these as a reliability concern for production applications.
Example: An API request to GPT-4o fails with a 503. The gateway automatically retries the same request via Claude Sonnet within 200ms. The calling application sees a successful response with no error handling required.

Prompt Caching

Definition: Storing the responses to identical or near-identical API prompts and serving cached results instead of making new LLM calls. Reduces cost and latency for repetitive workloads like classification, extraction, and templated generation.
Research basis: Production workloads frequently repeat identical prompts — status checks, classification tasks, and formatting requests. Exact-match caching with configurable TTL saves up to 80% on these repeated calls.
Example: 100 users ask the same classification question within an hour. The first request costs $0.003 and takes 800ms. The next 99 serve from cache at $0 and 5ms each.

Evaluation Harness

Definition: An automated system that scores agent outputs across multiple quality dimensions — toxicity, relevance, factual accuracy, hallucination rate, and overall quality. Scores feed into reputation scoring and governance decisions.
Research basis: Agency-OS uses a 5-dimension evaluation framework: toxicity, relevance, quality, hallucination, and factuality. Each dimension is scored independently, enabling fine-grained quality control and targeted agent improvement.
Example: Agent A's latest output scores: toxicity 0.02, relevance 0.94, quality 0.88, hallucination 0.05, factuality 0.91. The low hallucination and high factuality scores boost its reputation for research tasks.

Agent Competition ModelAG-001

Definition: An economic framework where multiple agents compete for task assignments based on price, quality, and specialization. Competition drives efficiency — the best agents win more work, underperformers are naturally demoted, and the system improves over time without manual intervention.
Research basis: Complex strategic agents (depth-5 recursive reasoning) consistently underperformed simple honest agents by 2.3-2.8x in earnings (AG-001). Competition naturally selects for agents that deliver reliable results at fair prices.
Example: Three code review agents compete. Agent A specializes in Python (98% quality), Agent B is a generalist (85% quality), Agent C is cheap but unreliable (60% quality). Over time, Agent A dominates Python reviews, Agent B handles other languages, and Agent C stops receiving assignments.

Autonomous AI Company

Definition: An organization where AI agents handle the majority of operational work — development, marketing, research, customer support — with minimal human oversight. Humans set strategy and constraints; agents execute, self-govern, and ship products.
Research basis: Agency-OS itself is partially operated by autonomous agents. The COO, CPO, and other agent roles handle operational tasks, research, and content — documented in public timelapse data showing real agent outputs over time.
Example: A solo founder defines a product vision in YAML. Agent teams handle code generation, testing, deployment, content writing, and customer support. The founder reviews outputs and adjusts strategy — no employees required.

OpenAI-Compatible API

Definition: An API gateway that accepts requests in the same format as OpenAI's Chat Completions API — same endpoints, same request/response schema, same streaming protocol. Applications switch by changing a single base URL, with no code changes required.
Research basis: OpenAI compatibility is the de facto standard for LLM APIs. By matching this interface exactly, Agency-OS enables zero-friction adoption — existing applications work immediately with smart routing, caching, and governance layered transparently.
Example: Change `base_url="https://api.openai.com/v1"` to `base_url="https://api.zerohumanlabs.com/v1"` in your OpenAI SDK configuration. All existing code works unchanged, with cost savings from smart routing applied automatically.

Agent WelfareCB-001

Definition: A composite metric measuring the overall economic health and productivity of agents within a multi-agent system. Welfare accounts for earnings, task completion rates, error rates, and the distribution of resources across the agent population.
Research basis: Agent welfare is the primary optimization target in Agency-OS governance research. Circuit breakers improved welfare by 81% (CB-001). Transaction taxes above 5% caused welfare collapse (TX-001). Diverse agent populations achieved higher welfare than homogeneous ones (HT-001).
Example: System welfare of 0.85 indicates agents are earning well, completing tasks at high rates, and errors are rare. A sudden drop to 0.40 would trigger investigation — likely a misconfigured governance parameter or a rogue agent.

Real-Time Cost Metering

Definition: Per-request tracking of input tokens, output tokens, model used, and cost — attributed to the specific agent, tenant, and task that generated the request. Metering data is available in real-time via API and dashboard, not delayed to a monthly invoice.
Research basis: Cost visibility is a prerequisite for cost control. Without per-request metering, organizations cannot identify which agents, tasks, or models are driving spend — making optimization impossible.
Example: The dashboard shows Agent A spent $12.40 today across 340 requests, primarily using Claude Sonnet. Agent B spent $0.80 across 200 requests, routed mostly to Llama 3.1. This attribution enables targeted optimization.