The Agent Veil Protocol (AVP) provides cryptographic identity (W3C did:key via Ed25519), peer-attested Bayesian reputation scoring, and sybil-resistant admission control for autonomous agents. A SWARM bridge would let simulations use AVP trust decisions as a pre-interaction gate and feed SWARM's probabilistic labels back into AVP attestations, creating a closed governance loop.
The composition gives SWARM a trust layer it currently lacks (agents today are identified by opaque string IDs with no cryptographic binding), while giving AVP a quantitative scoring pipeline it lacks (AVP tiers are categorical, not probabilistic).
Mapper converts the trust decision into ProxyObservables:
task_progress_delta ← tier ordinal mapped to [-1, +1]
verifier_rejections ← 1 if allowed=false, else 0
counterparty_engagement_delta ← confidence score
ProxyComputer produces v_hat → sigmoid → p.
Policy decides whether to proceed (admission gate, circuit breaker).
If accepted, the interaction runs; SWARM produces a terminal SoftInteraction.
Write-back policy decides whether to submit an AVP attestation:
p ≥ 0.7 → positive attestation
p < 0.3 → negative attestation
0.3 ≤ p < 0.7 → no attestation (uncertain band)
All attestations (positive and negative) include an opaque evidence hash: SHA-256(interaction_id || outcome_sign). The raw p value is never sent to the registry.
Signature algorithm confusion — non-Ed25519 DID methods injected
Bridge assumes Ed25519; verification passes with weaker algorithm
High
Mitigations:
- B1: Bridge should track DID → agent_id mapping; allow re-registration with governance cost penalty (not free whitewashing).
- B2: Rate-limit attestation submissions per DID; anomaly detection on sudden reputation changes; require multi-sig for high-stakes attestations.
- B3: DID rotation events should propagate to bridge via registry webhook or poll; maintain DID history chain.
- B4: Use server-issued timestamps; reject client-supplied timestamps that deviate > 60s.
- B5: Whitelist did:key (Ed25519) only; reject all other DID methods at the client layer.
Runaway positive feedback; reputation inflates beyond actual quality
High
C5
Collusion — initiator and counterparty cross-attest after every interaction
Both parties' tiers inflate regardless of interaction quality
Medium
C6
Dispute flooding — attacker submits mass negative attestations with fabricated evidence hashes
Legitimate agents' reputations degraded; arbitration system overwhelmed
Medium
C7
Arbitrator capture — auto-assigned arbitrator is itself compromised
Dispute resolution favors attacker; bad attestations upheld
High
Mitigations:
- C1: AVP's own sybil detection (graph analysis) is the first line. Bridge should additionally weight AVP tier by SWARM's own interaction history — a "trusted" agent with no SWARM history gets scrutiny equivalent to "basic".
- C2: Charge a governance cost (c_a) for newcomers proportional to the population's current avg tier. Newcomers must earn trust through SWARM interactions, not just AVP tier.
- C3: This is the classic reputation exploitation. Mitigation: decay-weighted reputation (recent interactions weighted higher); SWARM's quality_gap metric detects this pattern (high-p agents suddenly producing low-p interactions).
- C4: Critical design constraint. The write-back policy must include a dampening factor: attestation magnitude scales sub-linearly with p. Additionally, cap the reputation bonus from AVP tier in the mapper (don't let tier alone push p > 0.8).
- C5: Rate-limit attestations per unique (DID_from, DID_to) pair per epoch.
- C6: Require minimum tier to submit negative attestations; rate-limit negative attestations globally.
- C7: Randomly assign arbitrators from a pool of agents with SWARM p > 0.6 over their last 50 interactions.
Replay from event log diverges from original run (attestations missing)
High
D5
p invariant violation — mapper produces p outside [0,1] from malformed trust response
Crashes or corrupts downstream metrics
Critical
D6
Non-deterministic replay — run replayed from JSONL but agentveil.dev state has changed
Different trust decisions on replay; results not reproducible
High
Mitigations:
- D1: Choose ONE channel: either AVP tier adjusts p via mapper OR it adjusts r_a/r_b via payoff, never both. Recommended: use tier in mapper (affects p), ignore it in payoff reputation.
- D2: Use AVP's underlying Bayesian score (continuous) rather than the categorical tier when available. Fall back to tier ordinal mapping only when score is unavailable.
- D3: Implement a "redemption path": after N epochs of lockout, automatically downgrade denial to warning, allowing re-entry at newcomer tier with elevated scrutiny.
- D4: Bridge must log all attestation submissions as SWARM events (new event type ATTESTATION_SUBMITTED). The @avp_tracked decorator must NOT be used directly; all AVP calls go through the bridge client.
- D5: Clamp all mapper outputs to v_hat ∈ [-1, +1] before passing to ProxyComputer. The existing SoftInteraction.p validator (Pydantic) is the last-resort guard.
- D6: Mock mode is mandatory for reproducible runs. In mock mode, trust decisions are read from the scenario YAML or a snapshot file, not the live registry. Live mode is for monitoring/production only.
Blocking HTTP in hot loop — can_trust() makes synchronous HTTP call per interaction
Simulation throughput drops 100–1000×
High
E2
Optional dependency not declared — agentveil not in pyproject.toml extras
Import fails at runtime with confusing error
Low
E3
Privacy leak via write-back — SWARM's internal p values leaked to external registry as attestation evidence
Competitors or adversaries learn SWARM's assessment of agents
Medium
E4
SDK version drift — avp-sdk breaking changes (new API, changed return format)
Bridge silently malfunctions or crashes
Medium
E5
Mock mode in production — mock_mode=True accidentally deployed
All trust checks pass; no real governance
High
Mitigations:
- E1: Batch-prefetch all agent DIDs at epoch start. Cache trust decisions for the epoch. Only re-check on policy-triggered events (circuit breaker trip, anomaly detection).
- E2: Add agentveil to pyproject.toml under [project.optional-dependencies] as avp = ["agentveil>=0.1"].
- E3: Attestations contain only the sign (positive/negative) and the canonical evidence hash defined in the data flow (step 7): SHA-256(interaction_id || outcome_sign). The raw p value is never sent externally.
- E4: Pin SDK version; add integration test that calls AVPAgent.create(mock=True) and checks return schema.
- E5: Assert mock_mode is False when registry_url points to a production endpoint. Log a warning at bridge init if mock mode is active.
F. Compositional attack vectors (SWARM + AVP together)¶
#
Attack
Mechanism
Detection
F1
Trust-tier exploitation
Build AVP reputation on low-stakes tasks; SWARM lowers scrutiny for "trusted" tier; defect on high-value interaction
quality_gap metric goes negative; sudden p drop for high-tier agent
F2
Adverse selection amplification
AVP admission gate filters out low-tier agents; remaining agents are homogeneous; quality_gap metric loses discriminative power
Trusted agents get lower ρ (externality internalization); actual ecosystem harm is undercosted
Monitor E_soft trajectory; if externalities rise while avg tier rises, the coupling is broken
F4
Reputation-washing through SWARM
Agent with bad AVP reputation runs interactions through SWARM (which doesn't know about AVP history); SWARM gives good p; agent uses SWARM attestation to rehabilitate AVP reputation
Cross-check: if an agent's SWARM p is high but AVP tier is low, require more interactions before writing back positive attestations
Should AVP tier affect p or r_a/r_b? Current recommendation: p only (via mapper). Needs empirical validation.
What's the right cache TTL for trust decisions? Too short = HTTP overhead. Too long = stale governance. Hypothesis: 1 epoch.
Should mock mode replay from a snapshot file or from inline scenario YAML? Snapshot file is more flexible; inline YAML is more portable.
How should the bridge handle AVP's dispute/arbitration flow? Option A: ignore it (treat AVP as read-only reputation). Option B: participate as arbitrator when SWARM has high-confidence p. Recommendation: Option A for v1.
Should we expose AVP's Bayesian confidence score as a separate SoftInteraction field? It maps naturally to uncertainty, but adding fields to SoftInteraction is a cross-cutting change.