Five pillars. One platform. The infrastructure your engineers expect for human employees, applied to the digital ones — quantified risk on every connector call, server-side approval gates, drift detection against declared intent, a hash-chained personnel record, and clean termination. The first three pillars take you from Stage 02 — Stuck — to Stage 03 — Operating — in week one. Not sure where you are? Find your stage in 5 minutes →
Before ARX: no registry, no scoped credentials, no audit trail. Agents running that the security team cannot account for, stuck in vendor review with evidence in screenshots and Slack threads. The primitives below are how ARX moves you forward — from Stuck to Operating in week one, then compounding into Scaling.
Risk scoring is what makes "Enforced" enforceable. The score is the spine of the platform. It is computed before the call leaves the connector, from four signals you can audit and tune: the operation's blast radius (read vs. write vs. destructive vs. containment), the connector's data sensitivity, this session's action frequency, and the target system's criticality. Crossing your configured threshold doesn't fire an alert. It deterministically denies the call, and writes the verdict to the audit trail.
# From: governance/risk-scoring — the formula is open and explicit. risk_score = operation_risk # 0–40 by op type + connector_sensitivity # 0–25 by connector class + session_frequency # 0–15 escalates with action count + target_sensitivity # 0–20 prod > staging > dev if risk_score >= policy.deny_threshold: verdict = "DENY" # automated containment elif risk_score >= policy.review_threshold: verdict = "APPROVAL_REQUIRED" # gray zone → human else: verdict = "ALLOW" # 94% of calls in production
Pick the deny and review thresholds that match your appetite. Simulate against the last 30 days before enforcing.
Every call's score, formula inputs, and verdict are written to the immutable trail. Your auditor can recompute and verify.
Scoped permissions, like an engineer's least-privilege access: an agent that chooses when to call its own approval gate will eventually choose not to. In Arx, policy is enforced server-side — inside the connector that holds the credential. There is nothing for the agent to route around.
# Policy: servicenow.change.close over 1 ticket needs approval. # Evaluated inside the connector, before hitting ServiceNow. policy "change-close-over-one": when: servicenow.change.close if: payload.change_ids.length > 1 then: approval_required(severity="high") approvers=group("secops-leads") timeout="4h" diff_shown_to_approver=true
Run a proposed policy against the last 30 days of agent activity. See what would have been blocked, approved, and queued — before anything is enforced.
The exact payload diffed against current state; the agent's reasoning trace; the cascade blast radius. Nothing else.
A remediation agent (e.g. Cloud Posture Remediation) can request temporary write — say, S3 write to fix a misconfigured bucket — with a 15-minute TTL. The approver sees the exact scope. The grant auto-reverts when the TTL expires. Arx grants the permission; your agent does the fix; the audit row proves both.
If no human responds inside the request's TTL, the request is automatically declined and recorded as such. Silence is not consent.
The intent manifest says what an agent should be doing. Drift detection compares it against what the agent is doing and assigns a severity. See also drift detection. Low and medium drift raise alerts and throttle the agent. High and critical drift do something stronger: they suspend the agent immediately. Subsequent connector calls are denied at the intercept layer with a risk score of 100, before policy evaluation even runs.
# Drift response actions, by severity. Set in policy. severity: "low" → action: "alert" severity: "medium" → action: "throttle" severity: "high" → action: "suspend" # agent.status = suspended severity: "critical" → action: "suspend" # + drift.detected webhook fired # A suspended agent must be manually reactivated by an admin # after investigation. There is no auto-resume.
Most behavioral-drift systems compare against rolling history, which means the first time an agent misbehaves the system learns that misbehavior is normal. Arx compares against the declared manifest. The bar doesn't move.
Suspending an agent stops the bleeding; it does not fix what caused the deviation. That is correct. Arx's job is to deny and contain. Your remediation agent's job is to fix — under a scoped grant, with full audit.
An immutable record of what each agent did — like a personnel file for machines. Every platform action is hashed into a chain. The tip is signed and published every five minutes to a witness bucket in your account that Arx can write to, but not read or delete. Drift against baseline is tracked separately; see drift detection. Integrity is something you verify — not something we promise.
from arxsec.verify import verify_chain result = verify_chain( witness_bucket="s3://bank-grc/arx-witness/", arx_export="exports/2026-04-14.ndjson.gz", ) assert result.tip_matches_witness # True assert result.no_gaps # True assert result.signatures_valid # True
The registry is what your CISO sees when the board asks what's running. Each agent has a named owner, a declared connector graph, a blast radius, and a live health signal. Ownership is required at registration — not retroactively reconstructed during the next incident. The manifest is also the bar that drift detection measures against.
Agents register themselves with a manifest. Anything not in the manifest fails closed. Drift shows up as a control violation, not a mystery.
env:prod/region:us-east works as expected. Filter, group, and attest the fleet along whatever axis your org already cares about.
Every agent's connector graph renders as a single diagram. Read vs. write, gated vs. open, PII-touching vs. not — visible at a glance.
Registry state is immutable per-version. Rollback is a first-class operation with a bound control attestation that travels with it.
Agent-native attestation across Safety, Security, Reliability, Accountability, Data Privacy, Society — crosswalked to NIST AI RMF and ISO 42001, evidence emitted per-agent.
Connectors are SDK-shaped on the agent side and policy-enforced on the platform side. Secrets never leave Arx; the agent receives a short-lived handle. Rotation is a platform operation, not a deploy.
Compliance is the byproduct, not the product — the inherited SOC 2 / HIPAA / ISO posture on Aptible is the floor; Arx maps controls to what your agents actually do. Once risk is quantified, contained, and audit-traceable, the framework mapping writes itself. Static analysis reads your agent's Python source and Dockerfile, builds a connector graph, and produces a per-control mapping with evidence pointers to specific line ranges. 78 of 113 SOC 2 Type II controls pre-mapped on day one. The rest are explicitly marked as human-owned — because most of them are.
CC6.1, CC6.7, CC7.2, CC8.1 and 74 others bound to source spans with hash pinning.
GOVERN, MAP, MEASURE, MANAGE operationalized per-agent, with a workbook per release.
Annex A plus 42001 AI-management controls mapped 1:1 against deployed policies.
Risk classification per-agent; high-risk agents ship with conformity evidence attached.
Your agent doesn't quit when its LLM vendor blips — failover is part of the runtime contract, like any other dependency.
On April 17, 2026, Anthropic's automated systems suspended a legitimate enterprise org with sixty-plus accounts. Access was restored five hours later — after a Twitter thread. For a security team running agents against live production infrastructure, five hours of frontier-model downtime is not a degraded experience. It is an outage with a named owner.
Arx treats the frontier model the way it treats any other vendor: as a connector that can fail. Every LLM call flows through a provider-neutral router. Primary is whoever you choose (Anthropic by default). On a transient failure, the next provider in order takes over before the agent's code ever sees the error.
# Audit row after a failover. The agent never knew. action_type: llm.chat connector: llm model_tier: frontier provider_used: openai failover_hops: 1 attempts: [ { provider: "anthropic", kind: "LLMAuthError", status_code: 403 }, { provider: "openai", ok: true, latency_ms: 712 } ] usage: { total_tokens: 914 }
401, 403, 429, 5xx, timeout → try the next provider. 400 and content-policy rejections re-raise immediately — the secondary would reject them too, and silent re-routing is how compliance violations happen.
After five failures in sixty seconds, Arx marks a provider unhealthy for two minutes and stops sending requests to it. State lives in Redis so one worker's failures inform the entire fleet.
Customers with an Anthropic-first contract keep it. Customers who prefer OpenAI-first flip the order from the dashboard. Keys live encrypted in the platform vault; rotation is one click, never a redeploy.
Every call writes an audit row with provider_used, the failover hop count, and the attempt chain. Your auditor does not need to trust us. They can see the handoff.
An agent's offboarding is the same hygiene you'd expect for a departing engineer — and the same liability if it's done badly. ARX makes termination a one-click platform operation: credentials are revoked at the connector, short-lived handles invalidated immediately, and the agent's personnel record is sealed and retained per your retention policy. The audit chain captures the termination event itself, so an auditor six months later can see who decommissioned the agent, when, and why.
# Termination is a platform op. Effect propagates to every connector. agent.decommission( reason="replaced by triage-cs-02", retention_class="sox-7yr", ) # → credentials revoked across 7 connectors # → in-flight handles invalidated within ~30s # → personnel record sealed, hash-chained, witness-signed # → retention policy applied: 7 years, immutable
Termination is distinct from rotation. Rotation issues a new credential under the same identity; termination invalidates the identity itself. Both are platform operations, both are auditable, neither requires a deploy.
The personnel record stays in the audit chain — sealed for retention, queryable for audit, immutable by anyone including ARX. The same hash-chain integrity that protected it while live continues after termination.
If the decommissioned agent is being replaced, the replacement's record carries a parent pointer. The lineage is queryable: an auditor can walk from the live agent back through every prior version and the reasons each was retired.
Any time-bound remediation grants the agent held are revoked at termination, not at TTL expiry. The window between "we shut it down" and "the grant actually elapsed" is zero.
We run on Aptible's SOC 2 Type II / HIPAA-certified infrastructure. Our own controls are independently audited annually, with continuous evidence packets available to your GRC team. This is the base layer; your agents build on it, not next to it.
AIUC-1 in flight. Readiness completion targeted Q3 2026; Type II report following the observation window, Q2 2027. Pursued under the platform / infrastructure track, with a thin builder-track scope for our own LLM router and MCP server.
Annual third-party pentests; executive summary available under NDA.
Transparent subprocessor list with 30-day change notification.
Bring your own key in AWS, Azure, or GCP. Arx never holds plaintext.
US and EU regions available; deployment-scoped, not tenant-scoped.
We'll spin up a sandbox workspace, ingest one of your Python agents, and generate the evidence bundle you'd ship to review.