LLM API

The LLM API is the public interface to ARX's frontier-model HA pair. Every call is routed through the platform's LLM router: policy evaluated, audit logged, and automatically failed over when a vendor is down, rate-limited, or suspends an account. Callers reference logical tiers (frontier, fast, cheap) rather than vendor-specific model ids.

All endpoints are scoped to the authenticated user's organization. Provider credentials are stored encrypted (Fernet) in llm_credentials and decrypted only at call time — no credential ever leaves ARX in plaintext.

Chat Completion

Executes a chat completion with automatic Claude ↔ OpenAI failover.

Request Body

Field Type Required Description
messages Message[] Yes Chat history. At least one message required.
model_tier "frontier" \| "fast" \| "cheap" No Tier → model-id resolution happens per-provider. Default: frontier.
max_tokens int No 1 – 200 000. Default: 4096.
temperature float No 0.0 – 2.0. Default: 0.7.
tools Tool[] No Tool definitions. JSON-Schema input_schema; the provider adapter wraps as needed.
tool_choice string No "auto", "any", "none", or a specific tool name.
stop_sequences string[] No Optional stop sequences.
agent_id UUID No The agent making the call. Drives per-agent policy evaluation.
request_id string No Caller-supplied correlation id; written to the audit row.
metadata object No Free-form metadata, forwarded to the policy engine as session_context.

Message:

Field Type Description
role "system" \| "user" \| "assistant" \| "tool" Standard chat role.
content string \| object[] Text or structured content blocks.
tool_call_id string Required when role == "tool".
name string Optional message name.

Example Request

curl -X POST https://api.arxsec.io/v1/llm/chat \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a terse SOC analyst."},
      {"role": "user",   "content": "Summarize the top 3 findings from last hour."}
    ],
    "model_tier": "frontier",
    "agent_id": "7a4d…"
  }'

Example Response

{
  "content": "1. …",
  "tool_calls": [],
  "finish_reason": "stop",
  "usage": {
    "prompt_tokens": 142,
    "completion_tokens": 78,
    "total_tokens": 220
  },
  "provider_used": "openai",
  "model_used": "gpt-5"
}

provider_used is the vendor that actually served the response. If it differs from the configured primary, the request was failed over. The same information is persisted on the audit row along with failover_hops and an attempts[] array.

Failover Behavior

Provider response Action
401 / 403 Failover to next provider (covers vendor-side account suspension).
429 Failover (caller can't fix noisy-neighbor rate limiting).
5xx, timeout, connection error Retry 3× with exponential backoff within the provider, then failover.
400 invalid_request Re-raise as 400 to the caller — do not failover (deterministic).
Content-policy refusal Re-raise as 400 — the secondary would refuse the same prompt.

Status Codes

Code Meaning
200 Completion returned.
400 Deterministic caller error or malformed tool call.
403 Policy engine denied the call (DENY verdict).
502 Unexpected LLM error.
503 All providers exhausted (all in the failover order failed or are circuit-broken).

Credentials Management

Encrypted per-org API keys. Plaintext keys are never returned after creation — only a masked preview.

List Credentials

Returns all credentials scoped to the caller's org, each with a masked key preview (sk-ant-…abcd).

Create Credential

{
  "provider": "anthropic",
  "api_key":  "sk-ant-…",
  "label":    "production"
}

Response echoes the stored credential with the key already masked. The plaintext key is encrypted with the platform Fernet key before reaching storage.

Revoke Credential

Soft-deletes (sets active = false). Existing in-flight requests that already loaded the key continue to work; subsequent lookups skip the row.

Failover Order

Each organization may override the platform-default failover order.

Get Order

{
  "order":   ["anthropic", "openai"],
  "source":  "org",
  "default": ["anthropic", "openai"]
}

source is "org" when the org has set an explicit override, "default" when falling back to the platform configuration.

Set Order

{ "order": ["openai", "anthropic"] }

Provider Health

Exposes recent probe results and failover events. Probes run every 60 s via Celery beat against a cheap model; failures feed the shared Redis circuit breaker.

Response

Field Type Description
latest ProviderHealthSnapshot[] Newest snapshot per provider.
history ProviderHealthSnapshot[] All snapshots in the window (max 500).
recent_failovers FailoverEvent[] Audit rows from the caller's org where failover_hops > 0.

ProviderHealthSnapshot fields: provider, status (healthy | degraded | unhealthy), latency_ms, error, probe_kind, checked_at.

FailoverEvent fields: created_at, provider_used, failover_hops, primary_error, duration_ms.