LLM API¶

The LLM API is the public interface to ARX's frontier-model HA pair. Every call is routed through the platform's LLM router: policy evaluated, audit logged, and automatically failed over when a vendor is down, rate-limited, or suspends an account. Callers reference logical tiers (frontier, fast, cheap) rather than vendor-specific model ids.

All endpoints are scoped to the authenticated user's organization. Provider credentials are stored encrypted (Fernet) in llm_credentials and decrypted only at call time — no credential ever leaves ARX in plaintext.

Chat Completion¶

Executes a chat completion with automatic Claude ↔ OpenAI failover.

Method: POST
Path: /v1/llm/chat
Required Role: Any authenticated user

Request Body¶

Field	Type	Required	Description
`messages`	`Message[]`	Yes	Chat history. At least one message required.
`model_tier`	`"frontier" \\| "fast" \\| "cheap"`	No	Tier → model-id resolution happens per-provider. Default: `frontier`.
`max_tokens`	`int`	No	1 – 200 000. Default: 4096.
`temperature`	`float`	No	0.0 – 2.0. Default: 0.7.
`tools`	`Tool[]`	No	Tool definitions. JSON-Schema `input_schema`; the provider adapter wraps as needed.
`tool_choice`	`string`	No	`"auto"`, `"any"`, `"none"`, or a specific tool name.
`stop_sequences`	`string[]`	No	Optional stop sequences.
`agent_id`	`UUID`	No	The agent making the call. Drives per-agent policy evaluation.
`request_id`	`string`	No	Caller-supplied correlation id; written to the audit row.
`metadata`	`object`	No	Free-form metadata, forwarded to the policy engine as `session_context`.

Message:

Field	Type	Description
`role`	`"system" \\| "user" \\| "assistant" \\| "tool"`	Standard chat role.
`content`	`string \\| object[]`	Text or structured content blocks.
`tool_call_id`	`string`	Required when `role == "tool"`.
`name`	`string`	Optional message name.

Example Request¶

curl -X POST https://api.arxsec.io/v1/llm/chat \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a terse SOC analyst."},
      {"role": "user",   "content": "Summarize the top 3 findings from last hour."}
    ],
    "model_tier": "frontier",
    "agent_id": "7a4d…"
  }'

Example Response¶

{
  "content": "1. …",
  "tool_calls": [],
  "finish_reason": "stop",
  "usage": {
    "prompt_tokens": 142,
    "completion_tokens": 78,
    "total_tokens": 220
  },
  "provider_used": "openai",
  "model_used": "gpt-5"
}

provider_used is the vendor that actually served the response. If it differs from the configured primary, the request was failed over. The same information is persisted on the audit row along with failover_hops and an attempts[] array.

Failover Behavior¶

Provider response	Action
`401` / `403`	Failover to next provider (covers vendor-side account suspension).
`429`	Failover (caller can't fix noisy-neighbor rate limiting).
`5xx`, timeout, connection error	Retry 3× with exponential backoff within the provider, then failover.
`400 invalid_request`	Re-raise as `400` to the caller — do not failover (deterministic).
Content-policy refusal	Re-raise as `400` — the secondary would refuse the same prompt.

Status Codes¶

Code	Meaning
`200`	Completion returned.
`400`	Deterministic caller error or malformed tool call.
`403`	Policy engine denied the call (`DENY` verdict).
`502`	Unexpected LLM error.
`503`	All providers exhausted (all in the failover order failed or are circuit-broken).

Credentials Management¶

Encrypted per-org API keys. Plaintext keys are never returned after creation — only a masked preview.

List Credentials¶

Method: GET
Path: /v1/llm/credentials

Returns all credentials scoped to the caller's org, each with a masked key preview (sk-ant-…abcd).

Create Credential¶

Method: POST
Path: /v1/llm/credentials

{
  "provider": "anthropic",
  "api_key":  "sk-ant-…",
  "label":    "production"
}

Response echoes the stored credential with the key already masked. The plaintext key is encrypted with the platform Fernet key before reaching storage.

Revoke Credential¶

Method: DELETE
Path: /v1/llm/credentials/{credential_id}

Soft-deletes (sets active = false). Existing in-flight requests that already loaded the key continue to work; subsequent lookups skip the row.

Failover Order¶

Each organization may override the platform-default failover order.

Get Order¶

Method: GET
Path: /v1/llm/provider-order

{
  "order":   ["anthropic", "openai"],
  "source":  "org",
  "default": ["anthropic", "openai"]
}

source is "org" when the org has set an explicit override, "default" when falling back to the platform configuration.

Set Order¶

Method: PUT
Path: /v1/llm/provider-order

{ "order": ["openai", "anthropic"] }

Provider Health¶

Exposes recent probe results and failover events. Probes run every 60 s via Celery beat against a cheap model; failures feed the shared Redis circuit breaker.

Method: GET
Path: /v1/llm/health?hours=24

Response¶

Field	Type	Description
`latest`	`ProviderHealthSnapshot[]`	Newest snapshot per provider.
`history`	`ProviderHealthSnapshot[]`	All snapshots in the window (max 500).
`recent_failovers`	`FailoverEvent[]`	Audit rows from the caller's org where `failover_hops > 0`.

ProviderHealthSnapshot fields: provider, status (healthy | degraded | unhealthy), latency_ms, error, probe_kind, checked_at.

FailoverEvent fields: created_at, provider_used, failover_hops, primary_error, duration_ms.

Audit API — per-call rows with provider_used, failover_hops, and the attempts chain.
Platform overview — Model Continuity — narrative description of the HA pair.