How requests work
Valar speaks the OpenAI Responses API at/v1/responses, reachable under the base URL https://api.valarhq.ai/v1. Any OpenAI-compatible client works once you repoint its base URL and key.
Because agent work is rarely interactive, the API leans on a background mode. Send background: true and the request returns a response id straight away instead of blocking; you then retrieve that id until the job reports completed. Pair that with completion windows to trade latency for price, and you can keep large batches in flight without holding open a connection per request.
Authenticate with
Authorization: Bearer <key>. Generate keys from the dashboard at app.valarhq.ai, and store them in the VALAR_API_KEY environment variable so the SDK and OpenAI clients pick them up automatically.Choose your execution mode
How soon you need each result decides how you call Valar. Inference modes covers this in depth; in short:| Mode | How you call it | Latency | Cost | Best for |
|---|---|---|---|---|
| Realtime | Synchronous request on the Now window | Seconds | Highest | Interactive chat, prototyping |
| Async | background=True, then poll or wait on a webhook, on the Standard window | Minutes | Lower | Agent loops, background jobs |
| Batch | The Batches API on the Standard window | Up to hours | Lowest | Large datasets, evals, offline jobs |
Where to go next
Quickstart
Make your first API request with Valar