/v1/responses, so any OpenAI-compatible client works after you change two settings: the base URL and the key.
This walkthrough runs one realistic task end to end: classifying an inbound support ticket and drafting a reply. You send it as a background job, then retrieve the result once Valar finishes. The same pattern scales from this single call to the thousands of concurrent requests an agent fans out at runtime.
Create an API key
Sign in at app.valarhq.ai and create a key from the dashboard.
Point at Valar
Install the OpenAI SDK and point it at Valar. Set the base URL to
https://api.valarhq.ai/v1 and pass your key as a bearer token. Nothing else about the OpenAI client changes.Dispatch the task in the background
Setting
background returns a response id immediately rather than holding the connection open. For one ticket this is convenient; across a queue of them it is what lets the work run concurrently. Use a model from the Models page — here, zai-org/GLM-5.1-FP8.Retrieve the request
The create call hands back a response id and a status of
queued or in_progress. Retrieve that id until it reaches completed, then read output_text. In production you would replace this poll loop with a webhook so you aren’t holding a thread per job.Retrieve the result
The create call hands back a response id and a status of
queued or in_progress. Retrieve that id until it reaches completed, then read output_text. In production you would replace this poll loop with a webhook so you aren’t holding a thread per job.Going further
A single triaged ticket is the unit; an agent is many of them in a loop. From here:- See Models for the full list of supported models, and Pricing for per-token rates.
- Turn this into a tool-using agent that looks up the customer’s billing record before replying - see Building a tool-calling agent.
- Run the same task over a backlog of tickets at once with Requests at scale, choosing a completion window per the latency you can tolerate.