Skip to main content
POST
/
responses
curl --request POST \
  --url https://api.valarhq.ai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "zai-org/GLM-5.1-FP8",
  "input": "Explain the key ideas behind transformer architectures."
}
'
{
  "id": "<string>",
  "created_at": 123,
  "model": "<string>",
  "usage": {
    "input_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123,
      "reasoning_tokens": 123
    },
    "output_tokens": 123,
    "output_tokens_details": {
      "cached_tokens": 123,
      "reasoning_tokens": 123
    },
    "total_tokens": 123,
    "prompt_tokens": 123,
    "completion_tokens": 123
  },
  "metadata": {},
  "input": "<string>",
  "output": "<string>",
  "error": {},
  "incomplete_details": {},
  "max_output_tokens": 123,
  "reasoning": {},
  "text": {
    "format": {}
  },
  "store": true,
  "temperature": 123,
  "top_p": 123,
  "parallel_tool_calls": true,
  "tool_choice": "<string>",
  "tools": [
    {}
  ],
  "truncation": "<string>",
  "user": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Idempotency-Key
string

Lets you retry safely. Valar records a reservation under the combination of organization, API key, and Idempotency-Key, so sending the same value again hands back the stored response rather than running inference a second time. Values can be up to 255 characters. See Idempotent Requests for the complete rules.

Maximum string length: 255

Body

application/json
model
string
required
input
required

Accepts text, and image input (input_image) when you target a multimodal model. Valar does not yet accept audio, files, or item references.

Minimum string length: 1
max_output_tokens
integer | null
Required range: x >= 1
temperature
number | null
Required range: 0 <= x <= 2
top_p
number | null
Required range: 0 <= x <= 1
text
object
reasoning
object
background
boolean
prompt_cache_key
string

An optional hint Valar uses to keep prompt-prefix caches local. Requests sharing a key are steered toward the same place so you hit the cache more often.

store
enum<boolean>

Must be true; no other value is accepted.

Available options:
true
truncation
enum<string>
Available options:
disabled
stream
enum<boolean>

Streaming is not available on this endpoint yet.

Available options:
false
user
string
Maximum string length: 256
metadata
object

Optional string-valued metadata. Use completion_window to influence scheduling, and completion_webhook together with webhook_token to wire up a webhook that fires on completion.

Response

The response finished and is returned to you right away.

id
string
required
object
enum<string>
required
Available options:
response
created_at
integer
required
status
enum<string>
required
Available options:
pending,
running,
failed,
completed,
cancelled
model
string
required
usage
object
required
metadata
object
required
input

Accepts text, and image input (input_image) when you target a multimodal model. Valar does not yet accept audio, files, or item references.

Minimum string length: 1
output
error
object
incomplete_details
object
max_output_tokens
integer | null
reasoning
object
text
object
store
boolean
temperature
number
top_p
number
parallel_tool_calls
boolean
tool_choice
tools
object[]
truncation
user
string | null