Create a response

curl --request POST \ --url https://api.valarhq.ai/v1/responses \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "zai-org/GLM-5.1-FP8", "input": "Explain the key ideas behind transformer architectures." } '

{ "id": "<string>", "created_at": 123, "model": "<string>", "usage": { "input_tokens": 123, "input_tokens_details": { "cached_tokens": 123, "reasoning_tokens": 123 }, "output_tokens": 123, "output_tokens_details": { "cached_tokens": 123, "reasoning_tokens": 123 }, "total_tokens": 123, "prompt_tokens": 123, "completion_tokens": 123 }, "metadata": {}, "input": "<string>", "output": "<string>", "error": {}, "incomplete_details": {}, "max_output_tokens": 123, "reasoning": {}, "text": { "format": {} }, "store": true, "temperature": 123, "top_p": 123, "parallel_tool_calls": true, "tool_choice": "<string>", "tools": [ {} ], "truncation": "<string>", "user": "<string>" }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Idempotency-Key

string

Lets you retry safely. Valar records a reservation under the combination of organization, API key, and Idempotency-Key, so sending the same value again hands back the stored response rather than running inference a second time. Values can be up to 255 characters. See Idempotent Requests for the complete rules.

Maximum string length: 255

Body

application/json

model

string

required

input

required

Accepts text, and image input (input_image) when you target a multimodal model. Valar does not yet accept audio, files, or item references.

Minimum string length: 1

max_output_tokens

integer | null

Required range: x >= 1

temperature

number | null

Required range: 0 <= x <= 2

top_p

number | null

Required range: 0 <= x <= 1

text

object

Show child attributes

reasoning

object

Show child attributes

background

boolean

prompt_cache_key

string

An optional hint Valar uses to keep prompt-prefix caches local. Requests sharing a key are steered toward the same place so you hit the cache more often.

store

enum<boolean>

Must be true; no other value is accepted.

Available options:

true

truncation

enum<string>

Available options:

disabled

stream

enum<boolean>

Streaming is not available on this endpoint yet.

Available options:

false

user

string

Maximum string length: 256

metadata

object

Optional string-valued metadata. Use completion_window to influence scheduling, and completion_webhook together with webhook_token to wire up a webhook that fires on completion.

Show child attributes

Response

The response finished and is returned to you right away.

string

required

object

enum<string>

required

Available options:

response

created_at

integer

required

status

enum<string>

required

Available options:

pending,

running,

failed,

completed,

cancelled

model

string

required

usage

object

required

Show child attributes

metadata

object

required

input

Accepts text, and image input (input_image) when you target a multimodal model. Valar does not yet accept audio, files, or item references.

Minimum string length: 1

output

error

object

incomplete_details

object

max_output_tokens

integer | null

reasoning

object

text

object

Show child attributes

store

boolean

temperature

number

top_p

number

parallel_tool_calls

boolean

tool_choice

tools

object[]

truncation

user

string | null