Skip to main content
POST
/
chat
/
completions
Create a chat completion
curl --request POST \
  --url https://api.valarhq.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "zai-org/GLM-5.1-FP8",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise assistant."
    },
    {
      "role": "user",
      "content": "Summarize retrieval-augmented generation in 3 bullets."
    }
  ],
  "max_completion_tokens": 300
}
'
{
  "id": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "content": "<string>",
        "reasoning_content": "<string>",
        "refusal": "<string>"
      },
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {},
    "completion_tokens_details": {}
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

Idempotency-Key
string

Lets you retry safely. Valar records a reservation under the combination of organization, API key, and Idempotency-Key, so sending the same value again hands back the stored response rather than running inference a second time. Values can be up to 255 characters. See Idempotent Requests for the complete rules.

Maximum string length: 255

Body

application/json
model
string
required
messages
object[]
required
Minimum array length: 1
temperature
number | null
Required range: 0 <= x <= 2
top_p
number | null
Required range: 0 <= x <= 1
max_completion_tokens
integer | null
Required range: x >= 1
response_format
object
reasoning_effort
enum<string> | null
Available options:
none,
minimal,
low,
medium,
high,
xhigh
n
enum<integer>

Must be 1; requesting multiple choices is not supported yet.

Available options:
1
modalities
enum<string>[]
Required array length: 1 element
Available options:
text
stream
boolean

Set this to true to receive the answer as a Server-Sent Events stream of chat.completion.chunk objects rather than one JSON payload.

stream_options
object

Settings that only take effect while streaming (stream is true).

store
enum<boolean>

Must be true; no other value is accepted.

Available options:
true
user
string
Maximum string length: 256
metadata
object

Optional string-valued metadata. Use completion_window to influence scheduling, and completion_webhook together with webhook_token to wire up a webhook that fires on completion.

Response

The chat completion. By default this is one JSON object; with stream: true it becomes a Server-Sent Events stream of chat.completion.chunk objects closed by a final data: [DONE] line.

id
string
required
object
enum<string>
required
Available options:
chat.completion
created
integer
required
model
string
required
choices
object[]
required
Minimum array length: 1
usage
object