Set the window
Passmetadata.completion_window on the request:
"asap" (the Now tier) and "standard".
The two tiers
| Window | Avg. turn time | Price | Best for |
|---|---|---|---|
Now (asap) | Immediate | Higher | Realtime calls, interactive UIs, human-in-the-loop |
Standard (standard) | ~5 min | Lower | Cost-optimized agents, async loops, batch work |
Actual turn times vary with the specific workload and the model you choose.
How each tier behaves
Now runs immediately on the fastest available hardware in a latency-optimized setup, at the higher on-demand rate. Use it for realtime, interactive requests where a person or another system is waiting on the result. Standard is the default wherever a model supports it. It runs on Valar’s maximum-efficiency serving stack and targets roughly a five-minute average turn time across a balanced workload. Most of Valar’s published prices reference this window, and it’s the right default for async agent loops and batch jobs.Default behavior
Leavecompletion_window off and the request defaults to standard when the model supports it; otherwise it falls back to the Now tier.