How billing works
You pay per token. Three rates apply to each request: input for the tokens you send, cached for input tokens served from a prefix cache, and output for the tokens the model generates. Every figure in the table below is in USD per 1M tokens. Caching is automatic — Valar matches shared prompt prefixes for you and charges the lower cached rate on the tokens that hit. You can raise your hit rate by passingprompt_cache_key as a routing hint, but it’s optional.
The rate you pay also depends on the completion window you request. Faster scheduling carries a higher rate, so the on-demand Now window costs more than Standard. The table prices out the windows available for each model; coverage varies, and you can mix windows per request. Completion Windows explains the trade-offs.
Window coverage differs by model, and we keep adding models and widening window support. If a model or window you want isn’t shown, get in touch.
Rate table
USDper 1M tokens
| Model | Window | Input | Cached | Output |
|---|---|---|---|---|
| Standard | 0.90 | 0.15 | 2.30 | |
| Now | 1.60 | 0.15 | 4.10 | |
| Standard | 0.135 | 0.015 | 0.21 | |
| Now | 0.243 | 0.027 | 0.378 | |
| Standard | 0.45 | 0.20 | 0.30 | |
| Now | 0.90 | 0.20 | 3.60 | |
| Standard | 0.15 | 0.03 | 0.60 | |
| Now | 1.30 | 0.26 | 4.40 | |
| Standard | 0.04 | 0.02 | 0.30 | |
| Now | 0.06 | 0.03 | 0.40 | |
| Standard | 0.25 | 0.05 | 0.75 | |
| Now | 0.45 | 0.09 | 1.35 | |
| Standard | 0.18 | 0.10 | 0.30 | |
| Now | 0.36 | 0.20 | 0.60 | |
| Standard | 0.15 | 0.03 | 0.60 | |
| Now | 0.27 | 0.054 | 1.08 |