Pricing - Valar

How billing works

You pay per token. Three rates apply to each request: input for the tokens you send, cached for input tokens served from a prefix cache, and output for the tokens the model generates. Every figure in the table below is in USD per 1M tokens. Caching is automatic — Valar matches shared prompt prefixes for you and charges the lower cached rate on the tokens that hit. You can raise your hit rate by passing prompt_cache_key as a routing hint, but it’s optional. The rate you pay also depends on the completion window you request. Faster scheduling carries a higher rate, so the on-demand Now window costs more than Standard. The table prices out the windows available for each model; coverage varies, and you can mix windows per request. Completion Windows explains the trade-offs.

Window coverage differs by model, and we keep adding models and widening window support. If a model or window you want isn’t shown, get in touch.

Rate table

USDper 1M tokens

Model	Window	Input	Cached	Output
DeepSeek V4 Pro `deepseek-ai/DeepSeek-V4-Pro`	Standard	0.90	0.15	2.30
DeepSeek V4 Pro `deepseek-ai/DeepSeek-V4-Pro`	Now	1.60	0.15	4.10
DeepSeek V4 Flash `deepseek-ai/DeepSeek-V4-Flash`	Standard	0.135	0.015	0.21
DeepSeek V4 Flash `deepseek-ai/DeepSeek-V4-Flash`	Now	0.243	0.027	0.378
Kimi-K2.6 `moonshotai/Kimi-K2.6`	Standard	0.45	0.20	0.30
Kimi-K2.6 `moonshotai/Kimi-K2.6`	Now	0.90	0.20	3.60
GLM-5.1 `zai-org/GLM-5.1-FP8`	Standard	0.15	0.03	0.60
GLM-5.1 `zai-org/GLM-5.1-FP8`	Now	1.30	0.26	4.40
gpt-oss-120b `openai/gpt-oss-120b`	Standard	0.04	0.02	0.30
gpt-oss-120b `openai/gpt-oss-120b`	Now	0.06	0.03	0.40
Qwen3.5-397B-A17B `Qwen/Qwen3.5-397B-A17B`	Standard	0.25	0.05	0.75
Qwen3.5-397B-A17B `Qwen/Qwen3.5-397B-A17B`	Now	0.45	0.09	1.35
Gemma 4 31B IT `google/gemma-4-31B-it`	Standard	0.18	0.10	0.30
Gemma 4 31B IT `google/gemma-4-31B-it`	Now	0.36	0.20	0.60
MiniMax M2.7 `MiniMaxAI/MiniMax-M2.7`	Standard	0.15	0.03	0.60
MiniMax M2.7 `MiniMaxAI/MiniMax-M2.7`	Now	0.27	0.054	1.08

For each model’s capabilities — image input and reasoning support — see Models.

Models

Building a Tool-Calling Agent

⌘I

​How billing works

​Rate table

How billing works

Rate table