Catalog

Models

Perchy supports clear-lane scheduling and traditional token pricing for the most-used open and proprietary models.

Claude Opus 4.7

anthropic/claude-opus-4.7
Frontier reasoning10% off
1M ctxfrontierreasoningvision
Input / 1M
$4.50$5.0010% off
Output / 1M
$22.50$25.0010% off
Cache read / 1M
$0.450$0.50010% off

Claude Opus 4.6

anthropic/claude-opus-4.6
Frontier reasoning10% off
1M ctxfrontierreasoningvision
Input / 1M
$4.50$5.0010% off
Output / 1M
$22.50$25.0010% off
Cache read / 1M
$0.450$0.50010% off

Claude Sonnet 4.6

anthropic/claude-sonnet-4.6
General purpose10% off
1M ctxchattoolsvision
Input / 1M
$2.70$3.0010% off
Output / 1M
$13.50$15.0010% off
Cache read / 1M
$0.270$0.30010% off

Claude Haiku 4.5

anthropic/claude-haiku-4.5
Fast10% off
200K ctxchattoolsfast
Input / 1M
$0.900$1.0010% off
Output / 1M
$4.50$5.0010% off
Cache read / 1M
$0.0900$0.10010% off

GPT-5.5 Pro

openai/gpt-5.5-pro
Frontier reasoning10% off
1M ctxfrontierreasoningvision
Input / 1M
$27.00$30.0010% off
Output / 1M
$162.00$180.0010% off
Cache read / 1M
$2.70$3.0010% off

GPT-5.5

openai/gpt-5.5
Frontier reasoning10% off
1M ctxfrontierreasoningvision
Input / 1M
$4.50$5.0010% off
Output / 1M
$27.00$30.0010% off
Cache read / 1M
$0.450$0.50010% off

GPT-5.4

openai/gpt-5.4
General purpose10% off
1M ctxchattoolsvision
Input / 1M
$2.25$2.5010% off
Output / 1M
$13.50$15.0010% off
Cache read / 1M
$0.225$0.25010% off

GPT-5.4 Mini

openai/gpt-5.4-mini
Fast10% off
400K ctxchattoolsfast
Input / 1M
$0.675$0.75010% off
Output / 1M
$4.05$4.5010% off
Cache read / 1M
$0.0675$0.075010% off

GPT-5.4 Nano

openai/gpt-5.4-nano
Fast10% off
400K ctxchatfast
Input / 1M
$0.180$0.20010% off
Output / 1M
$1.13$1.2510% off
Cache read / 1M
$0.0180$0.020010% off

Gemini 3.1 Pro

google/gemini-3.1-pro-preview
Frontier reasoning10% off
1M ctxfrontiervisionaudio
Input / 1M
$1.80$2.0010% off
Output / 1M
$10.80$12.0010% off
Cache read / 1M
$0.180$0.20010% off

Gemini 3.5 Flash

google/gemini-3.5-flash
Fast10% off
1M ctxchatvisionfast
Input / 1M
$1.35$1.5010% off
Output / 1M
$8.10$9.0010% off
Cache read / 1M
$0.135$0.15010% off

Gemini 3.1 Flash Lite

google/gemini-3.1-flash-lite
Fast10% off
1M ctxchatfastlong-context
Input / 1M
$0.225$0.25010% off
Output / 1M
$1.35$1.5010% off
Cache read / 1M
$0.0225$0.025010% off

Gemma 4 31B

google/gemma-4-31b-it
General purpose10% off
256K ctxchattoolsopen-weights
Input / 1M
$0.108$0.12010% off
Output / 1M
$0.333$0.37010% off
Cache read / 1M
$0.0108$0.012010% off

Gemma 4 26B A4B

google/gemma-4-26b-a4b-it
Fast10% off
256K ctxchatmoeopen-weights
Input / 1M
$0.0540$0.060010% off
Output / 1M
$0.297$0.33010% off
Cache read / 1M
$0.0054$0.006010% off

Qwen 3.7 Max

qwen/qwen3.7-max
Frontier reasoning10% off
1M ctxfrontierreasoningtools
Input / 1M
$2.25$2.5010% off
Output / 1M
$6.75$7.5010% off
Cache read / 1M
$0.225$0.25010% off

Qwen 3.6 Plus

qwen/qwen3.6-plus
General purpose10% off
1M ctxchattoolslong-context
Input / 1M
$0.292$0.32510% off
Output / 1M
$1.75$1.9510% off
Cache read / 1M
$0.0293$0.032510% off

Qwen 3.6 Flash

qwen/qwen3.6-flash
Fast10% off
1M ctxchatfastlong-context
Input / 1M
$0.169$0.18810% off
Output / 1M
$1.01$1.1310% off
Cache read / 1M
$0.0169$0.018810% off

Qwen 3.6 35B A3B

qwen/qwen3.6-35b-a3b
General purpose10% off
262K ctxchatmoeopen-weights
Input / 1M
$0.135$0.15010% off
Output / 1M
$0.900$1.0010% off
Cache read / 1M
$0.0135$0.015010% off

Qwen 3.6 27B

qwen/qwen3.6-27b
Reasoning10% off
262K ctxreasoningthinkingopen-weights
Input / 1M
$0.270$0.30010% off
Output / 1M
$2.88$3.2010% off
Cache read / 1M
$0.0270$0.030010% off

DeepSeek V4 Pro

deepseek/deepseek-v4-pro
Frontier reasoning10% off
1M ctxfrontierreasoningmoe
Input / 1M
$0.392$0.43510% off
Output / 1M
$0.783$0.87010% off
Cache read / 1M
$0.0392$0.043510% off

DeepSeek V4 Flash

deepseek/deepseek-v4-flash
Fast10% off
1M ctxchatfastmoe
Input / 1M
$0.0900$0.10010% off
Output / 1M
$0.180$0.20010% off
Cache read / 1M
$0.0090$0.010010% off

MiniMax M2.7

minimax/minimax-m2.7
General purpose10% off
200K ctxchattoolslong-context
Input / 1M
$0.251$0.27910% off
Output / 1M
$1.08$1.2010% off
Cache read / 1M
$0.0251$0.027910% off

MiniMax M2.5

minimax/minimax-m2.5
Fast10% off
200K ctxchatfast
Input / 1M
$0.135$0.15010% off
Output / 1M
$1.03$1.1510% off
Cache read / 1M
$0.0135$0.015010% off

GLM 5.1

z-ai/glm-5.1
Frontier reasoning10% off
200K ctxfrontierreasoningtools
Input / 1M
$0.882$0.98010% off
Output / 1M
$2.77$3.0810% off
Cache read / 1M
$0.0882$0.098010% off

GLM 5

z-ai/glm-5
General purpose10% off
200K ctxchattools
Input / 1M
$0.540$0.60010% off
Output / 1M
$1.73$1.9210% off
Cache read / 1M
$0.0540$0.060010% off

GLM 4.7 Flash

z-ai/glm-4.7-flash
Fast10% off
200K ctxchatfast
Input / 1M
$0.0540$0.060010% off
Output / 1M
$0.360$0.40010% off
Cache read / 1M
$0.0054$0.006010% off