Kimi K2 just made itself available on Groq earlier today.
This is nice, because I’ve been looking for a good model that’s on Groq because their response times are stellar.
I love the big AI labs as much as the next AI Engineer, but seeing results getting processed in a second compared to an extra couple seconds if I had to use gpt-4.1 makes it feel sluggish in comparison. gpt-4.1-mini and gemini-2.0-flash speeds are what I’d want, but of course, those models perform notably worse compared to gpt-4.1 or gemini-2.5-pro
And while Llama 4 Maverick and Scout can technically do the job too, I’ve had issues with either model on tool use and so far, Kimi K2 hasn’t had those problems.

Here’s the list of models I’d normally compare against, and Kimi K2 hits that nice middle ground that I want.
| Model | Input Price | Output Price | Combined | Txn Cost | Monthly Cost |
|---|---|---|---|---|---|
| Anthropic: Claude Opus 4 | 15 | 75 | 90 | 0.21 | 105 |
| Anthropic: Claude Sonnet 4 | 3 | 15 | 18 | 0.042 | 21 |
| Google: Gemini 2.5 Pro | 1.25 | 10 | 11.25 | 0.025 | 12.5 |
| OpenAI: GPT-4.1 | 2 | 8 | 10 | 0.024 | 12 |
| MoonshotAI: Kimi K2 | 0.55 | 2.2 | 2.75 | 0.0066 | 3.3 |
| Google: Gemini 2.5 Flash | 0.3 | 2.5 | 2.8 | 0.0062 | 3.1 |
| OpenAI: GPT-4.1 Mini | 0.4 | 1.6 | 2 | 0.0048 | 2.4 |
| Google: Gemini 2.0 Flash | 0.1 | 0.4 | 0.5 | 0.0012 | 0.6 |
| OpenAI: GPT-4.1 Nano | 0.1 | 0.4 | 0.5 | 0.0012 | 0.6 |
| Google: Gemini 2.0 Flash Lite | 0.075 | 0.3 | 0.375 | 0.0009 | 0.45 |