Understanding the economics of LLMs
The general pricing technique of LLMs is cost per 1 million tokens generated. This factor allows us identify the right LLM for our financial situation. Below you can see the list of models supported by most popular agentic coding tool Cursor as per that pricing strategy.
| Provider | Model | Cache Write ($) | Cache Read ($) | Input ($) | Output ($) |
|---|---|---|---|---|---|
| OpenAI | GPT-5-Pro | 15 | 15 | 1.5 | 120 |
| Anthropic | Claude 4 Opus | 15 | 18.75 | 1.5 | 75 |
| Anthropic | Claude 4.1 Opus | 15 | 18.75 | 1.5 | 75 |
| Anthropic | Claude 4 Sonnet 1M | 6 | 7.5 | 0.6 | 22.5 |
| OpenAI | GPT-5 Fast | 2.5 | 2.5 | 0.25 | 20 |
| Anthropic | Claude 4 Sonnet | 3 | 3.75 | 0.3 | 15 |
| Anthropic | Claude 4.5 Sonnet | 3 | 3.75 | 0.3 | 15 |
| xAI | Grok 4 | 3 | 3 | 0.75 | 15 |
| Gemini 3 Pro | 2 | 2 | 0.2 | 12 | |
| Gemini 2.5 Pro | 1.25 | 1.25 | 0.125 | 10 | |
| OpenAI | GPT-5 | 1.25 | 1.25 | 0.125 | 10 |
| OpenAI | GPT-5-Codex | 1.25 | 1.25 | 0.125 | 10 |
| OpenAI | GPT-5.1 | 1.25 | 1.25 | 0.125 | 10 |
| OpenAI | GPT-5.1 Codex | 1.25 | 1.25 | 0.125 | 10 |
| Cursor | Composer 1 | 1.25 | 1.25 | 0.125 | 10 |
| DeepSeek | Deepseek R1 (05/28) | 3 | 3 | 3 | 8 |
| OpenAI | GPT 4.1 | 2 | 2 | 0.5 | 8 |
| OpenAI | o3 | 2 | 2 | 0.5 | 8 |
| Anthropic | Claude 4.5 Haiku | 1 | 1.25 | 0.1 | 5 |
| Gemini 2.5 Flash | 0.3 | 0.3 | 0.03 | 2.5 | |
| OpenAI | GPT-5 Mini | 0.25 | 0.25 | 0.025 | 2 |
| OpenAI | GPT-5.1 Codex Mini | 0.25 | 0.25 | 0.025 | 2 |
| DeepSeek | Deepseek V3.1 | 0.56 | 0.56 | 0.56 | 1.68 |
| xAI | Grok Code | 0.2 | 0.2 | 0.02 | 1.5 |
| xAI | Grok 4 Fast | 0.2 | 0.2 | 0.05 | 0.5 |
| OpenAI | GPT-5 Nano | 0.05 | 0.05 | 0.005 | 0.4 |
Where are the open source LLMs?
If you look at all the models supported by Cursor, it can be observed that other DeepSeek no other LLM is an open source one. This creates a major limitation in the market and if you want a solution for this, you'll have to look at independend inference chip makers like Groq and Cerebras which now offers open source LLMs that are remarkably powerful for coding.
LLMs Provided by Groq
| Model | Speed (Tokens/sec) | Input Price ($/M Tokens) | Output Price ($/M Tokens) |
|---|---|---|---|
| GPT OSS 20B 128k | 1,000 TPS | $0.075 | $0.30 |
| GPT OSS Safeguard 20B | 1,000 TPS | $0.075 | $0.30 |
| GPT OSS 120B 128k | 500 TPS | $0.15 | $0.60 |
| Kimi K2-0905 1T 256k | 200 TPS | $1.00 | $3.00 |
| Llama 4 Scout (17Bx16E) 128k | 594 TPS | $0.11 | $0.34 |
| Llama 4 Maverick (17Bx128E) 128k | 562 TPS | $0.20 | $0.60 |
| Llama Guard 4 12B 128k | 325 TPS | $0.20 | $0.20 |
| Qwen3 32B 131k | 662 TPS | $0.29 | $0.59 |
| Llama 3.3 70B Versatile 128k | 394 TPS | $0.59 | $0.79 |
| Llama 3.1 8B Instant 128k | 840 TPS | $0.05 | $0.08 |
it can be observed that Groq supports an open source flagship model for coding - Kimi K2-0905 1T 256k
Finding the right LLM for coding?
To find out the best model for coding you can refer to SWE bench benchmark where LLMs are used to fix actual repo issues. This quickly helps us figure out the right LLM for coding without wasting our time investigating each individual LLM.
Using the SWE benchmark
| Model | Score | Price | Date | Version |
|---|---|---|---|---|
| Gemini 3 Pro Preview (2025-11-18) | 74.20 | $0.46 | 2025-11-18 | 1.15.0 |
| Claude 4.5 Sonnet (20250929) | 70.60 | $0.56 | 2025-09-29 | 1.13.3 |
| Claude 4 Opus (20250514) | 67.60 | $1.13 | 2025-08-02 | 1.0.0 |
| GPT-5 (2025-08-07) (medium reasoning) | 65.00 | $0.28 | 2025-08-07 | 1.7.0 |
| Claude 4 Sonnet (20250514) | 64.93 | $0.37 | 2025-07-26 | 1.0.0 |
| GPT-5 mini (2025-08-07) (medium reasoning) | 59.80 | $0.04 | 2025-08-07 | 1.7.0 |
| o3 (2025-04-16) | 58.40 | $0.33 | 2025-07-26 | 1.0.0 |
| Qwen3-Coder 480B/A35B Instruct | 55.40 | $0.25 | 2025-08-02 | 1.0.0 |
| GLM-4.5 (2025-08-22) | 54.20 | $0.30 | 2025-08-22 | 1.9.1 |
| Gemini 2.5 Pro (2025-05-06) | 53.60 | $0.29 | 2025-07-26 | 1.0.0 |
| Claude 3.7 Sonnet (20250219) | 52.80 | $0.35 | 2025-07-20 | 0.0.0 |
| o4-mini (2025-04-16) | 45.00 | $0.21 | 2025-07-26 | 1.0.0 |
| Kimi K2 Instruct | 43.80 | $0.53 | 2025-08-07 | 1.7.0 |
| GPT-4.1 (2025-04-14) | 39.58 | $0.15 | 2025-07-26 | 1.0.0 |
| GPT-5 nano (2025-08-07) (medium reasoning) | 34.80 | $0.04 | 2025-08-07 | 1.7.0 |
| Gemini 2.5 Flash (2025-04-17) | 28.73 | $0.13 | 2025-07-26 | 1.0.0 |
| gpt-oss-120b | 26.00 | $0.06 | 2025-08-07 | 1.7.0 |
| GPT-4.1-mini (2025-04-14) | 23.94 | $0.44 | 2025-07-20 | 0.0.0 |
| GPT-4o (2024-11-20) | 21.62 | $1.53 | 2025-07-20 | 0.0.0 |
| Llama 4 Maverick Instruct | 21.04 | $0.31 | 2025-07-20 | 0.0.0 |
| Gemini 2.0 Flash | 13.52 | — | 2025-07-26 | 0.0.0 |
| Llama 4 Scout Instruct | 9.06 | $0.12 | 2025-07-20 | 0.0.0 |
| Qwen2.5-Coder 32B Instruct | 9.00 | $0.07 | 2025-08-03 | 1.0.0 |
Conclusion
It can be observed that all open source LLMs which are good at coding are Chinese.
Ready to transform your organization with data analytics? Contact our team to learn how BPMLinks can help you on your data journey.
Want to learn more about data analytics and digital transformation? Subscribe to our newsletter for regular insights and updates.








