Which AI model to use? (here's a list and recommendations)

Last updated: 1.1.2025

Are you still using the web version of ChatGPT? You’re missing out on a lot! In the realm of AI usage and application, the focus has shifted from simply asking “Will it do what I want?” to “Which model can I use to achieve results with consistent quality at the best price?” This is because costs are now determined by token usage rather than a flat rate.

When comparing AI models from various providers, it’s all about adjusting expectations. We know that models are becoming faster, cheaper, and more powerful. However, larger and more affordable doesn’t always equate to better performance. In some cases, smaller and less expensive models may actually be more effective. Here’s an updated overview of current models, their pricing, and performance.

As you can see, there are quite some differences in price. So choose (and test) your model wisely.

Comparison of Large Language Model specifications, pricing, and recommended usage (sorted by cheapest)

Model	Parameters	Inference Speed	Reasoning Capability	Pricing	Recommended Usage	Source
Amazon Nova Micro	2B	Fast	Moderate	Input: $0.04, Output: $0.14 per 1M tokens	Lightweight applications, mobile devices, cost-sensitive use cases	[1]
Gemini 1.5 Flash-8B ≤128k	8B	Very Fast	Moderate	Input: $0.04, Output: $0.15 per 1M tokens	Lightweight applications, mobile devices, cost-sensitive use cases	[2]
Amazon Nova Lite	6B	Fast	Moderate	Input: $0.06, Output: $0.24 per 1M tokens	Lightweight applications, mobile devices, cost-sensitive use cases	[1]
Gemini 1.5 Flash ≤128k	125B	Fast	Moderate	Input: $0.07, Output: $0.30 per 1M tokens	Rapid response applications, simple language tasks, chatbots	[2]
Gemini 1.5 Flash-8B >128k	8B	Very Fast	Moderate	Input: $0.07, Output: $0.30 per 1M tokens	Lightweight applications, mobile devices, cost-sensitive use cases	[2]
GPT-4o Mini	6B	Fast	Moderate	Input: $0.15, Output: $0.60 per 1M tokens	Lightweight applications, cost-sensitive use cases	[3]
Gemini 1.5 Flash >128k	125B	Fast	Moderate	Input: $0.15, Output: $0.60 per 1M tokens	Rapid response applications, simple language tasks, chatbots	[2]
Amazon Nova Pro	30B	Moderate	High	Input: $0.80, Output: $3.20 per 1M tokens	General-purpose language tasks, complex reasoning	[1]
Claude 3.5 Haiku	175B	Fast	Moderate	Input: $0.80, Output: $4 per 1M tokens	Concise, creative writing	[4]
o1-mini	6B	Fast	Moderate	Input: $3, Output: $12 per 1M tokens	Lightweight applications, cost-sensitive use cases	[3]
Gemini 1.5 Pro ≤128k	175B	Moderate	High	Input: $1.25, Output: $5 per 1M tokens	General-purpose language tasks, complex reasoning, long-form content generation	[2]
GPT-4o	175B	Moderate	High	Input: $2.50, Output: $10 per 1M tokens	General-purpose language tasks, complex reasoning	[3]
Gemini 1.5 Pro >128k	175B	Moderate	High	Input: $2.50, Output: $10 per 1M tokens	General-purpose language tasks, complex reasoning, long-form content generation	[2]
Claude 3.5 Sonnet	175B	Moderate	High	Input: $3, Output: $15 per 1M tokens	Creative writing, poetry generation	[4]
o1-preview	175B	Moderate	High	Input: $15, Output: $60 per 1M tokens	Cutting-edge research, advanced language tasks	[3]
Claude 3 Opus	175B	Moderate	High	Input: $15, Output: $75 per 1M tokens	Comprehensive language tasks, long-form content generation	[4]

Sources

Amazon AWS Pricing: https://aws.amazon.com/sagemaker/pricing/
Google Gemini Pricing: https://cloud.google.com/gemini/pricing
OpenAI Pricing: https://openai.com/pricing
Anthropic Pricing: https://www.anthropic.com/pricing

Comparison of Large Language Model specifications, pricing, and recommended usage (sorted by cheapest)

Sources

Related Articles