Which AI model to use? (here's a list and recommendations)

Last updated: 1.1.2025

Are you still using the web version of ChatGPT? You’re missing out on a lot! In the realm of AI usage and application, the focus has shifted from simply asking “Will it do what I want?” to “Which model can I use to achieve results with consistent quality at the best price?” This is because costs are now determined by token usage rather than a flat rate.

When comparing AI models from various providers, it’s all about adjusting expectations. We know that models are becoming faster, cheaper, and more powerful. However, larger and more affordable doesn’t always equate to better performance. In some cases, smaller and less expensive models may actually be more effective. Here’s an updated overview of current models, their pricing, and performance.

As you can see, there are quite some differences in price. So choose (and test) your model wisely.

ModelParametersInference SpeedReasoning CapabilityPricingRecommended UsageSource
Amazon Nova Micro2BFastModerateInput: $0.04, Output: $0.14 per 1M tokensLightweight applications, mobile devices, cost-sensitive use cases[1]
Gemini 1.5 Flash-8B ≤128k8BVery FastModerateInput: $0.04, Output: $0.15 per 1M tokensLightweight applications, mobile devices, cost-sensitive use cases[2]
Amazon Nova Lite6BFastModerateInput: $0.06, Output: $0.24 per 1M tokensLightweight applications, mobile devices, cost-sensitive use cases[1]
Gemini 1.5 Flash ≤128k125BFastModerateInput: $0.07, Output: $0.30 per 1M tokensRapid response applications, simple language tasks, chatbots[2]
Gemini 1.5 Flash-8B >128k8BVery FastModerateInput: $0.07, Output: $0.30 per 1M tokensLightweight applications, mobile devices, cost-sensitive use cases[2]
GPT-4o Mini6BFastModerateInput: $0.15, Output: $0.60 per 1M tokensLightweight applications, cost-sensitive use cases[3]
Gemini 1.5 Flash >128k125BFastModerateInput: $0.15, Output: $0.60 per 1M tokensRapid response applications, simple language tasks, chatbots[2]
Amazon Nova Pro30BModerateHighInput: $0.80, Output: $3.20 per 1M tokensGeneral-purpose language tasks, complex reasoning[1]
Claude 3.5 Haiku175BFastModerateInput: $0.80, Output: $4 per 1M tokensConcise, creative writing[4]
o1-mini6BFastModerateInput: $3, Output: $12 per 1M tokensLightweight applications, cost-sensitive use cases[3]
Gemini 1.5 Pro ≤128k175BModerateHighInput: $1.25, Output: $5 per 1M tokensGeneral-purpose language tasks, complex reasoning, long-form content generation[2]
GPT-4o175BModerateHighInput: $2.50, Output: $10 per 1M tokensGeneral-purpose language tasks, complex reasoning[3]
Gemini 1.5 Pro >128k175BModerateHighInput: $2.50, Output: $10 per 1M tokensGeneral-purpose language tasks, complex reasoning, long-form content generation[2]
Claude 3.5 Sonnet175BModerateHighInput: $3, Output: $15 per 1M tokensCreative writing, poetry generation[4]
o1-preview175BModerateHighInput: $15, Output: $60 per 1M tokensCutting-edge research, advanced language tasks[3]
Claude 3 Opus175BModerateHighInput: $15, Output: $75 per 1M tokensComprehensive language tasks, long-form content generation[4]

Sources

  1. Amazon AWS Pricing: https://aws.amazon.com/sagemaker/pricing/
  2. Google Gemini Pricing: https://cloud.google.com/gemini/pricing
  3. OpenAI Pricing: https://openai.com/pricing
  4. Anthropic Pricing: https://www.anthropic.com/pricing