Comprehensive AI Model Rankings: Performance Comparison of Over 30 LLMs

Explore our detailed ranking of over 30 AI models, including GPT-4o, Llama 3, and more. Compare key metrics such as quality, speed, latency, and pricing to find the best AI model for your needs. Stay informed with the latest insights on leading large language models (LLMs) and their performance.

model
creator
context
quality
price (USD / 1M tokens)
speed (tokens/second)
latency (first chunk/s)
o1-preview OpenAI 128k 85 26.25 31.4 33.09
o1-mini OpenAI 128k 82 5.25 72.5 14.51
GPT-4o (Aug '24) OpenAI 128k 77 4.38 104.1 0.39
GPT-4o (May '24) OpenAI 128k 77 7.5 110.9 0.37
Claude 3.5 Sonnet Anthropic 200k 77 6 68.3 0.94
Qwen2.5 72B Alibaba 131k 75 0.38 29.6 0.58
GPT-4 Turbo OpenAI 128k 74 15 38.4 0.57
Mistral Large 2 Mistral 128k 73 3.75 37.5 0.52
Gemini 1.5 Pro Google 2m 72 5.25 64.9 0.8
Llama 3.1 405B Meta 128k 72 5 27.8 0.65
GPT-4o mini OpenAI 128k 71 0.26 130.4 0.39
Claude 3 Opus Anthropic 200k 70 30 24.9 1.75
Qwen2 72B Alibaba 128k 69 0.63 51.3 0.34
DeepSeek-Coder-V2 DeepSeek 128k 67 0.17 16.8 1.14
DeepSeek-V2 DeepSeek 128k 66 0.17 16.8 1.22
DeepSeek-V2.5 DeepSeek 128k 66 0.17 16.8 1.21
Llama 3.1 70B Meta 128k 65 0.88 77.1 0.38
Jamba 1.5 Large AI21 Labs 256k 64 3.5 61.5 1.02
Sonar Large Perplexity 33k 62 1 45.7 0.24
Llama 3 70B Meta 8k 62 0.9 54.4 0.44
Mixtral 8x22B Mistral 65k 61 1.2 59.2 0.34
Mistral Small (Sep '24) Mistral 128k 60 0.3 80.2 0.44
Gemini 1.5 Flash Google 1m 60 0.13 310.5 0.37
Yi-Large 01.AI 32k 58 3 63.9 0.39
Claude 3 Sonnet Anthropic 200k 57 6 53.7 0.96
Reka Core Reka AI 128k 57 4 14.5 1.13
Command-R+ (Aug '24) Cohere 128k 56 5.19 47.8 0.53
Mistral Large Mistral 33k 56 6 35 0.47
Claude 3 Haiku Anthropic 200k 54 0.5 131.1 0.48
GPT-3.5 Turbo OpenAI 16k 53 0.75 83.7 0.38
Llama 3.1 8B Meta 128k 53 0.14 278.4 0.3
Mistral NeMo Mistral 128k 52 0.15 134.1 0.25
Command-R (Aug '24) Cohere 128k 51 0.51 106.6 0.35
Mistral Small (Feb '24) Mistral 33k 50 1.5 47.9 0.42
DBRX Databricks 33k 50 1.16 86.5 0.44
Gemma 2 27B Google 8k 49 0.8 66.4 0.36
Gemma 2 9B Google 8k 47 0.2 114.2 0.27
Llama 3 8B Meta 8k 46 0.15 98.3 0.35
Jamba 1.5 Mini AI21 Labs 256k 46 0.25 162.5 0.8
Command-R+ (Apr '24) Cohere 128k 46 6 47.2 0.56
Reka Flash Reka AI 128k 46 1.1 29.9 0.97
OpenChat 3.5 OpenChat 8k 43 0.06 69.5 0.28
Mixtral 8x7B Mistral 33k 42 0.5 86.6 0.33
Sonar Small Perplexity 33k 41 0.2 141.8 0.19
Command-R (Mar '24) Cohere 128k 36 0.75 106.2 0.39
Codestral-Mamba Mistral 256k 36 0.25 94.9 0.55
Llama 2 Chat 70B Meta 4k 34 1.39 47.6 0.36
Reka Edge Reka AI 64k 30 0.55 0 0
Gemma 7B Google 8k 28 0.07 1 0.85
Jamba Instruct AI21 Labs 256k 28 0.55 77.9 0.73
Command Cohere 4k 26 1.44 21.9 0.52
Llama 2 Chat 13B Meta 4k 25 0.3 53.6 0.4
Mistral 7B Mistral 33k 24 0.16 107.3 0.28
Command Light Cohere 4k 14 0.38 34.4 0.54
Llama 2 Chat 7B Meta 4k 10 0.33 122.6 0.25
Phi-3 Medium 14B Microsoft Azure 128k 0 0.45 55 0.41
Codestral Mistral 33k 0 0.3 49.6 0.48
Mistral Medium Mistral 33k 0 4.09 38.3 0.81
Sonar 3.1 Large Perplexity 131k 0 1 58.9 0.21
GPT-4 OpenAI 8k 0 37.5 28.3 0.58
Claude 2.0 Anthropic 100k 0 12 31.3 1.08
Claude Instant Anthropic 100k 0 1.2 74.9 0.57
Claude 2.1 Anthropic 200k 0 12 28.4 1.58
GPT-3.5 Turbo Instruct OpenAI 4k 0 1.63 107.8 0.53
Gemini 1.0 Pro Google 33k 0 0.75 99.7 1.18
Sonar 3.1 Small Perplexity 131k 0 0.2 135.6 0.18
Pixtral 12B Mistral 128k 0 0.15 80 0.59