Gemini 1.5 Pro | 1 million tokens | Early access model from Google. Incredible length support. |
Claude 2.1 | 200,000 tokens | Extremely large context window. Very strong at long-document tasks. |
Claude 2.0 | 100,000 tokens | Previous version, still widely used. |
Command R+ (Cohere) | 128,000 tokens | RAG-optimized, good for enterprise use. |
GPT-4 Turbo | 128,000 tokens | Current ChatGPT model (fast + cheap). Used in ChatGPT and OpenAI API. |
GPT-4 | 32,768 tokens | Earlier GPT-4 versions. |
Gemini 1.0 Pro | ~32,000 tokens | Competitive with GPT-4-32k. |
Mistral (Mixtral) | ~32,000 tokens | Open-weight MoE model, efficient. Context limit estimated. |
GPT-3.5 Turbo | 16,385 tokens | Available via API. |
Mistral 7B | 8,192 tokens | Smaller open-weight model. |
GPT-3.5 | 4,096 tokens | Default, older GPT-3.5 model. |
LLaMA 2 70B | 4,096 tokens | Popular open-weight base model. |
LLaMA 3 (expected) | TBD | Expected to support larger windows. |