What is Context Length?
Context length (also called context window) is the maximum number of tokens that a large language model can process in a single interaction. It includes both the input prompt and the generated output.
Context Length by Model
| Model | Context Length |
|---|---|
| GPT-4 | 8K-128K tokens |
| Claude 3 | 200K tokens |
| Gemini 1.5 | 1M+ tokens |
| Llama 3 | 8K-128K tokens |
Why It Matters
Input Capacity How much context you can provide.
Output Length Combined with input must fit.
Use Cases
- Long document analysis
- Code review
- Research synthesis
- Extended conversations
Token Estimation
English
- ~4 characters per token
- ~0.75 words per token
- 1000 tokens ≈ 750 words
Code
- More variable
- Depends on language
- Typically more tokens
Strategies for Long Content
Chunking Break into smaller pieces.
Summarization Compress information.
RAG Retrieve relevant portions.
Hierarchical Summary + details approach.
Considerations
- Cost scales with tokens
- Attention limitations
- "Lost in the middle" phenomenon
- Quality vs. quantity trade-off