What is Context Window? | Oximy Glossary

What is a Context Window?

A context window (or context length) is the maximum number of tokens a language model can process in a single interaction. It encompasses both the input prompt and the model's response, defining the "working memory" of the model.

Context Window Sizes

Historical

GPT-2: 1,024 tokens
GPT-3: 4,096 tokens

Current Generation

GPT-4: 8K-128K tokens
Claude: 100K-200K tokens
Gemini: Up to 1M tokens

Token Basics

What is a Token?

Roughly 4 characters in English
~750 words ≈ 1,000 tokens
Varies by language

Token Counting

Input tokens (prompt)
Output tokens (response)
Total must fit in window

Implications

Capabilities

Longer documents can be processed
More context for better responses
Extended conversations

Limitations

Cost increases with tokens
Processing time increases
"Lost in the middle" phenomenon

Strategies for Long Content

Chunking Break content into pieces.

Summarization Compress information.

RAG Retrieve relevant context.

Sliding Window Process sequentially.

Best Practices

Use context efficiently
Prioritize relevant information
Consider cost implications
Test with various lengths