Sep 25, 2023
Claude Doesn't Get the Attention It Deserves
A short note on Claude's long-context advantage, why full-document context matters, and Anthropic's analysis of prompting strategies.
Claude doesn't get the attention it deserves.
For longer documents like research papers, technical reports or legal texts, the 32k tokens GPT-4 is offering are often not enough.
You can split the documents into smaller parts, of course, but that way you're always risking that an answer - or parts of it - isn't in the section you think it is. Retrieval-augmented Generation (RAG) helps a lot with that. But especially for tasks like summarization or paraphrasing, it's a lot better if a model can handle the full text all at once.
Anthropic just released an analysis of different prompting strategies and how they're affecting the accuracy of their large context models (up to 95k).
One thing that surprised me is that for Claude Instant 1.2, there doesn't seem to be a "Lost in the Middle" effect that got popularized by a paper with the same name. Analysis