“The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time
“For the first time, the best available models—Opus for advanced tasks, Haiku for cost and efficiency—are from a vendor that isn’t OpenAI”
“For the first time, the best available models—Opus for advanced tasks, Haiku for cost and efficiency—are from a vendor that isn’t OpenAI”
Claude 3 performs better than the GPT family of language models that power ChatGPT on a series of benchmark cognitive tests. On our tests, we found that Claude is more articulate than ChatGPT, and its answers are usually better written and easier to read.
When Anthropic says that Claude 3 can outperform GPT-4 Turbo, which is currently still widely seen as the market leader in terms of general capability and low hallucinations, one needs to take that with a grain of salt—or a dose of vibes.
Anthropic says Opus outperformed most models in several benchmarking tests. It showed better graduate-level reasoning than OpenAI’s GPT-4, getting 50.4 percent in that test over GPT-4’s 35.7 percent.