Anthropic has a fast new AI model — and a clever new way to interact with chatbots
Anthropic says 3.5 Sonnet outperforms 3 Opus, and its benchmarks show it does so by a pretty wide margin.
Anthropic says 3.5 Sonnet outperforms 3 Opus, and its benchmarks show it does so by a pretty wide margin.
“We’re not a company that believes a certain set of things about the dangers that AI systems are going to have,” Amodei says.
Jan Leike, a key OpenAI researcher who resigned earlier this month citing “safety concerns,” has joined competitor Anthropic to “work on scalable oversight, weak-to-strong generalization, and automated alignment research.”
Even at this early stage, though, Anthropic’s research provides an exciting framework for making an LLM’s “black box” results that much more interpretable and, potentially, controllable.
“The process of implementing the policy has surfaced a range of important questions, projects, and dependencies that might otherwise have taken longer to identify or gone undiscussed.”
The Claude mobile app can act as a chatbot, and users can also upload photos straight to the app for “image analysis”.
A great resource with examples of effective prompts, from Claude provider Anthropic.
Less than a month after overtaking GPT-4 Turbo in the Chatbot Arena, Anthropic’s Claude 3 Opus has been pushed into second place in the overall category, followed by GPT-4-1106-preview, an older version of GPT-4 Turbo, in third place.
Developing ways to measure the persuasive capabilities of AI models is important because it serves as a proxy measure of how well AI models can match human skill in an important domain, and because persuasion may ultimately be tied to certain kinds of misuse, such as using AI to generate disinformation, or persuading people to take actions against their own interests.
A large language model (LLM) can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions first.
Amazon invests US $4 billion in Anthropic, saying: “We believe our strategic collaboration with Anthropic will further improve our customers’ experiences, and look forward to what’s next.”
Claude 3 Opus gives you a very similar experience to Google Gemini Advanced. There are no plugins, there is no code interpreter, and they generate text at very similar speeds to each other.
“For the first time, the best available models—Opus for advanced tasks, Haiku for cost and efficiency—are from a vendor that isn’t OpenAI”
Claude 3 performs better than the GPT family of language models that power ChatGPT on a series of benchmark cognitive tests. On our tests, we found that Claude is more articulate than ChatGPT, and its answers are usually better written and easier to read.
An Anthropic engineer shared a story from internal testing of Opus where the model seemingly demonstrated a type of “metacognition” or self-awareness during a “needle-in-the-haystack” evaluation, leading to both curiosity and skepticism online.
When Anthropic says that Claude 3 can outperform GPT-4 Turbo, which is currently still widely seen as the market leader in terms of general capability and low hallucinations, one needs to take that with a grain of salt—or a dose of vibes.
Anthropic says Opus outperformed most models in several benchmarking tests. It showed better graduate-level reasoning than OpenAI’s GPT-4, getting 50.4 percent in that test over GPT-4’s 35.7 percent.
“Over the last year, the start-up’s valuation has tripled to $15 billion… It hit roughly $8 million in monthly revenue last year and expects that to grow by around eightfold this year…”
Microsoft has invested billions of dollars in OpenAI, the maker of ChatGPT, while Amazon and Google have each committed billions of dollars to Anthropic, another leading A.I. start-up.
Anthropic has announced that the latest update of its chatbot, Claude 2.1, can digest up to 200,000 tokens at once for Pro tier users, which it says equals over 500 pages of material. The company also says Claude will hallucinate half as often as before.
Although OpenAI is well known for its ChatGPT – and for the GPT-4 ‘large language model’ that powers ChatGPT – it’s not the only game in town. Anthropic – a breakaway group of OpenAI engineers – have been hard at work at ‘Claude’, their own answer to ChatGPT. This week they’ve announced a paid version. […]