Speed of AI development is outpacing risk assessment

Every day we see a new ‘foundation’ model, used to power an AI chatbot. But are any of them really safe? Ars Technica investigates:

New AI systems routinely emerge that can “completely ace” existing benchmarks, Gomez said. “As models get better, the capabilities make these evaluations obsolete,” he said.

The problem of how to assess LLMs has shifted from academia to the boardroom, as generative AI has become the top investment priority of 70 percent of chief executives, according to a KPMG survey of more than 1,300 global CEOs.

“People won’t use technology they don’t trust,” said Shelley McKinley, chief legal officer at GitHub, a repository for code that is owned by Microsoft. “It’s incumbent on companies to put out trustworthy products.”

Assessment is always a hard problem – whether for people or for chatbots.

Read the full story here. (It’s excellent.)

Leave a comment