As OpenAI’s multimodal API launches broadly, research shows it’s still flawed

This week saw a big leap in the capabilities OpenAI offers to its customers. But does it fix any of the underlying weaknesses? Not really, says TechCrunch:

Hwang, who conducted a more systematic review of GPT-4 with vision’s capabilities, found that the model remains flawed in several significant — and problematic, in some cases — ways.

“I discovered that GPT-4 with vision often correctly described the positions of elements [in an image] but was less successful with their structural or relative relationships,” Hwang told TechCrunch in an email. “For example, it once correctly said that two curves on a line graph leaned upward, but incorrectly said which one was higher than the other. And it made quite a few errors with graphs in general, from incorrectly estimating the values on a bar or line graph to misinterpreting the colors in a legend.”

While the new tools are better, they’re still far from perfect.

Read the article here.

Leave a comment