Written by mpesceJuly 2, 2024July 1, 2024

OpenAI’s new “CriticGPT” model is trained to criticize GPT-4 outputs

OpenAI created CriticGPT to act as an AI assistant to human trainers who review programming code generated by the ChatGPT AI assistant.

Written by mpesceJune 19, 2024June 19, 2024

Why does AI hallucinate?

Here we go again. Chatbot fails are now a familiar meme. The tendency to make things up is holding chatbots back. But that’s just what they do.

Written by mpesceJune 18, 2024June 17, 2024

People struggle to tell humans apart from ChatGPT in five-minute chat conversations, tests show

People find it difficult to distinguish between the GPT-4 model and a human agent when interacting with them as part of a 2-person conversation.

Written by mpesceJune 14, 2024June 12, 2024

ChatGPT is bulls**t

“We argue that these falsehoods, and the overall activity of large language models, is better understood as bulls**t in the sense explored by Frankfurt: the models are in an important way indifferent to the truth of their outputs.”

Written by mpesceMay 28, 2024May 27, 2024

Here’s what’s really going on inside an LLM’s neural network

Even at this early stage, though, Anthropic’s research provides an exciting framework for making an LLM’s “black box” results that much more interpretable and, potentially, controllable.

Written by mpesceMay 16, 2024May 15, 2024

Artificial intelligence outperforms doctors in the precise diagnosis of eye disorders

GPT-4 showcased a markedly superior performance when compared to unspecialised junior doctors, who possess a proficiency level comparable to general practitioners with specialist eye knowledge.

Written by mpesceMay 10, 2024May 9, 2024

GPT-4 now has vision—can it actually read chest X-rays?

The visual recognition skills of the large language model fell far short of clinical standards, achieving a positive predictive value (PPV) of less than 25% during its best attempt at trying to spot image findings from a set of 100 chest x-rays.

Written by mpesceMay 9, 2024May 8, 2024

Google’s medical AI destroys GPT’s benchmark and outperforms doctors

Med-Gemini was tested on 14 medical benchmarks and established a new state-of-the-art (SoTA) performance on 10, surpassing the GPT-4 model family on every benchmark where a comparison could be made.

Written by mpesceMay 7, 2024May 6, 2024

ChatGPT shows better moral judgment than a college undergrad

Researchers found that morality judgments given by ChatGPT4 were “perceived as superior in quality to humans'” along a variety of dimensions like virtuosity and intelligence.

Written by mpesceApril 29, 2024April 28, 2024

ChatGPT-4 not reliable in cancer patient messaging

ChatGPT-4 generated acceptable messages to patients without any additional editing by radiation oncologists 58% of the time, and 7% of responses generated by GPT-4 were deemed unsafe by the radiation oncologists if left unedited.

Written by mpesceMarch 4, 2024March 2, 2024

Is ChatGPT making scientists hyper-productive? The highs and lows of using AI

Generative AI is continuing to improve — so publishers, grant-funding agencies and scientists must consider what constitutes ethical use of LLMs, and what over-reliance on these tools says about a research landscape that encourages hyper-productivity.

Written by mpesceFebruary 15, 2024February 14, 2024

A new way to let AI chatbots converse all day without crashing

By ensuring that these first few data points of a conversation remain in memory, the researchers’ method allows a chatbot to keep chatting no matter how long the conversation goes.

Written by mpesceFebruary 7, 2024February 6, 2024

This baby with a head camera helped teach an AI how kids learn language

A team of researchers at New York University wondered if AI could learn like a baby. What could an AI model do when given a far smaller data set—the sights and sounds experienced by a single child learning to talk?

Written by mpesceJanuary 19, 2024January 18, 2024

Google AI has better bedside manner than human doctors — and makes better diagnoses

An artificial intelligence (AI) system trained to conduct medical interviews matched, or even surpassed, human doctors’ performance at conversing with simulated patients and listing possible diagnoses on the basis of the patients’ medical history.

Written by mpesceDecember 18, 2023December 15, 2023

Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text

We found that only GPT-4 nearly flawlessly processes inputs with unnatural errors, even under the extreme condition, a task that poses significant challenges for other LLMs and often even for humans…

Written by mpesceDecember 15, 2023December 15, 2023

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

LLMs have been shown to “hallucinate” factually incorrect information, using them to make verifiably correct discoveries is a challenge. But what if we could harness the creativity of LLMs by identifying and building upon only their very best ideas?

Written by mpesceDecember 15, 2023December 14, 2023

Artificial intelligence systems found to excel at imitation, but not innovation

Artificial intelligence (AI) systems are often depicted as sentient agents poised to overshadow the human mind. But AI lacks the crucial human ability of innovation, researchers at the University of California, Berkeley have found.

Written by mpesceDecember 13, 2023December 12, 2023

Is AI leading to a reproducibility crisis in science?

This feature in Nature asks whether the poor use of AI is doing science more harm than good:

Written by mpesceDecember 5, 2023December 4, 2023

ChatGPT Exhibits “Poor Performance” When Classifying Patients With Prostate Cancer

ChatGPT failed to accurately risk stratify 35% of patients studied, but the artificial intelligence (AI) chatbot was able to provide accurate treatment recommendations.

Written by mpesceNovember 27, 2023November 24, 2023

Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using Multiple-Choice Questions

The current version of ChatGPT has limitations in accurately answering MCQs and generating correct and relevant rationales, particularly when it comes to referencing. To avoid possible threats, ChatGPT should be used with supervision.

Written by mpesceNovember 24, 2023November 23, 2023

ChatGPT generates fake data set to support scientific hypothesis

“It will make it very easy for any researcher or group of researchers to create fake measurements on non-existent patients, fake answers to questionnaires or to generate a large data set on animal experiments.”

The authors describe the results as a “seemingly authentic database”.

Written by mpesceNovember 14, 2023November 13, 2023

Getting emotional with ChatGPT could get you the best outputs

The study revealed that large language models may, in fact, be able to understand and respond to emotional cues.Researchers found that LLMs produced higher-quality outputs when emotional language was used to talk to AI chatbots.

Written by mpesceNovember 4, 2023November 2, 2023

Garbage in, garbage out: mitigating risks and maximizing benefits of AI in research

Data sets that are poorly thought out or insufficiently described increase the risk of ‘garbage in, garbage out’ studies and the propagation of biases, rendering outcomes meaningless or, even worse, dangerous.

Written by mpesceOctober 7, 2023October 4, 2023

The consciousness question in the age of AI

This feature from COSMOS Magazine asks — can we even ask if AI is conscious? And what does ‘consciousness’ even mean?

Written by mpesceSeptember 22, 2023September 21, 2023

Telling AI model to “take a deep breath” causes math scores to soar in study

n this latest study, DeepMind researchers found “Take a deep breath and work on this problem step by step” to be the most effective prompt when used with Google’s PaLM 2 language model. The phrase achieved the top accuracy score of 80.2 percent in tests against GSM8K, which is a data set of grade-school math word problems.

Windows Copilot News

All the latest news & tips to help you use AI chatbots safely & wisely

Tag: science