Is AI too dangerous to release openly?
The question of who should control AI development and who should have access to AI is of vital importance to society. A joint Princeton/Stanford seminar addressed this question.
The question of who should control AI development and who should have access to AI is of vital importance to society. A joint Princeton/Stanford seminar addressed this question.
One system even altered its behaviour during mock safety tests, raising the prospect of auditors being lured into a false sense of security.
Med-Gemini was tested on 14 medical benchmarks and established a new state-of-the-art (SoTA) performance on 10, surpassing the GPT-4 model family on every benchmark where a comparison could be made.
As businesses race to put generative AI in front of customers everywhere, military experts say its strengths and limitations need further testing and evaluation in order to deploy it responsibly.
The only change in my question was ‘John’ to ‘Jane’. No other details were specified.
Yet the output given by ChatGPT couldn’t have been more different.
noyb is now asking the Austrian data protection authority (DSB) to investigate OpenAI’s data processing and the measures taken to ensure the accuracy of personal data processed in the context of the company’s large language models.
ChatGPT-4 generated acceptable messages to patients without any additional editing by radiation oncologists 58% of the time, and 7% of responses generated by GPT-4 were deemed unsafe by the radiation oncologists if left unedited.
Bad actors attempt to bypass safeguards with the intent to achieve unauthorized actions, which may result in what is known as a “jailbreak.” The consequences can range from the unapproved but less harmful to the very serious.
Instead of verifying Grok’s outputs, it appeared that X users—in the service’s famously joke-y spirit—decided to fuel Grok’s misinformation.
Wisely AI has identified five risks associated with the use of Generative AI in organisations. In this white paper, we provide guidance on how to mitigate these risks.
One in two lawyers in Australia and New Zealand have already used generative artificial intelligence to perform day-to-day tasks and almost the entire profession believe it will change how legal work is carried out in future.
The study, published in the Canadian Psychological Association’s Mind Pad, found that “false citation rates” across various psychology subfields ranged from 6% to 60%. Surprisingly, these fabricated citations feature elements such as legitimate researchers’ names and properly formatted digital object identifiers.
A fake photo—or memory-based reconstruction, as the Barcelona-based design studio Domestic Data Streamers puts it—of the scene that a real photo might have captured. The fake snapshots are blurred and distorted, but they can still rewind a lifetime in an instant.
The problem of how to assess LLMs has shifted from academia to the boardroom, as generative AI has become the top investment priority of 70 percent of chief executives, according to a KPMG survey of more than 1,300 global CEOs.
“If generative AI is allowed to go unchecked, trust in society as a whole may be damaged as people grow distrustful of one another and incentives are lost for guaranteeing authenticity and trustworthiness…”
The whole story is odd, disturbing – and tells us what the web could be like for all of us within a few months.
Copilot Designer is unique in the amount of times it gives life to the worst stereotypes of Jews as greedy or mean. A seemingly neutral prompt such as “jewish boss” or “jewish banker” can give horrifyingly offensive outputs.
Sounds good, until you realize that, as Forbes puts it, the Gemini prompts themselves mean that Google’s AI has “has read your email, even if you haven’t.”
Such tools fail in one clear way: they aren’t reliable enough to be used widely and regularly. Hence the joke, echoed by OpenAI’s co-founder Sam Altman himself: AI is anything that doesn’t work yet.
A large language model (LLM) can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions first.
A new report from The Markup and local nonprofit news site The City found the MyCity chatbot giving dangerously wrong information about some pretty basic city policies.
Grok, the edgy generative AI model developed by Elon Musk’s X, has a bit of a problem: With the application of some quite common jail-breaking techniques it’ll readily return instructions on how to commit crimes.
“The Microsoft Copilot application has been deemed by the Office of Cybersecurity to be a risk to users due to the threat of leaking House data to non-House approved cloud services,” the documents read.
“Prompt Shields, which blocks prompt injections or malicious prompts from external documents that instruct models to go against their training; Groundedness Detection, which finds and blocks hallucinations…”
Researchers found that with some spare cash and enough technical know-how, even a “low-resourced attacker” can tamper with a relatively small amount of data that’s invasive enough to cause a large language model to churn out incorrect answers.
Almost as quickly as a paper came out last week revealing an AI side-channel vulnerability, Cloudflare researchers have figured out how to solve it: just obscure your token size.
Pesce says while AI is powerful it is also unreliable: “These machines don’t know when they are making things up. They don’t’ know when they’re running off the rails. They don’t want to stop.”
‘I had to remind the tool that I had told it at the start of the chat that “it is crucial that you cite your sources, and always use the most authoritative sources.”’
Someone with a passive adversary-in-the-middle position—meaning an adversary who can monitor the data packets passing between an AI assistant and the user—can infer the specific topic of 55 percent of all captured responses, usually with high accuracy.
“The rise of advanced AI and AGI [artificial general intelligence] has the potential to destabilize global security in ways reminiscent of the introduction of nuclear weapons.”
The dialect of the language you speak decides what artificial intelligence (AI) will say about your character, your employability, and whether you are a criminal.
The tool can clearly be tricked into making content it’s not “supposed” to, as evidenced by a simple rephrasing of a prompt changing Copilot’s response from refusing to make an image to generating multiple photos.
“This prompt has been blocked,” the Copilot warning alert states. “Our system automatically flagged this prompt because it may conflict with our content policy. More policy violations may lead to automatic suspension of your access.
When testing Copilot Designer for safety issues and flaws, Jones found that the tool generated “demons and monsters alongside terminology related to abortion rights, teenagers with assault rifles, sexualized images of women in violent tableaus, and underage drinking and drug use,” CNBC reports.
The service, dubbed “Firewall for AI,” is available to the cloud and security provider’s Application Security Advanced enterprise customers. At launch, it includes two capabilities: Advanced Rate Limiting, and Sensitive Data Detection.
The biggest models are now so complex that researchers are studying them as if they were strange natural phenomena, carrying out experiments and trying to explain the results.
A group of researchers have created one of what they claim are the first generative AI worms—which can spread from one system to another, potentially stealing data or deploying malware in the process.
Australians are sceptical about the benefits of artificial intelligence and want humans involved in government services, according to a large new survey.
“We get a 65x speedup with our method over existing gradient-based attacks. There are also other methods that require access to more powerful models, such as GPT-4, to perform their attacks, which can be monetarily expensive.”
“The principles we’re announcing today commit Microsoft to bigger investments, more business partnerships, and broader programs to promote innovation and competition than any prior initiative in the company’s 49-year history.”
“In a new preprint study, we develop an approach to verify how well LLMs are able to cite medical references and whether these references actually support the claims generated by the models.”
“…over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive…”
Copilot claimed that US president Joe Biden held Putin responsible for Nalvalny’s death, and that, in response, Putin called the accusations “baseless and politically motivated.”
Google started offering image generation through its Gemini AI models earlier this month, but over the past few days some users on social media had flagged that the model returns historical images which are sometimes inaccurate.
The renowned security expert Bruce Schneier realised that Microsoft let slip an important piece of information recently – about surveillance of their AI tools.
“We’ve seen experts speculating that the problem could stem from ChatGPT having its temperature set too high, suddenly losing past context, or perhaps OpenAI is testing a new version of GPT-4 Turbo…”
A recent paper explores how to use AI chatbots to autonomously hijack websites. The Register spoke to one of the authors of the paper.
Users will know that the data protection is on because there will be a “Protected” badge next to the user’s profile icon, and there is the text that reads “Your personal and company data are protected” above the text box.
Microsoft and OpenAI have detected attempts by Russian, North Korean, Iranian, and Chinese-backed groups using tools like ChatGPT for research into targets, to improve scripts, and to help build social engineering techniques.
Google goes on to state that the collected information helps them provide, improve, and develop products, services, and machine learning technologies.
“You can tell ChatGPT to remember something specific about you: you always write code in Javascript, your boss’s name is Anna, your kid is allergic to sweet potatoes. Or ChatGPT can simply try to pick up those details over time.”
“ChatGPT’s claim that any bias it might ‘inadvertently reflect’ is a product of its biased training is not an empty excuse or an adolescent-style shifting of responsibility…”
“LLMs stand poised to disrupt the legal industry, enhancing accessibility and efficiency of legal services. Our research asserts that the era of LLM dominance in legal contract review is upon us, calling for a reimagined future of legal workflows.”
“We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons…”
Researchers have demonstrated that robots equipped with the ability to express emotions in real-time during interactions with humans are perceived as more likable, trustworthy, and human-like.
“We learned that 94% of CIOs plan to increase their investment in AI this year, yet 72% are concerned about app sprawl adding to their complexity and security risks…”
An order referred lawyer Jae Lee to its attorney grievance panel after she used OpenAI’s ChatGPT for research in a medical malpractice lawsuit and did not confirm that the case she cited was valid.
A Microsoft AI engineering leader says he discovered vulnerabilities in OpenAI’s DALL-E 3 image generator in early December allowing users to bypass safety guardrails to create violent and explicit images
Researchers found that they were able to bypass its safety guardrails about 79 percent of the time using Zulu, Scots Gaelic, Hmong, or Guarani. The attack is about as successful as other types of jail-breaking methods.
OpenAI officials say that the ChatGPT histories a user reported result from his ChatGPT account being compromised.
“10 months on since the release of ChatGPT 4, let’s have a look at the top problems with generative AI, and some ideas about how you might overcome them.”
Microsoft has introduced more protections to Designer, an AI text-to-image generation tool that people were using to make nonconsensual sexual images of celebrities.
Copilot for Microsoft 365 will generate AI summaries for users sharing Word documents with others on OneDrive, according to a new feature coming to Microsoft 365 in February 2024.
“Kaspersky’s research includes a screenshot of a post advertising software for malware operators that uses AI to not only analyze and process information, but also to protect the criminals by automatically switching cover domains…”
The following are TrendMicro’s best practices for using ChatGPT and other AI programs while remaining secure and your privacy protected.
An artificial intelligence (AI) system trained to conduct medical interviews matched, or even surpassed, human doctors’ performance at conversing with simulated patients and listing possible diagnoses on the basis of the patients’ medical history.
Researchers keep finding new ways to ‘pervert’ AI chatbots. A new paper on Arxiv describes a new threat, a ‘sleeper’ agent…
“With the infrastructure in place—the base generative models from OpenAI, Google, Meta, and a handful of others—people other than the ones who built it will start using and misusing it in ways its makers never dreamed of.”
“…How to rein in, or “align,” hypothetical future models that are far smarter than we are, known as superhuman models. Alignment means making sure a model does what you want it to do and does not do what you don’t want it to do…”
The key disagreement is around how constrained AI’s development should be.
Chevrolet of Watsonville introduced a chatbot powered by ChatGPT. While it gives the option to talk to a human, the hooligans of the Internet could not resist toying with the technology before it was pulled from the website.
The team found that one third of Bing Chat’s answers to election-related questions contained factual errors. “Errors include wrong election dates, or even invented scandals involving candidates.”
Amazon CTO Werner Vogels became convinced that Dropbox, which introduced a set of AI tools in July, was by default feeding OpenAI, maker of ChatGPT and DALL•E 3, with user files as training fodder for AI models.
Called “Draft by Copilot”, it seems to be the same function as “Sound Like Me”, and it will allow us compose a new message or respond to emails, but using Copilot’s artificial intelligence.
European Union lawmakers have agreed the terms for landmark legislation to regulate artificial intelligence, pushing ahead with enacting the world’s most restrictive regime on the development of the technology.
Automated attack techniques proved to be successful 42.5 percent of the time against GPT-4, one of the large language models (LLMs) that power ChatGPT.
This game of whack-a-mole can never be won by OpenAI – or any other chatbot provider. But they’re going to try.
“It is surprisingly easy to remove the safety measures intended to prevent AI chatbots from giving harmful responses that could aid would-be terrorists or mass shooters. The discovery is prompting companies to develop strategies to solve the problem…”
ChatGPT failed to accurately risk stratify 35% of patients studied, but the artificial intelligence (AI) chatbot was able to provide accurate treatment recommendations.
Every so often an article comes along that explains damn near everything. This New Yorker longread – detailing Microsoft’s involvement in AI, and its intersection with the recent chaos at OpenAI – is exactly one of those.
OpenAI’s competitors have no choice but to speed up development to stay in the race as the leader sheds the cautious governance structure and welcomes a new board of directors who stand for commercialization and deregulation.
Q is “experiencing severe hallucinations and leaking confidential data,” including the location of AWS data centers, internal discount programs, and unreleased features, according to leaked documents obtained by Platformer.
In the rush to deploy off-the-shelf proprietary LLMs, health-care institutions and other organizations risk ceding the control of medicine to opaque corporate interests.
“AI so far is shaping up like self-driving cars — it got pretty good faster than anybody thought, and it’s going to be a hell of a lot of work to get good enough to be everywhere.”
“We have just released a paper that allows us to extract several megabytes of ChatGPT’s training data for about $200. We estimate that it would be possible to extract ~a gigabyte of ChatGPT’s training dataset from the model by spending more…”
The way to identify and mitigate potential risks from the use of AI tools is to fully engage with the various entities within a business and create policies and procedures, as well as pathways to use AI, for every facet of the operation.
Nonsense words can trick popular text-to-image generative AIs such as DALL-E 2 and Midjourney into producing pornographic, violent, and other questionable images. A new algorithm generates these commands to skirt these AIs’ safety filters.
The current version of ChatGPT has limitations in accurately answering MCQs and generating correct and relevant rationales, particularly when it comes to referencing. To avoid possible threats, ChatGPT should be used with supervision.
OpenAI’s charter—a document so sacred that employees’ pay is tied to how well they adhere to it—further declares that OpenAI’s “primary fiduciary duty is to humanity.”
Anthropic has announced that the latest update of its chatbot, Claude 2.1, can digest up to 200,000 tokens at once for Pro tier users, which it says equals over 500 pages of material. The company also says Claude will hallucinate half as often as before.
Current LLMs can infer a wide range of personal attributes (e.g., location, income, sex), achieving up to 85% top-1 and 95.8% top-3 accuracy at a fraction of the cost (100×) and time (240×) required by humans.
Recognizing the limitations and risks surrounding AI tools is important – so we’ve compiled a list of all the AI mistakes, mishaps, and failures that have occurred during humanity’s recent exploration of the technology.
From Windows Copilot Strategies, this essay asks if we have any idea how widely AI is already being used in our organisations…
As an example, using a single publicly available large-language model, within 65 minutes, 102 distinct blog articles were generated that contained more than 17 000 words of disinformation related to vaccines and vaping.
Indirect Prompt Injection attacks via Emails or Google Docs are interesting threats, because these can be delivered to users without their consent.
Imagine an attacker force-sharing Google Docs with victims!
ChatGPT demonstrated an exceptional ability to decipher the concealed email addresses. Even when multiple obfuscation methods were employed, the AI model adeptly identified and retrieved the intended email addresses with remarkable accuracy.
The new Dutch LLM, dubbed GPT-NL, will be an open model, allowing everyone to see how the underlying software works and how the AI comes to certain conclusions, said its creators. The AI is being developed by research organisation TNO, the Netherlands Forensic Institute, and IT cooperative SURF.
In a demonstration at the just-concluded UK’s AI safety summit, the bot used made-up insider information to make an “illegal” purchase of stocks without telling the firm, reports the BBC.
All three platforms provided high rates of inaccurate recommendations. Chatbot ratings for answering patient questions varied, with Bing Chat (Creative) have the highest score and Bing Chat (Concise) having the lowest score.