
Credit: Novikov Alexey/Shutterstock
Prosecutors in Florida have launched a criminal investigation into the artificial-intelligence company OpenAI, and whether the company’s chatbot ChatGPT was used to assist the suspect in a mass school shooting at Florida State University in April last year. No charges have been filed against the company, nor has it been accused of a crime, but the investigation has put a spotlight on one of the biggest challenges facing AI companies: why is it so hard to develop chatbots that adhere to human laws, ethics and values?
Florida law states that anyone who aids someone in committing a crime can also be held responsible for that crime. In a statement to the media, the state’s Attorney-general James Uthmeier said that if the chatbot were a person, then they would be facing charges for murder.
Concerns about large language model (LLM) chatbots giving dangerous or illegal advice have been growing for the past few years, following examples of them encouraging people to take their own life, create illegal sexual material and commit financial fraud.
Regardless of whether the Florida investigation leads to legal consequences for OpenAI, based in San Francisco, California, it will increase pressure on companies to prove that their safety measures are effective, says Usman Naseem, an LLM alignment researcher at Macquarie University in Sydney, Australia. At the same time, research into the process of encoding human values into AI models to make them helpful and safe — called alignment — is attempting to find other solutions.
OpenAI did not respond to Nature’s request for comment on the investigation, but a spokesperson for the AI company told the BBC that it has co-operated with authorities and that “ChatGPT is not responsible for this terrible crime.”
What are the current safety measures?
At present, safety standards for AI chatbots are set by companies and there is limited external oversight, says Naseem. Many companies have acknowledged there is a problem and say that they have introduced safety measures to prevent chatbots from giving advice that might lead to dangerous behaviour. But some researchers are calling for independent safety testing.
One safety measure that’s been introduced is content filters that mean the AI tool refuses to respond to requests that include certain words. However, there are many ways that users can get around these, says Toby Walsh, an AI researcher at the University of New South Wales, Sydney. Prompts with harmful intent can be reframed to work around safeguards, for instance, by writing requests in hypothetical or fictional contexts. This can make it difficult for the AI tool to screen problematic requests from benign ones, adds Walsh.
But many of the safety measures, including content filters, behavioural training and policy rules, are external controls layered on top of the system rather than the system having a genuine understanding of ethics or intent, says Naseem. “Those safeguards help, but they are not perfect, and determined users can still find ways around them,” he adds.
Why don’t LLMs comply with human laws?
Part of the issue is the way the most popular LLMs learn — by example rather than following a set of rules. LLMs are trained on the vast repository of text available on the Internet. When a user asks a question, or ‘prompt’ the LLM predicts the most likely sequence of words.
This design means they are able to respond to a large array of prompts. They are “a jack of all trades”, says Walsh, but that makes it challenging to put guardrails around what they should not say.
An LLM’s answers are pattern completion, says Naseem. “They do not truly understand meaning or consequences,” he says.
Walsh says that researchers have tried to teach AI systems to follow rules in the past. An older type of AI known as symbolic AI that was popular in the 1950s and 1960s involved teaching computers to follow rules. But these systems did not work for large-scale, real-world problems because developers could not write enough rules to cover all situations, says Simon Lucey, an AI researcher at Adelaide University, Australia.
Walsh says that one way to tune existing LLMs to be safer is reinforcement learning from human feedback. In this example, humans test an LLM’s output and help it to write preferred responses to prompts. But this type or learning is resource intensive and expensive, he adds.
Another way to ensure safety would be to remove harmful information from the initial data sets used to train AI models, but research has shown that that isn’t always successful. And to manually comb through huge data sets would also be very expensive for technology companies, Walsh says.

