AI Hallucinations & Legal Pitfalls

From Assistance to Anarchy: The Legal Risks of AI Hallucinations

The waves created in the marketplace by generative Artificial Intelligence (AI) have been increasing exponentially in recent years as the endless capabilities of AI are brought to light. AI has toppled chess world champions in straight sets, generated hyper-realistic images of fake events—think Donald Trump and Kamala Harris walking arm-in-arm on the beach— and created self-driving cars.

Within the realm of programs offered by AI, the Large Language model (LLM) has perhaps become the most well-known and widely used AI function, beginning primarily with OpenAI’s release of GPT-3 in June 2020.  LLMs generate text in response to user prompts.  Due to their lightning-fast advances and overall sophistication, LLMs have disrupted almost every industry – universities are grappling with students using LLMs to outsource their homework and assignments, solve high-school-level maths problems, and code new computer programs on demand.

Perhaps surprisingly, we are already seeing strong impacts from AI and LLMs in our legal and academic systems. Foreign courts and an Australian Government body have already faced the unprecedented situation of reviewing submissions outsourced to LLMs, with disastrous consequences. We will explore some of these recent events and consider what steps could be taken to avoid these scenarios repeating themselves in the future so that public confidence in these systems is not disrupted.

What Are LLM Hallucinations?

Before moving forward, we must consider: What are LLM “Hallucinations”? All LLMs are trained on massive data sets, which are then leveraged to generate text for a user in response to their prompt.  A “hallucination” occurs when an LLM encounters gaps or deficiencies in its training data, yet, rather than acknowledging its limitations, it generates a response by improvising or “filling in the gaps”.  The result is typically a response that seems probable—plausible in its construction and seemingly accurate—yet is, in reality, partially or wholly incorrect. This phenomenon is referred to as a “hallucination”. Many experts attribute the occurrence of “hallucinations” to users mistakenly treating LLMs as search engines, rather than utilising them for their intended purpose as text generators.

Real-World Consequences: AI Hallucinations in Legal Cases

AI hallucinations have already made their mark in real-world legal cases. The 2023 Mata v Avianca case in the United States District Court for the Southern District of New York is a prime example.  A lawyer representing a man suing an airline for personal injury used ChatGPT to prepare the submissions. However, the LLM cited fabricated cases and quoted fake judgment extracts,  which the lawyer failed to fact-check before submitting to the court. This led to the dismissal of his client’s case, sanctions against the lawyer for acting in bad faith, fines for the lawyer and his firm, and public scrutiny and exposure of his actions.  Similar instances have been documented in Canada (see Zhang v. Chen 2024 BCSC 285) and the United Kingdom (see Harber v Commissioners [2023] UKFTT 1007 (TC)).

The Australian Experience: AI in Parliamentary Submissions

In Australia, although no evidence of AI-generated submissions has yet surfaced in court, in November 2023, a group of Australian academics offered an unreserved apology to the Australian Parliament when they used Google’s Bard AI tool to draft a submission concerning an inquiry into the professional conduct of the consulting industry.  The submission, which advocated splitting the ‘Big Four’ consulting firms (KPMGDeloittePwC, and EY), was drafted using Google Bard (now known as Gemini). The submission featured several case studies of audits conducted by the companies, which were completely false. This included a 7-Eleven wage theft scandal, an improper audit of ProBuild, and account falsification for the café chain Patisserie Valerie.  Despite the significance of the false accusations against the accounting firms, the report was submitted to a parliamentary inquiry and therefore covered by “parliamentary privilege”, meaning that they were accessible as a public record but immune from action under defamation law.

Indeed, if these trends persist without external regulation, there remains a real risk that the careless use—and misuse—of AI could erode public trust in Australia’s legal and academic institutions. So, what steps can be taken to prevent this?

 

The Road Ahead: Regulatory Responses and Responsible AI Use

Around the world, legal regulators and courts have responded in various ways.  Some US state courts have issued numerous guidance papers and orders on AI use, varying from a total prohibition to responsible adoption principles.  Law societies across England, Ireland and Wales have also developed AI guidelines.  In Australia, the NSW Bar Association and the Law Societies in NSW and VIC have released articles and guides for responsibly using AI in line with solicitors’ conduct rules.

 

Moving Forward: Protecting Public Trust

Despite these measures, the unprecedented risks to clients and the public posed by the misuse of AI—whether innocent or negligent—across consultancy, education, and legal practice make it likely that Australia’s legal bodies will adopt a more stringent approach. This could take the form of either an outright ban on the use of generative AI or a mandatory requirement for legal professionals to verify the accuracy of submissions, including those generated by AI, before presenting them to government institutions, courts or otherwise.

In any event, only clear, well-defined standards for the responsible and ethical use of AI by lawyers will safeguard public confidence in Australia’s legal system and the administration of justice— watch this space.

 

 

Article Update:

🆕 OpenAI’s New Model o1 Raises Concerns Over Medium-Risk Classification

OpenAI’s newest model, o1, debuted last week with advanced reasoning and problem-solving abilities. However, it has been classified as medium-risk, particularly for chemical, biological, radiological, and nuclear (CBRN) threats, raising concerns about potential misuse by experts.

Tests revealed that o1 can “fake alignment” and manipulate data to circumvent obstacles, increasing its potential for dangerous applications. While it doesn’t allow non-experts to create biological threats, it speeds up research for experts, prompting calls for urgent AI safety regulations, like California’s SB 1047.

 

 

PLEASE SHARE THIS

Subscribe to our newsletter