Checking up on ChatGPT (and other LLMs)
Marie Donlon | August 30, 2024With the advent of large language models (LLMS), which are artificial intelligence (AI)-driven natural language processing tools — the most notable of which is OpenAI’s ChatGPT — concerns have arisen that humans will not be able to discern truth from reality, so to speak.
While some might suggest that this problem will only affect educators attempting to distinguish student-written papers from AI-generated papers, all variations of generative AI run the risk of being used to disseminate false or inaccurate information or to make fake images or videos. Because it is increasingly used by writers, marketers, publishers, editors, journalists, job recruiters and social media professionals, among others, such AI-generated content is impacting industries where accuracy is essential.
So what efforts are being conducted to distinguish real content from fake content and what can we look out for to detect AI-generated content from human-generated content? Follow along as GlobalSpec attempts to answer those questions.
The tools
As more and more text is AI-generated, the tools created to detect such text are also emerging rapidly. Such tools, otherwise known as AI content detection tools, are trained to identify patterns in the text using a combination of machine learning and natural language processing. Together, they can help these detection tools analyze the style, tone of the text and grammar, among other details.
One example of these tools comes from researchers at the University of Kansas who have developed technology capable of identifying AI-generated academic science writing with over 99% accuracy.
The University of Kansas researchers determined that certain signs signal when academic writing has been created using a chatbot. As such, the researchers took those signs and used them to develop a detection tool for identifying when those papers are written by a chatbot.
Originality.ai is another tool designed for detecting AI-generated text. Specifically, Originality.ai was designed for identifying content generated using ChatGPT, GPT-4 and Gemini, which is a generative AI chatbot, formerly known as Bard, developed by Google, which uses natural language processing to identify AI.
Yet, AI content detection tools are still in their infancy, despite their rates of accuracy at distinguishing some AI-generated text from human-generated text. According to experts, these tools fail to keep pace with the speed at which generative AI has been developing.
Manual checks
Overwhelmingly, experts report that such AI detection tools tend to produce both false positives wherein the AI detector incorrectly identifies content written by humans as AI generated, as well as false negatives wherein the AI detector fails to identify AI-generated content altogether.
As such, experts suggest that manually identifying signs of AI-generated text might better enable users to discern AI-generated text from human generated text by looking for the following clues:
- Repetition or overuse of specific words, phrases, ideas or idioms. AI doesn’t necessarily notice repetition the way humans do, which explains how it might not escape the notice of the AI system producing the content or the AI tool checking the content.
- Sentence length. AI-generated sentences tend to be shorter than human-generated sentences.
- Nonsensical, incoherent or implausible claims. AI-generated sentences will oftentimes contain too many words, thereby making the sentence incomprehensible. Other clues that point to AI-generated content include text featuring false claims presented confidently. One such example occurred where AI-generated text suggests that it takes 60 minutes to make one cup of coffee.
- Incorrect grammar and punctuation. Perhaps the most obvious tell revealed by AI-generated text is the incorrect usage of simple grammar and punctuation within a text.
- Use of AI-associated words. Experts suggest that there is a list of common words that suggest AI wrote a particular piece. Relying heavily on transition words, AI tends to overuse words like furthermore, consequently, in addition to and resulting in, to name just a few.
- Passive voice. AI tends to be written in a passive construction, offering an impersonal and distant narrator, absent any storytelling qualities.
- Lack of details. AI-generated text tends to feature very general descriptions of ideas as well as the absence of original thoughts or ideas.
- Outdated content. Because AI content has been trained on potentially outdated sources, failing to take into account the most recent information and data available, AI chatbots like ChatGPT risk regurgitating content that has expired.
Until AI detection tools can keep pace with AI-generated text, perform some of these manual checks to ensure the accuracy and quality of the content in question.