Ghostbuster: A Cutting-Edge Tool to Detect AI-Generated Text

Ghostbuster: Detecting Text Ghostwritten by Large Language Models:

Large Language models such as ChatGPT have created a stir, as they write convincingly well, leading to misuse by students for ghostwriting assignments. This concern has led to the banning of these models in some schools. The other issue is that the content generated by such models often contains factual errors. It is essential, therefore, that readers are aware of whether the article or any content has been ghostwritten using generative AI tools before divulging trust in it.

Existing detection tools for AI-generated text often fall short when faced with data substantially different from fields they trained in. In case these models incorrectly classify genuine human writing as AI-generated, they invariably jeopardize students whose work is honest.

The Ghostbuster, is a state-of-the-art solution for detecting AI-generated text. The tool works by computing the probability of generating each token in a document using several weaker language models. This process does not require defining the model used to generate a document. The function makes the tool useful in detecting text generated potentially by a black-box model. Ghostbuster has been evaluated across different ways text could be generated, including various novel datasets of essays, news, and stories, language models, and prompts.

Why Ghostbuster?

Most of the present AI-generated text detection systems are brittle in classifying different types of texts. Simple models that use perplexity alone cannot capture more complex features, and fall apart when faced with new writing domains. Classifiers based on large language models like RoBERTa capture complex features effortlessly but fail to generalize the training data: they often overfit. Methods that classify text without training on labeled data by calculating the probability that the text was generated by a specific model tend to do poorly when the text was actually generated by a different model.

How Does Ghostbuster Work?

Ghostbuster's functioning involves three stages of training: computing probabilities, selecting features, and training the classifier. In computing probabilities, each document gets converted into a series of vectors by computing the probability of generating each word in the document under a series of weaker language models. In the stage of feature selection, a structured search procedure is employed. This method involves operation-defined probabilities combined using forward feature selection, which consistently adds the best remaining feature. And finally, a linear classifier is trained on the best probability-based features along with some additional manually-selected features.

Ghostbuster is a state-of-the-art AI-generated text detection model that showcases progress over the existing models. It generalizes well to different domains and is adept at identifying text from black-box or unknown models. This utility does not require access to probabilities from the specific model used to generate the document.

Future work includes providing explanations for model decisions, improving robustness to attacks intending to deceive the detectors, using AI-generated text detection approaches alongside alternatives like watermarking, and hopefully assisting across a multitude of applications like filtering language model training data or flagging AI-generated content on the web.

Disclaimer: This article was written with the assistance of AI. The original article can be found here.