As deep learning models get better at representing human language, telling whether a text was written by a human being or a deep learning model becomes harder and harder. And because language models reproduce text found online (often without attribution); the risk of considering their output as if they were written by a human changes the reading experience for the reader.

The last year has been incredible for natural (and programming) language processing. GitHub’s Copilot has been out of technical preview since June, and ChatGPT was released in November. Copilot is based on OpenAI Codex and acts as a source code generator (which raises several issues of its own). ChatGPT is a language model built for dialogue, where a user can chat with the AI, ask questions and have them answered. Both are trained with data from web scrapping, with source code for Copilot and webpages for ChatGPT. Those models work particularly well for their respective purposes, and can thus be used to generate seemingly convincing source code or prose.

Because AI-generated texts are convincing, the fact that they were generated by an AI is not obvious to the careless reader. This is problematic, as there is no guarantee that the text is factually correct and that the human leveraging the AI checked it for mistakes. When reading, this may create a discomfort, as the reader has to determine whether a text was generated by an AI, and if so, if the publisher made sure that it is correct. Companies already started to use AI generated text for articles without clearly visible disclaimers and riddled with errors. The fact that text generated by ChatGPT may contain inaccuracies was acknowledged by OpenAI’s CEO. One might argue that humans make mistakes, too, and that prose or source code written by a human being can therefore also be wrong. This is true. However, the intent behind the text differs. In most cases, the author of a text tries their best to make it correct. But the language model does not understand the concept of correctness and will happily generate text containing wrong facts, which changes the tacitly assumed rules of writing and reading content.

Gaining trust in the text generated by an AI is thus a worthwhile objective. Here are partial solutions to this:

  • Watermarking texts generated by GPT models is a work in progress. Among the possible ones, the words chosen by the AI would embed a proof (using asymmetric cryptography) in their probability distribution. While this does not alleviate the concern stated above, this allows the reader to avoid AI-generated text if he or she wants to.

  • Connecting the text generated by the AI back to what lead it to generate the text might be another may offer a partial solution. If the readers can verify the trustworthiness of the sources, they might feel more confident about the AI-generated text they are reading.

  • If citing the source is too involved computationally, weighting the learning process of the AI in such a way that would give more importance to the authoritative sources on a subject would be a good workaround. Counting the number of backreferences of a page would be a good indicator of whether the text it contains is authoritative (just like page rank).

Considering this perspective, using large language models raises trust issues. A few technical solutions are listed above. However, it would be too reductive to consider this only as a technical problem. AI generated text then looks akin to search engines, without the comfort of knowing that they merely redirect to a source website, whose content is presumably written by a human being who tried to make it correct.

PS: This article was not written by an AI.