Artificial intelligence is not willing to be correct
As deep learning models get better at representing human language, telling whether a text was written by a human being or a deep learning model becomes harder and harder. And because language models reproduce text found online (often without attribution); the risk of considering their output as if they were written by a human changes the reading experience for the reader.
The last year has been incredible for natural (and programming) language processing. GitHub’s Copilot has been out of technical preview since June, and ChatGPT was released in November. Copilot is based on OpenAI Codex and acts as a source code generator (which raises several issues of its own). ChatGPT is a language model built for dialogue, where a user can chat with the AI, ask questions and have them answered. Both are trained with data from web scrapping, with source code for Copilot and webpages for ChatGPT. Those models work particularly well for their respective purposes, and can thus be used to generate seemingly convincing source code or prose.