Publications such as the New York Times, CNN, Chicago Tribune and ABC (Australia) have banned OpenAI web crawlers because they do not want articles in their archives to be used to train the various advanced language models that underlie AI applications.

OpenAIPhoto: Shutterstock

In fact, several major publications have blocked OpenAI (a web crawler) so that the company can no longer access their content to train AI tools.

The publication did not argue for this decision, but it is generally related to the law on intellectual property, which says that access to media content is not possible without the approval of the creators. In addition, access to these articles is, with some exceptions, only possible on a subscription basis.

The ban on GPTBot took effect in August. Another banned web scanner is CCBot.

Some publications have changed their terms of use to clearly state that it is illegal to use content created for software development, or to train machine learning or artificial intelligence (AI) systems.

Sources: The Guardian, The Verge