GPTBot: OpenAI releases new web crawler

You can now prevent OpenAI's ChatGPT from accessing your website, or parts of it, using robots.txt.

Barry Schwartz on August 7, 2023 at 6:43 am | Reading time: 2 minutes

OpenAI has published information about GPTBot, its new web crawler.

What is GPTBot. GPTBot is OpenAI’s web crawler. OpenAI uses it to crawl the web, consume knowledge for its AI features (e.g., ChatGPT) and provide AI-generated answers to questions (or prompts).

Useragent. GPTBot’s User agent token is “GPTBot”. Its full user-agent string is: “Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)”.

Robots.txt. You can use robots.txt to block GPTBot from accessing your website, or parts of it. To disallow GPTBot to access your site you can add GPTBot to your site’s robots.txt:

User-agent: GPTBot
Disallow: /

To allow GPTBot to access only parts of your site, you can add the GPTBot token to your site’s robots.txt like this:

User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/

GPTBot documentation. You can read the documentation on GPTBot.

GPTBot IP ranges. OpenAI also published the IP ranges that GPTBot uses. It only lists one, but I suspect they will add more over time.

Why we care. You can disallow GPTBot from crawling your site if you don’t want OpenAI using your content in any way. This is the same protocol you would use to block GoogleBot, BingBot or other web crawlers. These companies are also looking for an alternative to robots.txt for these purposes.

Dig deeper. Should you block ChatGPT’s web browser plugin from accessing your website?

You can now prevent OpenAI's ChatGPT from accessing your website, or parts of it, using robots.txt.

Recommend

Attackers target the Domain Name System, the internet's phone book. Here's how t...

Deals: Get the AirPods Pro 2 for Low Price of $199 ($50 Off)

Twitter’s dilemma, senseless feeds, terrible personas, free UX workshop

北京信用卡业务员，送出百万山寨山地车

健身器材的“宿命”：闲置、吃灰、转卖

「小鹏智驾之父」转投英伟达，这是一段好聚好散的温情故事

US Banking Crisis: The Truth Behind The Disaster & How It Will Get Worse......

2023年中国再生资源行业回收市场分析再生资源回收市场前景广阔【组图】

Wealthiest People in France (August 07, 2023)

连续5年亚马逊类目第一！新兴品类的跨文化破圈之旅

About Joyk