7

How to Prevent a Website Page From Showing Up in Search Results

 3 years ago
source link: https://www.dannyguo.com/blog/how-to-prevent-a-website-page-from-showing-up-in-search-results/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

How to Prevent a Website Page From Showing Up in Search Results

May 11, 2021  ·  373 words  ·  ~2 minutes to read

To prevent a website page from showing up in search results, either set a robots meta tag or send a X-Robots-Tag HTTP header.

So you can add this tag to the page:

<meta name="robots" content="noindex" />

Or send this header for the page:

X-Robots-Tag: noindex

One benefit of the header approach is that you can use it for non-HTML content, like a PDF or JSON file.

The noindex value tells crawlers, such as Google and Bing, not to index the page, so it won’t show up in search results.

Don’t Use robots.txt

You might think to use the robots exclusion standard (i.e. robots.txt) to disallow crawling, but that doesn’t work because then the crawlers can’t see your directive to not index the page. You’ve instructed them not to look at the page at all! So if other websites link to your page, a crawler can still pick up and index the page.

The robots.txt file is for controlling crawling, not indexing.

Directives

There are many possible directive values, and you can specify more than one by separating them with commas:

  • all: no restrictions (the default behavior)
  • noindex: exclude the page from search results
  • nofollow: don’t follow the links in the page
  • none: the same as noindex, nofollow
  • noarchive or nocache: don’t link to a cached version of the page
  • nosnippet: don’t show a description, snippet, thumbnail, or video preview of the page in search results
  • max-snippet:[length]: limit a snippet to [length] number of characters
  • max-image-preview:[setting]: set an image preview’s maximum size, where [setting] can be none, standard, or large
  • max-video-preview:[length]: limit a video preview to [length] number of seconds
  • notranslate: don’t link to a translation of the page
  • noimageindex: don’t index images on the page
  • unavailable_after:[datetime]: exclude the page from search results after [datetime], which should be in a standard format, such as ISO 8601

However, not all crawlers support all values. For example, check out this documentation for Google, this documentation for Bing, and this documentation for Yandex.

Specifying Crawlers

If you want to use different directives based on the specific crawler, you can specify the user agent in the meta tag’s name:

<meta name="googlebot" content="noindex" />
<meta name="bingbot" content="nofollow" />

Or in the header value:

X-Robots-Tag: googlebot: noindex
X-Robots-Tag: bingbot: nofollow


Follow me on Twitter or subscribe to my free newsletter or RSS feed for future posts.

Found an error or typo? Feel free to open a pull request on GitHub.



Twitter GitHub Email

You can encrypt messsages with my GPG key if you'd like.

© 2021 Danny Guo

This content is licensed under CC BY-NC-SA 4.0 and hosted on GitHub.

As an Amazon Associate I earn from qualifying purchases.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK