Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

By default, robots or spiders, like Googlebot, will try to crawl sites that are protected by an SP, this will result in the robot trying to crawl through the steps of the IdP.  Modern spidering algorithms used by most robots will result in long delays between each url that is fetched.  This results in issues like the LoginContext expiring from the StorageService long before the robot returns for the next step in the process to crawl the steps of the IdP authentication process.  This wasn't as much of an issue pre 2.2.0, but now with more redirects instead of internal forwards it generates many errors over the period of a day.  One of the steps to reduce robots generating huge amounts of errors is to add a robots.txt to the root of the site.

Code Block
langhtml
title/robots.txt
langhtml
User-agent: *
Disallow: /idp/