Search Engine Indexing

Purpose

To give administrators control over the behaviour of search engines when the crawl the site.

Search engines (e.g. Google, etc) search the internet for content to include in their index:

Crawling: visiting a website at its root, analysing the index.html page and following every link it contains; searching for a sitemap, visiting every URL it contains
Indexing: following a link from a (remote) website, the crawler ends up on a local page (which may or may not be accessible through crawling)

The feature instructs crawlers to either include or not include the data in its index.

A request for GET /robots.txt is responded to dynamically:

enabled: Disallow access to /my/transfers/*, disallow /my/drive, the rest is not explicitly disallowed
disabled: Disallow all access

Scope: Configured on Adminunit-level, applies to all Storagehosts of an Adminunit.
Privileges: Configurable by an Admin
Default: disabled; search engines are instructed not to store the visited pages in its index or follow links, regardless of the page.

The feature is enabled or disabled in the admin interface:

None

None