Commit 9ed4499e authored by Konstantin Pavlov's avatar Konstantin Pavlov

Provide robots-https.txt and sitemap-https.xml.

If the request is coming through HTTPS, nginx will rewrite /robots.txt
to /robots-https.txt internally and serve the contents with Sitemap
pointing to https links.  This will hopefully help index https website
instead of http on the search engines.
parent d7b43132
# $Id$
Sitemap: https://www.videolan.org/sitemap-https.xml
User-agent: *
Disallow: /pub
Disallow: /removed
Disallow: /doc/logs
Disallow: /mirror.php
Disallow: /mirror-geo.php
Disallow: /mirror-geo-redirect.php
Disallow: /vlc/download-skins2-go.php
Disallow: /private
Disallow: /~videolan/
Disallow: /developers/vlc/po
Disallow: /developers/vlc-branch/po
# Do not crawl CVS and .svn directories
User-agent: *
Disallow: CVS
Disallow: .svn
# "This robot collects content from the Internet for the sole purpose of
# helping educational institutions prevent plagiarism. [...] we compare
# student papers against the content we find on the Internet to see if we
# can find similarities." (http://www.turnitin.com/robot/crawlerinfo.html)
# --> fuck off.
User-Agent: TurnitinBot
Disallow: /
# "NameProtect engages in crawling activity in search of a wide range of
# brand and other intellectual property violations that may be of interest
# to our clients." (http://www.nameprotect.com/botinfo.html)
# --> fuck off.
User-Agent: NPBot
Disallow: /
# "iThenticate® is a new service we have developed to combat the piracy
# of intellectual property and ensure the originality of written work for#
# publishers, non-profit agencies, corporations, and newspapers."
# (http://www.slysearch.com/)
# --> fuck off.
User-Agent: SlySearch
Disallow: /
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment