Message boards : Number crunching : Changes to robots.txt to avoid Boinc DB overhead by bots
Author | Message |
---|---|
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
I was reading over SIMAP about excluding bots from accessing BOINC dbs (for performance reasons) -here- and checking R's robots.txt noticed that it needs to be changed by adding the "/rosetta/" path to the URLs excluded from bot (Googlebot, Yahoo Slurp etc) visits. ie. https://boinc.bakerlab.org/rosetta/robots.txt User-agent: * Disallow: /account Disallow: /add_venue Disallow: /am_ Disallow: /bug_report Disallow: /edit_ Disallow: /host_ Disallow: /prefs_ Disallow: /result Disallow: /team Disallow: /workunit should be: User-agent: * Disallow: /rosetta/account Disallow: /rosetta/add_venue Disallow: /rosetta/am_ Disallow: /rosetta/bug_report ... etc as the default examples are relevant for project URLs which don't include a path. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
Message boards :
Number crunching :
Changes to robots.txt to avoid Boinc DB overhead by bots
©2024 University of Washington
https://www.bakerlab.org