If you are responsible for a website, the following information may be of interest to you. We hope you find it helpful.
What is Survey Bot?
A. SurveyBot monitors Internet Statistics.
Each week SurveyBot will query websites for statistics and other useful information. This information
goes into the creation of the Domain Tools domain search engine.
B. The Future of SurveyBot.
SurveyBot will continue to get smarter and smarter. With that will come more information
on each domain. We hope that webmasters and domain buyers will find all the diagnostic information that
SurveyBot collects valuable when they are looking for domains to buy.
Robots.txt
A. Is SurveyBot compliant?
SurveyBot obeys all known guidelines. If SurveyBot is denied access to a site it will not include
web content from that website. However it will still record miscellaneous information such as what
web server the site is running and the status of the website (active/parked/deleted).
B. How do I idenitfy SurveyBot?
SurveyBot uses a User Agent field to idenfiy itself to websites. The user agent will
look like:
SurveyBot/2.3 (Domain Tools)
C. Is SurveyBot a Spider or a Probe?
Technically SurveyBot is a Probe. SurveyBot does not spider websites, it simple probes
to see if the website is active and what shape the site is in. The content that SurveyBot
will record are website images, title tag, and meta tags from the default document. SurveyBot
observes the robots.txt directive. The robots.txt directive is helpful for
webmasters in controling where and what information the robots gather.
D. Opting-out of Crawling
SurveyBot only gathers one page per domain, the default document located at "/".
But before we visit your website we will first consult the "/robots.txt" file to see if it has
permission to crawl the entire site. SurveyBot bot looks for the following tags and that would
stop the crawl and also remove your content from our website. The first tags bans all robots, the
second group of tags only exclude SurveyBot.
User-agent: *
Disallow: /
User-agent: SurveyBot
Disallow: /
E. Special Opting-out of all domains on one IP Address
Just like instructions in section D, you need to place this special code in the robots.txt on the Website that resolves
for the IP address and also in the robots.txt file for one of the domains on that IP address. So if you owned example.com, example.org and you had them both resolve to the
same IP address for example 192.168.0.97. You could exclude all domains on that IP address by placing
a robots.txt at http://192.168.0.97/robots.txt but would need to let the robot know by placing the special tag on one
of your domain's robots.txt files hosted on that IP address. As we will not check the IP address's website unless told by a domain on that IP.
This prevents users on the same system from removing websites on the same IP address. Only the administrator
can make the choice to globally block all domains on a host.
User-agent: SurveyBot_IgnoreIP
Disallow: /
Please note: Wildcard entries are ignored for this special case of being able to block every domain on that IP. We MUST find this exact explicit string above.
We will not directly look at
http://[your.ip.address]/robots.txt until we find the special
tag above in a robots.txt on one of your domains. Only then will SurveyBot doublecheck the robots.txt vhost for the
IP address. If the tag exists as above, then and only then will the robot not crawl every domain on that
IP address.