
After studying the "human" visits logs of my log, I have checked the visits done by bots identifiable by their userAgent ... And we can say they are legion.
The major search engines are of course present.

"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Bing (Microsoft)

"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
and another Microsoft bot for Msn
"msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)"
Yahoo! (I thought they used the Bing search engine ?)

"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
"Yahoo! Slurp China"
But also some lesser known engines:
Exalead (French)

"Mozilla/5.0 (compatible; Exabot/3.0; +http://www.exabot.com/go/robot)"
Voilà (French)

"Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 (support.voilabot@orange-ftgroup.com)"
Yacy (a free and decentralized search engine I discovered thanks to my logs)

"yacybot (webportal-global; amd64 Linux 3.6.10-nrj-desktop-1rosa; java 1.7.0_b147-icedtea; Europe/fr) http://yacy.net/bot.html"
Baidu (Chinese)

"Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
Jike (Chinese)
"Mozilla/5.0 (compatible; JikeSpider; +http://shoulu.jike.com/spider.html)"
Yandex (Russian)
"Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
"Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)"
Blekko

"Mozilla/5.0 (compatible; Blekkobot; ScoutJet; +http://blekko.com/about/blekkobot)"
gimme60

"gimme60 (Gimme60 Store ID Bot; gimme60.com)"
In addition to these search engines, there are also extractors bots data whose activities are less visible on the Internet.
alexa.com (Ranking site)

"ia_archiver (+http://www.alexa.com/site/help/webmasters; crawler@alexa.com)"
a Twitter bot

"Twitterbot/1.0"
A site using Twitter

"Twitmunin Crawler http://www.twitmunin.com"
And many societies which collect and cross data to sale them to their customers
80legs

http://www.80legs.com/webcrawler.html;) Gecko/2008032620"
panscient

"panscient.com"
Netcraft

"Mozilla/5.0 (compatible; NetcraftSurveyAgent/1.0; +info@netcraft.com)"
ahrefs

"Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)"
gnip
"UnwindFetchor/1.0 (+http://www.gnip.com/)"
Topsy

"Mozilla/5.0 (compatible; Butterfly/1.0; +http://labs.topsy.com/butterfly/) Gecko/2009032608 Firefox/3.0.8"
And a few small bots that I was not able to know the role
"Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
"Web front page analyser. robots.txt complaint (norw.acd.inst@gmail.com)"
Next time I'll talk about traces left by geeks on the teapot log!
This is a really cool logo. Thanks for sharing. gutter install
ReplyDeleteThe premium materials used to make our el paso artificial grass are intended to endure even the most severe weather conditions.
ReplyDelete