Wedge
Public area => The Pub => Off-topic => Topic started by: Norodo on September 8th, 2011, 04:22 AM
-
What is this... It does this all the time. Are you toying with my head, or is Baidu very Norodo-happy?
(http://i.imgur.com/kRw5Ll.png)(http://imgur.com/kRw5L)
-
its a very active bot.
Baidu (180.76.5.183) 06:48:39 Viewing the album "St Abbs Oct 16 - 17th 2010" in the gallery
Baidu (180.76.5.94) 06:42:21 Viewing the album "St Abbs Oct 16 - 17th 2010" in the gallery
-
Baidu is a horrifically vicious bot. Where Google might send two or three requests at a time, I've seen Baidu send 20.
Interestingly, because of default Bad Behaviour checks, Baidu is actually currently blocked, and I'm honestly not sure whether I should change that or not - and I'm more inclined not to change it for the most part.
-
It's because Wedge will not be released in English. Its language strings are from now on hardcoded to Mandarin. Did we forget to tell you? China is the biggest market now.
Oh and because ie6 is the leading browser in China, I've dropped compatibility with other browsers. Standards are too annoying to follow.
-
/mequietly chokes on first cup of tea of the day :lol:
-
We are all being stalked by Baidu, RUN!!!!!
-
I'm not being stalked by Baidu on at least one site of mine ;)
-
Hmm yeah...Baidu keeps stalking me as well. I might as well blow up their asses.
-
Protip: examine the HTTP headers.
If there's no Accept header, chances are it's a dodgy request - or it's Baidu.
-
Baidu 3:00 Viewing Norodo's profile.
Baidu 3:05 Viewing Norodo's profile.
Baidu 3:10 Viewing Norodo's profile.
Ok Baidu. I think you've seen all there is to see by now.
Why must you stalk me? It keeps happening.
-
Because Baidu's a shitty spider.
-
But...
(http://i.imgur.com/S6UPy.png)
Why me?
-
Do you want me to add a robots.txt for that page to push baidu away? I can even change the page for it, to post a message to them :P
-
Do you want me to add a robots.txt for that page to push baidu away? I can even change the page for it, to post a message to them :P
Baidu is so persistent you will probably draw 30 Baidu bots to the page :lol:
-
Yeah, they don't really honour robots.txt.
-
Why not stopping Baidu by iptables?
iptables -I INPUT -s 119.63.192.0/24 -j DROP
-
Because not everyone has control of iptables (though I think Nao has access to it here)
But you can definitely exclude them by examining the HTTP request.
-
I may have access to iptables, but... Why exactly would I stop Baidu? As long as it's not killing the bandwidth, I could care less about it...
If you don't like seeing it in the who's online area, I can remove it from the list of recognized bots... :P
-
That's the thing: it consumes 8-10x more bandwidth than any other search engine, and it's going to keep going.
-
That and for people outside China it's just not a bot that is needed. We have -much- better search engines.
-
In other news, now it's viewing my profile 3 times over.
-
In other news, now it's viewing my profile 3 times over.
I say ban they. :p
-
Maybe they're getting better; I told Baidu to bugger off in my robots.txt and (so far) they've not returned (finger crossed, touch wood...)
-
Maybe they're getting better; I told Baidu to bugger off in my robots.txt and (so far) they've not returned (finger crossed, touch wood...)
Wait, Baidu LISTENED TO YOU? Colour me surprised.
-
They have never honored my robots.txt. I was sometimes getting 40 to 50 of them at a time. The damn little roaches were everywhere at once. I don't have access to iptables, so I just banned them in .htaccess about 6 months or so ago.
-
Maybe it requires a specific format in the robots.txt...? Like, Baidu has to be written in hanzi? :P