Blocking spambots with .htaccess

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
JohnForsythe's picture

I just wrote an article about using htaccess to block spambots and scrapers, thought it might be good to post here. In the article, I go over how to block access by user-agent, referrer, IP address, and a few other things.

Comments

thanks

ggevalt's picture

I appreciate your article in that it's very clear and to the point.
.htaccess IS a very powerful way to combat the bots, but it is also very time consuming and, as you point out, screwing it up is so easy to do. The time, of course, is navigating all the hundreds, and in our case, thousands of referrers that have developed. Some, in fact, are now non-existent given that they operate like credit card theives setting up shop with some unsuspecting host, doing their thing for a few weeks, and then splitting.

what intrigues me is whether the best minds on the Net can in fact figure out better ways to detect certain behaviors. For instance, I had someone trying to log in as "anonymous" about 25 times in a space of seven minutes. Would that not be easy to deal with on a core basis (in fact, is there already something to deal with this?)(And, to reiterate my point about time, I do NOT have the time to check the IP addresses for each of those anonymous login attempts. I randomly checked three and they were the same, so I blocked that IP address.) It seems that behavior detection or something core or even server based would have more impact and would do more to solve the cause rather than trying to fix the symptom.

And, on another level, given the malicious and damaging nature of all of this, is there not a public service or even a business opportunity to go after the perpetrators? And is there not public policy that needs to be developed? (If, in fact, a lot of these guys are coming from China, is there not higher level discussions that should be had to get China to develop better controls?)

I don't want to sound like a crazy person here. But I do think that if we KNOW that every Internet user in the world complains about and has to waste time with spam and e-mail and if we KNOW that most every Internet user complains about losing legit email because of "filters" and the like, what if it reaches the same level on Web sites and we suddenly realize that ALL of us are diminishing content, accessibility and community building because of these bots....

gg

ggevalt
www.youngwritersproject.org

I couldn't agree more!

siliconmeadow's picture

Looking at my Logwatch report for today, I've had 34 attempts to login to my server via ssh as root - 28 times from one IP address and 6 times from another. And this was a slow day.

I was explaining it to my wife that it's like a couple of kids walking down a busy street checking the doors of all the parked cars in broad daylight. The other pedestrians are completely oblivious, and there are no cops. I have reported it to the police stations (ISPs who own the IP addr blocks) in the past, but I've had no responses back. I've even copied the relevant paragraphs from their AUP agreements and given the exact time of day in which incidents have occurred but have had no responses from the ISP. I do realise that the IP addresses could be spoofed, but ALL OF THEM?

It is cybercrime, surely. And surely I've gathered enough evidence to identify a script-kiddie or two. Any other suggestions how we could bust them?


Richard Sheppard
http://www.siliconmeadow.net

I have found DenyHosts

jaydub's picture

I have found DenyHosts http://denyhosts.sourceforge.net to be a worthwhile tool for automatically detecting and blocking SSH attackers...

Thank you!

siliconmeadow's picture

Looks like a good tool to give a go - fortunately, I feel my server is pretty well protected though. My interest is now catching and punishing the people who are attempting these cracks. My annoyance is that, as there is no consequence, there is no reason for them to stop.


Richard Sheppard
http://www.siliconmeadow.net

Combating Spam and Bots

Group organizers

Group categories

Modules

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: