How to block access to bots?

Started by SIROTA, Aug 03, 2022, 11:52 AM

Previous topic - Next topic

SIROTATopic starter

Bots come regularly, I don't even know what they need. They wind up behavioral factors for themselves or try to throw my site out of the top,
or maybe a certain service for tracking bots is trying to force me to fork out for their services in this way.
Doesn't matter.

Previously, they were very easy to deal with. But I had to switch to another hosting... in general, there is not even a regular service there to track the IP addresses of bots.
On another hosting, there are other sores, on the third, third problems ...

There is a CIDRAM WordPress plugin, absolutely free, but difficult to set up due to poor instructions. I use its functions by 80 percent.
I would like it to be 100%.

I assume that the bots come from one LLC Cloud Networks server.
How to ban bots by name?


Quote from: SIROTA on Aug 03, 2022, 11:52 AMHow to ban bots by name?

Just hang on Cloudflare and block the subnets you want in a couple of clicks.
It's free.

SIROTATopic starter

Quote from: Ali_Pro on Aug 03, 2022, 12:10 PMJust hang on Cloudflare and block the subnets you want in a couple of clicks.
It's free.

Connected, now in the mail, which is on the domain, letters have stopped coming.

What needs to be screwed up where?


Quote from: SIROTA on Aug 03, 2022, 01:34 PMWhat needs to be screwed up where?
mx records in DNS settings in CF.

SIROTATopic starter

The fact of the matter is that the records have already been updated and Cloudflare has stopped sending letters through the connected domain. I turned off the proxy for email records in the DNS settings, and still the letters do not reach. Letters go to SMTP, but they do not come from IMAP.

Post Merge: Aug 03, 2022, 05:26 PM

In NX - MX is slightly confused)
Now everything is fine.

Great, why no one says that Cloudflare successfully keeps bots out of the site, and you don't need to pay anything.
I pressed only 2 buttons (check JavaScript and something else) and all the bots fell behind, 4 bots left, and to hell with them. I used to suffer, track down, set up SIDRAM, ban subnets along with real users. And it turns out everything is simple and even my participation is not required.

Thank you kind man for your advice!


This issue has been discussed on the forums for many years.

It is much more efficient to leave only the necessary countries, and start the rest on js check + for the necessary countries, add to js check the subnet of hosters that are googled.

Some bots are tested, then you can just hang a captcha, they will not break, it will be expensive for them.


You can restrict access to bots, parsers and other spam to visit your site using file rules .htaccess (recommendations in robots.txt as practice has shown, it is often ignored). About how to do this – below.

We prohibit access to bots via .htaccess
There are three ways to prevent bots from visiting your site:

By ip address
By User-Agent
By Ip mask (by country)

Calculating unwanted bots by their ip addresses, with a corresponding ban of these addresses, is a tedious and ineffective task: ip addresses change, new ones appear, etc.
There is a much better solution - to put a ban on his name (User-Agent). In this case, it does not matter from which ip it comes. You can also additionally ban bots from entering the site by masking the IP addresses of foreign countries from which these bots come (often the servers of seo services, spam bots and other parsers are located abroad).

List of popular bots
There are quite a lot of unwanted robots, but the most popular and, accordingly, the most annoying are listed below:

AhrefsBot – this service robot analyzes the site pages for the presence of external links.
SemrushBot – analytical service robot by analyzing websites.
MJ12bot is a search robot of the Majestic service that collects data about outgoing links on websites.
Riddler – service robot
aiHitBot – service robot
trovitBot – service robot
Detectify – service robot
BLEXBot – a robot from
dotbot – service robot
FlipboardProxy – service robot
rogerBot is a robot from Moz Pro. It accesses your site's code, analyzes it and delivers a report on it to Moz Pro.
MegaIndex – service robot (automated promotion system)
We prohibit access by User-Agent
Adding the following code to .htaccess:

Option #1

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ".*AhrefsBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*SemrushBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*MJ12bot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*Riddler.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*aiHitBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*trovitBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*Detectify.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*BLEXBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*dotbot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*FlipboardProxy.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*rogerBot.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*MegaIndex\.ru/2\.0.*" [OR]
RewriteCond %{HTTP_USER_AGENT} ".*LinkpadBot.*"
RewriteRule ".*" "-" [F]
Option #2

RewriteCond %{HTTP_USER_AGENT} AhrefsBot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} SemrushBot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} MJ12bot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} Riddler
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} aiHitBot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} trovitBot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} Detectify
RewriteRule (.*) – [F,L]
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} dotbot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} FlipboardProxy
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} rogerBot
RewriteRule (.*) – [F,L]
RewriteCond %{HTTP_USER_AGENT} LinkpadBot
RewriteRule (.*) – [F,L]

We prohibit access by ip address mask
A significant part of the services that load your site are located on servers located abroad.
After analyzing the log files of dozens of sites, I identified the masks of the IP addresses of the most frequently visited robots.

To prohibit bots from entering from the specified IP addresses, you need to add the following lines to .htaccess:

Order deny,allow
Deny from 5.9. 66.4.

You can add to the file.htaccess other IP addresses from which a lot of activity was noticed.