Semalt Taking Over Your Google Analytics? Block It!

Semalt claims to be an SEO service, but there are many clear reports that they are using malware to boost search engine rankings. They should be blocked!

Above all, do Not use their “removal tool”, it messes up your entire Google Analytics history.  https://productforums.google.com/forum/#!topic/analytics/ePCUyPkDVvs has numerous very upset people after using Semalt’s “Removal Tool”

Semalt taps into a malware network of hundreds of infected computers, to make their activities harder to spot.  Read http://www.infosecurity-magazine.com/news/semalt-hijacks-hundreds-of/

You can block semalt.com web bot from accessing your site, or you can instruct Google Analytics to ignore all activity from semalt about your site (or do both).

To tell Google Analytics to ignore Semalt, see http://www.hallaminternet.com/2014/remove-semalt-google-analytics/

(Same idea, to tell Analytics to ignore your own accessing your site, see http://www.hallaminternet.com/2012/get-a-clearer-picture-of-your-website-traffic-how-to-exclude-internal-visitors-from-google-analytics/ )

To completely block Semalt from ever getting into your site, make add these lines to your .htaccess file. These should go before the WordPress lines. Notice that the period in semalt.com has to have a backslash \ before it. Replace “/shared/bad-webbot.php” with the actual location of your minimal HTML page, mine simply displays “Bad Webbot! Sit in the Corner” and logs what they tried to do.

SetEnvIfNoCase User-Agent (semalt\.com) badUserAgent=$1
# If the environment variable greater than nothing, and not loading an error page, display minimal error page just for bad bots. If simply use     RewriteRule (.*) - [F]     get a 500 Server Error, have to exclude the error page.
RewriteCond %{ENV:badUserAgent} >""
RewriteCond %{REQUEST_URI} !/shared/bad-webbot.php$ [NC]
RewriteCond %{REQUEST_URI} !/shared/403.php$ [NC]
RewriteRule (.*) /shared/bad-webbot\.php [L]

While you’re at it, block several other bad bots. Insert these lines right before the semalt (first) line:

SetEnvIfNoCase User-Agent (binlar|casper|cmsworldmap|comodo|diavol|dotbot|feedfinder|flicky|ia_archiver|jakarta|kmccrew|nutch|planetwork|purebot|pycurl|skygrid|sucker|turnit|vikspider|zmeu) badUserAgent=$1
SetEnvIfNoCase User-Agent ia_archiver\ \(+http://www\.alexa\.com\/site\/help\/webmasters;\ crawler@alexa\.com) !badUserAgent
SetEnvIfNoCase User-Agent LinkedInBot/1\.0\ \(compatible;\ Mozilla/5\.0;\ Jakarta\ Commons-HttpClient/3\.1\ +http://www\.linkedin\.com) !badUserAgent

That says Alexa and LinkedIn are exceptions (!badUserAgent means not bad), all those other user agents are nuisances or worse.


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.