Discovering traffic increases and new referring sites in Google Analytics reports is a great feeling. SEO is time-consuming work, so it’s incredibly rewarding to discover that the effort you’re putting into it is paying off. But before celebrating, it’s important to make sure those successes aren’t the result of fake traffic—Google Analytics spam.
Recently, industry experts have seen an increase in website traffic from spambots, which cause new (false) keywords, languages, hostnames, and referrers to appear in Analytics.
To enjoy the benefits of Google Analytics for measuring SEO and marketing campaigns, you need clean data. By identifying and filtering Google Analytics referrer spam, you can ensure data and reports are clean, accurate, and a reflection of interactions from real users.
Google Analytics spam is inflated data that appears when malicious bots send fake traffic to Analytics properties. Many of these fake hits never actually occur—meaning the bots don’t really visit your website. Instead, they spam Analytics accounts with fake data in hopes of conning webmasters into visiting their websites.
Spambots do this by falsifying keyword, language, hostname, and referrer data. When webmasters see this data in their Google Analytics accounts, the natural reaction is to investigate it, searching for the falsified keywords or navigating to the referring websites. This behavior helps spammers increase traffic to their own properties.
But it also completely sabotages the data you need to make good decisions for your marketing strategies. Identifying, blocking, and filtering Google Analytics spam ensures that the data you rely on:
The first step in eliminating Google Analytics spam is identifying it. Most often, Analytics spam appears in organic keyword, language, hostname, and referral reports.
Organic Keyword Report Spam:
Keywords that specify the web address of an unrelated site are likely the result of spam hits.
Language Report Spam:
Google formats languages as “xx-xx,” so anything in the language report that doesn’t match that format is likely spam.
Hostname Report Spam:
Hostname spam can be more difficult to spot. Additional steps are required to confirm that hostname traffic is the result of Analytics spam.
Referral Report Spam:
Referrals should point to pages on your property. If the landing page for the referral is another websites, it’s likely a spam referrer.
Some spam is obvious—anything in the language report that isn’t a language is clearly spam. Other times, you’ll have to conduct a few tests to confirm that what you’re seeing is spam and not legitimate visits or referrers.
Once you have a list of all of the spambots that are attacking your Analytics property, it’s time to block future report data from those spammers, and filter their data from historical reports.
With a list of sites that are sending referral spam in hand, set up filters in Google Analytics to block fake traffic from spambots in the future and to remove historical spam data from reports.
First, to make this easier next time, enable automatic bot filtering. Google is aware of the spambot problem and will automatically filter out known spammers when auto-filtering is turned on. Within the “Admin” tab, click “View Settings,” and check the box to enable bot filtering. Save changes to have Google automatically filter spambot hits.
This doesn’t solve the problem, though. New spambots are created every day, and spammers use masked IP addresses and other deceptive approaches to overcome auto-filtering. For this reason—and to eliminate historical spambot data—you’ll need to manually add filters for existing and new spammers.
Before creating any filters, create a sandbox environment where you can test changes so you don’t accidentally skew your data:
Creating a separate view allows you to test filters without accidentally skewing data, and ensures you have ongoing access to unfiltered data in the default “All Web Site Data” view.
With a filtered view created, you’re ready to establish spambot filters.
A hostname inclusion filter tells Google Analytics to only record hostnames for your site. This will filter out a lot of the spam, and will give you the added bonus of potentially filtering out traffic being sent from dev, staging, or other hosting environments. To create a hostname inclusion filter:
This will filter out a lot of the spam, and it provides the added bonus of potentially filtering out some dev traffic you may be sending.
You may also want to create filters to exclude specific patterns of spam that are tarnishing reports. To create referrer exclusion filters:
Language and keyword spam can also be blocked with exclusion filters using the same process. Just substitute the “Filter Field” selection (step six) with “Language Settings” for language spam, or “Search Term” for keyword spam.
After setting up these filters, you can expect to see changes in historical reports. The significance of the changes will depend on what percentage of your existing traffic was from spam referrals. You may see a decrease in referral numbers, organic traffic numbers, and keywords that are driving users to your site, but at least now you’re seeing accurate data.
While going through these steps will prevent known spambot referrals from appearing in reports, spammers are always finding new ways to circumvent filters. As such, identifying and preventing Google Analytics spam is an ongoing process that needs to be executed before running any incremental analytics reports.
In general, Google Analytics spam is problematic if it represents more than 2% of overall traffic, or 3% of traffic for any channel—and it’s important to check both. Even if only 1% of overall traffic is spam, that 1% could be 10% of organic search traffic, which would distort segmented analysis on organic traffic reports.
A simple first step is enabling spam auto-filtering in Google Analytics. While it won’t remove all fake traffic, it will eradicate many known offenders. After that, set up filters for other spammers that are impacting your reports.
If you need help, a good SEO partner can conduct an audit of your Analytics traffic and establish the necessary filters for you. This will help ensure that your historical data—as well as all future reporting—accurately reflects actual visits to your site by real users.