Explanation behind copyrightinstitute.org visitor log In Google Analytics

Explanation behind copyrightinstitute.org visitor log In Google Analytics

Everything on complaint***copyrightinstitute.org Referrer spam (also known as referral spam, log spam or referrer bombing) and how to prevent all similar to it

Last updated on December 7th, 2017 at 09:56 am

Similar Posts

What is Ads.txt And How it Affects Online Advertisers And Publishers

Since November 2017, Google stopped buying ads on sites with ads.txt files which don’t include the correct publisher IDs.

Explanation behind copyrightinstitute.org Referrer Spam AKA Log Spam/Referrer Bombing And How To Prevent Them in Google Analytics

Over the last couple of days, webmaster have been panicking based on visitor logs originating from  complaint1********.copyrightinstitute.org to their websites. Some users out of concern have started multiple discussions on popular forums like reddit and WordPress Forum to voice their concerns.

We did a little digging ourselves and concluded that this this visit is purely automated and is classified as a Referrer spam (also known as referral spam, log spam or referrer bombing). They are called ghosts because they never access your site. It is important to keep this in mind.

The technique involves making repeated web site requests using a fake referrer URL to the site the spammer wishes to advertise. Sites that publish their access logs, including referer statistics, will then inadvertently link back to the spammer’s site.



These links will be indexed by search engines as they crawl the access logs, improving the spammer’s search engine ranking. Except for polluting their statistics, the technique does not harm the affected sites.

Example traffic source from the same primary domain includes: 

  • complaint100380163.copyrightinstitute.org
  • complaint100199116.copyrightinstitute.org
  • complaint103421430.copyrightinstitute.org

is what is commonly known as . This type of traffic is sent to thousands of Google Analytics accounts with the purpose of driving traffic to spammer site and promote their services or product.

Observations

  1. Some webmasters will see traffic logs originating from subdomain complaint1********.copyrightinstitute.org
  2. These logs can be accessed using Google Analytics or other Website traffic monitoring utilities.
  3. This log mostly is associated with WordPress driven websites but reports also came from other website platforms.
  4. Clicking the link brings you to the Chrome Store  app Profile page for o-o-0-o-o Search Bar
  5. o-o-0-o-o Search Bar is a generic app that shares similarity to a fake aggregated search engine Called o-o-0-o-o.

Is it good or Bad?

No need for a pull the cables, do a Who Is Search or pull out your DNS sniffers as we have observed no harmful or suspicious behavior associated with the Chrome store app or by visiting the o-o-0-o-o domains. Dont be alarmed about being in the red with copyright enforces because of this log entry. This is not the procedure that is used by copyright enforcers.

At least since 2014, a new variation of this form of spam occurs on Google Analytics. Spammers send fake visits to Google Analytics, often without ever accessing the affected site. The technique is used to have the spammers’ URLs appear in the site statistics, inducing the site owner to visit the spam URLs. When the spammer never visited the affected site, the fake visits are also called Ghost Spam.

  • o-o-0-o-o.com
  • o-o-1-o-o.com
  • o-o-2-o-o.com
  • o-o-3-o-o.com
  • o-o-4-o-o.com
  • o-o-5-o-o.com
  • o-o-7-o-o.com
  • o-o-8-o-o.com
  • o-o-9-o-o.com

How to Stop ALL Ghost Spam in Google Analytics with One Effective Filter

Usually it is recommended to add the referral to an exclusion filter after it is spotted. Although this is useful for a quick action against the spam, it has three big disadvantages.

  • Making filters every week for every new spam detected is tedious and time-consuming, especially if you manage many sites. Plus, by the time you apply the filter, and it starts working, you already have some affected data.


  • Some of the spammers use direct visits along with the referrals.
  • These direct hits won’t be stopped by the filter so even if you are excluding the referral you will sill be receiving invalid traffic, which explains why some people have seen an unusual spike in direct traffic.

Luckily, there is a good way to prevent all these problems. Most of the spam (ghost) works by hitting GA’s random tracking-IDs, meaning the offender doesn’t really know who is the target, and for that reason either the hostname is not set or it uses a fake one. (See report below)

Ghost-Spam.png

You can see that they use some weird names or don’t even bother to set one. Although there are some known names in the list, these can be easily added by the spammer.


On the other hand, valid traffic will always use a real hostname. In most of the cases, this will be the domain. But it also can also result from paid services, translation services, or any other place where you’ve inserted GA tracking code.

Valid-Referral.png

Based on this, we can make a filter that will include only hits that use real hostnames. This will automatically exclude all hits from ghost spam, whether it shows up as a referral, keyword, or pageview; or even as a direct visit.

To create this filter, you will need to find the report of hostnames. Here’s how:

  1. Go to the Reporting tab in GA
  2. Click on Audience in the lefthand panel
  3. Expand Technology and select Network
  4. At the top of the report, click on Hostname

Valid-list

You will see a list of all hostnames, including the ones that the spam uses. Make a list of all the valid hostnames you find, as follows:

  • yourmaindomain.com
  • blog.yourmaindomain.com
  • es.yourmaindomain.com
  • payingservice.com
  • translatetool.com
  • anotheruseddomain.com

For small to medium sites, this list of hostnames will likely consist of the main domain and a couple of subdomains. After you are sure you got all of them, create a regular expression similar to this one:

yourmaindomain\.com|anotheruseddomain\.com|payingservice\.com|translatetool\.com

You don’t need to put all of your subdomains in the regular expression. The main domain will match all of them. If you don’t have a view set up without filters, create one now.

Then create a Custom Filter.

Make sure you select INCLUDE, then select “Hostname” on the filter field, and copy your expression into the Filter Pattern box.

filter

You might want to verify the filter before saving to check that everything is okay. Once you’re ready, set it to save, and apply the filter to all the views you want (except the view without filters).

This single filter will get rid of future occurrences of ghost spam that use invalid hostnames, and it doesn’t require much maintenance. But it’s important that every time you add your tracking code to any service, you add it to the end of the filter.

Now you should only need to take care of the crawler spam. Since crawlers access your site, you can block them by adding these lines to the .htaccess file:

## STOP REFERRER SPAM 
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC] 
RewriteRule .* - [F]

It is important to note that this file is very sensitive, and misplacing a single character it it can bring down your entire site. Therefore, make sure you create a backup copy of your .htaccess file prior to editing it.


If you don’t feel comfortable messing around with your .htaccess file, you can alternatively make an expression with all the crawlers, then and add it to an exclude filter by Campaign Source.

Implement these combined solutions, and you will worry much less about spam contaminating your analytics data. This will have the added benefit of freeing up more time for you to spend actually analyze your valid data.

1.0
OVERALL SCORE
Threat Level

COMMENTS

DISQUS: 0