How to cleanup unwanted bot traffic in Sitecore Analytics

Sitecore Analytics is more than the reports you can consolidate in the Experience Analytics module.
The power of Sitecore Analytics can primarily be found in its integration with the complete Sitecore Experience Platform. A solid Sitecore Analytics configuration is key to fully benefit from all the features in Sitecore such as profiling, personalization, path analyzer, A/B testing etc.

Sitecore uses 'server side tracking' to measure the behavior of the users. Unlike traditional analytics technologies that use client side tracking (with JavaScript) to measure user behavior. The primary advantage of 'server-side scripting' is the ability to highly customize the response based on the user's requirements, access rights or queries into data stores. However, the downside with server-side tracking is that any request is measured and that Sitecore has some difficulties in distinguishing human traffic/behavior with unhuman traffic, such as bots or scraping tools.

Sitecore has a robot detection functionality, but from our experience there is still a lot of bot traffic that slips through the net. Especially the European market has a lot of tools and technologies where Sitecore isn’t familiar with. As a result, the Experience Database is stored with polluted data. Compared to Google Analytics we monitored a data discrepancy of 200% more sessions. This makes analyzing your data more difficult and the higher amount of data storage will eventually result in a higher hosting cost.

Fortunately, it’s perfectly possible to customize the robot detection based on your requirements. Our experience with several Sitecore configurations helped us to improve and optimize the (default) robot detection. We developed two modules to improve the Sitecore configuration.

  1. Extending (and maintaining) the robot detection list.
  2. Clean-up polluted data based on specific preferences.

These customizations significantly reduced the robot traffic! We now have discrepancies between 5% and 20% compared to Google Analytics. This is perfectly reasonable since both technologies have a different approach in measuring sessions and users.




We are planning to launch these modules into the Sitecore Market Place in Q4 2017 after we tested and fine-tuned it furthermore. Meanwhile don’t hesitate to get in touch if you want more information or consultancy about this topic.


Don't miss out

It's more than digital, it's your business
The Reference is nothing without its customers. Melexis is the stock market-listed global player in the semi-conductor and sensors industry for whom we facilitated future company growth by updating the brand, building the completely new corporate website and giving shape to the use of online channels. Read more about this client.