Sitecore has a robot detection functionality, but from our experience there is still a lot of bot traffic that slips through the net. Especially the European market has a lot of tools and technologies where Sitecore isn’t familiar with. As a result, the Experience Database is stored with polluted data. Compared to Google Analytics we monitored a data discrepancy of 200% more sessions. This makes analyzing your data more difficult and the higher amount of data storage will eventually result in a higher hosting cost.
Fortunately, it’s perfectly possible to customize the robot detection based on your requirements. Our experience with several Sitecore configurations helped us to improve and optimize the (default) robot detection. We developed two modules to improve the Sitecore configuration.
- Extending (and maintaining) the robot detection list.
- Clean-up polluted data based on specific preferences.
These customizations significantly reduced the robot traffic! We now have discrepancies between 5% and 20% compared to Google Analytics. This is perfectly reasonable since both technologies have a different approach in measuring sessions and users.
We are planning to launch these modules into the Sitecore Market Place in Q4 2017 after we tested and fine-tuned it furthermore. Meanwhile don’t hesitate to get in touch if you want more information or consultancy about this topic.