In my most previous post, I compared Web data to water. Just like we can’t survive for an extended period of time without safe drinking water, data-driven businesses can’t survive without good data. In this second part, I will focus on different considerations related to data validation.
If you have a preteen daughter, then the Hannah Montana song “Nobody’s Perfect” is, no doubt, burned into your cerebral cortex. (I have more than one tween daughter so just imagine its effect on my brain.) Well . . . no implementation is ever perfect either. In fact, seeking perfection in your implementation can be a dangerous goal. You want your Web data quality to be as complete and accurate as possible, but perfection or near-perfection can be costly to achieve.
The higher the level of accuracy that is required, the higher the investment of time and effort that is needed from the business to calibrate its implementation. The targeted level of accuracy could require a quick cost-benefit analysis. Is your organization willing to invest more hours in calibrating your implementation and internal systems to gain additional benefits (e.g., executive confidence, user adoption, external reporting, etc.)? In some cases, small, incremental changes in the error of margin can be a big deal. In other cases, they can result in diminishing returns--delaying or wasting time that could be spent on analysis, testing, or other high-value activities.
In most cases, you’re dealing with an explainable and unexplainable margin of error. You typically want to reduce high amounts of unexplained margin of error. If you have explainable margin of error, then you have a couple of options: close or acknowledge the gap. For example, a retailer knows that its real-time SiteCatalyst data is consistently 10% higher in terms of revenue than its back-end system. Its back-end system removes fraudulent orders and product returns from its final revenue numbers. In this case, the retailer can make the decision to close the gap between SiteCatalyst and its back-end system by feeding this post-sale data back into SiteCatalyst. Or the retailer can acknowledge the gap and move forward with optimizing its Web site and campaigns based on the understanding that its Web data doesn’t factor out fraudulent and returned orders.
Industry pundits Jim Novo, Avinash Kaushik, and others have advocated for precision over accuracy. Precision focuses on reproducibility and repeatability compared to accuracy, which focuses on obtaining the exact number. As long as your Web data consistently falls within an acceptable threshold of accuracy, your business should be able to act on the data’s directional insights with confidence.
Twofold Responsibility
When it comes to data validation, you need to focus on two areas. First, were the original business requirements successfully met by the implementation? Hopefully, you have a measurement strategy or business requirement document in place that you can refer to. Your team needs to verify that the desired reports were set up properly and that they’re collecting data. Do you have all of the right buckets in place? Are the buckets capturing anything?
Second, is the data in the reports sound and accurate (precise)? You might be initially thrilled to see a comprehensive list of custom reports in the SiteCatalyst interface until you start looking more closely at the actual data. Is the data flowing into the buckets any good? At this stage, data validation should involve a Web analyst who can shine a business perspective on the reports to determine whether the data is drinkable. Frequently, the data validation responsibility falls solely to technical QA staff. In their minds, if the JavaScript code executes fine and doesn’t throw any errors, it passes. However, what about the actual values in the report? This is where a Web analyst who is knowledgeable with the Omniture tool can help.
For example, to an untrained, inexperienced, or unfamiliar eye, the collected data in the Pages report might look fine--“We’re collecting a bunch of page data, and it’s nicely formatted. Booyah!” A trained eye, however, will spot the three or four instances of the same home page in the Pages report that is concerning.
Next: Lost in translation




