What is the future for big data analytics?

What is the future for big data analytics?
Gary Angel is President and CTO at Semphonic. He co-founded Semphonic and continues to lead and develop the industry-leading online measurement practice. Under his leadership, Semphonic has become the world’s largest Web analytics consultancy. Gary’s ground-breaking work in hands-on Web analytics includes the development of functionalism (the public-domain methodology for tactical Web analysis), pioneering work in the creation of SEM analytics as a discipline, and numerous methodological improvements to the field of Web analytics and the study of online behavior. Gary was recently recognised for his work and awarded ‘Most Influential Industry Contributor’ at this year’s annual Web Analytics Association (WAA) Awards for Excellence. Angel publishes a regular blog on all things analytics and hosts a series of Webinars on cutting-edge analysis techniques. Gary's background in CRM, survey analysis, database marketing and large-scale data mining and business-intelligence have helped keep Semphonic at the leading-edge of online measurement. He has spoken at countless conferences including the X Change Conference, SMX, AIM, EUCI, OMMA Global, eMetrics, the WAA Symposium and The Red Door Speaker Series. The X Change Web Analytics Conference - the industry's premier conference for web measurement professionals - is Gary's original concept and continues to grow with an additional conference this year being held in Berlin. Prior to founding Semphonic, Angel created and implemented multi-million dollar database marketing and CRM systems for Fortune 500 companies including VISA, Bank of America, and American Express. Angel graduated, with honors, from Duke University in North Carolina.

The Problem

Digital measurement at the enterprise level has been driven by two major trends – the use of tags to collect user behaviour and the reliance of SaaS vendors to provide aggregated reporting on digital marketing – both of these trends are at crisis point in 2012.

Tagging represented a significant breakthrough in measurement data collection, enabling widespread access to Web analytics data throughout the organisation. Unfortunately, real drawbacks to the use of tagging have emerged. Customised tagging is cumbersome and difficult – a particular challenge since the amount of customisation demanded has grown steadily.

The process of tagging is time-consuming and turns out to be never-ending. Most IT organisations have little understanding of how tags function and how they relate to actual measurement. Because tagging turns out be much more cumbersome than initially realised, it raises the cost of switching tools to a nearly prohibitive level – locking many enterprises into unwanted measurement solutions.

Web analytics tools, too, are in something of a crisis-moment. When a measurement vendor is collecting vast amounts of Web data for thousands of clients, providing deep access to that data isn’t easy. Vendors focus on providing the best set of reports that meet the needs of most clients – a kind of least common denominator approach to measurement.

As a consequence, Web analytics tools provide little or no access to detailed data, little or no customer level analysis, and little or no ability to do advanced analytics or customer-based testing. Though Most Web analytics tools upgraded their data platforms in 2012 to support increased segmentation, these changes still leave them with little real customer analytics capability.

What’s the latest?

Enterprise analytics managers aren’t standing still in the face of these problems. Organisations have rapidly started to investigate either internal or cloud-based warehousing solutions that provide much deeper access and integration of the data.

Today’s warehousing technologies provides rich, flexible integration of virtually any form of data. That means unlimited tables, unlimited fields, multiple data types, flexible access paths, unlimited data transformation, and open tool access. This makes it much easier to drive outbound services and if your organisation is interested in actually using Web data to drive personalisation, targeting, or CRM support, this outbound capability is critical.

The #1 priority for most of our clients is to increase the relevancy (and, thereby, the efficiency) of their digital communications. To do this, data is used to generate “micro-decisions” about what to show or offer the customer. While these micro-decisions are based on rules that don’t need to be developed in real-time, the data used to make each individual decision must be available in real-time. The single most important thing to know about a customer is “what they are doing right now.”

There’s a widespread perception that this real-time decision-making is the unique domain of black-box optimisation techniques – of pure, machine generated optimisations done by systems that live entirely outside your warehouse. Not true. Black-box solutions are nowhere near as capable of building meaning out of data as can be done by a good analyst with the correct tools.

But real-time micro-decisioning is hard even on very fast systems. You have to assemble, analyse, and act on the data in sub-second timeframes. Because of this, many organisations forgo real-time decision-making – leaving the biggest analytics ROI on the table.

We have a problem…

Getting data into a warehouse and exposing it to powerful query and analysis tools sounds like a panacea for all the problems associated with using Web data. Between big-data engines and powerful analysis tools, organisations will surely be able to milk this data of its full value. Or will they?

In digital analytics, two huge challenges face any technology designed to support digital marketing analytics. First, there’s the small question of getting the data in to the warehouse. Nearly all current digital data warehouses rely on data feeds from existing tagging systems.

Not only do these systems carry-over the problems that have brought tagging to a crisis point, they introduce severe delays into the system. Web analytics data feeds operate on a daily basis – meaning the data is one-day old before it ever hits your warehouse. If you’re building an infrastructure to support marketing personalisation, that’s simply unacceptable. Tempting as they may seem, your Web analytics tool is the wrong way to source your analytics warehouse.

Systems dedicated to providing real-time sourcing of Web data to the warehouse are starting to emerge, and organisations looking to create a robust infrastructure for the warehouse that extends beyond the next 12 months need to be looking beyond their Web analytics tool to an infrastructure specifically designed for the task.

Equally problematic is the question of how to understand digital data. Web analytics data – how visitors move from page to page on a Website – is thin gruel for most marketers. You can’t build effective marketing campaigns that focus on how many pages a visitor viewed or how long there visit lasted.

This isn’t a problem of data access; the fastest warehouse in the world won’t solve the meaning problem. If you don’t have a meaningful data model for incorporating digital data into your view of the Customer, your data warehousing effort is going to fail.

Chances are, neither your BI teams or your Web analytics teams have any idea how to build that data model. One knows warehousing and customer analytics and the other knows Web analytics tools. The gap in-between is far larger than you’d expect, and bridging that gap is a critical part of a successful analytics warehousing program.

The future of digital measurement

Web analytics tools have expanded their range and sophistication dramatically but still deliver only a small fraction of the analysis capabilities necessary for segmentation, personalisation, or interesting site testing, and provide virtually no customer-level analysis.

Collection, aggregation, and segmentation/personalisation/testing have all, to date, been too generic, too focused on levels beyond the customer, and too siloed. As organisations embrace the analytics warehouse, there is an unprecedented opportunity to solve all these problems in new ways. Organisations should be thinking about the long-term infrastructure necessary to support a truly integrated and efficient system of data collection, data warehousing, and personalisation.

It’s not too early to put the right infrastructure in place: an infrastructure that will provide robust data collection in a truly maintainable fashion and that will support real-time data collection and the robust digital data model necessary to take advantage of all that capability.

View Comments
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *