Topology: The critical forgotten factor in digital analytics

Topology: The critical forgotten factor in digital analytics
Gary Angel is President and CTO at Semphonic. He co-founded Semphonic and continues to lead and develop the industry-leading online measurement practice. Under his leadership, Semphonic has become the world’s largest Web analytics consultancy. Gary’s ground-breaking work in hands-on Web analytics includes the development of functionalism (the public-domain methodology for tactical Web analysis), pioneering work in the creation of SEM analytics as a discipline, and numerous methodological improvements to the field of Web analytics and the study of online behavior. Gary was recently recognised for his work and awarded ‘Most Influential Industry Contributor’ at this year’s annual Web Analytics Association (WAA) Awards for Excellence. Angel publishes a regular blog on all things analytics and hosts a series of Webinars on cutting-edge analysis techniques. Gary's background in CRM, survey analysis, database marketing and large-scale data mining and business-intelligence have helped keep Semphonic at the leading-edge of online measurement. He has spoken at countless conferences including the X Change Conference, SMX, AIM, EUCI, OMMA Global, eMetrics, the WAA Symposium and The Red Door Speaker Series. The X Change Web Analytics Conference - the industry's premier conference for web measurement professionals - is Gary's original concept and continues to grow with an additional conference this year being held in Berlin. Prior to founding Semphonic, Angel created and implemented multi-million dollar database marketing and CRM systems for Fortune 500 companies including VISA, Bank of America, and American Express. Angel graduated, with honors, from Duke University in North Carolina.

The Problem with Statistical Analysis in Web Analytics

The practice of Digital Analytics is built on a few simple, largely unquestioned assumptions. The first of these assumptions is “intentionality”. When a visitor looks at a page about a topic or product, we assume they have an interest in the product or topic – that the behavior was intentional.

The second key assumption is “influence”. When a visitor views a page and then subsequently does something we consider a success, we assume that the earlier page view influenced the subsequent action. These two assumptions – intention and influence – are so deeply ingrained in Web analytics practice that we hardly ever even bother to think about them.

That’s dangerous, because the way visitors traverse a website is controlled, to some extent, by the options and pathways provided. Like a magician “forcing” the pick of a card, we exert significant control over where visitors go and how they get to key locations by the way we structure the website. So when we infer “intention” or “influence”, we may actually only be measuring our own little sleights of hand.

Think about it this way: websites are very much like city streets. Some pathways are big and broad, others small and narrow. Often, there’s no direct way to get from Point A to Point B. No analyst would ever be foolish enough to think that a straightforward correlation model would work for analysing city traffic. Yet, surprisingly, many have made exactly that same mistake when it comes to websites.

Basic statistical analysis techniques aren’t designed to handle data sets where the data is topographically arranged – and the structure of websites creates a deep topology to web data.

Simple correlation analysis, for example, does nothing to separate out the impact of site structure. So pages that are closely related navigationally are almost always highly correlated. This makes it impossible to interpret true intention of users or the true influence of pages and, is therefore, almost completely useless.

Creating a Topographical Analysis

So any real analysis of visitor behaviour will have to take account of topology before it will be possible to measure correlation and infer intentionality or influence. In effect, you have to remove your sleight of hand from the equation.

One of the easiest ways to do this is to create a logical model of the site (rather like a sitemap) and then count distances between nodes in the hierarchy. We call this a topographical design. Even better, a behavioural topology model can be built showing how users actually navigate the Website and from this model, distance between nodes can be calculated based either on the distance in the tree or the actual number of average clicks between points.

Creation of a behavioral topology model is truly a foundational project in digital analytics. Without it, every analysis you do of your Website is likely to be deeply flawed. With a behavioral topology come numerous new analytic opportunities that few Web analysts have explored. These models also open up the opportunity to use classic statistical analysis techniques more fully.

By creating objective measures of distance and a true topography of the website, these models make it possible to look at the relationship between content and outcome on the website while controlling for the site’s inherent structure. 

View Comments
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *