My Digital Footprint

A two-sided digital business model where your privacy will be someone else's business!

MY DIGITAL FOOTPRINT

To reiterate the previous argument, the idea of MY DIGITAL FOOTPRINT extends the idea of raw data to the wider concept of capture, store, analysis and value created from data generated through digital engagement. This process is based on a structured approach incorporating inputs and outputs, and a feedback loop that governs the whole process. This feedback loop progressively enriches and refines the outputs (value) over time. The analysis phase is able to take raw data from various sources (which I refer to as the digital footprint) and generate value in the form of services, such as personalisation, reputation or discovery – the output from the analysis process I call ‘behavioural DNA’. The value derived from the process is MY DIGITAL FOOTPRINT.

There are two central ideas which underpin this book (and this section): the feedback loop which enriches the digital footprint and the role of the mobile device in enriching the value from MY DIGITAL FOOTPRINT.

Identity is not a digital footprint, but a digital footprint has to be related to a digital identity in order to be interpreted and given value of some kind. It must also be related (at some point) to an identity in order to unlock its potential for control and contribution from the user.

The Nobel Prize winner with a poor reputation

Let us first understand the difference between identity and reputation from an imaginary anecdote.

Many social network sites depend on explicit reputation mechanisms. These can be implemented in many ways, such as the members rating each other ‘good’ or providing testimonials. However, such explicit recommendations have limitations and can be ‘gamed’. We illustrate the drawbacks of explicit reputation by considering the problem of the ‘Nobel Prize winner with a poor reputation’.

If a person’s ‘reputation’ depends on updating someone’s site (or depends on my friends updating the site) then it could mean: if I win the Nobel Prize (which presumably means I am the best in my field), but I forget to update that site, then my reputation is low, irrespective of my achievements.

This is a simplistic example but it illustrates the concept. Reputation should be implicit and should be reflected by your actions (which in turn are automatically captured in transactions/data), in other words, reputation should be a by-product of activities on a network.

footIdentity and digital footprints

The above example illustrates the difference between identity and reputation – in that identity (a passport, an ID card or even the Nobel Prize) is explicitly conferred by an external body and is recognised within its jurisdiction. Reputation (or digital footprint) is a much softer concept with a different use case as presented in the earlier work. Identity is based on documents, such as a driving licence, bank details, social security, certificates, etc. The digital footprint on the other hand is captured in the forms of data streams such as click data, content data, my data and social data – generated both by the individual but also by the social context of that individual as per Figure 16.

Figure 16 Identity is not a digital footprint

The six screens of life  

For the most part, we are consumers of content. In our daily lives we consume professionally created, produced and edited content from traditional and new media providers on our ‘six screens of life’. These screens are divided into two broad categories, big screens and small screens, each with three subgroups as show in Figure 17.

foot
Figure 17 The six screens of life (for digital engagement)

Both for big and small screens, the user has traditionally been a passive receiver of content (content has been broadcast to the user) or the user has been seen as a member of a carefully controlled and managed audience (e.g. voting) – but not as a primary creator of content. For instance:

· both TV and cinema need users to consume (view); and

· a website needs users to consume/interact in most cases.

However, according to Forrester, “Monolithic blocks of eyeballs are gone. In their place is a perpetually shifting mosaic of audience micro-segments that forces marketers to play an endless game of audience hide and seek.” As advertisers lose the ability to invade the home, they will have to wait for invitations, and this means they have to learn how to adopt and understand the user, a good reason for understanding the impact of digital footprint.

Reflecting the above trend, most of the content on the mobile device to date has also been the ‘re-presentation’ and ‘reproduction’ of existing material delivered to the mobile screen. The mobile device, however, is changing from being a primary consumer to a major creator of content. It is worth noting at this point that the separate screens of life don’t have to be managed or offered by separate service providers, devices will no longer control what you can do and where, instead the screen will be able to come under the control of the user, with services relevant to the size of the screen. However, least we forget that the interactive age of communicates and social media delivers diversity and innovation, broadcast amplifies.

The click streams of life

Whilst the broadcast medium dealt with consumers in isolation, solitude and separation, Web 2.0 has brought relationship, engagement and conversation and now have a greater influence of the medium they want to create and consume.

Related to the idea of the ‘screens of life’ is the concept of ‘click streams of life’. By this, we mean the ability of each platform to capture increasingly greater information (as opposed to merely consume it). For instance, web captures attention, browsing patterns, search patterns, click information, content creation, content consumption, as per Figure 18. TV captures only viewing and time preferences. Mobile captures a lot more, for instance: location, attention, browsing patterns, search patterns, time, who we consumed that information with, click stream information, content creation and consumption patterns, etc.

foot
Figure 18 Data types from mobile, web and TV

In addition, there is a proportional relationship between the time we spend with the mobile device and the data captured by the device, as shown in Figure 19.

Figure 19 Relationship between time and data

Figure 19 shows two results: one for the amount of time a user spends with a certain screen of life (TV, mobile and PC), and one as a measure of value generated from the same devices. The top graph shows that we tend to have the mobile on and with us for the majority of our waking hours. However, there is evidence that young people are rejecting this model and choosing to leave the mobile at ‘home’ to avoid parental control; which is ironic, as in some original research by Norman Lewis for Orange in 2004, young people adopted the mobile as it was a place where there was no paternal control. The lower graph presents a measure of value generated from the different devices and our interaction, shown as a normalised percentage. This assumes that all the value generated from mobile, web and TV creates 100% of value. About 15–17% will be generated from data from our interaction with the mobile Internet, even though this will only account for 2 or 3% of time on our chosen devices. A further 5–7% will be generated from the analysis of data that we will provide from our interaction with our TV (including voting). Watching TV accounts for an average of 17% of the total available time. Some 20% of total value will come from the analysis of data from the web, leaving a balance of about 52–55% of all value to be created from data that our mobile devices pick up. This is data that our mobile generates without the user doing anything. The battleground is who can collect this data, who can store it, who can analyse it and who can create value from it?

Converged click streams, Wikipedia button on my TV remote

Related to the two ideas of greater amounts of information being captured by mobile devices and the increasing amounts of time that we spend with mobile devices is the concept of how information captured from mobile devices can be used? Using the web analogy of mashups, information captured on one platform can be combined with data from another service to create a new application or service and a new dataset.

foot
Figure 20 Value from the mashup of data from mobile, web and TV

In Figure 20, it is assumed that each of the platforms (web, mobile and broadcast) have a ‘create’ and ‘consume’ component. MMD is Mobile Metadata, WMD is Web Metadata and BMD is Broadcast Metadata. In each of the cases the platform Metadata comprises patterns about consummation and creation. The cross platform concept of a mashup is the analysis of these datasets together. The output or value can be that the service that started on one platform is improved on that platform, not only by that platform’s own Metadata, but as a result of Metadata from another platform, and it also includes the possibility that an experience that starts on one platform improves a service on another. Additional user data such as NFC cards (e.g. Oyster in London) and non-web financial transactions can be added either by the mobile device being a component of the actual transaction, or being used as a sensor, or the user may download the data and do a manual add. However, given that the mobile will have location, NFC data for travel may not actually add any new information.

Mashups are not new and the idea of cross-platform mashups is a natural extension of the mashup concept. Initial synergies between platforms are based on relatively simple structured mechanisms, such as SMS voting on TV shows, (voice-based) fixed to mobile convergence, etc. However, a deeper integration means that data gathered from one platform can enhance services on another platform, in our case mobile data enriching a web platform. This deep integration of data is happening slowly and causes a conflict in business models as some are based on access or subscription whilst others on advertising and product sales.

Most traditional mediums, such as TV, are trying to incorporate some form of 'controlled interactivity'; for instance, SMS voting. Interactivity is interesting but it is old stuff, especially if it is 'managed' or ignored in the interest of editorial prowess. A truly converged medium would be ‘non-linear’ and boundaries between platforms would blur. For instance, the web has caused most people to absorb knowledge in a non-linear manner. By extension, a truly useful feature would be a Wikipedia button on our premium subscription TV remote. Traditional media will not allow it since they like (need) linearity for the advertisement business model. They fear that I may 'go away' – which I well might. But that's how the new mind may work; the user will return to the provider who allows freedom. From a digital footprint point of view, as customers begin asking for such converged services, it creates an opportunity to create truly integrated services based on understanding the digital footprint. We explore these ideas in greater detail below.

After you have trodden this path come the analysts and the anthropologists

As we have discussed so far, we all create digital footprints as we engage with digital platforms. Platforms like mobile devices will create a larger share of that footprint. Digital footprints will be cross-platform and will be ‘mashed up’ across platforms (for the lack of a better word). We have extended the idea of ‘digital footprints’ to MY DIGITAL FOOTPRINTS’. The concept of MY DIGITAL FOOTPRINT is therefore complex, but this book suggests it is a system and process for the ‘collection, store, analysis and value created from digital data from mobile, web and TV’.

Storage and analysis of digital footprints raises some important questions. Who analyses the digital footprint? Who stores it? What value is derived from the process and for whom?

Humans have always left traces of their activity. The oldest human footprints found date back to about 3.6 million years ago at Laetoli, Tanzania[i]. The ancient human beings who left those footprints would not have known that 3.6 million years later, we modern humans, would analyse them, photograph them, categorise them and draw new insights from those footprints about the people who created them. The difference now is that there is no need to wait 3.6 million years. As soon as those digital footprints are created, there is a host of online companies analysing the cookie crumbs and immediately creating new insights (to be used for commercial reasons).

As we discussed previously, harnessing collective intelligence is the key idea behind Web 2.0. This is not a problem in itself, but the dark side arises if a business entices its audience (customers, clients, delegates, patients, friends) to give up their digital data, collect their digital footprint without their agreement, charge people to view their own data, or sell our data off with the sole expectation of making cash though the one-sided route of exploitation.

foot

On the other hand, the value of digital footprints lies in using the analysis of data and to complement services. However, this use (exploitation) is likened to the most fundamental components of digital identity, that of risk, privacy and trust. These three inter-related components bond and bridge all the characteristics of a digital footprint and identity, as we will see later in business models.

Figure 21 pictorially shows the aspects of the model that will be explored in the next chapter. It shows that on one side there is the collection of data these are the inputs (click data, content data, my data and social data) from web, TV and mobile. These inputs create the digital footprints (raw data) which are stored. This stored digital footprint is analysed to create behavioural DNA. This is what your data says about you. This behavioural analysis is then used to create output, such as service discovery, service improvement and trade or barter.

Figure 21 Components of MY DIGITAL FOOTPRINT

The connection between the inputs and the outputs is the algorithm.

As previously mentioned, the algorithm is the component that creates the value; the outputs are how that value is realised. The algorithm that computes, combines, compares and analyses the digital footprint is the differentiator for a service provider. A good algorithm can produce success, a poor one can bring a company down. Whilst a company can implement the same algorithm, the way it is presented to the community will also lead to success or failure. This provides the bridges and bonds to risk, trust and privacy and how governance and the culture of the company, led by the CEO, will bring some brands down and others to new heights. Considering the algorithm is important. It is a very complex component and the part of the process that will bring differentiation.

foot

As shown in Figure 22, there is a comparison between the credit card algorithm and what a Web 2.0 MY DIGITAL FOOTPRINT algorithm would look like. On the left is a traditional credit card algorithm. Original data collected from transactions was used to build an algorithm for the purpose of predicting if a transaction is fraudulent. When you sign-up and start to use a new credit card, normalised data provides a traditional pattern of spending for your income group in your location and your profession (the behavioural model). Over time this is complemented with your own data, which sets up triggers and thresholds. If one of these triggers or thresholds is broken, there is a simple action of alert and a person steps in to determine the next action. This is a complex algorithm based on a lot of historical data, which works very well but has a singular function ­– fraud detection. Yes, the same datasets are used on another system applying another algorithm and determine if you pay on time, if you should get increased credit and if you may purchase new financial products, but the core of the value is to reduce fraud. The norms do provide a very good prediction, it is socially acceptable to do this and we enjoy the value that comes from giving up this data and the rights to it.

Figure 22 Understanding the analysis issue

On the right is an entirely new and more complex algorithm. It is not data mining principally as the output has moved from a trigger or threshold to prediction of intent, providing: personalisation (filters), delivering context at a specific moment in real-time, determining reputation, providing recommendations, pushing discovery, adding protection, and adding an ability to trade and barter. In this case, your data and others from social groups is combined, compared, contrasted and, through a ‘chaos’ algorithm, develops linkages, bridges and bonds to deliver value. It also takes the immediacy of the output reaction to the value back into the system to refine it; a continuum of feedback, honing and improving services and applications. As we will see in the business models chapter, my feedback produces focus and depth, and my social groups data produces colour and breath; components in real life that are at odds with each other. This is why the algorithm is complex, but is the part that will deliver differentiation.

Of course not all digital footprints are created equal, both in terms of the person to whom it is related and the actual value of different data types that could be collected. Whilst it is obvious that some people are worth more to certain brands (lifestyle and segmentation works), it is less obvious which types of data have value and why it varies. Some intrinsic value in MY DIGITAL FOOTPRINT, the output of the analysis stage, lies in how difficult it is to change or swop provider of a service. If a provider is applying good analysis tools and delivering value through the output mechanisms, it is hoped that the propensity to change/swop/migrate/move/try will be reduced. This means lower customer churn, higher retention, lower cost of customer acquisition and gaining access to more of the lifetime Net Present Value (NPV) of the customer. In this context, the lifetime NPV value of a customer translates into the longer a customer is with you, the higher the value you as a business have achieved from that customer, which will affect your share price. Therefore, what data types are there and which data could affect the ability of a brand to keep a customer loyal? In another direction the question could be, what data is hard or takes a long time to capture and would benefit the outcomes from analysis in a positive way, so as to ensure customers are more loyal?

foot

As an example, let’s review the types of data a mobile operator could have of its customers and see if any of the data could be used to make you more loyal. Figure 23 provides an overview.

The left-hand column shows the types of data that a mobile operator could collect, starting with the obvious ones on billing (monthly invoicing or pre-paid spend pattern), user’s name, and billing (home) address. In reality these data types are very fast to replicate and, for an operator, it provides little competitive advantage. Another operator would be able to replicate this very quickly (in a short period). There is some value from the analysis of this data (classic segmentation) but the value is biased to the operator.

Looking down the list, the majority of data that a mobile operator can collect is quick to replicate and would take a short period for someone else (another interested service provider) to collect the same data if the customer churned. Some data would take a little longer to replicate, such as your mobile web browsing history, your application download history (assuming an operator-controlled portal) and what media you consume. These latter data types, if analysed with a ‘good’ algorithm (one that creates value for the user), will provide some additional loyalty or stickiness.

The operator challenge, however, is that they don’t have access (under existing terms and conditions) to most of these data classes, other than snooping, as users go off portal and depend on the operator as an access (IP) provider. Some data types in the list would take a long time to replicate, an example would be the IEMI/device history. However, whilst it would take a long time to replicate, there is very limited value in knowing this data.

Not all data, even though it can be collected, has value. Indeed, the operator does have access to some data that will not be/cannot be replicated, such as adverts responded to, past call record patterns and payments used by SMS or m-payment (assuming the operator was partly to the transaction). However, whilst such non-replication data types could produce loyalty and value, these classes of data are both held by other third parties and, on their own, have limited value. Knowing you used the m-payment service for a ticket once is not probably as valuable as the data your bank has on you, as the user’s bank (transaction provider) can see all transactions.

foot

Certain types of data are very hard to replicate as it takes a long time (and hence could be very valuable), but this should not be assumed. Indeed, from this view, the mobile operator will find it difficult to retain customers and improve loyalty as they own and have access to low value data types.

Figure 23 Types of data inputs that make up a digital footprint

Creation of services therefore requires the user/customer to share data with the service provider. In sharing the data with the service provider, the services should become better, more accurate and more efficient, as this is the exchange for giving up some value, the customer should receive value. However, not all services are equal. Specifically services which have a higher cost of data acquisition or take longer to acquire the data will be seen to be more valuable as they present the opportunity to create barriers to entry.

It is worth a brief look at some other data types of a more generic nature to determine if there are any obvious data types that could create loyalty. Figure 24 provides such an overview. Again, the left-hand column shows different classes of data types, such as food, clothes and content purchased. It is possible to see that most to the generic data types are quick to replicate. By this it is meant it would not take long for two major food retailers to have the same profile of a user based on till/online spend, unless the user is very discerning and only buys specialist or selective foods from each store. The stores, however, know this as they can compare individual spend to the norm of other buyers in their store. Some data types, such as web searches, browsing, TV viewing and media consumption, take longer to pick up and hence are slower to replicate. It would take a generic medium length of time to replicate the data between two providers of the service. This starts to open up the opportunity for a provider to add value and hold the user. Therefore, a good predictive algorithm could help one provider make a user more loyal, and gain that additional slower-to-replicate data, as long as the user perceives value has been added. Very slow-to-replicate data, which takes a long time to build, can have significant value. However, some legislative jurisdictions are seeking companies to only hold data for a short period, which would block out the value from this data class. Ignoring these possible hurdles, high value goods are only purchased every several years, it is useful to know when re-purchase (new or preventative) is required or to offer additional insurance.

Another class of data that takes time to build is routes and routines. This class of data is about where you have been and what you do. To generate this data class from the source of location takes time. In a chapter later in the book we look at how this data class (routes and routines) could be used to offer a secure mobile service, based not on passwords but on your behaviour.

foot

Figure 24 Generic data types and comments on value

This chapter has focused on MY DIGITAL FOOTPRINT’s reputation and identity, and how our interactions with the six screens of life will both help generate data for our digital footprints (raw data) and also how value will be made through cross-platform integration, co-operation and convergence of services. Finally, we have seen that not all digital footprint data has equal value and just collecting data does not create any competitive advantage, but will introduce serious governance issues.

Fish tails, a model to categorise data

Having reviewed which types of data it is possible to collect and if value can be reached, this section provides a model to look at these data types in a framework and suggests two forms of classification, in addition to the idea of replication time. The first observation is how the data arrives if collected, the second is a classification. These tools I use to help companies undertake a simple data survey and determine where value will be generated.

As already mentioned, not all data is created equal in terms of its value, further not all data is created in the same way, but it is even more complex than this as the value of data changes depending on the context. An online bill for payment in 30 days is less important (has less value) than that same online bill when the deadline is just 1 day. However, location data is a continuous feed, food is short bursts of information, either daily for lunch or once a week for a household. High value goods and services, holiday, cars and a tax accountant present a small amount of data very infrequently. Content and music both provide two types of data, one-off dataset related to purchase data as previously mentioned and the other dataset is usage or consumption, which varies by time. Some of these data types are presented in Figure 25.

foot

Figure 25 Fish tails, a model to categorise data

Taking this obvious view that data is created in different ways, when this creation process is mapped onto the knowledge that data types are all inter-related it becomes clear, or very confusing, that the inter-relationships, bonds and dependencies mean that many companies with different data collection capabilities can reach the same value at outputs. By this I mean, two companies can have different datasets from the same user, but reach the same conclusion via analysis as to where value lies. This presents an interesting dilemma for application companies – should I ask the user for their home address or should I find it from records, depending on a relationship to a social group and behavioural data. However, if I show that I know your home address, will that spook the user? This is why trust, risk and privacy are so important. As an applications company, who should I buy data from and should I reveal my data sources? Figure 26 shows some of these inter-relationships between data types and, as can be seen, collecting some types of data will revel other material facts; however, gaining access to all of that class of data could be difficult, especially payment. Increasingly with a move to mobile payment, micropayment and near field pre-pay card payment, no one provider has all the data, indeed it could be important that the user is able to collect this data class and offer it through an open API to other providers as it will reveal all of the payment behaviours.

Figure 26 Data is inter-related

I do like the work in progress by Leafar[ii] and Fred Cavazza's[iii] 2006 framework for mapping our digital identity and, to quote some translated work, ‘increasingly, footprints appear on the digital sands over which we don't exercise any control: people blog about you, take and publish pictures, and if someone searches for your name in Google or GoogleBlogsearch or Clusty or IceRocket or Technorati, up comes whatever comes, and that's also defining your digital identity – each one of those search results composes one of your digital personas, giving information and hints about what you do, your character, your opinions, your network, in short, who you are. And there are more: just think of Second Life avatars.’ This work is a good framework for items we create about ourselves, but not for data created by others on us, via mashups or automated data from sensors.

foot

Therefore, data collection from the screens of life (mobile, web and TV) is in some ways easy, from both explicit and active (the user providing) and implicit and passive (automatic via sensors) sources. However, not all data has equal value and having some data can create barriers to entry and loyalty. In the long term, data collection will be a commodity business and data will be traded. The value will lie not in the position to collect but in the analysis and the algorithm that drives the outputs and value. Today’s simple CRM and data mining is not up to it; however, the skills that underpin them will bring the advantage.


[i] http://ulik.typepad.com/leafar/2006/10/ulik_unleash_id.html

[ii] http://www.fredcavazza.net/index.php?2006/10/22/1310-qu-est-ce-que-l-identite-numerique (in French)

[iii] http://anthropology.si.edu/humanorigins/ha/laetoli.htm



Designed and Developed by IN3K8 Solutions logo
Terms of use | Privacy policy | Acceptable use policy