Like Neil Armstrong, whilst walking on the moon, and Nelson Mandela walking free from Robben Island, we all leave footprints. Footprints are more than identity. Footprints are about where we have been, for how long, how often and the inter-relationships, they are memories and moments. Therefore, digital footprints are not about your identity, your passport, bank account or social security number. Digital footprints come from your mobile, web and TV interactions and comprise the digital data and also the Metadata[i] (data about data) of who we are, the true value and why the ownership of this data class is the battleground to be won and lost.
However, the original web-based digital footprint and its digital data belonged to the individual at some point. But the individual is currently not empowered to hold or manage this digital footprint. Mobile adds a unique dimension to the digital footprint since mobile provides new content, Metadata and the social context for the digital footprint. In contrast, both TV and the web can provide some data – but the mobile device is unique in terms of its contribution to our digital footprint. This idea is illustrated in Figure 6. The taps and the volume of water that flows is a visual representation of the amount of value that can be created from user data. The water fills a pool which is the representation of the digital data that is stored about you from your interactions.
Data from Broadcast/Listen include: viewing times and schedules, preferences for channels and content, timing of programmes and presence (are you actually there, this could be determined through motion, channel hopping, fast forward or a secondary device, PC or mobile listening to your TV preference).
Data from the web includes: attention, how long you read a page for, browsing history, search words and spelling, patterns and clicks, content created or viewed and purchases (consume). Data from a mobile would include: location, attention, browsing, search, time, who you are with (Bluetooth), proximity, clicks, creation of data and media, consumer, play lists and presence. The actual raw data could be location co-ordinate, a click, two-way interactions or a picture; the size of the tap and subsequent flow represents the volume of data that can be added to the digital footprint pool.
From digital footprints to MY DIGITAL FOOTPRINT
The idea of digital footprints has been discussed from a privacy or data protection standpoint. However, key commentators agree that we are increasingly leaving larger digital footprints over time, especially given the rise of popular social networks and mobile devices.
A digital footprint is the persistence of data trails online by a user’s activity in a digital environment – which Nicholas Negroponte called the ‘slug trail’ in Being Digital and John Battelle calls the ‘Clickstream exhaust’[ii].
According to the Pew Internet report[iii], there are two main classifications for digital footprints: passive digital footprints and active digital footprints. A passive digital footprint is created when data is collected about an action without any client activation (implicit) and include data from sensors; whereas active digital footprints are created when a user deliberately releases personal data for the purpose of sharing information about himself (explicit).
On the web, many interactions, such as creating a social networking profile or commenting on a picture on Flickr, leaves a digital footprint. In a mobile context, CDRs (Call Data Records) are the transactional data that constitute the user’s digital footprint. But the mere availability of transactional data alone is not enough since privacy and data protection rules will apply to the usage of data, and rightly so. It is the ability to store, analyse and create value from the digital footprint that differentiates the study of digital footprints. In other words, if we all left digital footprints – and nothing happened to those footprints – then there are no concerns and no benefits.
footHence, in this book, we present a definition of digital footprints to include capture, store, analysis and value. The ‘capture’ phase includes not only your own activity – but also the activity of others related to that information element – for instance, the impact of your social graph and third parties on the digital footprint. This idea is depicted in Figure 7.
The capture of data itself arises from multiple sources and importantly not only from the data created by the person but by sensors in or connected to devices. The storage of the digital footprint relates to where raw data is stored physically, it’s the ownership and portability [the analysis is also stored but cannot be reversed to create raw data again and identify the user, the location, the service, the purchase or anything unique]. Analysis is the key differentiator in terms of where wealth could be created and in turn leads to the potential value available to both the user and the service providers.
Thus the data captured consists of the click data (the digital trace), and the content (the actual data created – e.g. the pictures uploaded from a mobile device). My data, as presented in Figure 8, is the automated data collection, which can be an embedded location in a picture or enabled by the user in some application. sensory.net is part of my data and is where the device acts as a sensor and collects information. The last input being the social component. The social component is not your data about you as per the other inputs. Social data is what your social group provide about you, as the individual; your references in Linkedin, your rating on eBay, your image tag on Flickr and Facebook.
footThere is a subtle but important distinction between ‘digital footprints’ – that is, between the raw data that is captured and stored for analysis and my digital footprint. The idea of my digital footprint extends the idea of raw data to the wider concept of capture, store, analysis and value created from data generated through digital engagement. The way in which my digital footprint is used is to focus on the system and value created.
This process of building MY DIGITAL FOOTPRINT is based on a structured approach incorporating inputs (collection) and outputs (value) and a feedback loop that governs the whole process. Who does the building is an important topic and is addressed later on the in the book under the business models chapter. This feedback loop progressively enriches and refines the outputs (value). The analysis phase is able to take raw data from various stored sources (which we refer to as the digital footprint) and from the analysis of this raw feed generates value (wealth and services), which takes the form of components such as personalisation, reputation or discovery. The output of the analysis stage we call ‘behavioural DNA’, as it provides a detailed description of you, but has not delivered any value to you.
footIn the chapter on the two-sided business model there is a detailed description of the inputs and outputs and how the feedback loop works; however, here is a short description of the inputs and outputs. The inputs to MY DIGITAL FOOTPRINT are the data elements and the outputs are the value derived from the process which is in turn enhanced by the feedback loop.
Inputs into MY DIGITAL FOOTPRINT
Attention
Data that indicates what you are doing, it is the provision of data that details what applications and services you are engaged with. This could be a widget on your desktop, mobile or set-top providing insight into which applications are open, how long you edited a document, which pictures you viewed, what music you listened to and how often. The attention data stream is the record of what you spend your time doing in a digital world on TV, web and mobile.
Location
The data record of where you are. The live feed is collection, where you were (route taken) if stored.
Time
The time data record is both the time of day and also the period of time.
Search
The data string of search requests, currently the text words (and voice-based search on Google mobile), entered into a search engine, but progressing to automated search based on requests from 3D barcodes and local available intelligence.
Content (create)
The data record of type, context and information about the content you have created for text, voice, presentation, music, audio, images, video, blogs, tags and recommendation.
Activity
This is the dataset that defines what you are doing, whilst attention says you are looking at a web page, activity defines it that you are at a football ground. Location gives you the co-ordinates.
Outputs - value created from digital footprints
footIntent
Intent is an output which provides predication about what you will do next based on what you have done, what your social graph does but also on what you have told/inferred/implied that you are about to do, such as your calendar, email or IM trail. Whereas context is about now; intent is about next.
Reputation
Reputation (digital) has many components. Reputation is both about a rating (good and bad) and about your propensity to do something, such as leave a comment. Reputation (digital) is therefore partly about your value to the community as a participant.
This output data produces a record which is your digital reputation.
Discovery
This output provides concepts, ideas, insight to enable the user to discover. Discovery is about risk and comes in the form of improvement to an existing service or discovery of a new service/application.
Recommendation
This is where, based on your digital footprint (you and your social graph), a service/application is able to make a recommendation about an existing or new product or service with a degree of confidence that it will be relevant. Where discovery is risk, recommendation is about trust.
Protection
This is where your data can be used to protect you and your data, in the same way fraud on credit cards works. Your data is a good predictor if you are the individual who is providing the data. This does depend on humans and certain social groups being creatures of habit.
Personalisation
This is where the application or service is personalised to a user for the particular instance or time. It is the modification of a generic service automatically, but based on what is known about you.
Trade or barter
This second order output function enables the user to trade or barter for goods or services. The trade or barter will not be for cash (this is payment) but for data or for insights, research, etc. This trade or barter is based on input data, analysis, intent and reputation.
Contextual adaptation
This is where the service or application will adapt to deliver a service that is unique to the individual’s requirements based on the existing environment.
The paradox of piracy
There is a paradox with privacy. On the one hand, everyone fears losing it. Scott McNealy of Sun Microsystems famously said that: “Consumer privacy is a red herring: we have zero privacy – and we should all get over it”[iv]. This view has gathered credence after 9/11. Esther Dyson argues that we need more granular control over our data. She believes that the notion of privacy doesn’t fully capture the challenges of the current environment online. ”We need to stop talking about privacy and start talking about control over data,” she says. She argues that, “In the future, users are going to want more granular control over their data, making detailed decisions about what gets shared with whom. Users may be overwhelmed when first setting up an account, but when they get more comfortable with an application, they will exert more control.”
The idea of granular control over our personal information based on the work of Kim Cameron[v] and Stefan Brands[vi] is worth reading if you have a particular need for detail on privacy. The kind of revised technical model enshrined in the laws of identity combined with the smart cryptography of minimal disclosure tokens provides (at least at the technical level) an important breakthrough in the way we think about engineering the design of digital products and services, and empowering the user to control their data in a highly granular and empowering way.
On the other hand, we all have an incentive to contribute data about ourselves, while reflecting on the manner in which we want to be seen, so as to be more visible within a digital context. For instance, even if we don’t want to leave a digital footprint (traces, exhausts, trails, shadows), we do want to be searchable on the web and prefer the results to be seen to be favourable to ourselves.
So, either we are passively creating digital footprints or we are actively contributing information about ourselves. In either case, we are contributing to someone’s view about us, but in doing so are giving up data and our digital footprint, which could be harnessed.
footWho is harnessing your collective intelligence?
Web 2.0 taught us the concept of ‘harnessing collective intelligence’, which will be discussed in greater detail in this section. Harnessing collective intelligence is not a problem in itself. The dark side, however, arises if a business entices its audience (customers, clients, delegates, patients, friends) to give up their digital data, collect their digital footprint without their agreement, charge people to view their own data, or sell OUR data off with the sole expectation of making money though the one-sided route of exploitation. My [Tony Fish] mobile number is widely available on the web, I never get unwanted calls; my home number is ex-directory and only listed on private applications, once a week I receive an unwanted sales call, who sold my data?
On May 1 2009, Spock[vii] was acquired by Intelius (a background check company). Spock is based on a robot which automatically creates tags for any person it finds. It trawls the web for sites such as Wikipedia, LinkedIn and others. It also allows users to enrich their data by letting them add tags of their own and add other data such as relationships between people. Thus, following the ideas of Web 2.0, the site gets better as more people use it. However, from the user standpoint, it gets tricky because:
· Spock trawls the web looking for our data;
· it creates a profile about us in their site without direct approval;
· it encourages us to enrich that information;
· it charges us to access our own information; and
· ultimately, it sells that same information to a background check company.
This leaves open the question, can I delete my own information in Spock? In many ways it now becomes more interesting. Spock says on deletion of information[viii]:
‘If you'd like to remove yourself from Spock, please read the following information and click the link below. Before requesting removal, please make sure the original source of the information Spock found for you has been removed or made private (MySpace, blog, Friendster, etc). This will prevent you from being re-indexed on the site. Please note that you can only request removal for your Spock search result. When filling out your information please make sure to include your name, e-mail, a link to your Spock Search Result (http://www.spock.com/Tiger-Woods), and the reason why you'd like to be removed. The Spock Support Team will review your claim and get back to you within 24–48 hours.’
The implication is that, as a user, I have to ensure that the original sources of information that they (Spock) sourced (via a spider) the profiles from should also be made private (my blog, my Facebook profile, etc) or they will 'harness' me again. This is the Web 2.0 equivalent of harvesting email addresses and selling them on.
footIn future, legislation may extend to cover such practices and the idea of empowering the customer will gain more acceptance. As mobile devices become more common, the issue becomes more significant with Mobile Web 2.0. (By Mobile Web 2.0 – I mean the concept of extending the idea of harnessing collective intelligence to mobile devices, which are more attuned to capturing data along with the accompanying Metadata.) But on a more optimistic note, the use of user-generated data to create a better service is a good thing provided there is transparency and the control rests with the user.
Thus, whilst the principles of Web 2.0 were sound, some implementation of the business model may not be, especially is they are FREE. Responsible companies will come to use the principles and create better services (and will use the web and the mobile holistically). These may not be free, hence not 'Web 2.0' in the traditional sense. But they will be more honest in their relationship to their customers and more transparent in their usage of data from their customers. Users may also be wiser and more empowered. We will learn from the mistakes of Web 2.0 and create better engagement and trust based on converged/mobile-driven services.
Privacy - is your privacy someone else's business?
The above discussion raises the question: ‘Is our privacy becoming someone else’s business?’ What I mean by this is; are companies exploiting my private data for their own gains, and the privacy I thought I had, either never existed or has been eroded.
These practices (exploiting data) have attracted the attention of regulatory bodies, such as the Federal Trade Commission in the USA[ix]. It is widely accepted that the populace take the role of media and brands for granted. Marketers will tell you that ‘brands need us and we need them’.
Much technology and innovation adoption is now driven by brands, think iPod, and the interests of the brands are not necessarily aligned with the interests of the customer. Hence, the notion that 'we need brands and brands need us' has to be tempered with the basic reality that the primary purpose of brands is to sell. And let’s not forget that. As media becomes rich and complex, brands seek to engage with us and to measure that engagement for maximising their revenue. Hence, their interest in ‘harnessing the digital footprint’ and its consequent impact on privacy. Digital footprints and privacy, trust and risk are two sides of the same coin; they are bonded and bridged as discussed earlier. There is an extended chapter on these relationships towards the end of the book as the focus moves to the business model. Advertisers need a lot of data to make their advertising more personalised (and, by extension, to claim more money from the companies who use their advertising), but acquiring the data needs the customer to give up certain privacy rights, trust someone and take some measured risk, all in the interests of the advertiser.
Consider the industry’s emphasis on 'convergence'. Convergence could take a much darker meaning in light of harnessing a digital footprint. If the media and advertisers were to indeed 'join the dots' between the various information elements left by us (cookie crumbs of information in different media), then advertising becomes powerful and personalised. This benefits the advertiser especially in a converged media scenario (where the same provider owns the TV, landline, mobile subscriptions, etc); however, this could lead to some questionable behaviour which could be currently legal but may soon be regulated. It could also lead to consumer backlash, as explained in the scenario below.
footSuppose you are a fan of a rock group and you blog about it. Consider this scenario. Many TV companies are exploring ways to 'personalise' TV advertising to the home. For instance, they seek to gain viewing preferences from set-top boxes and other avenues and then (in an extreme scenario), to tailor the advertising to each home. The question becomes: Which data elements can be used to tailor this advertising?
footConsidering the ‘rock concert’ data element, which can be obtained from an RSS feed from my blog. It is easy to combine three sets of data: my home address and my name, which the cable company has in addition to the RSS feed from my blog (which ties to my name or, via packet inspection, my persona) and the phone book/voter registration (as a confirmation of my address). Knowing these elements, the cable TV company can join the dots (analysis in this book’s terminology) then 'personalise' the advertisement to tailor specifically to me (show me an advertisement of the next rock concert by the performer in the commercial break on the TV/cable). Presumably, this makes the advertiser 'happy' since they are 'personalising' the advertisement to me and I could even 'engage' with it by pressing the 'Buy it Now' button (advertising utopia!). On the other hand, it could be seen as a gross invasion of privacy and questionable 'Big Brother' tactics, which is why the first word in the book was ‘Marmite’ – some like this idea, others don’t.
This personalisation and engagement could be made progressively worse in the future based on the abundant availability of different datasets open to advertisers, all of which could be co-related from different datasets to gain new insights about us to 'sell' to us. This ability to re-identify or even identify does raise important privacy issues. You could call this 'micro-persuasion' and indeed it raises some questions about the ethics of advertisements and engagement (although none of this behaviour would be seen to be illegal). It also raises a genuine spectre of consumer backlash (e.g. if I were to see many such rock concert ads, I would know that my TV is watching me and I would take action, such as change channel or service provider).
footGovernments will also follow suit. Governments need to be involved in two ways:
.by creating regulation that benefits consumers in addition to the advertisers, especially in relation to new areas where regulation is sparse and consumers can be potentially exploited; and
· ensuring that the privacy rights of individuals are protected in the light of ever increasing encroachments from brands and advertisers.
At one level, we have laws such as the Data Protection Act[x] in the UK. However, government can also be part of the problem, for instance, the proposed law on 'data sharing' in the UK. Under the guise of 'mass exchange of data can offer some benefits' (to advertisers and governments), the UK Government is proposing legislation (source: Telegraph website[xi]) by which data held by the police, the NHS, schools, the HMRC (tax), local councils and the DVLA could all end up in private hands, according to Privacy International. At the same time, information gathered by companies including hotel registrations, bank details and telecommunications data could be transferred to the Government as part of the provisions of the Coroner's and Justice Bill, it is claimed. The campaign group admits the ‘mass exchange of personal information has the potential to deliver some benefit’.
Yet another grey area is targeting minors and ethnic groups. Legally, there is no law that prevents the targeting of specific ethnic groups by advertisers. In fact, it can be profitable to do so, as per the benefits from the ad network JumpTap, which predicted that Hispanic centric campaigns would quadruple this year, with revenue increasing at least 20% in the segment[xii]. There may be indeed nothing wrong in selling Hispanic-orientated content, music, etc, targeting a specific demographic, but change the model slightly and you get some serious privacy concerns. For instance, the South Asian population is genetically susceptible to Diabetes[xiii]. Does this mean that Diabetes medication advertisements should be targeted to South Asians in the UK? Again, this is not too difficult to do using current technology and increasing convergence and data availability. Where do we draw the line?
For many of us who travel to the United States, we see drug companies advertising medication on TV. This is illegal in many countries – especially in Europe. The message from the advertisements seems to be 'Call your General Practitioner (GP) and ask him to recommend our drugs'. Broadcasting drug company advertisements raises the ethical issue of the advertising company influencing the doctor's judgement for commercial reasons (selling their products). In many countries, this is an ethical question and under regulatory scrutiny.
Yet another area is the protection of minors especially in an era dominated by mobile and social networking. Social networks, mobile and other emerging mediums offer the possibility of pushing the boundaries of advertising to target kids. Again, this practice is not illegal (yet!), but it is certainly morally questionable. As per the Guardian newspaper’s[xiv] review of the book Consumer Kids by Ed Mayo, which summarises the case study of seven-year-old Sarah, who has been recruited through Dubit.com to act as a brand ambassador for Mattel and promote her Barbie MP3 player to school friends. In exchange for keeping the sought-after shiny pink gadget, her job description includes creating a fan site where she blogs about the product, taking pictures of her sales missions and posting them back to Dubit, where she is rewarded.
footAs more and more mobile devices are able to purchase goods and services, extending the above discussion, we enter into the realm of the ethics of impulse purchasing. Impulse purchasing is not un-ethical in itself. Supermarkets, for instance, regularly encourage impulse purchases though product placements. However, with a mobile device, new problems could arise. Consider the example of the phone 'reminding' you to buy a related product. This would be based on 'opt-in' so it's not spam. So far, so good. At worst a minor irritation, at best a useful recommendation.
Now extend this further. Knowing the person/object they are looking at (based on location, e.g. they are standing in front of a car showroom) and their credit history (available on the web), can we offer a 'One Click' loan to 'engage' with the person and 'encourage' them to buy the car? Legally and technologically it is not banned. However, morally and ethically it may be considered dubious. Note that all this precise engagement and personalisation can be enabled by co-relating different datasets.
If we consider advertising: to what extent does advertising dictate content? It is an intriguing question and most media channels will deny that their content is influenced by advertising. However, there are indicators that this may be the case based on the limited and advertising-led range of content. For instance, advertisers would favour entertainment-led content since it places the viewer in a more receptive mood to buy, in contrast to the more serious documentary-based content (which does not). The question of profiles is also interesting and raises some questions.
For instance, consider the abstract of the following patent filed by Google (source: search engine journal)[xv]. ‘Personalized advertisements are provided to a user using a search engine to obtain documents relevant to a search query. The advertisements are personalized in response to a search profile that is derived from personalized search results. The search results are personalized based on a user profile of the user providing the query. The user profile describes interests of the user, and can be derived from a variety of sources, including prior search queries, prior search results, expressed interests, demographic, geographic, psychographic, and activity information.’
footSuch a profile would appear to be recording all our activities in cyberspace and tying them individually to us (to be used for the purposes of advertising). This practice does raise privacy, trust and risk concerns. On the other side are anonymised profiles which seek to anonymise personal data and then create 'templates' of user behaviour, which may be used to predict future behaviour based on past behaviour. For instance, it may be used to identify in advance who will churn (move off) from a social network. In this case, rather than getting an individual profile, we get audience segments. Audience segments are not tied to individuals (of course in a very small segment, for example, a segment of one, it would be a direct link).
When it comes to the mobile platform, the mobile operators generally have a good reputation for managing data and preventing misuse from advertisers. Misleading promotions, such as the Crazy Frog ringtone[xvi] in Europe were not created by telecom operators but rather by mobile marketing companies. Certainly, most operators take privacy seriously. Over time, mobile operators and the industry will face new challenges and will work with new forms of advertising as indicated in the discussion above. Whatever the direction we choose, 'mobile', due to its unique, personalised nature, will have to go beyond 'opt-in' and may need higher standards beyond statutory regulation based on moral and ethical integrity with a view to protect consumer interests. The future of privacy will lie in customer empowerment. Some of the mechanisms for privacy that we discuss later include:
· anonymisation;
· revocation;
· vendor relationship management; and
· full disclosure.
For marketers, the temptation to treat social media and the digital footprint as a 'channel' is strong, along with the desire to retrofit the new world of communication to the familiar world of brands, traffic, audiences, growth and value. However, this is not (always) in the consumer's interest. The pendulum of legislation will shift from an emphasis on brands to empowering the consumer, and the debate is just beginning.
Perceptions of privacy
It is interesting to see the public’s reaction to the privacy issue. Pew International[xvii] gives some insights about customer perception. According to their analysis: ‘Online adults can be divided into four categories based on their level of concern about their online information and whether or not they take steps to limit their online footprint:
· Confident Creatives are the smallest of the four groups, comprising 17% of online adults. They say they do not worry about the availability of their online data, and actively upload content, but still take steps to limit their personal information.
· The Concerned and Careful fret about the personal information available about them online and take steps to proactively limit their own online data. One in five online adults (21%) fall into this category.
· Despite being anxious about how much information is available about them, members of the Worried by the Wayside group do not actively limit their online information. This group contains 18% of online adults.
· The Unfazed and Inactive group is the largest of the four groups – 43% of online adults fall into this category. They neither worry about their personal information nor limit the amount of information that can be found out about them online.’
Thus, I see a range of perceptions of privacy and digital footprints. As we shall see going forward, the issues, benefits and perceptions are all going to change significantly in the near future; especially in relation to mobile.
foot[i] http://en.wikipedia.org/wiki/Metadata July 2009
[ii] http://battellemedia.com/archives/000647.php
[iii] http://www.pewinternet.org/Reports/2007/Digital-Footprints.aspx
[iv] http://www.wired.com/politics/law/news/1999/01/17538
[v] http://www.identityblog.com/
[viii] http://www.spock.com/do/pages/help#claim-search-result
[ix] http://www.forbes.com/2009/01/12/mobile-marketing-privacy-tech-security-cx_ag_0113mobilemarket.html
[x] http://en.wikipedia.org/wiki/Data_Protection_Act July 2009
[xi] http://www.telegraph.co.uk/news/newstopics/politics/4339771/Threat-to-privacy-under-data-law-campaigners-warn.html
[xii] http://adage.com/digital/article?article_id=134036
[xiii] http://news.bbc.co.uk/1/hi/health/7760413.stm
[xiv] http://www.guardian.co.uk/media/2009/jan/26/marketing-online-children-kids-underage-regulation
[xv http://www.searchenginejournal.com/google-advertising-patents-for-behavioral-targeting-personalization-and-profiling/2311/
[xvi] http://www.out-law.com/page-6483
[xvii] http://pewglobal.org/reports/display.php?ReportID=247

This print function will only print the screen visible section to save paper - please don’t print out the entire book as it is an environmental waste of paper and ink. If you need a printed version please purchase the book. Thank you for your understanding