Blog 2 – Scholarly Sources

Profiling: A Hidden Challenge to the Regulation of Data Surveillance

Summary: Introduction to the data surveillance technique known as profiling. Infers information about individuals based on data gathered and generates list of possible targets (criminal suspects, potential clients, etc.) Regulation seems to be a big issue, and the article discusses why regulation is necessary in regard to the social implications of using profiling.


Author:  Roger Clarke

Clarke immediately begins by subtly implying his aversion to data surveillance in general, mentioning its cheapness as a reason for its use over other forms of surveillance. He also explains the difference between personal and mass data surveillance, and lists the ways that both work intrusively to gather information about individuals.

Clarke claims that there isn’t ‘anywhere in the world’ that has enacted legislation to actively control how profiling is carried, and that he intends to define the technique and assess its implication so that he can establish a need for its use to be regulated. This line of reasoning suggests Clarke is purposely searching for faults in the way the technique operates, which in turn provokes an obvious bias.

Outside of data surveillance, Clarke defines profiling as the process of creating and using a shematic representation of a person’s interests for the purposes of information retrieval. In relation to law enforcement, the accepted definition is “correlating a number of distinct data items in order to assess how close a person comes to predetermined characterisation or model of infraction”. In relation to the marketing industry, it is more about analysing responsiveness to advertising, and the frequence and value of purchases to select possible customers.

In addition to its usage in crime prevention and marketing, Clarke identifies other ways that it could be used, such as determining likely skills of students, the propensity of individuals to attempt suicide, discovering patients likely to be suffering from particular diseases or disorders, and matching employers and employees with particular skills.

The process of profiling is said to follow the same set of steps whenever it is used: describe the class of person and instances of where that class is located, then use existing experience to define a profile of that class of person (based in part on informal information, and references to related knowledge about particular fields, as well as the direction of the organisation doing the profiling).


The digital persona and its application to data surveillance

Summary: Explains the concept of the digital persona and how it is useful for developing an understanding of the behaviour of the world as a network. Also looks at where the concept originated and shows its use in real world applications. Continues by highlighting the threat of the digital persona in relation to data surveillance.

Media: The Information Society – 26 April, 2010

Author: Roger Clarke

The digital persona is the construction of an individual’s public profile formed from a cluster of related data, and intended to be used as a proxy or avatar for the individual in a network environment. Peculiarly, an individual can have more than one digital persona, as the power to collate the individual’s data is not limited to the individual, and other people or organisation can create their own digital persona of the individual.

It is important to recognise the difference between a projected persona and an imposed persona – the former being determined by the individual and somewhat controllable, and the latter created by outside parties and harder to influence.

Digital personas are used in two methods of data surveillance; data matching and profiling. Data matching collects otherwise isolated pieces of information about an individual to detect errors, abuses of systems, and fraud, however it can also risk the privacy of all those whose data is in the system and change the balance of power between the individual and the organisation collecting the data.