Big Data Capabilities and Citizen Glitching

by Dan McQuillan

 

Big data has followed the web out of the accelerator tunnels. When I was a particle physicist in the late 1980's the data flowing out of detectors was a mere one or two Mb/s[0]. Now the Large Hadron Collider produces at least 1 GB/s[1] while self-generated personal data flows into Facebook data centres at a similar rate. And on this journey out of the superconducting dark, big data has (like software) acquired a dimension called 'open'.

These days the big open data movement slurries through the streets like a mudslide, swirling repetitions of hopeful intention seeping over the sandbags of criticality. Big open data will bring transparency, accountability and democracy, and will sweep in to line rigid institutions and govermental structures.

Perhaps the institutions of power have not been hypnotised by open data. Perhaps they are happy to ride the wave for political advantage. In the UK, government open data could be a vital lubricant for civic outsourcing; part of a privatisation API that slots the Sercos and G4S's neatly in to place[2].  Selling off non-anonymised data from the UK's National Pupil Database is only the start[3]

Not that the idea of big open data for good has gone unvoiced. The UN's Global Pulse initiative has tried to harness big digital data and real-time analytics and asks “How do you find indicators for changes human well-being in big data? How do you know which digital signals are relevant enough to warrant further investigation?”, hoping to find answers through partnerships with data research companies and research centres[4].

But the current data ecosystem lacks diversity, especially in the capabilities of citizen actors. Sen & Nussbaum's capability theory sees wellbeing as based on a set of functionings (‘beings and doings') that we have reason to value – a view that they trace back to the Aristotelian notion of flourishing[5]. The negative freedom of open data (“we won't stop you using it”) needs to be superseded by the positive and substantive freedom of being able to use big open data to enrich the lives of people and communities. 

There are some worthy initiatives trying to fill the capability gap, such as the School of Data[6], although the current beneficiaries tend to be from disciplines trying to update themselves (journalists, social scientists). More fundamentally, the exercise of capabilites is based on the ability “to choose from possible livings”. It requires a critical understanding of the present and the development of an Imaginary about possible futures.

A citizen capability approach to big open data needs a critical pedagogy that fits with technological forms of life; a combination of critical peer learning and rapid prototyping that can be called Critical Hacktivism[7]. The gains of open data are not to be found only in statistical correlations but in the critical engagement of participants in examining and questioning what represents their world inside the data machine, and having the ability to intervene on behalf of their preferred futures.  

As people engage with the data they will encounter its obstinacy and material resistance. A data scientist knows that the bulk of work is beneath the surface – cleaning and purifying the data ready for analysis and visualisation. But these glitches can also be heuristic, can surface questions about the way the categories are constructed (“are the causes of my problems really captured by the category of Troubled Family?”[8])The exercise of separating indivisible lived experience in to suitable data objects becomes political, and the next logical step is to create data that is meaningful to us as citizens, that has value to us because it is part of a process of achieving wellbeing. This is where data science meets citizen science.

Participatory citizen science combines techniques of data analysis and mapping with a community development methodology, enabled by the affordances of technological innovation. A citizen science project in conducted by Mapping for Change in Deptford developed a methodology for collecting noise measurements with cheap, hand-held devices that the residents of Pepys Housing Estate could use to create an online map of noise pollution in the area, as part of their campaign against an unpopular local scrapyard. At a public meeting, the community were able to present the authorities with the evidence. After professional acousticians carried out a survey that largely confirmed the results of the residents' study, the environment agency revoked the license for the scrapyard[9].

Citizen science scales to big data when it meets the Internet of Things. Post-Fukushima projects like Safecast[10] are starting to generate large quantities of citizen-powered radiation measurements. The infrastructure is growing in the form of projects like Cosm/Pachube[11], and the Public Laboratory has successfully crowdfunded a DIY spectrometry kit[12]

What happens when citizen data meets big data? If citizen capabilities have been informed by a critical pedagogy, we can expect something like the approach of the Counter-Cartographies Collective to mapping data: 

“One big point of discussion was how to deal with the embedded biopolitics behind data sources like US Census data that we use in our maps — as 3Cs, we often talk about how we ‘queer’ data or statistics by pulling map stories out of them that they weren’t intended for.”[13]

In this way, citizen data could bring the New Aesthetic[14] to big data; a glitching that reveals the computational assumptions behind our databased world, a hacking away at the invisible voxels of power that striate society, an R[15]-powered interruption that returns a capability to the collective citizenry.

[0] The ZEUS detector and Data Storage at ZEUS

[1] CASTOR2 rises to LHC's data storage challenge http://cerncourier.com/cws/article/cern/31529

[2]Outsource to easyCouncil? Not in our name, says Barnet http://www.guardian.co.uk/commentisfree/2012/nov/11/local-government-democracy-outsourcing-barnet

[3]Opening the National Pupil Database? http://www.timdavies.org.uk/2012/11/12/opening-the-national-pupil-database

[4]Video from Global Pulse’s 8 November briefing to the UN General Assembly http://www.unglobalpulse.org/ResearchAndBigDataVideos

[5]Technology as empowerment a capability approach to computer ethics by Justine Johnstone, Science and Technology Policy Research, Freeman Centre, University of Sussex http://eccleethics.wikispaces.com/Technology+as+empowerment+a+capability+approach+to+computer+ethics+by+Justine+Johnstone

[6]School of Data - Learn how to find, process, analyze and visualize data http://schoolofdata.org/

[7]Critical Hacktivism, by Dan McQuillan, Internet.Artizans blog http://www.internetartizans.co.uk/critical_hacktivism

[8]Troubled families, Department for Communities and Local Government http://communities.gov.uk/communities/troubledfamilies/

[9]Scientists and Citizens, Chinadialogue http://www.chinadialogue.net/article/show/single/en/4782-Scientists-and-citizens

[10]Safecast global sensor network - “for collecting and sharing radiation measurements to empower people with data about their environments” http://blog.safecast.org/

[11]Cosm platform, API and community – How it works https://cosm.com/how_it_works

[12]Public Lab DIY Spectrometry Kit > on Kickstarter http://www.kickstarter.com/projects/jywarren/public-lab-diy-spectrometry-kit

[13]3cs in Chicago (part 1), Counter-Cartographies Collective http://countercartographies.wordpress.com/2010/02/13/3cs-in-chicago-part-1/

[14]The New Aesthetic: Waving at the Machines, a talk by James Bridle http://booktwo.org/notebook/waving-at-machines/

[15]RStudio - Take control of your R code http://www.rstudio.com/ide/