The development agenda is heralding a new cure-all for humanitarian and development challenges - data.
In the latest incarnation of the development world's dominant paradigm, ICTs for Development, data is being embraced, analysed and monitored by companies, humanitarian organisations, aid donors and governments alike. Yet despite the promises of data evangelists that big and open data can revolutionise innovation, education, health care and infrastructure, the potential risks of data - exclusion, discrimination, identification, persecution, and violations of the right to privacy - bear serious consideration. Without critical analysis and legal oversight, data could become the new conflict resource, causing and sustaining human rights violations.
International agencies and organisations such as UN Global Pulse, UN Economic Commission for Latin America and the Caribbean, OECD, and the World Economic Forum have all sang the praises of data as a tool to accelerate development, reduce poverty, spur innovation, and improve accountability and transparency. A recent report of the UN High Level Panel on the Post-2015 Development Agenda went so far as to call for "a New Data Revolution", drawing on existing and new sources of data "to fully integrate statistics into decision making, promote open access to, and use of, data and ensure increased support for statistical systems."
The Data Evangelist movement
For data evangelists, two interrelated forms of data promise to revolutionise the development and humanitarian agendas. Open data implies the digitisation, publication and reproduction of data to enable it to be freely used, reused and redistributed. The data that is being made open is primarily that created by the public sector: maps, traffic, crime, travel schedules, government contracts and spending, and other data sets in the hands of government can be released and made available openly, i.e. to any individual or institution, so that they can generate services, research, or analyses. The movement is framed as a means of enhancing government accountability and supporting scientific, environmental and social research.
Big data refers to the amassing and analysis of high volumes of digital data in order to uncover new correlations. Due to the rapid reproduction of the quantity and diversity of data generated by digital activities - call logs, mobile-banking transactions, online user-generated content, online searches, satellite images, etc - new forms of analysis can be conducted and new correlations uncovered. Algorithms can be applied to develop intelligence on people, groups, and events and places. With enough data, the theory goes, we even can try to predict behaviour based on past activities.
Open government or scientific data acts as a starting point and can be fed into big data to generate new forms of information and insight. Both open data and big data are increasingly viewed by governments as key to governance, accountability and innovation - the UK government has positioned itself as a leader on the former, publishing an Open Data White Paper and founding, with seven other states, the Open Government Partnership, while Australia has recently announced its intention to become the world leader in big data with its Big Data Strategy.
A deafening silence
From a perspective that values privacy and a human rights approach to development, big and open data pose a number of challenges.
The tired but apt truth that "knowledge is power" means that the publication of data about public entities can empower citizens and democratise information in a productive way. However, while open data can be analysed shine a light into government activities in a way that promotes open and democratic societies, governments choose which data gets released, and importantly, which information does not. While data about infrastructure may be vital to improving access to services, data that relates - even tangentially - to individuals may be used in ways or to ends that facilitate the publication of personal information. Meanwhile, big data relies on data held by both companies and governments, and these datasets and their resultant analyses are not as likely to be shared openly, generating closed forms of intelligence.
The data evangelism movement also poses very real risks to the enjoyment of human rights, particularly the right to privacy. The silence within the development and humanitarian worlds on the potential drawbacks of big and open data movements has been deafening, with only a handful of exceptions (notably, Steve Song, Patrick Meier and Linnet Taylor). There have been some half-hearted attempts to bring privacy into the debate (see here, pg 24 and here, pg 39) but a genuine and sustained consideration of the impact of big and open data on privacy and the protection of personal data has been mostly absent from this field.
The threat to privacy rights
The right to privacy implies that individuals have the right to decide whether to engage with society by sharing or exchanging their personal information and to determine on what terms they are prepared to do so. The right to privacy through the protection of personal data is not only an important right in itself, but it is a key element of individual autonomy and dignity.
That is why big and open data present a serious challenge. Digitising data and pairing personal data with multiple other data sources can result in the mosaic effect, allowing for data elements that in isolation appear non-personal or innocuous to be combined to enable the detailed profiling of individuals. Both agendas imagine that personal information is a resource that can be mined and disclosed by the organisation without any consideration of the wishes of the individual. Governments may release open data sets containing personal information, and this information was collected by government with the understanding that it was necessary for the administration of the state, not for the additional purposes that it is being subjected to by anyone who chooses to download it. Organisations mine the personal data they hold to discern the interests, movements, their habits, their interests and other highly personal characteristics.
Proponents of big and open data argue that their information is anonymised, and the analyses are about the aggregate, not the individual. The serious problems with data anonymisation and the potential for de-anonymisation have been well publicised and continue to plague the big and open data movements, despite assurances by regulators that such risks can be mitigated.
The problems of anonymisation are enhanced by the lack of safeguards and standards inherent in data for development initiatives. International consensus on data protection standards remains elusive, data protection legislation is still largely absent on the African continent, and few development and humanitarian organisations have self-standing data protection and privacy policies to guide their work in developing countries. As the UN itself admits, "while private-sector organisations and [g]overnment regulators have been grappling with this issue for almost a decade, humanitarian organisations appear further behind."[pg. 40] In the absence of strong legal safeguards and accountability institutions individuals in developing countries have little recourse against the violation of their privacy.
'Imagine what Muammar Qaddafi would have done with this sort of data'
Data is not context free. Developing countries are also plagued by historical divisions, ethnic conflicts and other social and cultural vulnerability that heighten the risk that big and open data will be misused. Discrimination or persecution could easily be the result of de-anonymisation of big data pertaining to, for example, electoral trends, public health issues, political activity or location.
Call and text message records held by the private sector, for example, were used by the Egyptian authorities to track down and convict protesters in the aftermath of anti-government food protests in 2008. The risk of the misuse of personal data is heightened when data is open and thus accessible by any one for any reason. Even the open digitation and publication of seemingly banal information can have adverse effects - in Pakistan, for example, the publication of locations of food distribution points and clinics led to threats to aid workers responding to floods. In India, the digitisation of land title records led to increased corruption and facilitated the capturing by very large players of vast quantities of land at a time, thus consolidating inequalities in land ownership. Big data initiatives such as that conducted by Orange in Cote d'Ivoire have shown that even a basic mobile phone traffic data set can enable conclusions about social divisions and segregation on the basis of ethnicity, language, religion or political persuasion.
As Alex Pentland, director of the Human Dynamics Lab at MIT, points out, "imagine what Muammar Qaddafi would have done with this sort of data".
A further consideration is the potential for fetishisation of data and the prospect that data will be misinterpreted or manipulated to support particular viewpoints. As Steve Song points out, the big and open data movements are founded on an assumption that 'data', 'facts' and 'truth' are roughly equivalent. Data can be politicised or misrepresented and yet come to represent an authoritative version of the truth, having serious implications for decision-making that could deeply affect individuals' life choices and futures. Health data pertaining to reproductive health, HIV status or birth control information made available through big data or open government initiatives might be particularly vulnerable to such manipulation.
Each of the challenges highlighted above are significant. As technologies continue to advance and lives - particularly those in developing countries - are increasingly lived online, personal information and data privacy become vital and tangible rights that must be protected. This is particularly the case given the indelible and unretractable nature of big and open data sets.
The question that data evangelists should thus be asking is not what problems can big or open data solve, but what problems might they cause. It is time that the development and humanitarian agendas recognised data as what it is - a new form of conflict resource that has the potential to contribute to, benefit from or result in the commission of serious violations of human rights.
Privacy International will soon be launching a research and advocacy project entitled Aiding Surveillance that will focus on the role of international development, humanitarian and funding organisations in promoting privacy and data protection. Click here to join PI's mailing list to find out more about this project and all of PI's activities.
Data for development: The new conflict resource?
The development agenda is heralding a new cure-all for humanitarian and development challenges - data.