Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Share, remix, reuse – just do it for fun, for profit and for the public good… Once the data is liberated, good things will follow ! Alas, some Cassandra beg to differ.

Can the output of a process based entirely on publicly available data be considered unfit for public availability ? As Marek Mahut explains in “The danger of transparency: A lesson from Slovakia“, the answer is ‘yes’ according to a court in Bratislava who ordered immediate censorship of some information produced by an application whose input is entirely composed of publicly available data.

As a French citizen, I’m not surprised – for more than thirty years, our law has recognized how the merging of data sources is a danger to privacy.

I was prepared to translate the relevant section of the original French text of “Act N°78-17 of 6 January 1978 on data processing, data files and individual liberties” for you… But in its great benevolence, my government has kindly provided an official translation – so I’ll use that… Here is the relevant extract :

Chapter IV, Section 2 : Authorisation
Article 25
I. – The following may be carried out after authorisation by the “Commission nationale de l’informatique et des libertés” , with the exception of those mentioned in Articles 26 (State security and criminal offences processing) and 27 (public processing NIR, i.e. social security number – State biometrics –census – e-government online services):
[..]
5° automatic processing whose purpose is:
– the combination of files of one or several legal entities who manage a public service and whose purposes relate to different public interests;
– the combination of other entities’ files of which the main purposes are different.

Short version : if you want to join data from two isolated sources, you need to ask and receive authorization first, on a case-by-case basis.

That law only applies to personal data, which it defines (Chapter I, Article 2) as ‘any information relating to a natural person who is or can be identified, directly or indirectly’. That last word opens a big can of worms : data de-anonymization techniques have shown that with sufficient detail, anonymous data can be linked to individuals. With that knowledge, one may consider that the whole Open Data movement falls in the shadow of that law.

To my knowledge this question has not yet been brought before a court, so there is therefore no case law to guide us… But it is only a matter of time – watch this space !