Irish Census 2016 & Privacy

I’ve been looking at the 2016 census results with the last few years and there is a great deal of suppression of values for relevant Small Areas. The CSO suppress results or aggregate them depending on the number of people living in a Small Area. If the population is too small and could lead to individuals being identified, the data is suppressed. They are legally required to undertake this exercise under s33 of the Statistics Act, 1993.

I’ve been looking at a selection of variables and after reading this piece on the traveller accommodation crisis by RTÉ I decided to map the percentage travellers per Small Area. I have all this data in a PostGIS database but I’ll quickly run through how to do it without having to use PostGIS. I downloaded the Small Areas shapefile (generalised to 50m) and the CSV of all of the Small Area values from the CSO here. Instead of having to use a spreadsheet or QGIS to manually delete the 802 fields I didn’t need I used the pandas library, the python code below that took 0.3 seconds to run. It opens the relevant CSV and only selects the columns that I need and then strips the first 7 characters from the ‘GEOGID’ string as these are not needed for the join I’ll do in QGIS later.

import pandas as pd, time

start = time.time()
df = pd.read_csv('SAPS2016_SA2017.csv', usecols=['GUID', 'GEOGID', 'GEOGDESC', 'T1_1AGETT','T2_2WIT'])

df.GEOGID.apply(str)

df['GEOGID'] = df['GEOGID'].str[7:]

df.to_csv('SAPS2016_SA2017_New_GEOGID.csv')
end = time.time()
print(end - start)

I then opened the shapefile in QGIS, imported the CSV and joined them. This was  subsequently exported to a GeoPackage and I used GDAL’s ogr2ogr library to convert it to a GeoJSON in order to upload it to Carto.

ogr2ogr command to convert to GeoJSON
ogr2ogr command to convert to GeoJSON

Below is the resultant map with some formatting of headings undertaken to make it more legible. You can make it full screen using the button on the left. What struck me about this was how with a small amount of work it was very easy to visualise accurately the resident locations of one of the most vulnerable groups of society. Obviously this information is useful to local governments, state government agencies, NGOs and so forth but I question whether this data should been available to the general public regardless of it being aggregated to the Small Area geography.