Irish Census 2016 & Privacy

I’ve been looking at the 2016 census results with the last few years and there is a great deal of suppression of values for relevant Small Areas. The CSO suppress results or aggregate them depending on the number of people living in a Small Area. If the population is too small and could lead to individuals being identified, the data is suppressed. They are legally required to undertake this exercise under s33 of the Statistics Act, 1993.

I’ve been looking at a selection of variables and after reading this piece on the traveller accommodation crisis by RTÉ I decided to map the percentage travellers per Small Area. I have all this data in a PostGIS database but I’ll quickly run through how to do it without having to use PostGIS. I downloaded the Small Areas shapefile generalised to 50m and the CSV of all of the Small Area values from the CSO here. Instead of having to use a spreadsheet or QGIS to manually delete the 802 fields I didn’t need I used the pandas library, the python code below that took 0.3 seconds to run. It opens the relevant CSV and only selects the columns that I need and then strips the first 7 characters from the ‘GEOGID’ string as these are not needed for the join I’ll do in QGIS later.

import pandas as pd, time

start = time.time()
df = pd.read_csv('SAPS2016_SA2017.csv', usecols=['GUID', 'GEOGID', 'GEOGDESC', 'T1_1AGETT','T2_2WIT'])

df.GEOGID.apply(str)

df['GEOGID'] = df['GEOGID'].str[7:]

df.to_csv('SAPS2016_SA2017_New_GEOGID.csv')
end = time.time()
print(end - start)

I then opened the shapefile in QGIS, imported the CSV and joined them. This was then exported this to a GeoPackage and I used GDAL’s ogr2ogr library to convert it to a GeoJSON in order to upload it to Carto.

ogr2ogr command to convert to GeoJSON
ogr2ogr command to convert to GeoJSON

Below is the resultant map with some formatting of heading undertaken to make it more legible. You can make it full screen using the button on the left. What struck me about this was how with a small amount of work it was very easy to visualise accurately the resident locations of one of the most vulnerable groups of society. Obviously this information is useful to local governments, state government agencies, NGOs and so forth but I question whether this data should been available to the general public regardless of it being aggregated to the Small Area geography.

 

Ireland’s Social Housing

Housing and all its intricacies have come to dominate the media discourse at home over the last few years. We’ve truly come out the other side of the recession and now the conversation is around the shortage of housing and where that has lead us. I’ve been thinking about this recently and in particular social housing. I think that most people assume that we built the majority of our social housing in the 1950s and 1960s. Collectively I think we assume we know when social housing was built but not where. This is where the census data can come in. Part 2 of question H3 in census 2016 asks ‘If renting, who is your landlord?’.

I have all of the census 2016 returns for each geographical unit in a PostGIS database so it was a simple exercise to add the households that rent from a local authority or voluntary/co-operative housing body and divide by the total number of houses. One inherent weakness to this method is that it doesn’t capture the social housing tenants that rent from a private landlord.

Ireland Census 2016-Question H3

I added a new column in PostGIS for the percentage social housing and I then symbolised this in QGIS. I used QGIS’s powerful Atlas generation tool. You’ll have to excuse the basemap, I’m aware that it’s a bit difficult to discern but in the interest of producing this entirely with opensource software I used OpenStreetMap as the basemap.

The next step will be to take the top-ten counties and use the Global Human Settlement Layer as a base to give an approximate indication of what epoch they were built in.

North Atlantic Sea Surface Temperature

This week I was trying to recreate Joshua Steven’s Commanding Cartography. He presented this at NACIS 2018 and I was keen to give it a go. As I use Windows 10, the first step for me was to install Windows Subsystem for Linux and then install GDAL. I then used Wget to download the month of April from the United States’ National Oceanic and Atmospheric Administration’s Coral Reef Watch as NetCDF4 files. Earlier in the week I ran through his presentation again to see if I encountered any problems trying to run through the steps on a single NetCDF4 file. The steps were as follows:

  1. Convert the NetCDF4 files to tifs.
  2. Crop the tifs to the area of interest (North Atlantic Ocean).
  3. Reproject to Albers Equal Area.
  4. Apply a colour palette to the image.

The only problems I encountered was trying to find the same data that Joshua used, and getting the projection type right for GDAL. I think I found the right download source by going to the daily NetCDF4 data (as below) and downloading the ‘SST’ data.

NOAA’s Website-Download Type

After some research, trial and error I reprojected the data to the Albers Equal Area projection with help from the brilliant Projection Wizard website. I selected the area of interest and copied the relevant proj4 string.

projectionwizard.org

I created the first image of the sea surface temperature for April and posted it on Twitter during the week:

Next, I wanted to create a GIF for the entire month of April. I followed Joshua’s presentation again, and used a series of bash for-loops. Following his example I used ImageMagick through the command line to resize my final tifs, convert and resize them, and then create the GIF. This is shown below along with my for-loop that worked to reproject all 30 daily temperature tifs to Albers.

My planned next step will be to use the gravity setting in ImageMagick to annotate each of the individual tifs with their date taken as a variable from the filename. This is so the GIF will show each date as it cycles through. I’m not 100% sure once I resize etc. whether this will actually work but I’m hoping to find out.

for f in *.tif; do
	gdalwarp -t_srs "+proj=aea +lat_1=5.101266605156489 +lat_2=56.35029964751531 +lon_0=-28.125" $f  "${f%.*}_Projected.tif"
done
Final GIF-North Atlantic Ocean Sea Surface Temperature-April ’19

Australia-Durack Electoral Division

Like a lot of people, I spent a great deal of time following the 2019 federal election results. I was (and still am) very impressed with the Australian Electoral Commission’s Tally Room where results are easily available and downloadable. It was while I was browsing their site that I came across the Western Australian federal seat of Durack, what piqued my interest is that the stated area is 1,629,858km², I looked at its wiki page which states that it’s the largest electoral division in the world that practices compulsory voting. The Guardian have a good article about it which contains a graph that compares it in size to different countries in the world.

I decided to spend some of my weekend making a map of it, I downloaded the dataset from the the Australian Electoral Commission’s website and the country admin data and hillshade from the brilliant Natural Earth. Below is the result, free free to use as you’d like.

Durack Electoral Division, WA-Largest Electoral Division in the world that practices compulsory voting.
Durack Electoral Division, WA-Largest Electoral Division in the world that practices compulsory voting.

New Cork City Boundary

On the 1st of June Cork City’s boundary will change and the official city area will become almost five times larger, it will encompass Ballincollig, Blarney, Douglas, Glanmire and Rochestown. Its population will increase by 85,000 people. To put that in perspective, it’s 4,796 more people than the population of Ireland’s two largest towns, Drogheda (Census ’16 population was 40,956) and Swords (Census ’16 population was 39,248) combined. The council has launched an interactive web-map to show the new boundary that will come into force. It can be found here.  The council have also produced a PDF of the new boundary which is shown below and the original can be accessed here.

New Cork City Boundary

New Cork City Boundary

Ireland-Census 2016

 

I’ve had to work recently on an older Linux based machine and as such most of my usual routes to edit and display data aren’t available to me. I needed to preform a join between the Small Areas geometry and the Small Areas table, both of which are available from the CSO’s website here. Even though the csv only has ~18,000 rows, the field calculator in QGIS 3.2 Bonn couldn’t cope and kept crashing.

Enter python to the rescue, I downloaded the Geany python IDE which I find to be nice and lightweight for older computers. I needed to remove the first 7 characters from the ‘GEOGID’ field. All of the values in this column started with ‘SA2017_’. The following is a quick few lines in python 2 to remove the first 7 characters using python’s built in csv module. For reference, on this very average laptop from 2011 it took 3 seconds to run.

import csv

with open('SAPS2016_SA2017.csv', 'rb') as input_file, open('output.csv', 'w') as output_file:
    reader, writer = csv.reader(input_file), csv.writer(output_file)
    first_row = reader.next()
    first_row.append("Strp_GeogID")
    writer.writerow(first_row)
    for row in reader:
        item_to_change = row[1]
        modified_item = item_to_change[7:]
        row.append(modified_item)
        writer.writerow(row)

Perth, Australia

Australia passed the 25 million people mark shortly after 11pm on the 7th of August 2018. This got me thinking, what would a map of Perth look like showing each nationality? Over 28% of Australians was born abroad, what would this translate to in Perth terms?

I took a quick look online to see if anything already existed, the only thing I could find is the below from Perth’s Wikipedia page

One Dot per 100 persons, Perth, Wikipedia

It’s from 2008 and although a gallant effort, there are a few major problems, most notably the lack of a legend. So I decided to see if I could make something, if not better, than as good as the above.

My first job was to source the data, I knew from previously working with ABS data that their pre-built geopackages or datapacks wouldn’t contain the data I needed (question 12 from census ’16) but the geopackages were useful to download the geometry that I needed.

Question 12

I needed to use the Tablebuilder in order to collate the data that I needed for the geometry that I was going to use. This was the main learning area for me, I didn’t know enough about which unit of statisitical geography I wanted to use for this exercise. Luckily, the ABS  have a website where you can compare and contract each unit.

The ABS already had the hard working done in that one of their staticial units is ‘Greater Perth‘, I used this as my boundary and then chose the SA2 as the statistical unit. I went back to Tablebuilder and tried in vain to make sense of it; I found it very cumbersome and non-intuititve to use at the start and their introductory videos weren’t of any help. Fortuntately,  I found an amazing video on YouTube that explained Tablebuilder in great detail and once I’d watched that everything made sense, and I’m a Tablebuilder convert now!

I then used Tablebuilder to build the exact statistics that I needed (Country of Birth by SA2). I saved the table in Tablebuilder and downloaded it as a CSV file. In QGIS I then joined this with the SA2 geopackage file for WA and clipped it using the Greater Perth boundary that I had also downloaded. I then exported this layer as a new geopackage. I had previously found the top 8 nationalities by country of birth (using Tablebuilder) and then created new fields for each one where each number represented 200 persons born in that country. I then used the Random Points Inside Polygons tool to create random points for each nationality.

Generate Ramdom Points Inside Polygons using QGIS

I then used Adobe Color [sic] to pick a decent colour scheme for the various dots. I used Quick OSM in QGIS to download a layer with the towns in Greater Perth to be used for reference, this took about 10 seconds to do, Quick OSM is really useful.

Quick OSM in QGIS

Lastly, I used Google Fonts to download some nice fonts. I also used some styling effects in QGIS before I exported everything to Inkscape in order to add the text. Below is the finished product, the biggest flaw in what I have done is that there are overlapping points but I still think it gives a good overall understading of where people of different nationalities live in Greater Perth.


Excel and Removing Columns

I had a situation today where I had a spreadsheet that contained hundreds of columns. I only needed five or so of these and I didn’t fancy going through them one-by-one to delete the unnecessary ones. I found the below snipped of VBA on stackoverflow. The code ran almost instantaneously and deleted all columns that I didn’t need. The country names are the columns that I needed to keep. I didn’t change/need the part of the code that deletes cells if they don’t contain the string ‘homer’.

Sub deleteIrrelevantColumns()
    Dim currentColumn As Integer
    Dim columnHeading As String

    ActiveSheet.Columns("L").Delete

    For currentColumn = ActiveSheet.UsedRange.Columns.Count To 1 Step -1

        columnHeading = ActiveSheet.UsedRange.Cells(1, currentColumn).Value

        'CHECK WHETHER TO KEEP THE COLUMN
        Select Case columnHeading
            Case "England", "New Zealand", "India", "South Africa", "Malaysia", "China", "Philippines", "Scotland"
                'Do nothing
            Case Else
                'Delete if the cell doesn't contain "Homer"
                If Instr(1, _
                   ActiveSheet.UsedRange.Cells(1, currentColumn).Value, _
                   "Homer",vbBinaryCompare) = 0 Then
                    ActiveSheet.Columns(currentColumn).Delete

                End If
        End Select
    Next

End Sub

IRELAND-TOWNLANDS

In Ireland, the townland is the smallest unit of land division. They pre-date the Anglo-Norman conquest (source). What I find amazing about them is how prevalent their use is to this day. Where I grew up in Kerry, they are still used, day-in, day-out to give everything from directions to advertise property and house sales. I find this fascinating; what also amazes me is the number of discussions that occur among friends and in the community regarding townlands and their exact boundaries. Until the OSI released the below dataset, any disputes on the boundaries would have to be resolved using someone’s copy of maps from the 19th Century. It is great to be able to solve these using accurate data.

There has been an OSM project ongoing with a few years to map all the townlands of Ireland. The Ordnance Survey of Ireland released the townland boundaries as open data under a creative commons licence. There are no townlands for the cities of Dublin and Cork but they cover the rest of the country. There are 50,380 townlands in this dataset.

Townlands of Ireland

Townlands of Ireland

Because the ArcGIS Online viewer isn’t fantastic, I uploaded the townlands to Carto to view online. I have only uploaded the 50m generalised dataset as the ungeneralised dataset is ~240MB. Below is a Carto web map of the townlands of Ireland. I hope to do some work in the future on these townlands, such as general statistics and such.

ArcPY Data Driven Pages Script

I had a situation at work a few weeks back where an individual needed 180 maps within a few hours. The maps themselves weren’t overly complex, they required satellite imagery as the basemap, some Ordnance Survey mapping overlaid with each map showcasing a particular site (in the Greater London area). I knew I wouldn’t be able to turn these around if I had to export them manually so enter ArcPy and the power of data driven pages in ArcMap.

I used the basic ‘Grid Index Features‘ tool to create the index for the mapbook and I then created the mapbook as normal. I needed each page to only show one site, to achieve this there is a little workaround in ArcMap to white-out irrelevant features.

The next step was to insert dynamic text for each page using a value from the attribute table (this was the layer name). I carried out a spatial join between the polygons (which contained the field with the polygon/page name) and the grid index features so that each grid index polygon would have the page/polygon name as an attribute. I then used this guidance to ensure each page had its own page number.

Although, it would have been possible to export the mapbook from the ‘File’ menu, it would just export one (quite large) pdf with a single filename as opposed to 180 individual PDFs with each having the correct label and title. I then wrote the following python function in order to export each page. If the same file name is exported more than once it appends an underscore and number to the end. It then took about 20 minutes to export the 180 plans, automation for the win!

#Export Data Driven Pages to PDF (Proper Names)
import arcpy, os
def export_pdf_maps():
	strOutpath = r"\\Output_Location"
	mxd = arcpy.mapping.MapDocument(r"\\Example.mxd")
	ucodes = {}
	for pageNum in range(1, mxd.dataDrivenPages.pageCount + 1):
		mxd.dataDrivenPages.currentPageID = pageNum
		pageorder = mxd.dataDrivenPages.pageRow.U_Code
		#Check if we have already found this Ucode
		if pageorder not in ucodes:
			ucodes[pageorder] = 0
		ucodes[pageorder] += 1
		pdfname = pageorder + "_" + str(ucodes[pageorder]) + ".pdf"
		print(pdfname)
		if os.path.exists(strOutpath + pdfname):
			print("Error", pdfname)		    
		arcpy.mapping.ExportToPDF(mxd, strOutpath  + pdfname)
	del mxd