DAC and CRS code lists – Now available as Frictionless Data!

This blog was originally posted on the Publish What You Fund website.

Maintained, machine readable versions of the DAC and CRS code lists are now available as CSV and JSON! Here’s how Publish What You Fund and Open Knowledge made it happen…

DAC CRS Bot

The OECD’s Development Assistance Committee (DAC) maintains a set of code lists used by donors to report on their aid flows. These are used as part of donors’ DAC reporting, but also in their IATI publications. Not only that, but since some of the codes e.g. for aid classification, are so widely used, they are also useful to recipient country governments to map aid activities to their own budgets. So they’re super important!

Keeping in sync

Now, these code lists are available on the OECD website as a non-machine-readable XLS file. There’s also an XML version, but it was last updated 18-months ago, and as such it differs significantly from the standard, canonical XLS version on the OECD website.

Because of this lack of a machine-readable version, IATI maintains its own replicated versions of these code lists. These replicated versions are used by d-portal, the IATI Dashboard and others. However, due to the overheads involved in maintaining them, these too have fallen out of sync with the source file.

There has been a-rumbling (and some grumbling!) within the IATI community about getting the DAC to produce a machine-readable version of these code lists. This idea has long been in the offing, and we at Publish What You Fund would very much welcome such a development.

In the meantime, though, we have taken matters into our own hands. Together with Open Knowledge, we’ve published a frictionless data package of the DAC code lists – with data available in machine-readable CSV and JSON formats. This is published as an Open Knowledge Core Dataset – a group of important and commonly-used datasets in high quality, easy-to-use and open form.

But how does it work? The science bit!

The data is stored on github, and maintained by a scraper that runs nightly on morph.io (created by the wonderful Open Australia Foundation). When a change to the data is detected, a pull request is sent by DAC CRS Bot, and reviewed by a (human) maintainer. Via github, we maintain a version history of changes to the data, so it’s possible to tell what changed and when.

The next logical step would be for IATI to use this data to maintain their replicated lists as a routine maintenance task. We’ve already tested this as a proof of concept one-off task, to bring all the relevant replicated IATI code lists up-to-date, including adding all French translations. De rien!

Comments