Get Started

We are a community of civic hackers, data wranglers and ordinary citizens intrigued and excited by the possibilities of combining technology and information for good – making government more accountable, culture more accessible and science more efficient.

We focus on making things - whether that’s apps, insights or tools. We have a strong preference for open data and free/open source software.

We are part of Open Knowledge International and operate as a collaborative community which anyone can join. Labs is organized through a public GitHub Repository.

Data Wrangling

Data wrangling can be described as the process of getting, cleaning and using data. The following list covers almost all activities we do.

Click here to find more about data wrangling, getting, cleaning and publishing data online.

Skills required

This site is for anyone interested in working with data. It assumes you have a basic working knowledge of UNIX shell commands, the Python programming language, networking, SQL and common file formats like HTML, CSV or XML. If you are not familiar with these technologies, start by working through a tutorial on Python, such as Zed Shaw’s Learn Python the Hard Way and then pick up the others as you need them.

We hope that some of the content will be especially useful to the many emerging types of data users, such as data journalists, civic hackers, coding wonks etc.

While most techniques apply to any kind of machine-readable information, some of the material may refer to a specific class of data that we care a lot about: open government data. Government information is a good example of data both for the interesting things that we can learn from it, but also because virtually any imaginable data problem applies to it: incompleteness, corruption, strange formats or sheer size.

What can I start doing now?

The Core Datasets project is a project in which we do a lot of data wrangling skills and we could use your help!