We make tools and insights using
open data, open content and open code
Join in »

The Data Wrangling Blog

  • 25 July 2016 Dan Fowler

    Publish Data Packages to DataHub (CKAN)

    Back in March, I wrote about a CKAN extension for publishing and exporting Data Packages1. This extension, datapackager, has been updated and is now live on our very own CKAN instance, DataHub. DataHub users can now import and export Data Packages via the CKAN UI...
  • 18 July 2016 Dan Fowler

    Comma Chameleon at csv,conf,v2

    Having co-organized csv,conf,v2 this past May, a few of us from Open Knowledge International had the awesome opportunity to travel to Berlin and sit in on a range of fascinating talks on the current state-of-the-art on wrangling messy data. One such talk was given by...
  • 14 July 2016 Dan Fowler

    Using Data Packages with R

    R is a popular open-source programming language and platform for data analysis. Frictionless Data is an Open Knowledge International project aimed at making it easy to publish and load high-quality data into tools like R through the creation of a standard wrapper format called the...
  • 13 July 2016 Alexandre Bonnasseau

    'Continuous Processing' with Data Packages

    When storing your data in Data Packages, it is considered good practice to store scripts for updating, processing, or analyzing your data in a directory called scripts/ placed at the root of your Data Package. I’ve written a tutorial to show how to achieve continuous...
  • Much of the open data on the web is published in CSV or Excel format. Unfortunately, it is often messy and can require significant manipulation to actually be usable. In this post, I walk through a workflow for automating data validation on every update to...
  • All blog posts…