Rufus Pollock [rgrp]

Member since 01 January 2005

Rufus is an avid hacker on many small data tools, an enthusiastic collector of new datasets and an excited out-of-hours data-based investigator. Skills are various including some python, javascript, sysadmin and data wrangling.

Projects

Rufus Pollock is a contributor to the following projects.

Data Patterns

A collection of tips, tricks and patterns for data work

DataPortals.org

DataPortals.org is the most comprehensive list of open data portals in the world. It is curated by a group of leading open data experts from around the world - including representatives from local, regional and national governments, international organisations such as the World Bank, and...

Public Bodies

PublicBodies.org is a website hosting a database of so-called public bodies—that is government-run or -controlled organizations (which may or may not have a distinct corporate existence). Examples include government ministries or departments, as well as state-run organizations such as libraries, police and fire departments. Contributions...

FacetView

FacetView is a pure JavaScript frontend for ElasticSearch search indices. It lets you easily embed a faceted browser and search frontend into any web page. It also provides a micro-framework you can build on when creating user interfaces to ElasticSearch. It is currently used in...

BibServer

BibServer is an open-source RESTful bibliographic data server. BibServer makes it easy to create and manage collections of bibliographic records such as reading lists, publication lists and even complete library catalogs. FacetView is included to provide a rich interface for complex search queries. BibServer supports...

Bad Data

Bad Data is a site detailing real-world examples of how not to prepare or provide data. It showcases poorly structured, misformatted, or just plain ugly datasets and what they get wrong. While its primary purpose is to serve as an educational tool for governments and...

Data Package Manager

dpm (data package manager) is a library and command-line tool for installing and managing data packages. Inspired by software package management tools like apt for Debian, dpm aims to reduce the friction of sharing and working with data.

Open Knowledge Labs website

The Open Knowledge Labs website (i.e. the site you’re looking at right now) is itself a collaborative project of Open Knowledge. It is built using Jekyll, a static site generator, and hosted on GitHub Pages. You can contribute by addressing some of the items on...

CSV.js

Simple javascript CSV library focused on the browser with zero dependencies. Supports both parsing and serializing CSV. Originally developed as part of ReclineJS but now fully standalone.

Data Pipes

Data Pipes is a service to provide streaming, “pipe-like” data transformations on the web – things like deleting rows or columns, find and replace, head, grep etc.

ElasticSearch.JS

A simple javascript library for working with ElasticSearch. It also provides a backend interface to ElasticSearch suitable for use with the Recline suite of data libraries.

Data Explorer

Data Explorer is an in-browser data cleaning and visualization app. Load tabular data, process it with JavaScript, and graph the results, all in the comfort of your browser. Gist-based persistence enables simple versioning and sharing of projects.

Frictionless Data

There’s too much friction working with data - friction getting data, friction processing data, friction sharing data. This friction stops people doing stuff: stops them creating, sharing, collaborating, and using data - especially amongst more distributed communities. It kills the cycles of find, improve, share...

ReclineJS

Recline is a simple but powerful library for building data applications in pure Javascript and HTML. Building on Backbone, Recline supplies components and structure to data-heavy applications by providing a set of models (Dataset, Record/Row, Field) and views (Grid, Map, Graph etc).

Textus

In a nutshell it is an open source platform for working with collections of texts. It enables students, researchers and teachers to share and collaborate around texts using a simple and intuitive interface.

Listify

Turn a Google spreadsheet into a beautiful, searchable listing in seconds

TimeMapper

Make timelines & maps from a Google Spreadsheet in seconds

Recline Chrome CSV Viewer

A chrome extension which allows you to view, search, graph and map CSV files in the browser (built using Recline)

Data Converters

Python library and command line tool for converting data from one format to another. It builds on messytables, GDAL and many more great open-source libraries for processing data, and provides one easy to use standard API.

WikipediaJS

WikipediaJS is a simple JS library for accessing information in Wikipedia articles such as dates, places, abstracts etc. The library is the work of Labs member Rufus Pollock. In essence, it is a small wrapper around the data and APIs of the DBPedia project and...

Yourtopia

Yourtopia is a web app for crowdsourcing preferences about index weighting such as the Human Development Index.

Posts

The SEC EDGAR Database

04 March 2014

Data as Code Deja-Vu

04 October 2013

Recline JS Search Demo

01 November 2012

Get Involved

Join our discussion list. Here, we exchange datasets and ideas and plan our projects.

Many of us also hang out and chat on Gitter:

https://gitter.im/okfn/chat

Get Hacking

Check out the projects list or the ideas page. Contribute and earn a badge!