Welcome to the Data Wrangling Handbook!
Data wrangling for fun and profit
This handbook is not a finished document but a collection of opinions and evolving best practices. The purpose is not to present all available options and technologies but to pick one and follow it through.
The Handbook is also a collaborative effort: if you have a recipe, a tool or a howto and would like to share them, please contribute a patch or make a suggestion.
The Handbook consists of two main parts:
- Guides and Tutorials: Guides and tutorials that walks you the main aspects of data wrangling
- Patterns: A set of “patterns” or recipes for doing specific tasks from scraping an HTML table to geocoding in a spreadsheet
Guides and Tutorials
- Walk-through of some Data Wrangling basic tasks
- SQL for Data Manipulation
- Introduction CSV - the Lingua Franca of Data
- Glossary of Terms
Contributing
- Edit directly on the Handbook Github Repository
- Submit an issue to our Issue Tracker
- Improve this page Edit on Github Help and instructions
-
Donate
If you have found this useful and would like to support our work please consider making a small donation.