Overview
Teaching: 10 min
Exercises: 0 minQuestions
Motivation for using OpenRefine
Objectives
Describe OpenRefine’s uses and applications.
Differentiate data cleaning from data organization.
Experiment with OpenRefine’s user interface.
Locate helpful resources to learn more about OpenRefine.
Note: this is a Java program that runs on your machine (not in the cloud). It runs inside your browser, but no web connection is needed.
Follow the Setup instructions to install OpenRefine.
If after installation and running OpenRefine, it does not automatically open for you, point your browser at http://127.0.0.1:3333/ or http://localhost:3333 to launch the program.
You can find out a lot more about OpenRefine at http://openrefine.org and check out some great introductory videos. There is a Google Group that can answer a lot of beginner questions and problems. There is also an OpenRefine Google Plus community where you can find a lot of help and a lot of folks from the life sciences are members. As with other programs of this type, OpenRefine libraries are available too, where you can find a script you need and copy it into your OpenRefine instance to run it on your dataset.
Key Points
OpenRefine is a powerful, free and open source tool that can be used for data cleaning.
OpenRefine will automatically track any steps you take in working with your data.