Open Refine for Ecology

Examining Numbers in OpenRefine

Overview

Teaching: 10 min
Exercises: 10 min
Questions
  • Examing numerical data

Objectives
  • Transform a text column in to a number column.

  • Identify and modify non-numeric values in a column using facets.

  • Use scatterplot facet to examine relationships among columns.

Lesson

Numbers

When a table is imported into OpenRefine, all columns are treated as having text values. We saw earlier how we can sort interpreting column values as numbers, but this did not change the cells in a column from text to numbers.

Be sure to remove an Text filter facets from the left margin so that we can examine the whole rodent dataset.

To transform cells in the recordID column to numbers, use the column pulldown to > Edit cells > Common transforms… > To number. You will notice the recordID values change from left-justified to right-justified, and black to green color.

Challenge

Numeric facet

Sometimes there are non-number values or blanks in a column and we want to find them. We can do that with a Numeric facet.

Challenge

When done examining the numeric data, remove this facet by clicking the x in the upper left corner of its panel.

Scatterplot facet

Now that we have multiple columns as numbers, we can see how they relate to one another using the scatterplot facet. Select a numeric column, say recordID, and use the pulldown menu to > Facet > Scatterplot facet. A new window called Scatterplot Matrix will appear. There are squares for each pair of numeric columns organized in an upper right triangle. Each square has little dots for the cell values from each row.

Challenge

Examine pair of columns in detail

We can examine one pair of columns by clicking on its square in the Scatterplot Matrix. A new facet with only that pair will appear in the left margin.

Challenge

Challenge

Key Points