Aug 01-Aug 02, 2017
9:00 am - 4:30 pm
Instructors: Simon Hettrick, Alistair Bailey, Rob Blair
Helpers: Arshad Emmambux, Olivier Phillipe, Robin Wilson
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Our mission is to provide researchers high-quality, training covering the full lifecycle of data-driven research. Data Carpentry is a sibling organization of Software Carpentry. Where Software Carpentry teaches best practices in software development, our focus is on the introductory computational skills needed for data management and analysis in all domains of research. Our initial target audience is learners who have little to no prior computational experience. We create a friendly environment for learning to empower researchers and enable data driven discovery. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.
For more information on what we teach and why, please see our paper "Best Practices for Scientific Computing".
Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.
Where: Room 2207, Building 85 (Biological Sciences Building). Get directions with OpenStreetMap or Google Maps.
When: Aug 01-Aug 02, 2017. Add to your Google Calendar.
Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Software Carpentry's Code of Conduct.
Accessibility: We are committed to making this workshop accessible to everybody. The workshop organisers have checked that:
Contact: Please email rsg-info@soton.ac.uk for more information.
Surveys
Please be sure to complete these surveys before and after the workshop.
09:00 | Arrival and Setup |
09:15 | Wendy White - Library |
09:30 | Introduction to Data Carpentry |
10:00 | Data Organisation in Spreadsheets |
11:00 | Break |
11:30 | Data Cleaning with OpenRefine |
12:30 | Lunch |
13:30 | Introduction to R and RStudio |
14:30 | Break |
15:00 | Introduction to R and RStudio contd. |
16:00 | Wrap-up |
10:00 | Data Analysis and Visualization in R |
11:00 | Break |
11:30 | Data Analysis and Visualization in R contd. |
12:30 | Jeremy Frey - The EDISON project |
12:45 | Lunch |
13:30 | Data Management with SQL |
14:30 | Break |
15:00 | Data Management with SQL contd. |
16:00 | Wrap-up |
To participate in a Data Carpentry workshop, you will need to bring a laptop with the software described below.
To work with with spreadsheets, we can use Microsoft Excel, OpenOffice.org, or other programs. Commands may differ a bit between programs, but general ideas for thinking about spreadsheets are the same. For this lesson, if you don’t have a spreadsheet program already, you can use LibreOffice. It’s a free, open source spreadsheet program.
Only if you don't have MS Excel installed. Install LibreOffice by going to the download page. Your download should begin automatically. You will go to a page that asks about a donation, but you don’t need to make one.
Only if you don't have MS Excel installed. Install LibreOffice by going to the download page. Your download should begin automatically. You will go to a page that asks about a donation, but you don’t need to make one.
Install LibreOffice by going to the download page. The version for Linux should automatically be selected. Click Download Version 5.3.X. You will go to a page that asks about a donation, but you don’t need to make one. Your download should begin automatically.
For this lesson you will need OpenRefine (formerly Google Refine) and a web browser.
Note: this is a program that runs on your machine (not in the cloud). It is accessed via your browser, but no web connection is needed.
spctl --add /Applications/OpenRefine.app
and try again.
java -version
. If you don't have it, the run sudo apt-get install default-jre
(Ubuntu) or sudo dnf install java-1.8.0-openjdk
(Fedora)./refine
into the terminal within the OpenRefine directoryR is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
sudo apt-get install r-base
and for Fedora run
sudo yum install R
).SQL is a specialized programming language used with databases. We use a very lightweight database system called SQLite in our lessons. On its own, it's so light, it doesn't even include a user interface! So, we use DB Browser for SQLite.
Download and install DB Browser for SQLite (Windows)
Download and install DB Browser for SQLite (Mac)
Download and install DB Browser for SQLite (Linux)