247 Hesburgh Library, Navari Family Center for Digital Scholarship
Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start.
Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start, too!
In this lesson, you will learn:
In this lesson, however, you will NOT learn about data ANALYSIS with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.
Prerequisites:
-----
This workshop will follow a Carpentries curriculum. Learn more about Carpentries workshops at Notre Dame (https://libguides.library.nd.edu/carpentries).