Schedule

May 13 - 17, 2019
James B. Hunt Jr. Library
NC State University

DRAFT Schedule is below. Content will be updated once finalized.

Monday, May 13

8:00 am

Transportation from hotel to Duke Energy Hall, Hunt Library

8:30 am - 9:45 am

Breakfast, Welcome and Learning Norms

9:45 am - 10:00 am

Break

10:00 am - 12:30 pm

Fundamentals of Data, Part 1 (Emily Griffith, NCSU)

Focusing on the first steps of data exploration and analysis, topics covered will include the nature and types of measurement, descriptive statistics, and guidance on how to choose an appropriate analysis method based on the research question and data types. Demonstrations and hands-on exercises will use popular statistical analysis packages such as SAS and Stata. We’ll discuss the role of statistical consultants and some typical data analysis challenges

12:30 pm - 1:30 pm

Lunch

1:30 pm - 4:00 pm

Fundamentals of Data, Part 2 (Emily Griffith, NCSU)

4:15 pm

Transportation back to hotel

5:30 pm - 7:30 pm

DSVIL Welcome Reception, Sheraton Hotel

Tuesday, May 14

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 12:00 pm

Data Visualization, Part 1 (Angela Zoss, Duke University)

Participants will be introduced to a wide range of visualization types that can be created with Excel and easy-to-use, free software like Tableau Public, Plot.ly, and Raw. We will focus on skills and techniques that will apply well to both personal visualization projects and patron requests for help creating visualizations. Almost entirely hands-on, this workshop will cover the process of making data visual from start to finish--knowing what visualizations work well with particular types of data, knowing what tools can produce different types of visualizations, and knowing how to adjust the design or layout of a figure to create professional output.

12:00 pm - 1:00 pm

Lunch

1:00 pm - 4:30 pm

Data Visualization, Part 2 (Angela Zoss, Duke University)

4:45 pm

Transportation back to hotel

Wednesday, May 15

7:30 am -  8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 11:00 am

Web Scraping: Gathering data from websites (John Little, Duke University)

Preexisting and clean data sets such as the General Social Survey (GSS) or Census data are readily available, cover long periods of time, and have well documented codebooks. Meanwhile, researchers increasingly want to gather their own data from websites, which introduces a different layer of complexity. Accessing content from web sources requires different tools and new techniques. In this workshop we will use webscraper.io to crawl a website and gather text from an online newsletter.

11:00 am - 11:15 am

Break

11:15 am - 12:00 pm

Data Cleaning and Preparation, Part 1 (John Little, Duke University)

OpenRefine is a tool used to impose structure upon semi-structured data. The often-intuitive interface is a great convenience. Its powerful and extensible method for normalizing data makes OpenRefine a “go to” option for quick and easy data transformations. Categorical facets can be exposed for simple data clean-up. Bulk data clustering options are so easy that the process looks like nerdy fun. Few tools are better suited for bulk data cleaning. This hands-on session will explore how Refine can help with common data cleaning challenges.

12:00 pm -1:00 pm

Lunch

1:00 pm - 2:30 pm

Data Cleaning and Preparation, Part 2 (John Little, Duke University)

2:30 pm - 2:45 pm

Break

2:45 pm - 4:30 pm

Parsing HTML & JSON, Orchestrating APIs, and Gathering Twitter Streams (John Little, Duke University)

As time allows we will build on our newly developed OpenRefine knowledge to move beyond beginner Web Scraping techniques. Using OpenRefine, we will gather and clean data from less structured web pages. Then, following a discussion about Application Programming Interfaces (API), we will use the TAGS tool to gather Twitter data.

4:45 pm

Transportation back to hotel

Thursday, May 16

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am -10:30 am

R session - TBA

10:30 am - 10:45 am

Break

10:45 am - 11:00 am

Group Photo & Tour Group Travel

11:00 am - 12:00 pm

Hunt Visualization Spaces (Karen Ciccone, Walt Gurley and Hannah Rainey, NCSU Libraries)

Tour of Hunt data and visualization spaces highlighting interesting uses and collaborations between librarians and faculty and students.

12:00 pm - 1:00 pm

Lunch

1:30 pm - 4:30 pm

Data Description, Sharing, and Reuse (Thu-Mai Christian, UNC) 

Introduction of descriptive, administrative and technical metadata that facilitates discovery and reuse of existing data. Participants will explore unique identifiers and best practices for version control, discuss data sensitivity and tools for managing privacy and access controls, including data sharing agreements between researchers, de-identification, and encryption. The session will also demonstrate methods of sharing large files among researchers and making them available, such as file transfer and posting in general and subject matter repositories, and ensuring that data adheres to FAIR principles

4:45 pm

Transportation back to the hotel

Friday, May 17

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 10:30 am

Electives

Elective 1: Data Visualization with R (Alison Blaine, NCSU Libraries)

An introductory overview and hands-on practice with making data visualizations with R, a statistical computing language. Topics covered will include an overview of visualization packages and considerations for creating publication-quality graphics using R. The hands-on activity will consist of creating a few visualizations using R in the RStudio development environment.

Elective 2: Bibliometric Network Analysis Using Sci2 and Gephi (Danica Lewis & Shaun Bennett, NCSU Libraries)

Hands-on activities of co-authorship analysis using Web of Science data, Sci2 and Gephi.

10:30-10:45

Break

10:45 am -12:00 pm

Roundtables

Participants have the opportunity to propose topics and facilitate discussions on data related topics.

12:00 pm - 1:00 pm

Lunch & Wrap Up

1:15 pm

Transport to RDU airport or hotel