Schedule

June 4 - 8, 2018
James B. Hunt Jr. Library
NC State University

Monday, June 4

8:00 am

Transportation from hotel to Duke Energy Hall, Hunt Library

8:30 am - 9:30 am

Breakfast, Welcome and Learning Norms

9:30 am - 10:45 am

Opening Keynote: Start with Do: Expertise + Facility with Tools = Change 

(Juliane Schneider, Harvard Catalyst’s eagle-i)

There are a million different conversations surrounding data curation, scholarly communication, and open access. There are enough reports, proposals, white papers, guides, think pieces and 'findings' of grant-funded meetings that it can make your head spin. Juliane Schneider will present a simple approach towards coping with data services in libraries: learn some concrete skills using data tools, and use them as a communication channel to the faculty and researchers who need them. She will show how having these skills gives context to the conversations going on in the wider world. The will also be a short group activity using Markdown and Pandoc.

10:45 am - 11:00 am

Break

11:00 am - 12:30 pm

Fundamentals of Data, Part 1 (Emily Griffith, NCSU)

Focusing on the first steps of data exploration and analysis, topics covered will include the nature and types of measurement, descriptive statistics, and guidance on how to choose an appropriate analysis method based on the research question and data types. Demonstrations and hands-on exercises will use popular statistical analysis packages such as SAS and Stata. We’ll discuss the role of statistical consultants and some typical data analysis challenges

12:30 pm - 1:30 pm

Lunch

1:30 pm - 3:00 pm

Fundamentals of Data, Part 2 (Emily Griffith, NCSU)

3:00 pm - 3:15 pm

Break

3:15 pm - 4:30 pm

Conversation with Faculty about Data (Paul Fyfe, Hollylynne Lee, Maria Mayorga, and Darby Orcutt)

What data issues occupy the minds of faculty and librarians? NCSU faculty and librarians from multiple disciplines share how data creation, management, storage, and dissemination affect their work.  Bring your questions and add to the conversation.

4:45 pm

Transportation back to hotel

5:30 pm - 7:30 pm

DSVIL Welcome Reception, Sheraton Hotel

Tuesday, June 5

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 12:00 pm

Data Visualization, Part 1 (Angela Zoss, Duke University)

Participants will be introduced to a wide range of visualization types that can be created with Excel and easy-to-use, free software like Tableau Public, Plot.ly, CartoDB, and Raw. We will focus on skills and techniques that will apply well to both personal visualization projects and patron requests for help creating visualizations. Almost entirely hands-on, this workshop will cover the process of making data visual from start to finish--knowing what visualizations work well with particular types of data, knowing what tools can produce different types of visualizations, and knowing how to adjust the design or layout of a figure to create professional output.

12:00 pm - 1:00 pm

Lunch

1:00 pm - 4:00 pm

Data Visualization, Part 2 (Angela Zoss, Duke University)

4:00 pm - 4:45 pm

Break - Ice Cream Social

Enjoy NC State's Howling Cow ice cream and more!

5:00 pm

Transportation back to hotel

Wednesday, June 6

7:30 am -  8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 11:00 am

Web Scraping: Gathering data from websites (John Little, Duke University)

Preexisting and clean data sets such as the General Social Survey (GSS) or Census data are readily available, cover long periods of time, and have well documented codebooks. Meanwhile, researchers increasingly want to gather their own data from websites, which introduces a different layer of complexity. Accessing content from web sources requires different tools and new techniques. In this workshop we will use webscraper.io to crawl a website and gather text from an online newsletter.

11:00 am - 11:15 am

Break

11:15 am - 12:00 pm

Data Cleaning and Preparation, Part 1 (John Little, Duke University)

OpenRefine is a tool used to impose structure upon semi-structured data. The often-intuitive interface is a great convenience. Its powerful and extensible method for normalizing data makes OpenRefine a “go to” option for quick and easy data transformations. Categorical facets can be exposed for simple data clean-up. Bulk data clustering options are so easy that the process looks like nerdy fun. Few tools are better suited for bulk data cleaning. This hands-on session will explore how Refine can help with common data cleaning challenges.

12:00 pm -1:00 pm

Lunch

1:00 pm - 2:30 pm

Data Cleaning and Preparation, Part 2 (John Little, Duke University)

2:30 pm - 2:45 pm

Break

2:45 pm - 4:30 pm

Parsing HTML & JSON, Orchestrating APIs, and Gathering Twitter Streams (John Little, Duke University)

As time allows we will build on our newly developed OpenRefine knowledge to move beyond beginner Web Scraping techniques. Using OpenRefine, we will gather and clean data from less structured web pages. Then, following a discussion about Application Programming Interfaces (API), we will use the TAGS tool to gather Twitter data.

4:45 pm

Transportation back to hotel

Thursday, June 7

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 11:15 am

Bibliometric Network Analysis Using Sci2 and Gephi (Kris Alpi and Danica Lewis, NCSU Libraries)

Hands-on activities of co-authorship analysis using Web of Science data, Sci2 and Gephi.

11:15 am -11:30 am

Break and travel

11:30 am - 12:30 pm

Hunt Visualization Spaces (Karen Ciccone, Walt Gurley and Jason Jefferies, NCSU Libraries)

Tour of Hunt video walls and visualization spaces highlighting interesting uses and collaborations between librarians and faculty and students.

12:30 pm - 1:30 pm

Lunch

1:30 pm - 4:00 pm

Data Description, Sharing, and Reuse (Thu-Mai Christian, UNC) 

Introduction of descriptive, administrative and technical metadata that facilitates discovery and reuse of existing data. Session will explore unique identifiers and best practices for version control, discuss data sensitivity and tools for managing privacy and access controls including data sharing agreements between researchers, de-identification, and encryption, and demonstrate methods of sharing large files among researchers and making them available such as file transfer and posting in general and subject matter repositories.

4:00 pm - 4:30 pm

Data Use Agreements Activity (Kris Alpi, NCSU Libraries)

An activity of how data is addressed in publisher copyright agreements and instructions to authors, with a focus on how data use terms have evolved along with funder mandates for data access.

4:45 pm

Transportation back to the hotel

Friday, June 8

7:30 am - 8:00 am

Transportation from hotel to Hunt Library

8:00 am - 8:45 am

Breakfast and Small Group Reflections

Smalls groups will explore questions for discussion from day's previous speakers.

8:45 am - 9:00 am

Break

9:00 am - 10:30 am

Electives

Elective 1: Data Visualization with R (Alison Blaine, NCSU Libraries)

An introductory overview and hands-on practice with making data visualizations with R, a statistical computing language. Topics covered will include an overview of visualization packages and considerations for creating publication-quality graphics using R. The hands-on activity will consist of creating a few visualizations using R in the RStudio development environment.

Elective 2: Principles of Version Control and GitHub (Bret Davidson & Heidi Tebbe, NCSU Libraries)

General overview of the definition, history, and purpose of version control systems with examples of use in software development and data management. Walkthrough of the key features of the GitHub web interface and desktop client. Participants will do hands on exercises using a rubric to evaluate open source software on GitHub.

Elective 3: Managing Data & Visualization Services (discussion session)

Discussion of developing and managing data & visualization services and staff learning.

10:30-10:45

Break

10:45 am -12:00 pm

Lightning Talks!

Participants have the opportunity to give short talks on data related topics:

Luke Aeschleman - "Colab Notebooks"
Emily Boyd - "Entrepreneurship and Data: The Library Can Help Your Pitch!"
Anna J Dabrowski - "Brief Experience in Two Areas of RDM Services"
Tess Grynoch - "Resources for Data - Driven Discovery & The Data Thesaurus"
Janelle Hedstrom - “Liaisons and Data Librarians: Strategic Partnering to Scale-up Data Services”
Rachael Posey - "The Case for Mode"
Hannah Rainey - "Data Security @ NCSU Libraries"
Jason Reed - "Keeping the Visualization Train Rolling"
Mary Ellen Sloane - "Data Information Literacy"

12:00 pm - 1:00 pm

Lunch & Wrap Up

1:15 pm

Transport to RDU airport or hotel