Data Services Librarian
One of the best places to look for data is the Inter-University Consortium for Political and Social Research (ICPSR), at the University of Michigan. D. H. Hill Library maintains the ICPSR membership for NCSU, which means that we have access to all of the thousands of studies archived there. The data are, for all intents and purposes, free (to you), and are easily and quickly attainable.
Individual ICPSR studies come with a variety of files, sometimes as little as an ascii file of raw data and an electronic codebook, but often with syntax files for SAS, SPSS and/or Stata to read the raw data (which cuts down on your work considerably).
The first thing to do to gain access to ICPSR data is to create your login. You must do this from an on-campus computer.
Look for the Login/Create Account link in the upper left corner of the ICPSR home page. You will be asked for your email address and information about your NCSU affiliation and you will create your own password. To complete the account creation process, ICPSR will send you an email and you must click on a link (URL) contained in the verification message.
You may search the database without logging in but you may not download either the documentation of a study or any data files without doing so.
ICPSR provides a variety of ways to search the data collection. The basic search is similar to that of many databases.
You can do a keyword search on either the title or abstract of the studies in the archive, or by Principal Investigator. A listing of all studies matching your keyword(s) will be displayed. If you have too many hits (studies) to be useful, try a more specific keyword. If you have no hits, try to think of a broader term.
If you have just a general idea of what you are interested in, you can browse the ICPSR data archives by subject heading (i.e., health care, education, social institutions, etc.). Clicking on one of the subjects in the list will display all the studies in the ICPSR collection falling under that subject heading.
Each study in their collection is given a unique identification number by ICPSR. If you know the ICPSR number for the study in which you are interested, you can search for that specific study by entering the ICPSR number.
The list of studies matching your keyword(s), or those falling under the subject heading you selected, will include the name of the study, the Principal Investigators, the ICPSR study number, and links to the study Description, to the Download screen (with a list of files available with the study) and to Related Literature.
The first thing you should do is carefully read the study Description. This abstract provides important information about the study, such as the title of the study, a summary (e.g., topics included, sample information), principal investigators, data collection period, funding agency, and lots more. Also provided is information about each of the files in the study, such as the type of file (e.g., codebook, data, SAS program, etc.), number of cases, and number of variables.
This information should give you a good idea as to whether the study might be of use to you. Does it appear to have the variables in which you are interested? Does the sample look reasonable? Is the time period reasonable? Are the data too complicated for the assignment? And, if you are pressed for time -- are any program files (SAS or SPSS programs) available?
Note for future: If you work with a particular data set and find later that it has been updated, the Access and Availability section of the Description provides details about when and how a data set is revised. ICPSR archives older versions but they are not available for direct download from the Web site; you would need to contact the staff directly to receive a particular version.
We'll come back to downloading in a moment. ICPSR maintains a bibliography of research papers and publications which use data from their archive. The Bibliography of Related Literature can be used to survey research on a given topic, narrow down a topic for your own research, and to find additional resources for your research. Don't forget to submit your research when published so it can be listed in the Bibliography, too.
To "download" a file means to copy it electronically from some place out there on the internet to your PC.
Make note of the cumulative size of all the files you want to download -- you need to have enough space available on your computer to accomodate them.
The first step in downloading ICPSR data is to log in if you have not already done so. When you click on either the Download or Documentation tabs you will be prompted to log in. Then you'll be passed to a page listing all the files available for the study. There are four main steps here.
- You may choose to download just the data or just the documentation files, or both. Downloading the documentation only is recommended if you're considering a study's appropriateness. You may download lots of studies' documentation before finding a suitable data set and downloading only the documentation will save you time (and computer file space).
- You must select the individual data sets you want. Sometimes there will only be one, but many studies have several sections divided by topic or methodology and longitudinal studies may present different years' data in sections.
- You must add the data to your "Data Cart." You only need to put items in your Data Cart when you intend to download them. As you add items to your cart, the page will reload and automatically tell you the total size of the files you have. Remember, your computer has to have enough space to receive them. These file sizes can get quite large. You may want to consider doing the download on campus if you only have dial-up at home.
- There's an optional interim step here where you can review your cart, but you can skip that and go directly to the final step, the actual download. Your computer will pop up a window asking if you want to open or save the file. You don't want to open it; just save it.
All files that you download from ICPSR are automatically compressed. Before reading a compressed file you must first "decompress" it. Many applications are available to do this, including WinZip, StuffIt, etc., and many offer free trial versions. If you need help, contact your computer consultant.
Once you decompress the download file, you will have both it and the individual uncompressed files on your PC. You can delete the compressed file if you wish and open the uncompressed files. Files of various types may be included in the same compressed file so you may end up with things in ascii text, PDF, and SAS (or other statistical software) formats. Note: while you can open SAS and SPSS program files and raw data files in word processing programs, be sure to save them as "Ascii text" files if you make any changes --- SAS and SPSS will not be able to read them unless they are in ascii format.
You will need to write a program to read the raw data and create, for instance, the SAS data set or SPSS system file which includes the variables you want.
NOTE : UNLESS A STATISTICAL SOFTWARE SETUP FILE IS PROVIDED FOR YOU, WHAT YOU HAVE IS JUST RAW DATA. YOU MUST WRITE A PROGRAM TO RUN ANY PROCEDURES ON THE DATA.
There are resources on campus for getting help with writing such programs. See the Services page for information about classes or contact Data Services for a referral. Note that different offices have expertise with different software packages.
- Online Data Use Tutorial - for getting started with ICPSR data sets
- Data User Help Center - other tutorials for downloading and using data
- ICPSR's Help Documentation - searchable database of FAQs