Data Services Librarian
Guide to Planning Data Research
When seeking out secondary data, there are a few essential questions to ponder before beginning.
What is my research question?
You obviously need to have a pretty good idea what it is your are interested in studying. If your topic is too broad (i.e., "gender") you will get too many "hits" in a keyword search to be of much use. On the other hand, if your interests are too narrow (i.e., "single parent fathers raising 6 or more children in rural coastal towns"), it is not likely you will find any data at all. The key is to be focused without being overly specific.
What is the purpose of my research?
There is a big difference between a methods course paper demonstrating you understand how to do and interpret a crosstab and a dissertation or major research project. Many (many) novice graduate students have that "magnum opus" mentality -- they want to answer THE question of their career during their first semester in school. While it is useful to begin focusing on a thesis topic fairly early in graduate school, and it is a great help to be very familiar with a potentially complicated data set before you begin your thesis, you need to learn to walk before you run.
Data comes in all shapes and forms, from relatively easy to work with to practically impossible. There is no reason to start with something overly complicated -- it will be a lot of frustration for you (and a LOT of assistance from staff--generally more than most people will have time to provide for a class paper). As you move along in your carreer, and understand the basics of reading codebooks, writing programs to read raw data, merging files, etc., etc., you will become more comfortable working with more complicated data sets.
In general, for your early course work, you should think about topics that are interesting to you, and that are related to what you might want to pursue in more detail later on. Find a data set that has some measure for your dependent variable, as well as several useful independent variables (and, importantly, variables that meet the criteria of the assignment, e.g., a linear dependant variable). Note that it is better to have fewer but more applicable variables than to wallow around in hundreds of variables trying to see what pans out.
Of course, it is often difficult to know ahead of time if a given data set is going to be useful to you, and/or if it is too complicated for the given assignment or the amount of time you have in which to complete it. Your professor will be your first and best resource in deciding which data might be most appropriate.
How much time do I have? (Or, AGHH! What can I possibly do now that I have waited so long???)
Time (or lack thereof) is a critical factor in choosing a data set. The sooner you focus on a topic and find a data set the better. Although obtaining data is becoming increasingly easier, sometimes there are slowdowns. Sometimes the data doesn't have what you want or need and you need to find something else. Sometimes there are computer glitches. Sometimes the faculty member you need goes to a conference!
However, if your deadline is approaching, and you still don't have any data, there are several useful (and heavily used) studies immediately available, such as the General Social Survey, or the American National Election Studies. These studies generally have adequate data for class assignments (although probably not for advanced analyses courses...).
See some of the many resources available for Choosing a Topic.