This is an Eval Central archive copy, find the original at evalacademy.com.
This article is part of a series: How To Enter Survey Data
Part 1: Three Steps for Painless Survey Data Entry
Part 2: Preventing Mistakes in Survey Data Entry
Part 3: Common Issues with Survey Data Entry (and How to Solve Them)
Arguably the most exciting part about conducting a survey is seeing the results – finally your hard work has come to fruition, and you get to hear what everybody had to say about your program or organization! But before you can get to that step, you need to transform the stack of paper surveys on your desk into useable data.
For some, the thought of survey data entry is a mind-numbing task, but I kind of love it… You get to switch the critical thinking part of your brain off and just focus on one simple task, which isn’t an opportunity we often get in this fast-paced world.
I’m going to share my three-step system for making survey data entry as easy and painless as possible, which comes from my experience designing, entering, and analyzing survey data.
Before you get started entering survey data, you should think about your goals. My priorities for survey data entry are that it is:
-
Accurate,
-
Easy to analyze, and
-
Fast.
The most important job of data entry is that it is accurate. If it isn’t accurate, then forget analysis and speed. Accurate data entry means what ends up in the spreadsheet reflects exactly what was on the survey, every single time.
The next priority is that the survey data is easy to analyze. With some forethought, you can save your data analyst (which might also be you!) a lot of time and headache down the road.
Finally, data entry should be as fast as possible – time is money, after all! But never, ever sacrifice accuracy for speed.
Here are the three steps you can follow to set yourself up for painless survey data entry:
1. Review the survey carefully
Familiarize yourself with the questions on the survey, and the available options. Are there fill in the blanks? Multiple choice? Select all that apply? Most likely there are many question types, and understanding all the different questions is critical to steps 2 and 3. If it’s your first time seeing the particular survey, you might want to sit down and fill out a blank copy as if you were a respondent to get a really good feel for the questions.
2. Create the codebook
You should never be typing out the verbatim responses to each question while entering survey data (e.g., “yes” “yes” “no” “yes”). Instead, assign each response a number (e.g., yes = 1, no = 2) and enter those numbers instead of words. This fulfills all of our data entry priorities: it is more accurate, easier to analyze, and faster.
The codebook is your translator between the survey and the data. It tells you (and the analyst) how to turn survey responses into numbers, and back again. A copy of this codebook should live in the same folder as the data entry sheet and be clearly named. For added convenience, I paste a copy of the codebook into the data entry spreadsheet (Step 3) in a tab called Codebook. Here is what a simple codebook looks like:
The codebook outlines which number should be entered for each response. In this example, if someone answered Yes to Q1, you would enter “1.” If they answered No, enter “2.” You’ll notice I added the question numbers beside the questions – sometimes the paper surveys you receive won’t have the questions numbered, so you should write them into the codebook.
How you assign the response codes is up to you, but I strongly recommend following this system: from left-to-right and top-to-bottom, number the responses sequentially starting from 1. This way, the codes are the same no matter what the question is, which helps you ensure accuracy and speed. By following the same coding system for every question, the data entry person knows that the first response is always “1,” the second is always “2,” and so on. Numbers are faster and more accurate to type than letters because they are all close together on your keyboard’s number pad. Note: there are some exceptions to this rule when it comes to more complex question types, which I will cover in a follow-up article.
When it comes time to analyze the data, you might need to recode the data depending on how it will be analyzed (for example, maybe you want to change all the 1’s back to Yes’s, or change all the 2’s to 0’s). This is quick and easy to do at the analysis stage, and is not very prone to errors as long as you document any changes you make. Trust me, it’s way easier to change all the 1’s to Yes’s at the end than it is to type out “y-e-s” (or even just “y”) during data entry.
You’ll notice that I added “blank = 99 and unclear response = 98.” These are codes you will use when someone skips a question (99) or if they check more boxes than they are supposed to (98). How you deal with missing or unclear responses is up to you – just decide on a rule, document it, and apply it consistently. Entering 99 instead of leaving a blank cell is good practice because then you know for sure that question was skipped by the respondent, and not accidentally missed during data entry. However, do not use 98 and 99 if you are recording a numeric variable like age, because you won’t know if it is supposed to be “99 years old” or “missing data.” In this case, you may want to use 999 for missing data instead. Read more about blanks in data entry in our article “Four Common Data Entry Mistakes (and How to Fix Them)”.
3. Create the data entry spreadsheet
Now that you have the codebook, the data entry spreadsheet is easy to create. Using Microsoft Excel or Google Sheets (or other spreadsheet software) create a new file with one column for each question, plus a column for an identification number (ID#). Each cell will contain one number corresponding to the response to that question. For the above example, the spreadsheet (with some sample data) would look like this:
In a data entry spreadsheet, each row should always contain all the data for one unique individual. I like to add an ID column and fill the ID numbers all the way down the column before starting data entry. Even if there is no ID number on the survey to begin with, it is a good idea to add it to the spreadsheet because some statistical programs require a unique ID for each respondent. You may also want to manually write the ID numbers on the surveys as you enter them — if you don’t put ID numbers on the surveys, it is very difficult to go back and fix mistakes or do quality control.
When entering data, I keep my right hand on the keyboard’s number pad, and my left hand on the Tab key. Hitting Tab moves you to the right in the spreadsheet (to the next question), and when you get to the end of each survey you hit Enter to move down to the start of the next row. Remember to keep an eye on the screen to make sure you are still entering data in the correct cells.
Now that you’ve familiarized yourself with the survey and set up your codebook and data entry spreadsheet, it’s time to start entering data! This is the part where I turn on a podcast or some music, and let my mind focus solely on the task of data entry. If you follow these steps, you might be surprised at how painless (and even relaxing) data entry can be.
In the next article, I will cover some more advanced survey data entry topics, such as entering complex question types and dealing with unclear responses.
Sign up for our newsletter
We’ll let you know about our new content, and curate the best new evaluation resources from around the web!
We respect your privacy.
Thank you!