Module 2: Cases, variables, and measurement


Module 2 asks you to engage some fundamental questions: What are data? Cases? Variables? It additionally askes you to grapple with critical issues surrounding measurement as well as consider how these issues play out in Daniel Treisman’s famous article on “The casues of corruption” (2005). Finally, the Module introduces important statistical and visual tools for exploring and describing variables.


  • Define data, cases, and variables
  • Explain the qualities of “good” measurement; begin to analyze scholarship in light of these qualities
  • Grasp the intuitions behind common measures of central tendency and spread; become familiar with their notation and learn to calculate them in R
  • Explore data in R using summary functions, tables, and plots

Exercise 1

Download, complete, and submit Exercise 1 by 5pm CST on Monday, 1/25. The file is available at Canvas. I recommend that you preview the Exercise early in the week so that you can plan your time accordingly.

What is data?

Read OIS sections 1.1–1.5. Some of this material may be review, but don’t worry if it isn’t!

Now watch the brief video on “What is data?”

Variables and measurement

As the video mentions, we will dig into some actual data as we complete the practice questions. But first, watch the video on “Variables and measurement.” Note that the video will occasionally ask you to press “pause” and then spend some time answering some “class questions.” You don’t have to submit your answers to these questions. However, quickly jotting your answers down may be useful, as we will discuss some of the questions during our section meetings.

How do issues surrounding conceptual clarity, validity, and reliability play out in social science scholarship? In a moment, you will read “The causes of corruption,” by Daniel Treisman (2005). This is a famous article that uses linear regression to test common hypotheses about the causes of corruption worldwide. Before you read, takes some time to answer some questions. (As above, I suggest that jot your answers down somewhere): (1) What is corruption? Define it. (2) Imagine that you wanted to measure corruption levels across countries. How would you go about doing that? What kind of data would you look for? (3) What are some advantages of your approach? What are some of its disadvantages?

Now read the article.

Once you have finished reading, answer these questions: (1) Do you think that Transparency International (TI) Index scores are valid and reliable as a measure of corruption? (2) Treisman makes the case that they are. What does he argue to support the validity and reliability of his measure? (3) Are you persuaded? Why or why not?

Describing variables

Now read OIS sections 1.6 and 1.7. Then, watch the video on “Describing Variables: Measures of Central Tendency and Spread.”

And now, watch the video on “Describing Variables: Tables and Plots.”

Let’s apply some of the concepts presented above! Download the “module 2 practice questions” RMD from Canvas (“Files/Practice Questions”). Note that there are 2 parts to the practice questions. Complete the practice questions in RMD and then knit your files to PDF.

Remember: If you get stuck at any point… breathe. Coding can be frustrating at first, but we will work through it together. There a lots of ways to seek help:

  1. Use the “Help” tab in RStudio
  2. Internet search
  3. Post your question to Ed Discussion
  4. As a final option, email me directly or visit me during my office hours

As you seek help, try to specify the nature of the problem: Examine any warnings or error messages. What line of code seems to be the issue? Which function, specifically? (During knitting, Markdown will often tell you which line of code is stalling the knitting process.) If you are getting error messages, are you missing parentheses, commas, or quotations? (This happens to me all the time.) Answering these questions will help to ensure that you get the help you need.