Online Respondents Struggle with Longer Pages. Pollsters Should Take Note.
by Scott Blatte (Class of ‘23)
In my last blog post, I identified a group of contradictory responses to two policing questions from individuals who I argued were likely inattentive when responding to those questions. This finding raises two important follow-up questions, both of which I work to answer in this post. First, did these individuals exhibit illogical response patterns solely on the policing questions, or were their responses to other questions also suspect? Second, if these respondents are indeed systematically responding based on factors apart from their personal views on the question, which questions is this happening for, and why?
Let’s begin with the former question, which is important for determining the magnitude of these respondents’ impact. If these respondents are inattentive for only the policing questions, then their impact on the overall survey is minimal. However, if 6% of the sample (the percent of people who contradict themselves) exhibits response error on a significant number of questions, then the impact of these respondents could be considerable. My previous blog post focused on inattentiveness in the two policing-specific questions; however, whether this same phenomenon persisted throughout the CES was unclear. It is plausible that some unobserved factor specific to policing altered the response patterns of these respondents, thus limiting the scope of their unreliability to only policing. In fact, if this is the case, then the source of the contradictory answers may not be inattentiveness, but rather the topic of policing itself. On the other hand, if contradictory respondents are associated with similarly fast response times across multiple questions, then inattentiveness is far more likely to be the cause. Here, I try to resolve that ambiguity and identify just how problematic these 3,000 people are to the CES.
To answer this question, I again relied on the CES’s wealth of timing data that was automatically collected for each page of the survey. For each page, I subtracted the mean response time for the contradictory respondents from the mean for the normal respondents. The following graph depicts the differences for each question.
Based on this alone, there is strong evidence that the contradictory group systematically differed from the normal group. On average, the normal respondents take longer to respond than the contradictory respondents. For most questions, that difference is negligible and is clustered above or below zero. However, for a handful of questions, the difference between the two becomes more pronounced and likely more impactful. This is reflected by the locations of the red points, which represent a mean difference greater than five seconds . All fourteen pages with mean differences greater than 5 seconds are positive, indicating that when there is a significant difference between the two groups, it is always the contradictory respondents taking less time to respond.
Given this evidence, I now move to the second question I posed at the beginning of this blog: which questions are respondents diverging on, and why? Specifically, I focus on the first cluster of red points from the preceding graph. The mean response times for each group are shown below.
On one hand, it could be that these questions are united by a common theme, whether it be highly polarized topics, policies with specific personal impacts — such as policing and racial justice — or other traits present in these questions with greater differences in response times. For whatever reason, this theme would thus cause a specific subset of respondents to exhibit unique response behavior for these questions. While it is true that in the latter graph, the topics are all politically polarized and thus have some overlap, they still represent an extremely wide range of topics; abortion, policing, and trade are all distinct from one another. Thus, based on this graph alone, there is no strong evidence that it is a particular issue that is most strongly associated with inattentive respondents.
The evidence does, however, support the hypothesis that the time required to complete the page is a strong predictor of satisficing behavior. From the graph, it is clear that the divergence of the two groups appears to occur when the question itself requires more time to complete; for context, most other pages on the CES took under 20 seconds. In turn, this suggests one potential cause for inattentiveness: the length of time required to read, process, and respond to the question. Of course, a longer baseline time required to complete the question/page implies a larger divergence between the two groups in magnitude. However, as the graph shows, the gap between the two groups for questions with mean differences greater than 5 seconds is larger in both magnitude and relative terms. The intuition behind this explanation is strong: questions demanding longer response times would cause respondents predisposed to inattentiveness to see the length of the question, the time needed to process it, and the time needed to respond to it, and, rather than take the required time to properly answer it, respond quickly without consideration of their actual views.
Interestingly, not only are the pages highlighted above all above average length, but they are all the same structure: a grid. That is, the question itself is a set of questions on a single topic that form a grid. The timing for that grid is recorded as a single value, which is displayed in my analysis. Of course, a grid with multiple questions is bound to take longer than a single text-box question, and thus it may be that the grid questions happen to be the questions on the CES that are the longest and thus result in the response time divergence. I wonder, however, whether the grid structure itself is a factor in inattentiveness. For example, consider a single, more in-depth question that should take about 40 seconds to answer, and a three-question grid page that also takes approximately 40 seconds to answer. A respondent prone to inattentiveness will see a single question and, under my hypothesis, be less likely to rush or skip through the page as there is no clear signal that the question requires more time to complete. However, when presented with a three-part grid that clearly signals to the respondent that additional time will be required, respondents predisposed to rush or inattentively respond will do so in greater proportions on these grid questions and may impact the reliability of the results.
In sum, respondent perceptions about length appear to pose problems. If respondents believe that a question or page will take longer because of the presence of a correlate with time — grids, for example — then they are more likely to respond illogically due to satisficing. Pollsters should take note. Complex or compound questions raise concerns because of fears respondents will get confused or lost in the question. But for online surveys, regardless of complexity, the number of questions on a single page might pose their own problems. The preceding analysis is observational and should not be read as “proof” that 6% of respondents were inattentive, that longer questions have a nonlinear, increasing effect on the gap between response times, or that grid questions are associated with more inattentive responses. Nevertheless, as the CES Guide warns, “if just 0.5% of respondents provide a mistaken response to a question, then it may result in hundreds of respondents being mis-categorized,” thereby confounding analysis, especially in small groups. And, if hundreds of respondents being mis-categorized pose a problem to large-n surveys, what happens if thousands give problematic responses? It’s an interesting question largely dependent on whether that error has a consistent direction, and thus a question I think deserves further attention and exploration from pollsters.