Assessing the Risks to Online Polls from Bogus Respondents’

As online polling has become more ubiquitous, Pew Research Center’s methodologists have examined whether variability in methods of recruiting online respondents has consequences for data quality. In this session, Director of Survey Research Courtney Kennedy will discuss the risk to online polls from untrustworthy, or “bogus,” respondents that impact the quality of survey data.

Key takeaways will include:
Not all online polls suffer from this problem. Sourcing affects data quality – online samples fielded with widely used opt-in sources contain small but measurable shares of bogus respondents (about 4% to 7% depending on the source). Critically, these bogus respondents are not just answering at random or “adding noise,” but rather they tend to select positive answer choices introducing a small, systematic bias into estimates like presidential approval.

No method of online polling is perfect, but there are notable differences across approaches with respect to the risks posed by bogus interviews. Crowdsourced and opt-in survey panel respondents were more likely to give bogus data than those recruited offline via random sampling of addresses. To help the public better differentiate trustworthy and untrustworthy polls, it would be helpful if poll methodology statements mentioned what checking, if any, was performed.

Some poll questions are more affected by bogus respondents than others. Questions that allow the respondent to give a positively valanced answer show larger effects than those that do not.

Two of the most common checks to detect low quality online interviews – looking for respondents who answer items too fast or fail an attention check (or ‘trap’) question – are not very effective when it comes to identifying bogus respondents. Some 84% of bogus respondents pass a trap question and 87% pass a check for responding too quickly.