Web based health surveys: Using a Two Step Heckman model to examine their potential for population health analysis
Social science & medicine
Reason for embargo
In June 2011 the BBC Lab UK carried out a web-based survey on the causes of mental distress. The ‘Stress Test’ was launched on ‘All in the Mind’ a BBC Radio 4 programme and the test’s URL was publicised on radio and TV broadcasts, and made available via BBC web pages and social media. Given the large amount of data created, over 32,800 participants, with corresponding diagnosis, demographic and socioeconomic characteristics; the dataset are potentially an important source of data for population based research on depression and anxiety. However, as respondents self-selected to participate in the online survey, the survey may comprise a non-random sample. It may be only individuals that listen to BBC Radio 4 and/or use their website that participated in the survey. In this instance using the Stress Test data for wider population based research may create sample selection bias. Focusing on the depression component of the Stress Test, this paper presents an easy-to-use method, the Two Step Probit Selection Model, to detect and statistically correct selection bias in the Stress Test. Using a Two Step Probit Selection Model; this paper did not find a statistically significant selection on unobserved factors for participants of the Stress Test. That is, survey participants who accessed and completed an online survey are not systematically different from non-participants on the variables of substantive interest.
This is the author accepted manuscript. The final version is available from the publisher via the DOI in this record.
Article in press version. Available online 1 July 2016