Relative Bias in Health Estimates from Probability-Based Online Panels: Systematic Review and Meta-Analysis
Keywords:
Probability-based online panels, Data quality, Relative bias, Meta-analysis, Health researchAbstract
Introduction: Health surveys require the highest data quality, especially when they inform public health policies. With recent technological developments, probability-based online panels (PBOPs) are becoming an attractive cost-effective alternative to traditional surveys. They are also beginning to be used for official health statistics. However, PBOPs still face concerns about bias, especially for health-related estimates.
Method: Using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach, we conducted a systematic review and meta-analysis of PBOP PBOP health survey data quality, with relative bias (RB) of the estimates as the effect size. We analysed 137 health-related survey items from 14 studies and used a linear regression model to examine factors that moderate RB.
Results: The RB varied considerably across the subjects, and its overall median was 12.7%. The highest RBs were exhibited by disabilities (23.6%), mental illnesses (23.2%), personal mental health conditions (20.8%) and drug use (20.7%), and the lowest, by doctor’s treatment (2.24%). The measurement levels with ordinal scales (25.8%) showed higher RB, and certain country effects were also observed.
Conclusion: This moderate bias of the health estimates raises concerns about the accuracy of PBOP estimates regarding sensitive health topics. Therefore, PBOP should be used cautiously for official health statistics; and when designing PBOP surveys for health subjects, the item and study characteristics should be included as methodological considerations.
Downloads
Metrics
References
Ali, A., Hassiotis, A., Strydom, A., & King, M. (2012). Self stigma in people with intellectual disabilities and courtesy stigma in family carers: A systematic review. Research in Developmental Disabilities, 33(6), 2122–2140. https://doi.org/10.1016/j.ridd.2012.06.013 DOI: https://doi.org/10.1016/j.ridd.2012.06.013
Berzelak, J., & Vehovar, V. (2018). Mode effects on socially desirable responding in web surveys compared to face-to-face and telephone surveys. Metodološki Zvezki, 15(2), 21–43. From http://www.dlib.si DOI: https://doi.org/10.51936/lrkv4884
Bialik, K. (2018, December 6). How asking about your sleep, smoking or yoga habits can help pollsters verify their findings. Pew Research Center. https://www.pewresearch.org/short-reads/2018/12/06/how-asking-about-your-sleep-smoking-or-yoga-habits-can-help-pollsters-verify-their-findings/
Bosch, O. J., & Maslovskaya, O. (2023, May 26). GenPopWeb2: The utility of probability-based online surveys – literature review. National Centre for Research Methods. From https://www.ncrm.ac.uk/ DOI: https://doi.org/10.31235/osf.io/f69hy
Bosnjak, M., Dannwolf, T., Enderle, T., Schaurer, I., Struminskaya, B., Tanner, A., & Weyandt, K.W. (2018). Establishing an open probability-based mixed-mode panel of the general population in Germany: The GESIS Panel. Social Science Computer Review, 36(1), 103–115. http://dx.doi.org/10.1177/0894439317697949 DOI: https://doi.org/10.1177/0894439317697949
Bosnjak, M., Haas, I., Galesic, M., Kaczmirek, L., Bandilla, W., & Couper, M. P. (2013). Sample composition discrepancies in different stages of a probability-based online panel. Field Methods, 25(4), 339–360. https://doi.org/10.1177/1525822X12472951 DOI: https://doi.org/10.1177/1525822X12472951
Bradley, V. C., Kuriwaki, S., Isakov, M., Sejdinovic, D., Meng, X. L., & Flaxman, S. (2021). Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature, 600(7890), 695–700. https://doi.org/10.1038/s41586-021-04198-4 DOI: https://doi.org/10.1038/s41586-021-04198-4
Callegaro, M., Manfreda, K. L., & Vehovar, V. (2015). Selected topics in web survey implementation (Chapter 5). In Web survey methodology (pp. 191–230). SAGE Publications. DOI: https://doi.org/10.4135/9781529799651
Cornesse, C., & Blom, A. G. (2023). Response quality in nonprobability and probability-based online panels. Sociological Methods & Research, 52(2), 879–908. https://doi.org/10.1177/0049124120914940 DOI: https://doi.org/10.1177/0049124120914940
Cornesse, C., Felderer, B., Fikel, M., Krieger, U., & Blom, A. G. (2022). Recruiting a probability-based online panel via postal mail: Experimental evidence. Social Science Computer Review, 40(5), 1259–1284. https://doi.org/10.1177/08944393211006059 DOI: https://doi.org/10.1177/08944393211006059
Cornesse, C., Krieger, U., Sohnius, M., Fikel, M., Friedel, S., Rettig, T., Wenz, A., Juhl, S., Lehrer, R., Möhring, K., Naumann, E., Reifenscheid, M., & Blom, A. G. (2021). From German Internet Panel to Mannheim Corona Study: Adaptable probability‐based online panel infrastructures during the pandemic. Journal of the Royal Statistical Society, Series A (Statistics in Society), 185, 773–797. DOI: https://doi.org/10.1111/rssa.12749
Dettori, J. R., Norvell, D. C., & Chapman, J. R. (2022). Fixed-effect vs random-effects models for meta-analysis: 3 points to consider. Global Spine Journal, 12(7), 1624–1626. https://doi.org/10.1177/21925682221110527 DOI: https://doi.org/10.1177/21925682221110527
Dever, J. A., Amaya, A., Srivastav, A., Lu, P. J., Roycroft, J., Stanley, M., Stringer, M. C., Bostwick, M. G., Greby, S. M., Santibanez, T. A., & Williams, W. W. (2021). Fit for purpose in action: Design, implementation, and evaluation of the National Internet Flu Survey. Journal of Survey Statistics and Methodology, 9(3), 449–476. https://doi.org/10.1093/jssam/smz050 DOI: https://doi.org/10.1093/jssam/smz050
Digital Library of University of Ljubljani (DiKUL). http://dikul.uni-lj.si
do Nascimento, I. J. B., Pizarro, A. B., Almeida, J. M., Azzopardi-Muscat, N., Gonçalves, M. A., Björklund, M., & Novillo-Ortiz, D. (2022). Infodemics and health misinformation: A systematic review of reviews. Bulletin of the World Health Organization, 100(9), 544–561. https://doi.org/10.2471/BLT.22.288002 DOI: https://doi.org/10.2471/BLT.21.287654
Eckman, S. (2015). Does the inclusion of non-Internet households in a Web panel reduce coverage bias? Social Science Computer Review, 34(1), 41–58. https://doi.org/10.1177/0894439315572985 DOI: https://doi.org/10.1177/0894439315572985
Groves, R. M., & Lyberg, L. (2010). Total survey error: Past, present, and future. Public Opinion Quarterly, 74(5), 849–879. https://doi.org/10.1093/poq/nfq065 DOI: https://doi.org/10.1093/poq/nfq065
Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of Internet panels to conduct surveys. Behavior Research Methods, 47(3), 685–690. https://doi.org/10.3758/s13428-015-0617-9 DOI: https://doi.org/10.3758/s13428-015-0617-9
Herman, P. M., Slaughter, M. E., Qureshi, N., Azzam, T., Cella, D., Coulter, I. D., DiGuiseppi, G., Edelen, M. O., Kapteyn, A., Rodriguez, A., Rubinstein, M., & Hays, R. D. (2024). Comparing health survey data cost and quality between Amazon’s Mechanical Turk and Ipsos’ KnowledgePanel: Observational study. Journal of Medical Internet Research, 26, e63032. https://doi.org/10.2196/63032 DOI: https://doi.org/10.2196/63032
Kaczmirek, L., Phillips, B., Pennay, D. W., Lavrakas, P. J., & Neiger, D. (2019). Building a probability-based online panel: Life in Australia™ (CSRM & SRC Methods Paper No. 2/2019). ANU Centre for Social Research & Methods. https://csrm.cass.anu.edu.au/research/publications/building-probability-based-online-panel-life-australia
Kemp, S., & Grace, R. C. (2021). Using ordinal scales in psychology. Methods in Psychology, 5, 100054. https://doi.org/10.1016/j.metip.2021.100054 DOI: https://doi.org/10.1016/j.metip.2021.100054
Kennedy, C., Mercer, A., Keeter, S., Hatley, N., McGeeney, K., & Gimenez, A. (2016, May 2). Evaluating online nonprobability surveys: Vendor choice matters; widespread errors found for estimates based on blacks and Hispanics. Pew Research Center. https://www.pewresearch.org/methods/2016/05/02/evaluating-online-nonprobability-surveys/
Keusch, F., & Yang, T. (2018). Is satisficing responsible for response order effects in rating scale questions? Survey Research Methods, 12(3), 259–270. https://doi.org/10.18148/srm/2018.v12i3.7263
Kim, Y., Park, S., Kim, N.-S., & Lee, B.-K. (2013). Inappropriate survey design analysis of the Korean National Health and Nutrition Examination Survey may produce biased results. Journal of Preventive Medicine and Public Health, 46(2), 96–104. https://doi.org/10.3961/jpmph.2013.46.2.96 DOI: https://doi.org/10.3961/jpmph.2013.46.2.96
Kocar, S., & Baffour, B. (2023). Comparing and improving the accuracy of nonprobability samples: Profiling Australian surveys. Methods, Data, Analyses, 2(2023). https://doi.org/10.12758/MDA.2023.04
Kocar, S., & Biddle, N. (2023). Do we have to mix modes in probability-based online panel research to obtain more accurate results? Methods, Data, Analyses, 16(1), 93–120. https://doi.org/10.12758/mda.2022.11
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13 DOI: https://doi.org/10.18637/jss.v082.i13
Lalla, M. (2017). Fundamental characteristics and statistical analysis of ordinal variables: A review. Quality & Quantity, 51(1), 435–458. https://doi.org/10.1007/s11135-016-0314-5 DOI: https://doi.org/10.1007/s11135-016-0314-5
Latkin, C. A., Edwards, C., Davey-Rothwell, M. A., & Tobin, K. E. (2017). The relationship between social desirability bias and self-reports of health, substance use, and social network factors among urban substance users in Baltimore, Maryland. Addictive Behaviors, 73, 133–136. https://doi.org/10.1016/j.addbeh.2017.05.005 DOI: https://doi.org/10.1016/j.addbeh.2017.05.005
Lavrakas, P. J., Pennay, D., Neiger, D., & Phillips, B. (2022). Comparing probability-based surveys and nonprobability online panel surveys in Australia: A total survey error perspective. Survey Research Methods, 16(2), 241–266. https://doi.org/10.18148/srm/2022.v16i2.7907
Lemcke, J., Loss, J., Allen, J., Öztürk, I., Hintze, M., Damerow, S., Kuttig, T., Wetzstein, M., Hövener, C., Hapke, U., Ziese, T., Scheidt-Nave, C., & Schmich, P. (2024). Health in Germany: Establishment of a population-based health panel. Journal of Health Monitoring, 9(Suppl 2), 2–21. https://doi.org/10.25646/11992.2
MacInnis, B., Krosnick, J. A., Ho, A. S., & Cho, M.-J. (2018). The accuracy of measurements with probability and nonprobability survey samples: Replication and extension. Public Opinion Quarterly, 82(4), 707–744. https://doi.org/10.1093/poq/nfy038 DOI: https://doi.org/10.1093/poq/nfy038
Martinsson, J., & Riedel, K. (2015). Postal recruitment to a probability based web panel: Long term consequences for response rates, representativeness and costs (LORE working paper 2015:1). University of Gothenburg. https://gup.ub.gu.se/publication/222612?lang=en
Maslovskaya, O., & Lugtig, P. (2022). Representativeness in six waves of cross-national online survey (CRONOS) panel. Journal of the Royal Statistical Society: Series A (Statistics in Society), 185(3), 851–871. https://doi.org/10.1111/rssa.12801 DOI: https://doi.org/10.1111/rssa.12801
Matsumoto, D., & van de Vijver, F. J. R. (2012). Cross-cultural research methods. In H. Cooper (Ed.-in-Chief), P. M. Camic, D. L. Long, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology, Volume 1: Foundations, planning, measures, and psychometrics (pp. 85–102). American Psychological Association. https://doi.org/10.1037/13619-000 DOI: https://doi.org/10.1037/13619-000
Mercer, A., & Lau, A. (2023). Comparing two types of online survey samples. Pew Research Center. https://www.pewresearch.org/methods/2023/09/07/comparing-two-types-of-online-survey-samples/
Nayak, S. D. P., & Narayan, K. A. (2019). Strengths and Weaknesses of Online Surveys. IOSR Journal of Humanities and Social Sciences, 24, 31-38.
Pacheco, J. (2020). The policy consequences of health bias in political voice. Political Research Quarterly, 73(4), 935–949. https://doi.org/10.1177/1065912919874256
Page, M. J., Moher, D., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … McKenzie, J. E. (2021). PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ, 372, n160). https://doi.org/10.1136/bmj.n160 DOI: https://doi.org/10.1136/bmj.n160
Pennay, D. W., Neiger, D., Lavrakas, P. J., & Borg, K. (2018). The Online Panels Benchmarking Study: A Total Survey Error comparison of findings from probability-based surveys and non-probability online panel surveys in Australia (CSRM Methods Series No. 2/2018). Centre for Social Research and Methods. https://csrm.cass.anu.edu.au/research/publications/online-panels-benchmarking-study-total-survey-error-comparison-findings
Pforr, K., & Dannwolf, T. (2017). What do we lose with online-only surveys? Estimating the bias in selected political variables due to online mode restriction. Statistics, Politics and Policy, 8(1), 105–120. https://doi.org/10.1515/spp-2016-0004 DOI: https://doi.org/10.1515/spp-2016-0004
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Schwarz, N., Knäuper, B., & Oyserman, D. (2008). The psychology of asking questions. In E. de Leeuw, J. Hox, & D. Dillman (Eds.), International handbook of survey methodology (pp. 18–34). Taylor & Francis.
Spijkerman, R., Knibbe, R., Knoops, K., Van De Mheen, D., & Van Den Eijnden, R. (2009). The utility of online panel surveys versus computer-assisted interviews in obtaining substance-use prevalence estimates in the Netherlands. Addiction, 104(10), 1641–1645. https://doi.org/10.1111/j.1360-0443.2009.02642.x DOI: https://doi.org/10.1111/j.1360-0443.2009.02642.x
Stevens, G. A., Alkema, L., Black, R. E., Boerma, J. T., Collins, G. S., Ezzati, M., Grove, J. T., Hogan, D. R., Hogan, M. C., Horton, R., Lawn, J. E., Marušić, A., Mathers, C. D., Murray, C. J., Rudan, I., Salomon, J. A., Simpson, P. J., Vos, T., & Welch, V. (The GATHER Working Group). (2016). Guidelines for accurate and transparent health estimates reporting: The GATHER statement. The Lancet, 388(10062), E19–E23. https://doi.org/10.1016/S0140-6736(16)30388-9 DOI: https://doi.org/10.1016/S0140-6736(16)30388-9
Struminskaya, B., de Leeuw, E., & Kaczmirek , L. (2015). Mode system effects in an online panel study: Comparing a probability-based online panel with two face-to-face reference surveys. Methods, Data, Analyses, 9(1), 3–56. https://doi.org/10.12758/mda.2015.001
Struminskaya, B., Kaczmirek, L., Schaurer, I., & Bandilla, W. (2014). Assessing representativeness of a probability-based online panel in Germany. In M. Callegaro, R. Baker, J. Bethlehem, A. S. Göritz, J. A. Krosnick, & P. J. Lavrakas (Eds.), Online panel research: A data quality perspective (Chapter 3). Wiley. https://doi.org/10.1002/9781118763520.ch3 DOI: https://doi.org/10.1002/9781118763520.ch3
Survey Quality Predictor. (2017). SQP coding instructions. Universitat Pompeu Fabra. http://sqp.upf.edu/media/files/sqp_coding_instructions.pdf
Tourangeau, R., Conrad, F. G., & Couper, M. P. (2013). Measurement error on the web and in other modes of data collection (Chapter 7). In The science of web surveys (pp. 129–150). Oxford University Press. DOI: https://doi.org/10.1093/acprof:oso/9780199747047.003.0007
Unangst, J., Amaya, A. E., Sanders, H. L., Howard, J., Ferrell, A., Karon, S., & Dever, J. A. (2020). A process for decomposing total survey error in probability and nonprobability surveys: A case study comparing health statistics in US Internet panels. Journal of Survey Statistics and Methodology, 8(1), 62–88. https://doi.org/10.1093/jssam/smz040 DOI: https://doi.org/10.1093/jssam/smz040
Vehovar, V., Čehovin, G., & Praček, A. (2023). The use of probability web panels in national statistical institutes [Elaborat, predštudija, studia]. Faculty of Social Sciences, Centre for Social Informatics. https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=slv&id=145313
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., ... Yutani, H. (2019). Welcome to the tidyverse. The Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686 DOI: https://doi.org/10.21105/joss.01686
Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A., & Wang, R. (2011). Comparing the accuracy of RDD telephone surveys and internet surveys conducted with probability and non-probability samples. Public Opinion Quarterly, 75(4), 709–747. https://doi.org/10.1093/poq/nfr020 DOI: https://doi.org/10.1093/poq/nfr020
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Andrea Ivanovska, Michael Bosnjak, Vasja Vehovar

This work is licensed under a Creative Commons Attribution 4.0 International License.
Funding data
-
The Slovenian Research and Innovation Agency
Grant numbers P5-0399;J5-3100;J5-50159