Echo chambers and vaccines against COVID-19 mis/disinformation on Twitter: machine learning and network analysis-based approach




Vaccine; Infodemics; Fake News; Echo Chambers; Twitter; Machine learning.


Given the infodemiological importance of echo chambers in the dissemination of mis/disinformation, we aimed to analyze the interaction networks of users most exposed to mis/disinformation or controversy about vaccines in the context of the COVID-19 pandemic. To this end, a methodology based on machine learning and Social Network Analysis is proposed in this research for automated detection of controversial and mis/disinformative content about vaccines, through which a model with 92% accuracy was achieved. Out of the nearly 24 million tweets collected, 12.4 million (52%) were flagged as controversial and/or potential for mis/disinformation, and the months of January and June 2021 were those with the highest activity, being analyzed through a cohort. Unlike previous work, we analyzed the network of all ways of interacting on Twitter, and the entire textual structure of the tweets - not just links or hashtags -.  Regarding the conversation about COVID-19 vaccines, the findings were different from those associated with party-political discussion previously described in the literature, since the network of mentions and replies privileges heterophilic relationships, and "echo" conformations were not observable. Finally, further studies are needed to better understand the dissemination of misinformation about vaccines on Twitter.

Author Biographies

Arthur da Silva Lopes, Universidade Federal da Bahia

MSc student in Collective Health (ISC-UFBA), Interdisciplinary Bachelor's degree in Health (UFBA) and Data Scientist in Natural Language Processing. 

Antonio Brotas, Fundação Oswaldo Cruz

PhD in Culture and Society (UFBA). Professor of the Master's in Science, Technology and Health Communication at Casa de Oswaldo Cruz, Oswaldo Cruz Foundation (COC/Fiocruz), researcher at the National Institute for Public Communication of Science and Technology (INCT-CPCT), journalist and communication advisor at Fiocruz Bahia.


Amaral I, Santos SJ (2019) Algoritmos e redes sociais: a propagação de fake News na era da pós-. In: Figueira J, Santos S (Ed.), As fake news e a nova ordem (des)informativa na era da pós-verdade (pp. 63–85). Portugal: Imprensa da Universidade de Coimbra.

Cinelli, M., Quattrociocchi, W., Galeazzi, A., Valensise, C. M., Brugnoli, E., Schmidt, A. L., Zola, P., Zollo, F., & Scala, A. (2020). The COVID-19 social media infodemic. Scientific Reports, 10(1). doi:

Cohen, K. B., & Hunter, L. (2008). Getting Started in Text Mining. PLOS Computational Biology, 4(1), e20. doi:

Conover, M., Ratkiewicz, J., Francisco, M., Goncalves, B., Menczer, F., & Flammini, A. (2011). Political Polarization on Twitter. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 89–96. doi:

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv. doi:

Du, S., & Gregory, S. (2017). The echo chamber effect in twitter: Does community polarization increase? Studies in Computational Intelligence, 693, 373–384. doi:

Eysenbach, G. (2009). Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. Journal of Medical Internet Research, 11(1). doi:

Ferreira, F. V., Varão, R., Boselli, M. A., Santos, L. B., & Moret, M. A. (2022). Uso de Python para detecção de fake news sobre a covid-19: desafios e possibilidades. Revista Eletrônica de Comunicação, Informação & Inovação Em Saúde, 16(2). doi:

Ferreira-Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6), e1332. doi:

Geron, A. (2017). Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. Tokyo: O'Reilly.

Gomaa, W., & A. Fahmy, A. (2013). A Survey of Text Similarity Approaches. International Journal of Computer Applications, 68(13), 13–18. doi:

Hotho, A., Nürnberger, A., & Paaß, G. (2005). A Brief Survey of Text Mining. Journal for Language Technology and Computational Linguistics, 20(1), 19–62. doi:

Jung, H., & Lee, B. G. (2020). Research trends in text mining: Semantic network and main path analysis of selected journals. Expert Systems with Applications, 162, 113851. doi:

Khan, J. Y., Khondaker, Md. T. I., Afroz, S., Uddin, G., & Iqbal, A. (2021). A benchmark study of machine learning models for online fake news detection. Machine Learning with Applications, 4, 100032. doi:

Lima, C. R. M. de, Sánchez-Tarragó, N., Moraes, D., Grings, L., & Maia, M. R. (2020). Emergência de saúde pública global por pandemia de Covid-19. Folha de Rosto, 6(2), 5–21. doi:

Liu.B. (2020). Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge, England: Cambridge University Press.

Macanovic, A. (2022). Text mining for social science – The state and the future of computational text analysis in sociology. Social Science Research, 108, 102784. doi:

Massarani, L., Brotas, A., Costa, M., & Neves, LF. (2021). Vacinas contra a COVID-19 e o combate à desinformação na cobertura da Folha de S. Paulo. Fronteiras - Estudos Midiáticos, 23(2), 29–43. doi:

Mønsted, B., & Lehmann, S. (2022). Characterizing polarization in online vaccine discourse—A large-scale study. PLOS ONE, 17(2), e0263746. doi:

Recuero, R. (2017). Introdução à análise de redes sociais online. Salvador: EDUFBA.

Recuero, R., & Soares, F. B. (2021). O Discurso Desinformativo sobre a Cura do COVID-19 no Twitter: Estudo de caso. E-Compós, 24, 1–29. doi:

Recuero, R., & Zago, G. (2021). “RT, por favor”: considerações sobre a difusão de informações no Twitter. Fronteiras - Estudos Midiáticos, 12(2), 69–81. doi:

Santos, C. R. P. dos, & Maurer, C. (2020). Potencialidades e limites do fact-checking no combate à desinformação. Comunicação & Informação, 23. doi:

Shore, J., Baek, J., & Dellarocas, C. (2018, October 22). Twitter Is Not the Echo Chamber We Think It Is. MITSloan Management Review.

Soares, F. B., Viegas, P., Bonoto, C., & Recuero, R. (2021). Covid-19, desinformação e Facebook: circulação de URLs sobre a hidroxicloroquina em páginas e grupos públicos. Galáxia (São Paulo), 46. doi:

Suen, C. Y. (1979). n-Gram Statistics for Natural Language Understanding and Text Processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 164–172. doi:

Tao, D., Yang, P., & Feng, H. (2020). Utilization of text mining as a big data analysis tool for food science and nutrition. Comprehensive Reviews in Food Science and Food Safety, 19(2), 875–894. doi:

Törnberg, P. (2018). Echo chambers and viral misinformation: Modeling fake news as complex contagion. PLoS ONE, 13(9). doi:

van der Linden, S. (2022). Misinformation: susceptibility, spread, and interventions to immunize the public. Nature Medicine, 28(3), 460–467. doi:

World Health Organization. (2020). Managing the COVID-19 infodemic: Promoting healthy behaviours and mitigating the harm from misinformation and disinformation.



How to Cite

LOPES, A. da S. .; BROTAS, A. Echo chambers and vaccines against COVID-19 mis/disinformation on Twitter: machine learning and network analysis-based approach . Research, Society and Development, [S. l.], v. 12, n. 2, p. e22812240159, 2023. DOI: 10.33448/rsd-v12i2.40159. Disponível em: Acesso em: 3 jun. 2023.



Health Sciences