The COVID-19 pandemic in Brazil: an application of the k-means clustering method

Authors

DOI:

https://doi.org/10.33448/rsd-v9i10.9059

Keywords:

Clusters; COVID-19; Coronavírus in Brazil; SARS-CoV-2.

Abstract

COVID-19 is an infection caused by the SARS-CoV-2 coronavirus, its first records were in the Chinese city of Wuhan in December 2019, and was considered by the World Health Organization (WHO) to be a worldwide pandemic in March 2020. In Brazil, COVID-19 spread to 27 states (UFs). As a result, decision-making to decrease the speed of transmission was based on WHO recommendations, where the main one is social isolation. However, due to the heterogeneity of the population in each of the UFs, the pandemic spread differently. Thus, it is interesting to group UFs by similarity due to some characteristics, and thus, observe the measures to combat COVID-19 carried out in each of these groups. The aim of this study was to group UFs using cluster analysis using the non-hierarchical k-means method considering the epidemiological coefficients such as incidence, prevalence, and lethality. The data were obtained from the website of the Ministry of Health of Brazil and consisted of the variables number of cases and new and accumulated deaths in UFs, in addition to the population at risk. For cluster analysis, the database was divided into three chronological periods for the three coefficients under study. With the cluster analysis, it was possible to verify the stratification of UFs according to their similarities in relation to COVID-19. Thus, the stratification of incidence, prevalence, and lethality by UFs can present itself as an additional resource to signal which places and which measures should be adopted and where these measures were effective.

References

Brasil (2020). Ministério da Saúde. COVID-19 no Brasil. Recuperado de https://susanalitico .saude.gov.br/extensions/covid-19_html/covid-19_html.html

Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2015). Determining the best number of clusters in a data set. Package ‘NbClust’. Recuperado de http://cran.rediris.es/ web/packages/NbClust/NbClust.pdf

Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis, (5th ed.), John Wiley.

Fávero, L. P., & Belfiore, P. (2019). Data Science for Business and Decision Making. Academic Press, Cambridge, MA, USA.

Fernandes, F. A., Alves, H. J. P., Fernandes T. J., & Muniz. J. A. (2020). Panorama da fase inicial do crescimento dos números de casos e óbitos causados pela COVID-19 no Brasil. Research, Society and Development, 9(10), 1-19. DOI: http://dx.doi.org/10.33448/rsd-v9i10.8560

Ferreira, D. F. (2018). Estatística Multivariada, (3a ed.), 624. Editora UFLA, Universidade Federal de Lavras.

Fundação Oswaldo Cruz. (2020a). Instituto de Comunicação e Informação Científica e Tecnológica em Saúde. Monitora COVID-19. Nota Técnica 1. Recuperado de https://bigdata-covid19.icict.fiocruz.br

Fundação Oswaldo Cruz. (2020b). Instituto de Comunicação e Informação Científica e Tecnológica em Saúde. Monitora COVID-19. Nota Técnica 2. Recuperado de https://bigdata-covid19.icict.fiocruz.br

Fundação Oswaldo Cruz. (2020c). Instituto de Comunicação e Informação Científica e Tecnológica em Saúde. Monitora COVID-19. Nota Técnica 11.

Fundação Oswaldo Cruz. (2020d). Instituto de Comunicação e Informação Científica e Tecnológica em Saúde. Monitora COVID-19. Nota Técnica 3. Recuperado de https://bigdata-covid19.icict.fiocruz.br

Guimarães, R. M., Eleuterio, T. D. A., & Monteiro-da-Silva, J. H. C. (2020). Estratificação de risco para predição de disseminação e gravidade da Covid-19 no Brasil. Revista Brasileira De Estudos De População, 37,.1-17. DOI: http://dx.doi.org/10.20947/s0102-3098a0122

Iritani, O., Okuno, T., Hama, D., Kane, A., Kodera, K., Morigaki, K., Terai, T., Maeno, N., & Morimoto, S. (2020). Clusters of covid-19 in long-term care hospitals and facilities in japan from 16 january to 9 may 2020. Geriatrics & gerontology international, 20(7), 715-719. DOI: 10.1111/ggi.13973

James, N., & Menzies, M. (2020). Cluster-based dual evolution for multivariate time series: Analyzing covid-19. Chaos: An Interdisciplinary Journal of Nonlinear Science, 30. DOI: https://doi.org/10.1063/5.0013156

Johnson, R. A., & Wichern, D. W. (2002). Applied Multivariate Statistical Analysis. Prentice hall Upper Saddle River, NJ, Upper Saddle River, 5.

Khailany, R. A., Safdar, M., & Ozaslan, M. (2020). Genomic characterization of a novel sars-cov-2. Gene reports, 19, 1-6. DOI: https://doi.org/10.1016/j.genrep.2020.100682

Kodinariya, T. M., & Makwana, P. R. (2013). Review on determining number of cluster in k-means clustering. International Journal of Advance Research in Computer Science and Management Studies, 1(6), 90-95.

Kumar, S. (2020). Monitoring Novel Corona Virus (COVID-19) Infections in India by Cluster Analysis. Annals of Data Science, 7(3), 417-425. DOI: https://doi.org/10.1007/s40745-020-00289-7

Letko, M., Marzi, A., & Munster, V. (2020). Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nature Microbiology, 5, 562-569. DOI: https://doi.org/10.1038/s41564-020-0688-y

Maciel, E. L., Jabor, P., Goncalves Júnior, E., Tristão-Sá, R., Lima, R. C. D., Reis-Santos, B., Lira, P., Bussinguer, E. C. A., & Zandonade, E. (2020). Fatores associados ao óbito hospitalar por covid-19 no Espírito Santo. Epidemiologia e Serviços de Saúde, 29(4), 1-11. DOI: 10.5123/S1679-49742020000400022

Nassiri, R. (2020). Perspective on Wuhan viral pneumonia. Advances in Public Health, Community and Tropical Medicine, 2, 1-3.

Pereira, A. S., Shitsuka, D. M., Parreira, F. J., & Shitsuka R. (2018). Metodologia da pesquisa científica. [e-book]. Santa Maria. Ed. UAB/NTE/UFSM. Recuperado de https://repositorio.ufsm.br/bitstream/handle/1/15824/Lic_Computacao_Metodologia-Pesquisa-Cientifica.pdf?sequence=1.

R Core Team. R: a language and environment for statistical computing. Vienna, 2020. Recuperado de https://www.Rproject.org/

Ratkowsky, D., & Lance, G. (1978). Criterion for determining the number of groups in a classification. Australian Computer Journal, 10(3), 115-117.

Salathé, M., Althaus, C.L.b., Neher, R., Stringhini, S., Hodcroft, E., Fellay, J., Zwahlen, M., Senti, G., Battegay, M., Wilder-Smith, A., Eckerle I., Egger M., & Low N. (2020). Covid-19 epidemic in Switzerland: on the importance of testing, contact tracing and isolation. Swiss Medical Weekly, 150, 1-3. DOI: https://doi.org/10.4414/smw.2020.20225

Souza, C. D. F. D., Paiva, J. P. S. D., Leal, T. C., Silva, L. F. D., & Santos, L. G. (2020). Evolução espaçotemporal da letalidade por COVID-19 no Brasil, 2020. Jornal Brasileiro de Pneumologia, 46(4), 1-3. DOI: https://doi.org/10.36416/1806-3756/e20200208

Stier, A., Berman, M., & Bettencourt, L. (2020). COVID-19 attack rate increases with city size. Mansueto Institute for Urban Innovation Research Paper Forthcoming. Recuperado de https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3564464

Velavan, T. P., & Meyer, C.G. (2020). The COVID-19 epidemic. Tropical Medicine and International Health, 25(3), 278-280. DOI: 10.1111/tmi.13383

Wang, C., Horby, P. W., Hayden, F. G., & Gao, G. F. (2020). A novel coronavirus outbreak of global health concern. The Lancet, 395, 470-473. DOI: https://doi.org/10.1016/S0140-6736(20)30185-9

Werneck, G. L., & Carvalho, M. S. (2020). A pandemia de COVID-19 no Brasil: crônica de uma crise sanitária anunciada. Cadernos de Saúde Pública, 36(5), 1-4. DOI: 10.1590/0102-311X00068820

Zarikas, V., Poulopoulos, S. G., Gareiou, Z., & Zervas, E. (2020). Clustering analysis of countries using the covid-19 cases dataset. Data in Brief, 31, 1-8. DOI: https://doi.org/10.1016/j.dib.2020.105787

Published

09/10/2020

How to Cite

ALVES, H. J. de P. .; FERNANDES, F. A. .; LIMA, K. P. de; BATISTA , B. D. de O. .; FERNANDES , T. J. . The COVID-19 pandemic in Brazil: an application of the k-means clustering method. Research, Society and Development, [S. l.], v. 9, n. 10, p. e5829109059, 2020. DOI: 10.33448/rsd-v9i10.9059. Disponível em: https://rsdjournal.org/index.php/rsd/article/view/9059. Acesso em: 2 jan. 2025.

Issue

Section

Health Sciences