Modeling with logistic regression for credit grant analysis
DOI:
https://doi.org/10.33448/rsd-v11i7.29761Keywords:
Data mining; ROC curve; Probability.Abstract
With the advancement of Big Data and the growing number of large masses of data in the most diverse areas of study, data mining techniques become increasingly necessary to obtain accurate and robust statistical information. This study aimed to show the efficiency of logistic regression as a data mining technique in obtaining a useful and statistically effective model in the analysis of customers for granting bank credit. The data comes from the Machine Learning Repository’s at the University of California-Irvin UCI. The database was divided into two groups: training and testing. The adjusted model was selected using the stepwise method in the R program. The model met the expectations of goodness of fit, with an accuracy of approximately 72% in discriminating non-defaulting from non-defaulting customers, sensitivity of 87% of the 140 non-defaulting customers, the model was correct 122 and specificity of 38%. The ROC curve had an area of 0.847, suggesting an effective fit.
References
Agresti, A. (2018). An introduction to categorical data analysis. John Wiley & Sons.
Braga, A. C. (2001). Curvas ROC: aspectos funcionais e aplicações.
Costa, R. R. (2003). Análise empresarial avançada para crédito. Qualitymark Editora Ltda.
Costa, R. S. D. (2013). Teste de diagnóstico baseado em análise de regressão logística.
Cox, D. R., & Hinkley, D. V. (1979). Theoretical statistics. CRC Press.
Dunn, P. K., & Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5(3), 236-244.
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.
Kleinbaum, D. G., & Klein, M. (2002). Analysis of matched data using logistic regression. Logistic regression: A self-learning text, 227-265.
Lewis, E. M. (1992). An introduction to credit scoring. Fair, Isaac and Company.
Lima, F. A. P. D. (2011). Práticas em gestão de sistemas de credit scoring e portfólio de crédito em instituições financeiras brasileiras (Tese de Doutorado).
Marcelino, J. A. (2012). Credit scoring: uma ferramenta para análise de crédito em uma instituição de microcrédito produtivo e orientado.
Mays, F. E., & Lynas, N. (2004). Credit scoring for risk managers: The handbook for lenders. Thomson/South-Western.
Moura, G. M. (2018). Regressão Logística aplicada a análise de risco de crédito. (Monografia, Universidade Federal do Rio Grande).
Nelder, J. A., & Wedderburn, R. W. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3), 370-384.
Pagano, M., & Gauvreau, K. (2011). Princípios de bioestatística. In Princípios de bioestatística (pp. xv-506).
Paula, G. A. (2004). Modelos de regressão: com apoio computacional (pp. 28-55). IME-USP.
Pereira, M. A. A. (2019). Modelos não lineares assimétricos com efeitos mistos.
Da Silva, J. P. (2000). Gestão e análise de risco de crédito. Editora Atlas SA.
Souza, É. C. D. (2006). Análise de influência local no modelo de regressão logística (Tese de Doutorado, Universidade de São Paulo).
Tavares, M.D.C. (2009). A crise financeira atual. Paper Itamaraty, 30(04).
Team, R. C. (2021). R: A language and environment for statistical computing (R Version 4.0. 3, R Foundation for Statistical Computing, Vienna, Austria, 2020).: https://www.r-project.org/.
Walpole, R. E. (2009). Probabilidade & Estatística para engenharias e ciências. Pearson Prentice Hall.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Rafaella Santos Beserra; Nyedja Fialho Morais Barbosa; Ana Patricia Bastos Peixoto; Érika Fialho Morais Xavier; Jader Silva Jale; Sílvio Fernando Alves Xavier Júnior
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
1) Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
2) Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
3) Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.