Journal: Journal of Computer Science and Engineering Research (JCSER), Volume:2, Issue:1, Pages: 9-20 Download pdf
Authors: Deepa Shukla, Sunil Gupta
Date: 3-2025
Abstract: Access to credit is crucial for economic participation, yet traditional credit scoring models often fail those with limited credit history ("thin-file" consumers). This research presents a novel multi-factor credit scoring model specifically designed to address this challenge. Leveraging machine learning and alternative data sources, our model aims to provide a more comprehensive and inclusive assessment of creditworthiness. We detail the model's architecture, data inputs (including synthetic data from the Harvard Dataverse), and algorithm selection, followed by a rigorous performance evaluation. Results demonstrate the model's superior predictive accuracy compared to traditional methods, particularly for thin-file individuals. This research contributes to the growing body of knowledge on financial inclusion and offers a practical solution for lenders seeking to expand credit access responsibly.
Keywords: Computing Techniques; Empirical Evaluation; Credit Scoring; Thin-File Consumers; Multi-Factor Model; Synthetic Data.
References:
[1] Dastile, X., Çelik, T., & Potsane, M. (2020). Statistical and machine learning models in credit scoring: A systematic literature survey. Appl. Soft Comput., 91, 106263. [Q1] https://doi.org/10.1016/j.asoc.2020.106263.
[2] Bhatore, S., Mohan, L., & Reddy, Y. (2020). Machine learning techniques for credit risk evaluation: a systematic literature review. Journal of Banking and Financial Technology, 1-28. https://doi.org/10.1007/s42786-020-00020-3.
[3] Allen, F., Gu, X., & Jagtiani, J. (2020). A Survey of Fintech Research and Policy Discussion. ERN: Econometric Studies of Private Equity. https://doi.org/10.21799/frbp.wp.2020.21.
[4] Bazarbash, M. (2019). Fintech in Financial Inclusion: Machine Learning Applications in Assessing Credit Risk. FinPlanRN: Other Finance Planning Fundamentals (Topic). https://doi.org/10.5089/9781498314428.001.
[5] Teng, S., & Khong, K. (2021). Examining actual consumer usage of E-wallet: A case study of big data analytics. Comput. Hum. Behav., 121, 106778. [Q1] https://doi.org/10.1016/J.CHB.2021.106778.
[6] Jain, P., & Pamula, R. (2020). A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews. Comput. Sci. Rev., 41, 100413. [Q1] https://doi.org/10.1016/j.cosrev.2021.100413.
[7] Tsoy, N., Steubing, B., Giesen, C., & Guinée, J. (2020). Upscaling methods used in ex ante life cycle assessment of emerging technologies: a review. The International Journal of Life Cycle Assessment, 25, 1680 - 1692. [Q1] https://doi.org/10.1007/s11367-020-01796-8.
[8] Louzada, F., Ara, A., & Fernandes, G. (2016). Classification methods applied to credit scoring: A systematic review and overall comparison. arXiv: Applications. https://doi.org/10.1016/J.SORMS.2016.10.001.
[9] Cavalcante, R., Brasileiro, R., Souza, V., Nóbrega, J., & Oliveira, A. (2016). Computational Intelligence and Financial Markets: A Survey and Future Directions. Expert Syst. Appl., 55, 194-211. [Q1] https://doi.org/10.1016/j.eswa.2016.02.006.
[10] Kumar, A., & Jaiswal, A. (2020). Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurrency and Computation: Practice and Experience, 32.[Q2] https://doi.org/10.1002/cpe.5107.
[11] Poria, S., Cambria, E., Bajpai, R., & Hussain, A. (2017). A review of affective computing: From unimodal analysis to multimodal fusion. Inf. Fusion, 37, 98-125. [Q1] https://doi.org/10.1016/J.INFFUS.2017.02.003.
[12] Deng, Y., Loy, C., & Tang, X. (2016). Image Aesthetic Assessment: An experimental survey. IEEE Signal Processing Magazine, 34, 80-106. [Q1] https://doi.org/10.1109/MSP.2017.2696576.
[13] Kumar, M., Sharma, S., Goel, A., & Singh, S. (2019). A comprehensive survey for scheduling techniques in cloud computing. J. Netw. Comput. Appl., 143, 1-33. [Q1] https://doi.org/10.1016/J.JNCA.2019.06.006.
[14] Verdoliva, L. (2020). Media Forensics and DeepFakes: An Overview. IEEE Journal of Selected Topics in Signal Processing, 14, 910-932. [Q1] https://doi.org/10.1109/JSTSP.2020.3002101.
[15] Adewumi, A., & Akinyelu, A. (2017). A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management, 8, 937-953. [Q2] https://doi.org/10.1007/S13198-016-0551-Y.
[16] Kim, C., Lim, S., Woo, S., Kang, W., Seo, Y., Lee, S., Lee, S., Kwon, D., Oh, S., Noh, Y., Kim, H., Kim, J., Bae, J., & Lee, J. (2018). Emerging memory technologies for neuromorphic computing. Nanotechnology, 30.[Q1] https://doi.org/10.1088/1361-6528/aae975.
[17] Kumar, A., Mangla, S., Luthra, S., Rana, N., & Dwivedi, Y. (2018). Predicting changing pattern: building model for consumer decision making in digital market. J. Enterp. Inf. Manag., 31, 674-703.[Q1] https://doi.org/10.1108/JEIM-01-2018-0003.
[18] Yu, S. (2018). Neuro-Inspired Computing With Emerging Nonvolatile Memorys. Proceedings of the IEEE, 106, 260-285. [Q1] https://doi.org/10.1109/JPROC.2018.2790840.
[19] Salehi, H., & Burgueño, R. (2018). Emerging artificial intelligence methods in structural engineering. Engineering Structures. [Q1] https://doi.org/10.1016/J.ENGSTRUCT.2018.05.084
[20] Shukla, D. (2023). Replication Data for: Credit scoring of thin file consumers. Harvard Dataverse. https://doi.org/10.7910/DVN/6MLVVI