Model Construction of Consumer Credit Ratings Based on Nonparametric Statistical Testing and Machine Learning Fusion

Main Article Content

Yawen Wang

Keywords

consumer credit rating, nonparametric testing, cox proportional hazards model, machine learning, risk assessment

Abstract

With the rapid growth of consumer credit, there is a need to assess it. Given the limitations of traditional credit scoring methods, which rely on linear assumptions or lack interpretability, and artificial intelligence (AI) methods, more robust models are needed. Against this backdrop, this study proposes a novel consumer credit rating model that integrates nonparametric statistical testing with machine learning to increase accuracy and interpretability. The research focuses on screening statistically significant features via Kolmogorov‒Smirnov tests (e.g., D=0.28, p=0.002 for annual income) and Mann‒Whitney U tests (U=210, p=0.008) while addressing multicollinearity via Spearman rank correlation. A fusion framework combining the Cox proportional hazards model and conformal prediction is employed to generate confidence intervals for default probability. The results demonstrate the model’s superior performance, achieving an area under the curve (AUC) of 0.775, higher recall (0.72), and lower Brier score (0.158) than logistic regression and survival forest. The study bridges the gap between predictive accuracy and interpretability, offering a reliable tool for financial institutions to assess credit risk.

Abstract 10 | PDF Downloads 5

References

  • Bao, Y., (2020). A hybrid approach for credit risk assessment: Integrating statistical and machine learning methods. Expert Systems with Applications, vol. 144, p. 113087.
  • Brown, I. and Mues, C., (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert systems with applications, vol. 39, no. 3, pp. 3446-3453.
  • Chen, J. and Tsai, C., (2019). Nonlinear credit scoring models: A comparative study. European Journal of Operational Research, vol. 277, no. 2, pp. 654-665.
  • Flori, A., Pammolli, F. and Spelta, A., (2021). Commodity prices co-movements and financial stability: A multidimensional visibility nexus with climate conditions. Journal of Financial Stability, vol. 54, p. 100876.
  • George, N., (2019). All Lending Club loan data. (2019.1.1) [Online]. Available: https://www.kaggle.com/datasets/wordsforthewise/lending-club?resource=download [Accessed December 10, 2025].
  • Ishwaran, H., Gerds, T. A., Kogalur, U. B., Moore, R. D., Gange, S. J. and Lau, B. M., (2014). Random survival forests for competing risks. Biostatistics, vol. 15, no. 4, pp. 757-773.
  • Jagtiani, J. and Lemieux, C., (2019). The roles of alternative data and machine learning in fintech lending: Evidence from the LendingClub consumer platform. Financial Management, vol. 48, no. 4, pp. 1009-1029.
  • Lessmann, S., (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: A ten-year update. European Journal of Operational Research, vol. 24, no. 1, pp. 124-136.
  • Liang, Y. and Lu, X., (2022). Hierarchical optimization in credit risk modeling. Journal of Banking & Finance, vol. 135, p. 106385.
  • Malik, M. and Thomas, L. C., (2012). Transition matrix models of consumer credit ratings. International Journal of Forecasting, vol. 28, no. 1, pp. 261-272.
  • Wu, X. and Shang, J., (2023). Published. Research on Consumer Credit Rating Model. Proceedings of the 2023 International Conference on Finance, Trade and Business Management (FTBM 2023), Cham. Springer Nature, pp. 157-163.

Similar Articles

11-20 of 29

You may also start an advanced similarity search for this article.