Leveraging Machine Learning for Telecom Banking Card Fraud Detection: A Comparative Analysis of Logistic Regression, Random Forest, and XGBoost Models

Main Article Content

Guanyu Liu

Keywords

Telecom Banking, Fraud Detection, Machine Learning, Logistic Regression, Random Forest, XGBoost, Predictive Modeling.

Abstract

In recent years, telecommunication bank card fraud has become a major threat to financial security, so it is necessary to develop robust detection mechanisms for telecommunication bank card fraud. This study examines the application of machine learning techniques (specifically logistic regression, random forests, and XGBoost) in identifying fraudulent telecom bank transactions. Using a dataset consisting of one million transaction records from the 2024 National Student Data Statistics and Analytics Competition, we implemented and evaluated these models based on key performance metrics such as accuracy, precision, recall, F1 score and ROC-AUC. The results show that XGBoost outperforms the other models, achieving superior accuracy and robustness in fraud detection, while Random Forest also performs well, achieving almost perfect classification accuracy. Logistic regression, while effective, lagged behind in terms of handling the complexity of the data. The analyses in this paper further highlight the critical role of features such as transaction amount ratios and online transaction status in predicting fraud. These findings suggest that advanced machine learning models, especially ensemble methods such as XGBoost, are highly effective in combating telecom banking fraud and should be integrated into existing detection systems to enhance their predictive capabilities.

Abstract 74 | PDF Downloads 40

References

Bolton, R. J., & Hand, D. J. (2002). Statistical fraud detection: A review. Statistical Science, 17(3), 235-255. doi:10.1214/ss/1042727940
Ngai, E. W., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559-569. doi: 10.1016/j.dss.2010.08.006
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794. doi:10.1145/2939672.2939785
Roy, A., Mukherjee, A., & Maulik, U. (2018). Deep learning models for fraud detection: A survey. IEEE Access, 6, 59153-59161. doi:10.1109/ACCESS.2018.2876048
Jing, G., & Zeng, Z. (2009). A study on data imbalance problem in fraud detection. Journal of Computational Information Systems, 5(4), 1451-1458.
Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv:1009.6119.
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
Goldstein, M., Uchida, S., & Blanchard, G. (2017). Towards reliable anomaly detection benchmarks in the presence of complex data. arXiv preprint arXiv:1708.09183.
Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3), 602-613. doi: 10.1016/j.dss.2010.08.008