Research and Analysis of NLP Based Motion Understanding Large Language Model from Perception to Cognition

Main Article Content

Chenyang Xue

Keywords

large language models, perception to cognition, multi-modal, intelligence analysis

Abstract

With the rapid development of Large Language Models (LLMs), Natural Language Processing (NLP) has paved a new path for intelligent analysis in the field of sports. However, the comprehension of sports knowledge in most existing models is limited to static texts and struggles to capture real-time information. Based on the main line of “from perception to cognition”, this paper systematically reviews the research progress of large language models in sports intelligence analysis. At the perception level, this paper reviews the text-based encoding of sports knowledge, video-based understanding of sports events and sports perception-oriented datasets. At the cognition level, this paper investigates the current state of large language models in sports modeling, focusing on tactical analysis, decision understanding, and evaluation, and match trend prediction. Furthermore, this review summarizes the current challenges in data resources and model capabilities, and looks forward to the future development pathways, including multimodal datasets construction, temporal awareness enhancement, and reasoning stability improvement. Therefore, promoting large language models from perception to cognition is expected to realize their in-depth application in the field of sports, becoming an intelligent tool for match comprehension and decision support.

Abstract 6 | PDF Downloads 2

References

  • [1] Jiang L., Tang H., & Chen Y.(2024). A review of natural language processing based on Transformer models. Modern Computer, 30(14), 31-35.
  • [2] Zhao T., Xu M., & Chen A.(2025). A review of natural language processing research. Journal of Xinjiang Normal University (Philosophy and Social Sciences Edition), 46(02),89-111+2. https://doi.org/10.14100/j.cnki.65-1039/g4.20230804.001.
  • [3] Mendes-Neves, T., Meireles, L., & Mendes-Moreira, J. (2024). Forecasting events in soccer matches through language. arXiv preprint arXiv:2402.06820.
  • [4] Bao, Z., & Zhang, L. (2025). TennisTV: Do Multimodal Large Language Models Understand Tennis Rallies?. arXiv preprint arXiv:2509.15602.
  • [5] Zhang, J., Han, D., Han, S., Li, H., Lam, W. K., & Zhang, M. (2025). ChatMatch: Exploring the potential of hybrid vision–language deep learning approach for the intelligent analysis and inference of racket sports. Computer Speech & Language, 89, 101694.
  • [6] Xia, H., Yang, Z., Zhao, Y., Wang, Y., Li, J., Tracy, R., ... & Shen, W. (2024). Language and multimodal models in sports: A survey of datasets and applications. arXiv preprint arXiv:2406.12252.
  • [7] Xia, H., Yang, Z., Wang, Y., Tracy, R., Zhao, Y., Huang, D., ... & Shen, W. (2024, June). Sportqa: A benchmark for sports understanding in large language models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) (pp. 5061-5081).
  • [8] Xia, H., Yang, Z., Zou, J., Tracy, R., Wang, Y., Lu, C., ... & Chen, H. (2024). Sportu: A comprehensive sports understanding benchmark for multimodal large language models. arXiv preprint arXiv:2410.08474.
  • [9] Oved, N., Feder, A., & Reichart, R. (2020). Predicting in-game actions from interviews of NBA players. Computational Linguistics, 46(3), 667-712.
  • [10] Beal, R., Middleton, S. E., Norman, T. J., & Ramchurn, S. D. (2021, May). Combining machine learning and human experts to predict match outcomes in football: A baseline model. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 17, pp. 15447-15451).
  • [11] Yang, Z., Xia, H., Li, J., Chen, Z., Zhu, Z., & Shen, W. (2025). Sports intelligence: Assessing the sports understanding capabilities of language models through question answering from text to video. Electronics, 14(3), 461.