Analysis of the Bottlenecks and Solutions for Enhancing the Capabilities of Large Language Models

Yihan Wang

doi:10.70267/ic-aimees.20260226231

Yihan Wang

Sehool of Statistics and Data Science, Lanzhou University of Finance and Economics, Lanzhou 730101, China

DOI: https://doi.org/10.70267/ic-aimees.20260226231

Keywords

large language models, data, algorithms, bottlenecks and solutions

Abstract

Large language models, with their ability to understand human language, have become intelligent assistants and efficiency-enhancing tools for people in learning, medical care, entertainment, and more, driving the intelligent development of society. This paper conducts relevant research on the current problems in the improvement of capabilities of large language models from three aspects: data, algorithms, and the models themselves, and reviews them successively from the perspectives of problems and solutions. Through the study of the problems and solutions from the above three perspectives, this paper finds that there are common problems in the process of capability improvement of large language models, such as insufficient data diversity, redundant and low-quality generated text, inability to stably identify their own errors, inability to achieve efficient iterative evolution, model fragility and reduced generalization ability. In the future, the focus of development of large language models can be shifted to improving their accuracy, security, and controllability, and researchers can mainly focus on breaking through their self-thinking, self-correction, and efficient reasoning capabilities. This paper aims to provide researchers in related fields with ideas and theoretical references for model optimization to help large language models break through development bottlenecks.

Abstract 20 | PDF Downloads 13

References

[1] Ding Y, Xi Z, He W, et al. Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling[C]//Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025: 10627-10646.
[2] Pang J, Wei J, Shah A P, et al. Improving data efficiency via curating llm-driven rating systems[J]. arXiv preprint arXiv:2410.10877, 2024.
[3] Maity A, Potamitis N, Arora A. Reconciling Divergent Views Through a Critical Analysis of Iterative Self-Improvement in LLMs[J]. 2025.
[4] Jiang M, Lupu A, Bachrach Y. Bootstrapping task spaces for self-improvement[J]. arXiv preprint arXiv:2509.04575, 2025.
[5] Qin Z, Lyu K, Yu Q, et al. The Achilles' Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities[J]. arXiv preprint arXiv:2510.10238, 2025.
[6] Yuan X, Zhang C, Liu Z, et al. Superficial self-improved reasoners benefit from model merging[C]//Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025: 5912-5932.
[7] Gao D, Dai J, Liu S, et al. SDGT: LLMs fine-tuning with seed-driven growth technology based on GPT-4 data expansion[J]. Neurocomputing, 2026: 132766.
[8] Zhang J, Zhang C X, Liu Y, et al. D3: Diversity, difficulty, and dependability-aware data selection for sample-efficient llm instruction tuning[J]. arXiv preprint arXiv:2503.11441 2025.
[9] Ruotian Ma, Peisong Wang, Cheng Liu, Xingyan Liu, Jiaqi Chen, Bang Zhang, Xin Zhou, Nan Du, and Jia Li. 2025. S2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 22632–22654, Vienna, Austria. Association for Computational Linguistics.
[10] Samanta A, Magesh A, Jain A, et al. Structure Enables Effective Self-Localization of Errors in LLMs[J]. arXiv preprint arXiv:2602.02416, 2026.
[11] Nazari N, Makrani H M, Fang C, et al. Forget and rewire: Enhancing the resilience of transformer-based models against {Bit-Flip} attacks[C]//33rd USENIX Security Symposium (USENIX Security 24). 2024: 1349-1366.

PDF

Published

Jun 8, 2026

Conference Proceedings Volume

Vol. 14 (2026): Proceedings of the 2nd International Conference on Artificial Intelligence, Modern Engineering and Environmental Sustainability (IC-AIMEES 2026)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Wang, Yihan. “Analysis of the Bottlenecks and Solutions for Enhancing the Capabilities of Large Language Models”. Exploring Science Academic Conference Series, vol. 14, June 2026, pp. 226-31, https://doi.org/10.70267/ic-aimees.20260226231.

Download Citation

Analysis of the Bottlenecks and Solutions for Enhancing the Capabilities of Large Language Models

Main Article Content

Keywords

Abstract

References

Article Sidebar

How to Cite

Most read articles by the same author(s)