Multi Table QA:Evaluating Modern LLM Strategies on Table0 Understanding

Main Article Content

Letian Li

Keywords

large language models, multi-table question answering, tool-augmented reasoning, structured data understanding, agent-based workflows, tabular data reasoning

Abstract

Tables constitute the majority of structured data in enterprise environments. While single-table question answering has received significant attention, research on multi-table reasoning remains limited. Compared to single-table QA, multi-table QA requires schema alignment, relational inference, and scalable context man- agement. We introduce tool-augmented reasoning as a paradigm for multi-table QA, and systematically study two complementary strategies: (1) free-form tool interaction, where models iteratively call exploration and computation tools, and(2) structured agent workflows, which stage tool-use into exploration, preparation, and analysis phases. Using the TQA-Bench dataset, we evaluate GPT-4o-mini and GPT-5-mini across both small (8k) and large (128k) database scales under varying tool context constraints. We show that tool augmentation substantially improves robustness and accuracy over direct prompting, with gains up to +28 percentage points. Structured workflows yield further benefits for weaker models on large databases, but regress for stronger models, revealing a scale–capacity trade-off in the value of structure. These results establish tool-augmented reasoning as a powerful paradigm for multi-table QA. Structure aids weaker models under scale but constrains stronger models, underscoring that tool-use strategies must be adapted jointly to model ability and database complexity.

Abstract 0 | PDF Downloads 0

References

  • Chen, Z., Zhou, K., Zhang, B., Gong, Z., Zhao, W. X. and Wen, J.-R., (2023). Published. Chatcot: Tool-augmented chain-of-thought reasoning on chat-based large language models. EMNLP 2023 Conference, 2023 Singapore. pp. 14777-14790.
  • Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J. and Neubig, G., (2023). Published. Pal: Program-aided language models. International Conference on Machine Learning, 2023 Honolulu, Hawaii, USA. PMLR, pp. 10764-10799.
  • Lu, W., Zhang, J., Fan, J., Fu, Z., Chen, Y. and Du, X., (2025). Large language model for table processing: A survey. Frontiers of Computer Science, vol. 19, no. 2, p. 192350.
  • Qiu, Z., Peng, Y., He, G., Yuan, B. and Wang, C., (2024). Tqa-bench: Evaluating llms for multi-table question answering with scalable context and symbolic extension. arXiv preprint arXiv:2411.19504.
  • Wang, X., (2016). A brief discussion on implementing national plans to enhance China's soybean competitiveness. Heilongjiang Grain, no. 11, pp. 30-33.
  • Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R. and Cao, Y., (2023). Published. React: Synergizing reasoning and acting in language models. The eleventh international conference on learning representations, 2023 Kigali, Rwanda. Proceedings of the 11th International Conference on Learning Representations (ICLR).

Similar Articles

1-10 of 29

You may also start an advanced similarity search for this article.