Research on Oracle Bone Inscription Detection and Recognition Algorithm Based on YOLO11-ViT

Main Article Content

Zhiyuan Lu

Keywords

oracle bone inscription recognition, yolo11, vision transformer, object detection, image preprocessing

Abstract

Oracle bone inscriptions are the earliest mature writing system discovered in China and constitute an important historical source for the origin of Chinese characters and traditional Chinese culture. However, oracle bone rubbing images are usually affected by manual carving variations, long-term underground burial, imbalanced sample distributions, severe noise interference and highly similar character structures, which considerably restrict recognition accuracy. To address these challenges, this paper proposes a two-stage oracle bone inscription detection and recognition model integrating YOLO11-ViT. First, the images are preprocessed using grayscale conversion, Otsu binarization, Gaussian denoising and morphological optimization. Second, YOLO11m is employed to accurately detect and localize oracle bone characters in the rubbing images. Finally, a Vision Transformer model is used to classify the cropped single-character regions. Experiments are conducted on the dataset provided by the 2024 14th MathorCup Mathematical Application Challenge. The results show that the proposed method achieves an mAP@0.5 of 0.92 in the detection stage and an accuracy of 97.60%, a recall of 0.96 and an F1 score of 0.97 in the recognition stage, outperforming the compared methods.

Abstract 0 | PDF Downloads 0

References

  • [1] Liu, Y, Lu, Y, Wei, YC, et al. (2023). Research status and prospects of oracle bone inscription recognition technology. Knowledge Management Forum, 8(2), 115–125.
  • [2] Mao, YF, Bi, XJ. (2023). Oracle bone inscription recognition on rubbings using an improved ResNeSt network. CAAI Transactions on Intelligent Systems, 18(3), 450–458.
  • [3] Zhang, YK, Zhang, H, Liu, YG, et al. (2021). Oracle bone character recognition based on cross-modal deep metric learning. Acta Automatica Sinica, 47(4), 791–800.
  • [4] Wang, HB. (2019). Research on Oracle Bone Character Detection and Recognition Based on Deep Learning (Master’s Thesis). South China University of Technology, Guangzhou.
  • [5] Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62–66.
  • [6] Redmon, J, Divvala, S, Girshick, R, et al. (2016). You only look once: unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 779–788.
  • [7] Dosovitskiy A, Beyer L, Kolesnikov A, et al. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations. Available from: https://arxiv.org/abs/2010.11929
  • [8] Huang, S, Wang, H, Liu, Y, et al. (2019).OBC306: a large-scale oracle bone character recognition dataset. International Conference on Document Analysis and Recognition. p. 681–688.