Research on Cross-modal and Semantic Collaborative Unsupervised Domain Adaptation for All-weather Autonomous Driving Perception Enhancement

Main Article Content

Zhuochao Du

Keywords

autonomous driving, Unsupervised Domain Adaptation, all-weather perception, cross-modal knowledge transfer, Semantic Correlation Alignment

Abstract

Autonomous driving perception systems often encounter severe “domain shift” during real-world deployment. Although deep learning models perform well under ideal weather, their accuracy drops significantly in extreme conditions like rain, snow, fog, or low-light nighttime, as well as during sensor modality transitions. Current Unsupervised Domain Adaptation (UDA) methods mostly focus on class-agnostic global alignment, often neglecting fine-grained semantic correlations and complex visual gaps. This paper systematically explores how to use UDA to enhance the robustness of autonomous driving in complex environments. We propose a multi-dimensional framework that synergizes visual style, feature extraction, and semantic correlation. This study combines theoretical analysis with an algorithmic review to organize several cutting-edge technical paths. First, we introduce Style Prompt Tuning guided by pre-trained Vision-Language Models (VLM). This achieves style transfer while preserving image geometry. Second, we utilize the Transformer-based DTN-DETR model, which cooperates with frequency and spatial domain optimization to improve feature extraction in low-light conditions. Finally, we analyze the Graph Embedding Interclass Relation-Aware Adaptive Network (GelraA-Net) based on Graph Convolutional Networks (GCN) and the Dual Semantic Correlation Alignment (DSCA) mechanism. These methods strengthen cross-scene semantic consistency through topological structures and contextual logic.

Abstract 23 | PDF Downloads 12

References

  • [1] Wang, Q., Wang, M., Huang, J., Liu, T., Shen, T., & Gu, Y. (2024). Unsupervised domain adaptation for cross-scene multispectral point cloud classification. IEEE Transactions on Geoscience and Remote Sensing, 62, 1-15.
  • [2] Gong, T., Lu, X., Sang, Y., Li, S., & Yu, B. (2025). DTN-DETR: Day-night domain adaptive Transformer for nighttime object detection. Computer Engineering, 1–16. https://doi.org/10.19678/j.issn.1000-3428.0252921
  • [3] Kim, Y. H., Shin, U., Park, J., & Kweon, I. S. (2021). MS-UDA: Multi-spectral unsupervised domain adaptation for thermal image semantic segmentation. IEEE Robotics and Automation Letters, 6(4), 6497-6504.
  • [4] Rao, C., Fang, X., Zhang, Y., Fan, W., & Zhou, D. (2025). Cross-domain autonomous driving visual segmentation based on enhanced target data learning. ICT Express, 11(1), 53-58.
  • [5] Xiao, H., Zhou, T., Xiong, S., Li, J., Li, Z., Liu, X., & Deng, T. (2025). Unsupervised domain-adaptive object detection: An efficient method based on UDA-DETR. Neurocomputing, 631, 129711.
  • [6] Cha, S., Choi, G., Kwak, M., & Choi, J. (2025). Style prompt tuning for bridging visual gaps in autonomous driving. Engineering Applications of Artificial Intelligence, 161, 112105.
  • [7] Yang, T., Xiao, S., Qu, J., Dong, W., Du, Q., & Li, Y. (2024). Graph embedding interclass relation-aware adaptive network for cross-scene classification of multisource remote sensing data. IEEE Transactions on Image Processing.
  • [8] Guo, Y., Yu, H., Xie, S., Ma, L., Cao, X., & Luo, X. (2024). Dsca: A dual semantic correlation alignment method for domain adaptation object detection. Pattern Recognition, 150, 110329.