Research on Energy-Efficient Multi-Objective Satellite Orbit Control Based on Deep Reinforcement Learning
Main Article Content
Keywords
deep reinforcement learning, satellite orbit control, multi-objective optimization, fuel efficiency, continuous control, sim-to-real transfer, interpretability, trajectory optimization
Abstract
To address the challenges of multi-objective trade-offs, uncertainty, and nonlinearity in satellite orbit control, this paper proposes a data-driven control framework based on deep reinforcement learning. Methodologically, a novel composite reward function is designed that integrates fuel consumption, trajectory-tracking accuracy, control smoothness, and mission constraints, such as collision avoidance. An adaptive weighting mechanism is introduced to balance these competing objectives. The algorithm employs an enhanced Soft Actor-Critic architecture, in which both the actor and critic networks are constructed with deep residual networks and augmented with an attention mechanism to capture long-term dependencies in orbital dynamics. Experimental results demonstrate that, for low-Earth orbit transfer and proximity operations, the proposed approach reduces average fuel consumption by approximately 18.7% compared to conventional optimal control and baseline deep deterministic policy gradient methods, while meeting the same mission accuracy requirements. Additionally, control stability is improved by 22.4%. Under conditions of model parameter perturbations and measurement noise, the method achieves a success rate of 99.2%, confirming its strong robustness and adaptability. In conclusion, the developed deep reinforcement learning framework enables effective multi-objective coordination and long-horizon fuel-efficient planning, providing a viable pathway toward autonomous and intelligent orbital control.
References
- [1] Vissicchio S, Handley M Characterizing lowest-delay paths in low earth orbit satellite networks [J]. Theoretical Computer Science, 2026, 1072115859-115859. DOI: 10.1016/J.TCS.2026.115859.
- [2] Quarta A A. Augmented Hohmann Transfer for Spacecraft with Continuous-Thrust Propulsion System [J]. Aerospace,2025,12(4): 307-307. DOI:10.3390/AEROSPACE12040307.
- [3] V R, Ponnusamy S Optimizing energy efficiency in battery-powered electric vehicles: Leveraging Pontryagin’s minimum principle and model adaptive control [J]. Energy Sources, Part A: Recovery, Utilization, and Environmental Effects,2025,47(2): DOI:10.1080/15567036.2025.2486386.
- [4] Aerospace and Defense; Studies Conducted at National Research University on Aerospace and Defense Recently Reported (Autonomous implementation of dynamic operations in a geostationary orbit. I. Formalization of control problem) [J]. Defense & Aerospace Week,2015,
- [5] Casuso M, Mateos R A, Martin A, et al. Laser integration and novel nomenclature for multistage distortion measurement and geometry analysis in thin aluminum plates: Applications in aerospace manufacturing [J]. The International Journal of Advanced Manufacturing Technology,2026, (prepublish): 1-14. DOI:10.1007/S00170-026-17756-9.
- [6] Aerospace Research: Study Data from Islamic Azad University Provide New Insights into Aerospace Research (An indirect adaptive predictive control for the pitch channel autopilot of a flight system) [J]. Defense & Aerospace Week, 2015.
- [7] Muhammed I, Nada A A, Hussieny E H. Real-time decentralized model predictive control for cooperative multi-robot object transport: experimental validation. [J]. Scientific reports, 2026, DOI:10.1038/S41598-026-41881-W.
- [8] Rojas B R, Aranda E L J, Díez J. Enhancing anomaly detection in satellite imagery using self-supervised learning techniques [J].Neural Computing and Applications,2026,38(2):20-20. DOI:10.1007/S00521-025-11746-W.
- [9] Sharma G, Jain S, Sharma S R. Sac-eprb: Soft Actor-Critic with Enhanced Prioritized Replay Buffer for UAV Navigation [J]. Intelligent Service Robotics,2026,19(3):48-48. DOI:10.1007/S11370-026-00709-2.
- [10] JianRong C, JunFeng L, XiJing W, et al. A simplex method for the orbit determination of maneuvering satellites [J]. Science China Physics, Mechanics & Astronomy, 2018, 61 (2): 024511-024511. DOI:10.1007/s11433-017-9102-1.
- [11] Sahu S, Rajana K S, Venkata N N S, et al. Static, stress and free vibration analysis of composite conoidal shell using Carrera Unified formulation [J]. Mechanics of Advanced Materials and Structures,2025,32(23): 5938-5955. DOI:10.1080/15376494.2024.2431157.
- [12] Chen J, Mao Q. Solving a modified algebraic Riccati equation for applications in mean-square control [J]. Automatica, 2026, 187112901-112901. DOI:10.1016/J.AUTOMATICA.2026.112901.
- [13] Pasiecznik J, Servadio S, Linares R Koopman Operator theory applied to Lambert’s problem with a spectral behavior analysis [J]. Acta Astronautica, 2025, 229565-577. DOI:10.1016/J.ACTAASTRO.2024.03.021.
- [14] Xiong H, Ma T, Zhang L, et al. Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots [J]. Neurocomputing, 2020, 37773-84. DOI:10.1016/j.neucom.2019.10.020.
- [15] Lee S S. Time-based autonomous orbit control laws using a low-thrust system to maintain orbit configuration of satellite constellations [J]. Ain Shams Engineering Journal, 2025, 16(10): 103609-103609. DOI:10.1016/J.ASEJ.2025.103609.
