Simulator for Accelerated Embodied AI Training
The substantial training samples required for Artificial Intelligence (AI) models, which range upwards of millions for robotics tasks, is a significant hurdle due to labor and operating costs. Deep Reinforcement Learning (DRL) is an effective Machine Learning (ML) technique that updates Artificial Neural Network (ANN) weights with repeated collisions in the surrounding environment. The scale of required samples necessitates training times that range to years on hardware, dependent on the complexity of the task, which can be accelerated to days with parallelized simulations. While synthetic data from simulators speeds up the generation of training data, the physics inaccuracies render simulation to reality (sim-to-real) transfer arduous, where the availability of high-fidelity simulation assets is a bottleneck. AUTOnomous ground Vehicle deep Reinforcement Learning simulator (AutoVRL) [1] is an open-source simulator for accelerated embodied AI training built upon the Bullet physics engine equipped with sensor implementations of Global Positioning System (GPS), Inertial Measurement Unit (IMU), LiDAR and camera, high-fidelity model of eXperimental one-TENTH scaled vehicle platform for Connected autonomy and All-terrain Research (XTENTH-CAR) [2], and 3D environments for training and evaluation, with extensibility for new robots and environments. XTENTH-CAR and the digital twin are depicted in Figure 1 and environments in Figure 2.
Figure 1. XTENTH-CAR mobile robot and digital twin in AutoVRL.
Figure 2. Woodland, urban and racetrack environments.
Accelerated simulation training facilitates policies trained over 50,000,000 samples for map-free exploration and navigation to a target object [3], shown in Video 1 for search and rescue. Zero-shot sim-to-real transfer, attained with simulator-exploit aware rewards, yields autonomous racing performance 12% faster than a human demonstration [4] trained over 20,000,000 samples, illustrated by an example in Figure 3, and robust map-free navigation in unstructured terrain [5], depicted in Figure 4.
Video 1. Exploration and navigation to target object for search and rescue trained over 50,000,000 samples.
Figure 3. Sim-to-real transfer of racing policy 12% faster than a human demonstration trained over 20,000,000 samples.
Figure 4. Sim-to-real transfer of robust navigation policy in unstructured terrain.
References
[1] S. Sivashangaran, A. Khairnar and A. Eskandarian, “AutoVRL: A High Fidelity Autonomous Ground Vehicle Simulator for Sim-to-Real Deep Reinforcement Learning," IFAC-PapersOnLine, vol. 56, no. 3, pp. 475-480, Dec. 2023. (Link) (Preprint)
[2] S. Sivashangaran and A. Eskandarian, “XTENTH-CAR: A Proportionally Scaled Experimental Vehicle Platform for Connected Autonomy and All-Terrain Research," Proceedings of the ASME 2023 International Mechanical Engineering Congress and Exposition.Volume 6: Dynamics, Vibration, and Control. New Orleans, LA, USA, Oct. 29–Nov. 2, 2023. V006T07A068. American Society of Mechanical Engineers. (Link) (Preprint)
[3] S. Sivashangaran, A. Khairnar and A. Eskandarian, “Cognitive Navigation for Search and Rescue Autonomous Ground Vehicle using Soft Actor-Critic,” The American Control Conference (ACC), Poster, San Diego, CA, USA, May 31-Jun. 2 2023. (Link)
[4] S. Sivashangaran, A. Khairnar, S. Gohari, V. Dutta and A. Eskandarian, “Physics-Informed Reinforcement Learning of Spatial Density Velocity Potentials for Map-Free Racing,” arXiv preprint arXiv:2604.09499, Apr. 2026. (Link)
[5] S. Sivashangaran, A. Khairnar and A. Eskandarian, “Mobile Robot Exploration Without Maps via Out-of-Distribution Deep Reinforcement Learning," IFAC-PapersOnLine, vol. 59, no. 30, pp. 533-538, Dec. 2025. (Link) (Preprint)