Towards Generalizable Safety in Crowd Navigation viaConformal Uncertainty Handling
- Jianpeng Yao
- Xiaopan Zhang
- Yu Xia
- Zejing Wang
- Amit K. Roy-Chowdhury
- Jiachen Li
An uncertainty-aware crowd navigation framework leveraging constrained reinforcement learning to handle prediction uncertainty through adaptive conformal inference, achieving significant in-distribution improvements and strong OOD robustness to velocity variations, policy changes, and group dynamics.
Abstract
Mobile robots navigating in crowds trained using reinforcement learning are known to suffer performance degradation when faced with out-of-distribution scenarios. We propose that by properly accounting for the uncertainties of pedestrians, a robot can learn safe navigation policies that are robust to distribution shifts. Our method augments agent observations with prediction uncertainty estimates generated by adaptive conformal inference, and it uses these estimates to guide the agentβs behavior through constrained reinforcement learning. The system helps regulate the agentβs actions and enables it to adapt to distribution shifts. In the in-distribution setting, our approach achieves a 96.93% success rate, which is over 8.80% higher than the previous state-of-the-art baselines with over 3.72 times fewer collisions and 2.43 times fewer intrusions into ground-truth human future trajectories. In three out-of-distribution scenarios, our method shows much stronger robustness when facing distribution shifts in velocity variations, policy changes, and transitions from individual to group dynamics. We deploy our method on a real robot, and experiments show that the robot makes safe and robust decisions when interacting with both sparse and dense crowds.
Key Ideas and Contributions
2) Behavior-Level Constraint Mechanism: We introduce a behavior-level constraint mechanism that constrains cumulative intrusions into uncertainty areas rather than directly constraining collision rates, providing richer cost feedback and effectively addressing the sparse constraint feedback issue.
3) Performance and Robustness: Our method achieves the state-of-the-art performance with 96.93% success rate in dense crowd navigation and over 3.72Γ fewer collisions in in-distribution settings, while demonstrating superior robustness across three different OOD scenarios including velocity variations, policy changes, and group dynamics.
Test Results in In-Distribution and OOD Settings