Poster Session

H M Sabbir Ahmad and Ehsan Sabouni
Advisor: Wenchao Li and Christos Cassandras
Boston University

Talk Title: Bridging Learning and Safety: A Reinforcement Learning Based Approach for Joint Receding Horizon and Control Barrier Function Framework

Abstract: Optimal control methods provide solutions to safety-critical problems but easily become intractable. Control Barrier Functions (CBFs) have emerged as a popular technique that facilitates their solution by provably guaranteeing safety, through their forward invariance property. This approach involves defining a performance objective alongside CBF-based safety constraints that must always be enforced. However, achieving a balance between performance and solution feasibility can be heavily influenced by two main factors: the choice of cost function and its parameters, as well as the calibration of parameters within the CBF-based constraints. To address these challenges, we propose a Reinforcement Learning (RL)-based Receding Horizon Control (RHC) approach leveraging Model Predictive Control (MPC) with CBFs (MPC-CBF). In particular, we parameterize our controller and use bilevel optimization, where RL is used to learn the optimal parameters while MPC computes the optimal control input. We validate our method by applying it to the challenging automated merging control problem for Connected and Automated Vehicles (CAVs) at conflicting roadways. This work contributes to enhancing the safety of learning-based autonomous systems in real-world applications, offering a promising direction for addressing the performance-safety trade-off in learning-based control systems.

Abin Binoy George
Advisor: Sabrina Neuman
Boston University

Talk Title: Co-Design of Mechanical & Computer Hardware for Robotics

Abstract: Efficiently executing specific tasks is the primary goal of robotics, with key paradigms including mechanical design, control systems, and computational hardware. These three facets are interdependent; altering one will affect the others. For example, using research frameworks like RoboGrammar, we demonstrate that variations in hardware mass significantly affect optimal robot designs. Our focus is on co-designing the robot’s mechanical structure and control systems to enhance efficiency, using an origami quadruped as our experimental platform. Initial explorations involve tuning design and gait parameters to improve power consumption and performance, employing Bayesian Optimization for optimal configurations. Future work will integrate advanced controllers and a computer hardware simulator to create a comprehensive framework for optimizing robotic systems across various tasks and terrains.

Zijian Guo
Advisor: Wenchao Li
Boston University

Talk Title: Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning

Abstract: Offline safe reinforcement learning (RL) aims to train a constraint satisfaction policy from a fixed dataset. Current state-of-the-art approaches are based on supervised learning with a conditioned policy. However, these approaches fall short in real-world applications that involve complex tasks with rich temporal and logical structures. In this paper, we propose temporal logic Specification- conditioned Decision Transformer (SDT), a novel framework that harnesses the expressive power of signal temporal logic (STL) to specify complex temporal rules that an agent should follow and the sequential modeling capability of Decision Transformer (DT). Empirical evaluations on the DSRL benchmarks demonstrate the better capacity of SDT in learning safe and high-reward policies compared with existing approaches. In addition, SDT shows good alignment with respect to different desired degrees of satisfaction of the STL specification that it is conditioned on.

Ran Jing
Advisor: Andrew Sabelhaus
Boston University

Talk Title: Self-Sensing for Proprioception and Contact Detection in Soft Robots Using Shape Memory Alloy Artificial Muscles

Abstract: Estimating a soft robot’s pose and applied forces, also called proprioception, is crucial for safe interaction of the robot with its environment. However, most solutions for soft robot proprioception use dedicated sensors, particularly for external forces, which introduce design trade-offs, rigidity, and risk of failure. This work presents an approach for pose estimation and contact detection for soft robots actuated by shape memory alloy (SMA) artificial muscles, using no dedicated force sensors. Our framework uses the unique material properties of SMAs to self-sense their internal stress, via offboard measurements of their electrical resistance and in-situ temperature readings, in an existing fully-soft limb design. We demonstrate that a simple polynomial regression model on these measurements is sufficient to predict the robot’s pose, under no-contact conditions. Then, we show that if an additional measurement of the true pose is available (e.g. from an already-in-place bending sensor), it is possible to predict a binary contact/no-contact using multiple combinations of self-sensing signals. Our hardware tests verify our hypothesis via a contact detection test with a human operator. This proof-of-concept validates that self-sensing signals in soft SMA-actuated soft robots can be used for proprioception and contact detection, and suggests a direction for integrating proprioception into soft robots without design compromises. Future work could employ machine learning for enhanced accuracy.

Anni Li
Advisor: Christos Cassandras
Boston University

Talk Title: Human-Autonomous Vehicle Safe Interactions in Multi-agent Systems

Abstract: We study safe driving interactions between Human-Driven Vehicles (HDVs) and Connected and Autonomous Vehicles (CAVs) in mixed traffic to derive time and energy-optimal policies for CAVs to complete lane change maneuvers. The interaction between CAVs and HDVs can be formulated using a bilevel optimization setting with an appropriate behavioral model for HDV, requiring the best possible response from a CAV to actions by its neighboring HDVs. An iterated best response (IBR) method is then used to determine a Nash equilibrium. Besides, we also show that CAV cooperation can eliminate or greatly reduce the interaction between CAVs and HDVs. We derive a simple threshold-based criterion to select an optimal policy for the lane-changing CAV to merge ahead of a cooperating CAV in the target lane. In this case, the trajectory of the lane-changing CAV is independent of HDV behavior. Moreover, in the case where the dynamics and control policies of HDVs are unknown and hard to predict, we employ event-triggered Control Barrier Functions (CBFs) to estimate the HDV model online, construct data-driven and state-feedback safety controllers, and transform constrained optimal control problems for CAVs into a sequence of event-triggered quadratic programs. We show that we can ensure collision-free interactions between HDVs and CAVs and demonstrate the robustness and flexibility of our framework on different types of human drivers in lane-changing scenarios while guaranteeing the satisfaction of safety constraints.

Ehsan Sabouni
Advisor: Christos Cassandras
Boston University

Talk Title: Bridging Learning and Safety: A Reinforcement Learning Based Approach for Joint Receding Horizon and Control Barrier Function Framework

Zili Wang
Advisor: Sean Andersson
Boston University

Talk Title: Learning-enabled Navigation and Nonlinear Control for Resource Constrained Robots

Abstract: Navigating complex and unknown environments is a remarkable ability shared by both humans and animals, allowing them to predict important environmental features and devise effective strategies for success based on past experiences. Equipping robots with a similar capability is a valuable challenge for real-world applications. While recent advances in deep learning have improved robotic task execution through experience-based learning, many existing methods demand extensive sensing and computational resources. This research addresses the challenge of resource-efficient semantic navigation and reliable nonlinear control in unknown structured environments. We propose a novel framework that divides decision-making into two key components: (1) high-level planning under resource constraints, where the selection of intermediate goals is informed by the environment’s predictable structure. This stage includes a scene network module that extrapolates the environment from partial data and a planning module that charts the trajectory toward the target. (2) Low-level control, where a nonlinear controller navigates toward the goals set in the planning stage. This component focuses on a control network designed for stability and safety in polygonal environments. By integrating them, we provide a robust decision-making framework applicable to a wide range of robotic applications.

Alp Eren Yilmaz
Advisor: Sabrina Neuman
Boston University

Talk Title: Accelerating Rigid Body Dynamics through Variable Fixed-Precision with Provable Error-Bounds

Abstract: Robotic applications increasingly demand mobility, imposing strict constraints on battery capacity, weight, and power consumption. The integration of advanced AI and optimal control at the edge has heightened computational demands, worsening these constraints, especially for tiny robots. This drives the need for compressing these algorithms to enable deployment on embedded edge hardware. A promising approach is fixed-point arithmetic, which requires less memory and is faster than standard floating-point arithmetic, and is already demonstrated in optimal control applications. Previous works used a single fixed-point representation through trial and error, missing key optimization opportunities. Tools like Daisy now enable rigorous stability proofs for computational graphs after fixed-point conversion, but their generalized approach often leads to suboptimal results or extended runtimes. Our approach leverages domain-specific knowledge to address these issues. Additionally, we are developing an end-to-end pipeline to provide provable error bounds on variable-representation fixed precision for robotics applications, focusing initially on rigid body dynamics—a known bottleneck in optimal control. Early experiments show that a 20x speedup is achievable on a microcontroller without an FPU, such as Raspberry Pi Pico.

Mela Coffey & Dawei Zhang
Advisor: Alyssa Pierson and Roberto Tron
Boston University

Talk Title: Reactive and Safe Co-Navigation with Haptic Guidance

Abstract: We propose a co-navigation algorithm that facilitates collaborative navigation between a human and a robot toward a shared goal. In this framework, the human pilot makes high-level directional decisions, while the robot provides haptic feedback for collision avoidance and path guidance, adapting to dynamic environmental changes. Our algorithm uses optimized Rapidly-exploring Random Trees (RRT*) to generate paths to lead the user to the goal, via an attractive force feedback computed using a Control Lyapunov Function (CLF). Simultaneously, we ensure collision avoidance through a Control Barrier Function (CBF) where necessary. Our approach is validated through simulations with a virtual pilot and hardware experiments with a human pilot. The results show that integrating RRT* with CBFs is a promising tool for effective human-robot co-navigation.

Zirui Zang & Ahmad Anime
Advisor: Rahul Mangharam
University of Pennsylvania

Talk Title: IT-LMPC: Information Theoretic Learning Model Predictive Control for Safe Iterative Learning Control

Abstract: We present an approach to solve the Learning Model Predictive Control (LMPC) problem using Model Predictive Path Integral (MPPI) as a means of extending LMPC to stochastic non-linear systems. LMPC solves the optimal control policy that minimizes a value function by iteratively improving performance and learning from previous trajectories. Unlike reinforcement learning, the LMPC formulation provides desirable theoretical properties for safety and optimality, which is valuable for safety-critical applications. Previous methods utilizing optimization-based techniques or Cross-Entropy-Method sampling to solve the LMPC problem fall short when dealing with high-dimensional, nonlinear dynamics. Our MPPI-based framework optimizes the control policy from an information-theoretic perspective and overcomes these limitations by providing a systematic way of handling constraints without sacrificing sample spread. We validate our approach through simulations and real-world experiments, demonstrating significant improvements in constraint satisfaction and final trajectory performance.