Computationally Efficient and Safe Control for Aerial Robotic Systems under Threat and Disturbance

→

Up next

⇣

Proposed Work

T1 Proposed

→

→

The Practical–Research Gap

Despite major advances in aerial trajectory planning and control, state-of-the-art methods remain time- and compute-heavy — hard to deploy on real hardware.

Particularly pronounced on miniature aerial platforms, where onboard compute is often limited
NMPC is the state-of-the-art, yet: “impractical to run NMPC on some miniature aerial vehicles with a limited computational budget” [1]
Similar state-of-the-art approaches are either computationally complex, or require advanced reference derivatives or detailed model knowledge

The practical state-of-the-art remains largely tethered to advanced PID control — a gap between the practical and research state-of-the-art.

Newton–Raphson Flow Controller

For $\dot x = f(x,u),\; y = h(x)$, introduce a differentiable look-ahead predictor $\hat y(t+T) = \rho(x(t), u(t))$ under zero-order hold on $u$. Applying NRT to the predictor [2]:

$\displaystyle \Psi(x,u) \;\triangleq\;$ $\displaystyle \Bigl(\tfrac{\partial \rho}{\partial u}(x,u)\Bigr)^{-1}$$\displaystyle \bigl(r(t+T) - \rho(x,u)\bigr)$

$\displaystyle \dot u(t) =$ $\displaystyle \alpha$$\displaystyle \,\bigl(\Psi(x,u) + \eta\bigr)$

\[ u_{k+1} = u_k + \dot u(t_k)\Delta t \]

Newton–Raphson Flow Controller

For $\dot x = f(x,u),\; y = h(x)$, introduce a differentiable look-ahead predictor $\hat y(t+T) = \rho(x(t), u(t))$ under zero-order hold on $u$. Applying NRT to the predictor [2]:

\[ \Psi(x,u) \;\triangleq\; \Bigl(\tfrac{\partial \rho}{\partial u}(x,u)\Bigr)^{-1} \bigl(r(t+T) - \rho(x,u)\bigr), \qquad \dot u(t) = \alpha\,\bigl(\Psi(x,u) + \eta\bigr) \]

\[ u_{k+1} = u_k + \dot u(t_k)\Delta t \]

Asymptotic error bound: \[\limsup_{t\to\infty}\lVert r(t)-y(t)\rVert \leq \nu_1 + \frac{\nu_2}{\alpha}\]

$\nu_1$ — prediction-model mismatch: gap between predictor $\rho$ and true $y(t+T)$; can’t be attenuated
$\nu_2$ — reference-variation residual: scales with $\sup_t\lVert\dot r(t+T)\rVert$; attenuated by speedup $\alpha$

Newton–Raphson Flow Controller

For $\dot x = f(x,u),\; y = h(x)$, introduce a differentiable look-ahead predictor $\hat y(t+T) = \rho(x(t), u(t))$ under zero-order hold on $u$. Applying NRT to the predictor [2]:

\[ \Psi(x,u) \;\triangleq\; \Bigl(\tfrac{\partial \rho}{\partial u}(x,u)\Bigr)^{-1} \bigl(r(t+T) - \rho(x,u)\bigr), \qquad \dot u(t) = \alpha\,\bigl(\Psi(x,u) + \eta\bigr) \]

\[ u_{k+1} = u_k + \dot u(t_k)\Delta t \]

Asymptotic error bound: \[\limsup_{t\to\infty}\lVert r(t)-y(t)\rVert \leq \nu_1 + \frac{\nu_2}{\alpha}\]

Integral CBFs [3] \[\eta \;=\; \arg\min_{\eta}\tfrac12\lVert\eta\rVert^2 \quad \text{s.t.}\quad \dot b + \gamma(b) \ge 0\]

CBF for dynamic control laws
Barrier $b(x,u)$ — safety in the state–input set, not just $x$
Admits QP even for non-control-affine systems

Prediction Strategies

The central design choice is the prediction strategy. Throughout our work, we’ve developed different approaches with various tradeoffs:

Simplified closed-form linearized predictor. Small-angle approx; predictor and Jacobian computed offline
- Pure matrix-vector costs
- 100 Hz on RPi; no compilation needed
Flatness-based predictor. Prediction is kinematic expansion in flat space
- Bypass ODE integration; 100Hz without compilation
- Fast computation; sacrifice accuracy
Learned predictor. $\rho_{\mathrm{NN}}(x,u;\theta)$
- Feedforward and RNN successful in simulation
- Project won $3^{rd}$ place in Deep Learning Symposium
Nonlinear RK4 predictor. Compiled
- Full-model RK4/Fwd Euler
  - JIT-Compilation w/forward-mode autodiff
  - Cython + finite-difference

Quadrotor running NRT on a Raspberry Pi (Spinning Helix).

Flatness-based NRT on 3D quadrotor.

Hardware Demonstration & Implementation Details

Quadrotor with NRT and NMPC running onboard on a Raspberry Pi.

Avg per-iteration compute — NRT: 2.45 ms · NMPC: 12.88 ms → NRT ~$5.3\times$ faster

NMPC implementation:

Solved via Acados with Real-Time Iteration (RTI)
Carefully tuned weights, horizon, steps
Delay compensation infrastructure needed

NRT Implementation:

Highly modular
Minimal tuning required
RK4 prediction

Empirical $\alpha$-RMSE plateau — broad region insensitive to speedup $\alpha$.

Quadrotor Tracking — Overall Results

Quadrotor hardware: top NRT, bottom NMPC. Flight data in red, reference in blue.

Average CPU energy per iteration — NRT is lowest on both platforms:

Method	Energy / iter [$\mu$J]
Blimp NRT	$\mathbf{1.25\times 10^4}$
Blimp NMPC	$2.04\times 10^5$
Blimp FBL	$3.06\times 10^4$
Quad NRT	$\mathbf{1.73\times 10^4}$
Quad NMPC	$6.15\times 10^4$

RMSE on full flight paths — NRT and NMPC are roughly even:

Trajectory	NRT [m]	NMPC [m]
Circle A	0.123	0.107
Circle B	0.123	0.152
Lemniscate A	0.122	0.113
Lemniscate B	0.152	0.105
Lemniscate C	0.145	0.173
Helix A	0.149	0.169
Helix B	0.189	0.180
Circle C	0.171	0.177
Sawtooth	0.045	0.060
Triangle	0.081	0.084

Competitive tracking against state-of-the-art NMPC and lowest CPU energy consumption across all platforms

Blimp Results — Tracking Comparison

Miniature blimp tracking references under FBL, NMPC, & NRT.

NMPC missed the 25 ms deadline at 40 Hz; NRT + FBL never did. Only NRT used no derivative information.

Blimp hardware: flight data (red) vs reference (blue). Rows: NRT, NMPC, FBL.

Per-trajectory RMSE — NRT wins on all 8:

Trajectory	NRT [m]	NMPC [m]	FBL [m]
Circle A	0.079	0.141	0.193
Circle B	0.051	0.110	0.143
Lemniscate A	0.104	0.185	0.230
Lemniscate B	0.054	0.103	0.146
Lemniscate C	0.056	0.118	0.131
Helix A	0.072	0.077	0.113
Helix B	0.088	0.215	0.220
Circle C	0.101	0.403	0.384

Integration 1 — I-STL Runtime Assurance

Separation of timescales: 2 Hz RTA, 50 Hz splines + NRT.

Unsafe trajectory (pink); runtime assurance produces a safe one (green).

We couple NRT to a runtime-assurance layer enforcing temporal-logic safety[4]:

Reference is spline-interpolated to high rate, tracked by NRT with iCBF-enforced limits
NRT tracking tolerance $\varepsilon$ folded back into the RTA safety margin

$\alpha$-Stability for Miniature Blimp

Proposition [5]. For the hover linearization of the six-DOF miniature blimp with exact ZOH predictor $\hat y(t)=Ce^{AT}x + CA^{-1}(e^{AT}-I)Bu$, the closed-loop linearized system is $\alpha$-stable.

Definition [2, Def. 4.1]

A uniform-in-$\alpha$ variant of BIBS stability: $\exists\,\bar\alpha\!\geq\!0$ and class-$\mathcal{K}$ functions $\beta,\gamma_1,\gamma_2$ independent of $\alpha\!\in\![\bar\alpha,\infty)$ such that \[\lVert z(t)\rVert \leq \beta(\lVert z_0\rVert) + \gamma_1(\lVert r\rVert_\infty) + \gamma_2(\lVert\dot r\rVert_\infty)\] for all $\alpha\!\geq\!\bar\alpha$.

Implication [2, Prop. 4.2]

$\alpha$-stability $\Rightarrow$ asymptotic exact tracking as the speedup grows: \[\lim_{\alpha\to\infty}\limsup_{t\to\infty}\lVert r(t)-\hat y(t)\rVert = 0.\]

Once $\alpha$-stability holds, arbitrary tracking precision is achievable by speedup.

Proof Sketch. The closed-loop Newton–Raphson system with this predictor defines the linear closed-loop system for extended state $z := [x^\top,\, u^\top]^\top$. \[ \dot z = \Phi_\alpha\, z + \Psi_\alpha\, r(t+T) \]

[2] defines corresponding system polynomials $P_0(s)$ and $Q(s)$ for this closed loop, with the sufficient condition that both polynomials being Hurwitz implies $\alpha$-stability.

For the linearized blimp system, both $P_0$ and $Q$ can be shown to have all roots in the open left-hand complex plane for any value of $\alpha$. Therefore, the linearized closed loop is $\alpha$-stable, and our true system is stable near equilibrium. $\blacksquare$

Integration 2 — GP Invariant Tubes

WIND ESTIMATION OFF

WIND ESTIMATION ON

Hardware MM-GPR demo — wind estimation OFF vs ON.

MM-GPR tube rejects an unmodeled industrial-fan disturbance. With GP estimation disabled, tubes become inaccurate and the mission fails[6]

NRT inside a runtime-assurance framework using forward-invariant tubes around reference trajectories:

Disturbances estimated online via time-varying Gaussian processes
Replanning triggered when tube can no longer certify safety
NRT at 100 Hz, immrax [7] tube updates at 10 Hz — full pipeline meets 10 ms budget on average

Thrust 1 — Takeaways

Summary of Thrust 1

NRT method occupies a valuable middle ground between high-accuracy control and efficient compute
Modular implementation with minimal tuning
$\alpha$-stability proof on the linearized miniature blimp; empirically near-insensitive to $\alpha$ over a broad plateau
Deployed on real hardware (blimp, quadrotor) with high accuracy, fast computation time, and low-energy footprint
Efficiency leaves time and compute for high-level tasks; demonstrated in I-STL and GP invariant tube work

Thrust 2 — Safe Planning with RTD-RAX

Thrust II · Safe Trajectory Planning

Roadmap

Preliminary Work

→

→ Now

⇣

Proposed Work

Up next

→

→

Reachability-based Trajectory Design (RTD) Core

Planning model. Low-fidelity model used for FRS computation offline; worst-case tracking-error inflation absorbs model mismatch.
Offline FRS. Trajectories parameterized by $k$.
Online Optimization. $k^\star=\arg\min_{k\in K_{\mathrm{adm}}} J(k)\ \ \text{s.t. } k \text{ is obstacle-free}$

Limitations we address:
(1) worst-case inflation → safe trajectories falsely deemed infeasible
(2) offline FRS cannot absorb a priori unknown disturbances.

Offline FRS computation: each trajectory parameter $k$ (left, parameter space) maps to a trajectory plus uncertainty tube inside the entire forward reachable set (right, state space). Reproduced from [8].

Reachability-based Trajectory Design (RTD) Core

Planning model. Low-fidelity model used for FRS computation offline; worst-case tracking-error inflation absorbs model mismatch.
Offline FRS. Trajectories parameterized by $k$.
Online Optimization. $k^\star=\arg\min_{k\in K_{\mathrm{adm}}} J(k)\ \ \text{s.t. } k \text{ is obstacle-free}$

Limitations we address:
(1) worst-case inflation → safe trajectories falsely deemed infeasible
(2) offline FRS cannot absorb a priori unknown disturbances.

Reachability-based Trajectory Design (RTD) Core

Planning model. Low-fidelity model used for FRS computation offline; worst-case tracking-error inflation absorbs model mismatch.
Offline FRS. Trajectories parameterized by $k$.
Online Optimization. $k^\star=\arg\min_{k\in K_{\mathrm{adm}}} J(k)\ \ \text{s.t. } k \text{ is obstacle-free}$

Limitations we address:
(1) worst-case inflation → safe trajectories falsely deemed infeasible
(2) offline FRS cannot absorb a priori unknown disturbances.

RTD-RAX Architecture

RTD-RAX [9]: offline RTD candidate generation + online MMR verification + repair.

Departure: strip the offline FRS of its tracking-error inflation and delegate safety certification to the online MMR verifier against the measured disturbance. First RTD-class framework to safely plan under a priori unknown disturbances.

Why Mixed-Monotone Reachability

Embedding trajectory and reachable tube under bounded disturbance $w\in[-1,1]$.

MMR enables efficient calculation of hyperrectangular reachable set overapproximations via a single ODE integration. immrax [7] provides an embedding system for our dynamics and JIT-compiles it for real-time reachability.

Case Study 1 — Narrow Gap

Animated side-by-side: Standard RTD (left) remains infeasible through the gap; RTD-RAX (right) uses the non-inflated FRS plus immrax verification to certify a safe corridor path.

Two rectangular obstacles make a narrow gap:

Standard RTD: inflated FRS declares infeasible → fail-safe triggered
RTD-RAX: optimizer returns $k^\star$ through the gap → mixed-monotone tube certifies safe

Scenario isolates the effect of offline FRS conservatism

Case Study 2 — Angled Obstacle

At each step: generate candidate → verify → execute or repair.

Standard RTD succeeds conservatively
RTD-RAX: rejections become repaired candidates via speed-backoff + buffer tightening → shorter, more efficient paths

Side-by-side animation: Standard RTD (conservative but succeeds) and RTD-RAX (online verification + repair reaches goal).

The RTD-RAX path is shorter than the Standard RTD path.

Close-up of two repair events: unsafe candidate tube (pink) rejected, repaired tube (green) chosen and executed. Animation pauses at each repair for visual clarity.

Case Study 3 — Unknown Disturbances

Multi-gap course with disturbance patches. Top: Standard RTD collides on cycle 3. Bottom: RTD-RAX senses, verifies, repairs, and reaches the goal.

The verifier + repair layer turns mission failures into safe alternatives.

Verification and simple repairs often happen near-instantly: trade $\approx\!1$ ms online compute for a recovered, certifiably safe path

Receding-horizon with multiple disturbance patches.

Planner	Outcome	Cycles	Repairs	Mean / p95 [ms]
Std RTD	Collision	3	—	10.5 / 21.9
RTD-RAX	Goal reached	19	3	10.5 / 37.4

Mean unchanged — verification is flat on typical cycles.
p95 tail isolates the repair loop’s worst-case cost.

First RTD-class framework to certify safety under a priori unknown runtime disturbances [9].

Thrust 2 — Takeaways

Summary of Thrust 2

RTD-RAX separates fast candidate generation from execution-time safety certification.
Uses MMR with measured disturbance bounds online — first RTD-class method accommodating a priori unknown disturbances.
Verification adds negligible added mean cycle time; adds cost on the p95 tail when many repairs fire.
Repair layer turns rejections into safe alternatives before falling back to a certified fail-safe.

Proposed Work

Thrust 1 · Low-Level

Contraction-NN tracker
Head-to-head vs NRT / NMPC
Holybro X500 deployment (collaboration)

Thrust 2 · Mid-Level

RTD-RAX:

Gradient feedback for repair
GP-learned disturbance bounds
GTernal / Holybro hardware
Parameter → STL-satisfying trajectory map

RL-STL: MILP-on-trigger policy

Thrust 3 · High-Level

Diagnose → recover → inoculate pipeline
Parameter → safe-strategy map (per fault class)
Certified backup-controller library

Thrust 1 — Proposed Work

Thrust I · Proposed Work

Roadmap

Preliminary Work

→

⇣

Proposed Work

→ Now

→

→

P1.1 — Contraction-NN Hardware

In addition to upcoming methodological contributions, will leverage hardware-implementation expertise in collaborative efforts.

Upcoming collaboration:

Deploy a recently-proposed contraction-based NN tracking controller [10] on the Holybro X500
Preliminary results show similar impact to NRT: fast, efficient, high-accuracy

X- and Y-axis position RMSE by controller (w/FF)

Preliminary Gazebo Results

Reference (gray) vs actual figure-8 paths in simulation.

Mean per-iteration computation time by controller.

Thrust 2 — Proposed Work

Thrust II · Proposed Extensions

Roadmap

Preliminary Work

→

⇣

Proposed Work

→

→

P2.1 — Algorithmic Improvements

Current

Rejected parameters are altered in a guess-and-check manner that empirically works but may be inefficient.
RTD candidate-generation and online MMR verifier communicate only through accept/reject.

Proposed. Two parallel tracks.

Differentiable distance-to-obstacle measure that accounts for disturbance and perform gradient step on the parameters for smarter repair

\[ k_{\text{new}} \leftarrow k^\star + \eta\;\nabla_k\bigl(\min_i\operatorname{dist}(\mathcal B_j(k,\mathbf d), \mathrm{obs}_i)\bigr) \]

Feed disturbance measurements back into core RTD program as constraints and/or warm-start with disturbance-aware repaired trajectory parameters. \[ \min_k \, J(k) \; \text{s.t. } k \text{ is obstacle-free},\;\; k_0 = k^{\text{wind}} \]

P2.2 — Learned GP Disturbance Bounds

Current. Simulated disturbance known at every iteration.

Not viable for real-world implementations.

Proposed. Utilize time-varying GPs as in [6].

Learn the disturbance online from a measured residual between estimated acceleration and measured acceleration.

Pay-off. Learns from flight data.

More recent measurements weighted more strongly. SITL precedent in gale-force conditions. Hardware precedent in IFL against an industrial fan.

Setup for hardware experiments in the Indoor Flight Lab at Georgia Tech. The quadrotor (blue), tracked by a motion capture system (red), landing pad (yellow), while in the presence of wind created by an industrial fan (green).

P2.3 — Hardware Validation

Step 1: GTernal ground robot — port simulation case studies, characterize real disturbances, validate repair online
Step 2: Holybro X500 quadrotor — challenges: higher-dimensional MMR, tube blows up quickly

Will be first hardware demonstrations of RTD-RAX.

P2.4 — RTD-RAX for STL Missions

Extend from spatial safety to full STL mission specifications by learning which parameters map onto given STL formulas:

Offline. Learn mapping between trajectory parameter space and specification-observing trajectories.
Online. Use this map instead of core RTD optimization until robustness falls below desired value, or higher-level safety concerns become more important
- MMR verifier always ensures safety vis-a-vis obstacles with disturbance-awareness

What STL adds beyond “avoid obstacles”: timed, sequenced, conditional objectives.

Reach $G$ by $T$: $\Diamond_{[0,T]}(x{\in}G)$
Visit $A$ then $B$: $\Diamond_{[0,T]}(x{\in}A \land \Diamond_{[0,T]}(x{\in}B))$
Avoid $O$ until done: $\lnot(x{\in}O)\,\mathcal{U}\,(x{\in}G)$

P2.5 — Learning-Accelerated STL (RL-STL)

STL-based control synthesis typically requires a mixed-integer program (MILP) — expensive to solve every step.

Pipeline

Train an RL policy $\pi_\theta$ offline to imitate MILP solutions over a distribution of environments and specifications
At runtime: run $\pi_\theta$ each step — fast policy inference
Re-solve the MILP only when STL robustness $\rho_\varphi$ drops below threshold $\rho_{\text{trig}}$

Online compute dominated by policy inference rather than solving MILP.
Safety maintained because MILP re-solve is triggered by robustness degradation.
Threshold $\rho_{\text{trig}}$ introduces a compute–conservatism tradeoff.

Runtime pipeline: STL-robustness monitor drives an SPDT switch on $\rho_\varphi \geq \rho_{\min}$ — RL-STL $\pi_\theta$ when satisfied, MILP re-solve when violated.

Thrust 3 — Proposed Work

Thrust III · Secure Control

Roadmap

Preliminary Work

→

⇣

Proposed Work

→

→