ROS 2 Has a Nondeterminism Problem. Two New Papers Want to Fix It.
The middleware that powers most research robots can't guarantee the same input produces the same output. That's a problem when you're shipping AI-controlled machines into the real world.
By
·3 days ago·5 min read
Why does your robot do something different every time you run the same code?
If you've spent any time with ROS 2, you've probably asked yourself this question. Maybe you've blamed the network, or your sensor drivers, or that one callback that seems to fire whenever it feels like it. But the truth is simpler and more uncomfortable: the middleware itself is fundamentally nondeterministic. The order callbacks execute isn't guaranteed. Distributed deployments add more chaos from message interleaving and network latency. Run the same program twice, get different behavior twice.
For research prototypes, this is annoying. For safety-critical deployed systems, it's potentially catastrophic. And now that we're shoving large AI models into the control loop (vision-language-action models, learned policies, the whole Physical AI menagerie), the problem is getting worse, not better.
Two papers dropped on arXiv this week that tackle this head-on, and they're worth reading together because they represent two different philosophies for solving what is essentially the same problem.
The first paper, from researchers working with Lingua Franca, a coordination language developed at UC Berkeley, presents a framework that can take an unmodified ROS 2 application and run it under their system to guarantee deterministic execution. Same input, same execution order, every time.
The key insight is that ROS 2's publish-subscribe pattern, for all its flexibility, is the source of the chaos. Callbacks get dispatched by executors in whatever order the scheduler decides, and when you distribute across nodes, you're adding network timing jitter on top of that. The Lingua Franca approach uses logical time (a concept that's been around in distributed systems for decades, by the way) to impose order on this mess.
Related coverage
More in Autonomy
Three new navigation papers tackle the same ugly problem: robots that trust bad visual information too much. The fix isn't more AI horsepower. It's teaching machines a little epistemic humility.
Mark Kowalski · 5 hours ago · 6 min
Researchers want large language models to rewrite the cost functions that govern how self-driving cars move. Bob Macintosh has some thoughts.
Robert "Bob" Macintosh · 5 hours ago · 4 min
Separate research teams tackled GPS-denied exploration from different angles this week, and together they paint a picture of where robot autonomy is actually heading.
Sarah Williams · 5 hours ago · 6 min
They tested it on the Autoware reference system, which is about as close to a real autonomous vehicle stack as you get in open source. The results are what you'd expect: vanilla ROS 2 showed callbacks executing in different orders across runs, with end-to-end latencies that varied significantly. Under their LF-controlled system, execution order was consistent and latencies were predictable.
Now, I've seen determinism claims before. I covered real-time operating systems back when QNX was the hot thing and everyone was promising microsecond-level guarantees. The question is always the same: what do you give up? In this case, the paper is honest that not every ROS 2 feature can be executed deterministically under logical time. There are constraints. But the fact that they can do automatic conversion without modifying the ROS 2 code is, I have to admit, pretty clever.
The second paper takes a different angle. Instead of trying to make ROS 2 deterministic after the fact, it argues that robot middleware needs to evolve into what the AI agent community calls a "harness", the external system that mediates tools, manages state, bounds resources, and records execution.
This framing comes from the language model agent world, where researchers have been grappling with how to safely let AI models interact with external tools. The robotics community, according to these authors, hasn't adopted this framing yet, and it should.
Their argument is that a Physical AI harness has to mediate at three levels simultaneously: control (what commands get sent to actuators), computing (how much inference time the model gets), and communication (how much bandwidth it consumes). A learned policy's output crosses all three dimensions. Robot middleware, they claim, is the only layer in the stack with abstractions over all three, so it's the natural place to build enforcement.
What enforcement? They propose three functions: Projection (gate each output at emission), Isolation (bound the model's execution and transmission slot), and Transfer (fall back to a verified baseline when checks fail). If this sounds like safety engineering 101, well, it is. The point is that these functions exist today as hand-built application code scattered across deployed systems. The paper argues they should be standardized as a "ROS 2 Harness Profile."
Here's the thing that strikes me about both papers: they're describing problems that the autonomous vehicle industry has been wrestling with for a decade. Deterministic execution? Waymo and Cruise built custom middleware stacks precisely because ROS wasn't cutting it for safety-critical applications. Resource bounding for AI inference? Every AV company has horror stories about a perception model taking too long and the planning stack running on stale data.
I've seen this movie before. Academic robotics builds on ROS because it's convenient and well-documented. Industry builds custom systems because ROS doesn't meet their requirements. Then, years later, the academic community rediscovers the problems industry already solved (often in proprietary, unpublished ways) and proposes solutions that look suspiciously similar.
That's not a criticism, exactly. It's just how the field works. And to be fair, both papers are doing something valuable: they're making these solutions available to the broader community, not locked up inside some company's internal codebase.
The timing question is whether any of this matters for the current wave of humanoid and general-purpose robot startups. Most of them are building on ROS 2 because, well, what else are you going to use? But they're also moving fast and shipping robots into warehouses and factories where "nondeterministic behavior" is not exactly a selling point.
My guess (and it's just a guess, call me old-fashioned but I prefer to wait for actual deployment data) is that we'll see a bifurcation. Research labs and early-stage startups will keep using vanilla ROS 2 because it's good enough for prototyping. Companies shipping safety-critical systems will either adopt frameworks like Lingua Franca, build their own harness layers, or abandon ROS entirely.
The Harness Profile idea is interesting because it's trying to prevent that abandonment. If ROS 2 can evolve to meet the requirements of Physical AI deployment, maybe the ecosystem stays unified. If it can't, we'll end up with the same fragmentation we saw in the autonomous vehicle space, where everyone built their own thing and interoperability went out the window.
It's too early to say which way this goes. But if you're building robots with AI in the loop and you haven't thought about determinism and resource bounding, these papers are worth your time. The problems they describe are real, and they're only going to get worse as models get bigger and deployment stakes get higher.
If you want to argue about any of this, my email's on the about page.