A new paper claims to solve the "demonstration scarcity" problem with synthetic data
One of robotics' oldest bottlenecks may have a real solution. Or it may not. A new arXiv paper makes a strong case for synthetic demonstration data.
Image credit: Photo by Aaron Burden on Unsplash · source
Demonstration scarcity has been the longest-running bottleneck in robot manipulation research. Models need demonstrations to learn from; demonstrations come from human teleoperation; human teleoperation is expensive and slow. The standard response has been to argue that we need more data.
A new paper on arXiv proposes a different response: generate the data synthetically.
What the paper claims
The authors describe a method for generating synthetic robot demonstrations from a small seed of real demonstrations. The pipeline trains a learned simulator on the seed, then uses the simulator to roll out variations of the demonstrated tasks. The variations include perturbations to object positions, lighting, camera angles, and task instructions.
Models trained on the synthetic data achieve performance comparable to models trained on 10x larger real-demonstration sets, on standard manipulation benchmarks.
If the result holds up, the implications for the field are significant. The demonstration-data flywheel that has constrained robotics for a decade could become significantly cheaper.
Why researchers are sceptical
MIT Technology Review captures the appropriate caution.
Simulator gap is the eternal demon. — Stanford professor (via MIT Technology Review)
Related coverage
More in Research
For five years, imitation learning has dominated practical robotics research. New results suggest reinforcement learning is back, with better tooling.
Isaac Mendez · 18 May · 3 min
A new benchmark suite makes the question of robotic generalisation testable in a way previous benchmarks did not.
Priya Nair · 14 May · 3 min
Researchers have developed a sensor dense enough to let a robot distinguish between fabrics by feel. The applications are immediate.
Isaac Mendez · 4 May · 3 min
Code generation for robot tasks has improved dramatically. The reliability gap between generated and human-written code is narrowing.