Generalization in robot learning
相较于传统CV / NLP任务, Robot learning的task更需要强泛化性,因为机器人任务的数据获取更加困难,条件更加严苛。
- Cross-environment
- Capture / Location / Objects / Lighting / Background / Geometry
- Cross-task (Open-X)
- Cross-embodiment
How to improve Generalization in robot learning?
- learning from large and diverse datasets / data augmentation ..
- RT-2 / RT-Trajectory | paper里有更多文献整理
小型泛化场景 (暂不讨论cross-task / cross embodiment):
- 不同摄像设备
- 同一场景,不同材质(厨房光面糙面)、颜色、光照等
- 不同场景同一任务(书房和厨房;都是抓取物体)
- 抓取不同物体 (球 → 正方体)
- 静态物体 → 动态物体
- 视觉无关信息干扰 (distractors)
Robot Foundation Models
Vision-Language-Action Model (Input: Vision/Language | Output: Actions)
Robot Foundation Models