ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots
Source: ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots from Google Research
Posted by Michael Ahn, Software Engineer and Vikash Kumar, Research Scientist, Robotics at Google
Learning-based methods for solving robotic control problems have recently seen significant momentum, driven by the widening availability of simulated benchmarks (like dm_control or OpenAI-Gym) and advancements in flexible and scalable reinforcement learning techniques (DDPG, QT-Opt, or Soft Actor-Critic). While learning through simulation is effective, these simulated environments often encounter difficulty in deploying to real-world robots due to factors such as inaccurate modeling of physical phenomena and system delays. This motivates the need to develop robotic control solutions directly in the real world, on real physical hardware.
The majority of current robotics research on physical hardware is conducted on high-cost, industrial-quality robots (PR2, Kuka-arms, ShadowHand, Baxter, etc.) intended for precise, monitored operation in controlled environments. Furthermore, these robots are designed around traditional control methods that focus on precision, repeatability, and ease of characterization. This stands in sharp contrast with the learning-based methods that are robust to imperfect sensing and actuation, and demand (a) a high degree of resilience to allow real-world trial-and-error learning, (b) low cost and ease of maintenance to enable scalability through replication and (c) a reliable reset mechanism to alleviate strict human monitoring requirements.
In “ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots”, to be presented at CoRL 2019, we introduce an open-source platform of cost-effective robots and curated benchmarks designed primarily to facilitate research and development on physical hardware in the real world. Analogous to an optical table in the field of optics, ROBEL serves as a rapid experimentation platform, supporting a wide range of experimental needs and the development of new reinforcement learning and control methods. ROBEL consists of D’Claw, a three-fingered hand robot that facilitates learning of dexterous manipulation tasks and D’Kitty, a four-legged robot that enables the learning of agile legged locomotion tasks. The robotic platforms are low-cost, modular, easy to maintain, and are robust enough to sustain on-hardware reinforcement learning from scratch.
In order to make the robots relatively inexpensive and easy to build, we based ROBEL’s designs on off-the-shelf components and commonly-available prototyping tools (3D-printed or laser cut). Designs are easy to assemble and require only a few hours to build. Detailed part lists (with CAD details), assembly instructions, and software instructions for getting started are available here.
We devised a set of tasks suitable for each platform, D’Claw and D’Kitty, which can be used for benchmarking real-world robotic learning. ROBEL’s task definitions include both dense and sparse task objectives, and introduce metrics for hardware-safety in the task definition, which for example, indicate if joints are exceeding “safe” operating bounds or force thresholds. ROBEL also supports a simulator for all tasks to facilitate algorithmic development and rapid prototyping. D’Claw tasks are centered around three commonly observed manipulation behaviors — Pose, Turn, and Screw.
D’Kitty tasks are centered around three commonly observed locomotion behaviors — Stand, Orient, and Walk.
We evaluated several classes (on-policy, off policy, demo-accelerated, supervised) of deep reinforcement learning methods on each of these benchmark tasks. The evaluation results and the final policies are included as baselines in the software package for comparison. Full task details and baseline performances are available in the technical report.
Reproducibility & Robustness
ROBEL platforms are robust to sustain direct hardware training, and have clocked over 14,000 hours of real-world experience to-date. The platforms have significantly matured over the year. Owing to the modularity of the design, repairs are trivial and require minimal to no domain expertise, making the overall system easy to maintain.
To establish the replicability of the platforms and reproducibility of the benchmarks, ROBEL was studied in isolation by two different research labs. Only software distribution and documentation was used in this study. No in-person visits were allowed. Using ROBEL’s design files and assembly instructions both sites were able to replicate both hardware platforms. Benchmark tasks were trained on robots built at both sites. In the figure below we see that two D’Claw robots built at two different sites not only exhibit similar training progress but also converge to the same final performance, establishing reproducibility of the ROBEL benchmarks.
ROBEL has been useful in a variety of reinforcement learning studies so far. Below we highlight a few of the key results, and you can find all our results in this comprehensive gallery. D’Claw platforms are completely autonomous and can sustain reliable experimentation for an extended period of time, and has facilitated experimentation with a wide variety of reinforcement learning paradigms and tasks using both rigid and flexible objects.
Importantly, D’Claw platforms are modular and easy to replicate, which facilitates scalable experimentation. With our scaled setup, we find that multiple D’Claws can collectively learn tasks faster by sharing experience.
We have also been successful in deploying robust locomotion policies on the D’Kitty platform. Below we show a blind D’Kitty walking over indoor and outdoor terrains exhibiting the robustness of its gait in presence of unseen disturbances.
When presented with information about its torso and objects present in the scene, D’Kitty can learn to interact with these objects exhibiting complex behaviors.
In conclusion, ROBEL platforms are low cost, robust, reliable and are designed to accommodate the needs of the emerging learning-based paradigms that need scalability and resilience. We are proud to announce the release of ROBEL to the open source community and are excited to learn about the diversity of research and experimentation they will enable. For getting started on ROBEL platforms and ROBEL benchmarks refer to roboticsbenchmarks.org.
Google’s ROBEL D’Claw evolved from earlier designs Vikash Kumar developed at the Universities of Washington and Berkeley. Multiple people across organizations have contributed towards the ROBEL projects. We thank our co-authors Henry Zhu (UC Berkeley), Kristian Hartikainen (UC Berkeley), Abhishek Gupta (UC Berkeley) and Sergey Levine (Google and UC Berkeley) for their contributions and extensive feedback throughout the project. We would like to acknowledge Matt Neiss (Google) and Chad Richards (Google) for their significant contribution to the platform designs. We would also like to thank Aravind Rajeshwaran (U-Washington), Emo Todorov (U-Washington), and Vincent Vanhoucke (Google) for their helpful discussions and comments throughout the project.
Posted by Michael Ahn, Software Engineer and Vikash Kumar, Research Scientist, Robotics at Google Learning-based methods for solving robotic control problems have recently seen significant momentum, driven...
本次博文是cousera的第三次课程，主要涉及slam建图，没有特别复杂的算法，几乎都是概念理解。 我们知道地图，是对机器人所在环境的空间建模(spatial model)。那么，机器人是怎么理解地图得？地图该包含哪些信息？使用什么样的坐...
本次课程讲解，假设在给定地图的情况下，如何根据机器人测量进行定位，如何使用机器人定位算法估计每个时刻机器人的位姿。作业是利用粒子滤波求解机器人定位或跟踪问题。 一、里程模型Odometry Model 1.各种定位传感器...
README.md S-RL Toolbox: Reinforcement Learning (RL) and State Representation Learning (SRL) Toolbox for Robotics This repository was made to evaluate S...
Soft Actor-Critic: Deep Reinforcement Learning for Robotics 2019-01-19...
It seems like robocalls are only getting worse, but there are several ways of reducing the scam calls and these annoying, constant interruptions.
Teaching Uncalibrated Robots to Visually Self-Adapt 2018-06-23
A new generation of robot systems which are modular, flexible and safe for human-robot interaction are needed. Existing cobots seem to meet only the later and require a modular approach to improve their reconfigurability...
No Comments Until recently I was hesitant to recommend UI tests, as far as I was concerned they were brittle and took far t...