Berkeley Artificial Intelligence Research reposted this
Can we make Transformers better and more efficient for robot learning? Excited to introduce Body Transformer (BoT) 🤖, an architecture that leverages robot embodiment in the attention mechanism, by treating it as a graph of sensors and actuators. Corrective localized actuation is crucial for efficient locomotion and manipulation (e.g., humans use ankles to correct for imbalance at the feet). Robot policies do not typically exploit such spatial interrelations, mostly reusing architectures developed for NLP or computer vision. In practice, we separate observations and actions into a graph of nodes representing the sensors and actuators spread across the robot body. Then, we use masked attention to make sure that, at each layer of the Body Transformer, a node can only attend to itself and its neighbors. Information propagates throughout the graph over different layers. Provided that we have a sufficient number of layers, this simply guides the learning process, without compromising the representation power of the architecture. It makes for a ‘flexible’ but strong inductive bias! BoT surpasses MLP and Transformer baselines on both imitation and reinforcement learning. It shows better generalization and strong scaling properties, as well as potential for much more efficient learning (up to 2x less FLOPs in the attention mechanism). We deployed BoT on a real robot (Unitree A1), sim-to-real, showing its feasibility for real-world deployment! A lot more details in the paper! Work done at Berkeley Artificial Intelligence Research, with absolutely great collaborators Dun-Ming H. Fangchen Liu Jongmin Lee Pieter Abbeel Website: https://lnkd.in/d8a4r2j6 Arxiv: https://lnkd.in/ddfbJjmw Code: https://lnkd.in/dqjVQx4N