Learning from Symmetry: Meta-Reinforcement Learning with
Symmetrical Behaviors and Language Instructions
Xiangtong Yao, Zhenshan Bing, Genghang Zhuang, Kejia Chen, Hongkuan Zhou, Kai Huang, and Alois Knoll
Video of simulation and real-world experiments
Real-world experiments
![scene.jpg](https://static.wixstatic.com/media/b33402_3ea11cf64f6b46d1afbcd370dba3ca35~mv2.jpg/v1/fill/w_540,h_305,al_c,q_80,usm_0.66_1.00_0.01,enc_auto/scene.jpg)
Real-world experiment scenario. We use Intel RealSense D435i cameras to locate the red object.
Push-right and Push-left
Push-right tasks
Task goal position: [0.1, 0.7, 0]
Task goal position: [0.05, 0.65, 0]
Push-left tasks
Task goal position: [-0.1, 0.7, 0]
Task goal position: [-0.05, 0.65, 0]
We adapt Panda robot to Meta-world environment. For more information, please visit
Supplementary materials
![AIRL.jpg](https://static.wixstatic.com/media/b33402_6dce17e2a3784920a2651e25610fb12b~mv2.jpg/v1/fill/w_112,h_83,al_c,q_80,usm_0.66_1.00_0.01,blur_2,enc_auto/AIRL.jpg)
The recovered reward function of other task families
![airl_door-close.gif](https://static.wixstatic.com/media/b33402_835c0eeeb8fe4bef86e61b3de942a0cc~mv2.gif)
Door-close
![airl_drawer-open.gif](https://static.wixstatic.com/media/b33402_ddad004b6c8f496da2a3cd5cc4ee9e81~mv2.gif)
Drawer-open
![airl_faucet_open.gif](https://static.wixstatic.com/media/b33402_bf3232b539794e88a46688b9c23d52c6~mv2.gif)
Faucet-open
![airl_window-close.gif](https://static.wixstatic.com/media/b33402_15532e1eb9e5475f9494460db8b25f5a~mv2.gif)
Window-close
The visualisation of trained AIRL policies, each of which is trained by symmetrical trajectories generated from the Symmetric Data Generator.
![meta-training-task.jpg](https://static.wixstatic.com/media/b33402_d22b4717c9d04cd4b9a788866bdf4747~mv2.jpg/v1/fill/w_112,h_82,al_c,q_80,usm_0.66_1.00_0.01,blur_2,enc_auto/meta-training-task.jpg)
![symmetry task.jpg](https://static.wixstatic.com/media/b33402_2098bbe7de7d41369cf031f827a8f9b9~mv2.jpg/v1/fill/w_118,h_82,al_c,q_80,usm_0.66_1.00_0.01,blur_2,enc_auto/symmetry%20task.jpg)
Language instructions of meta-training tasks and symmetry tasks
![door-open.gif](https://static.wixstatic.com/media/b33402_14fd956336604d31881ace07f5096ecc~mv2.gif)
Door-open
![drawer-close.gif](https://static.wixstatic.com/media/b33402_d32e650a673d412399001bdc80f9c692~mv2.gif)
Drawer-close
![window-open.gif](https://static.wixstatic.com/media/b33402_2542b1ae18384c06b7cb2b48a208e2a6~mv2.gif)
Window-open
![faucet-close.gif](https://static.wixstatic.com/media/b33402_5e6be4cfe74c4f70a3c4febde766aca0~mv2.gif)
Faucet-close
![door-close.gif](https://static.wixstatic.com/media/b33402_8e6233b7fbb64ed9ab81097556393a0c~mv2.gif)
Door-close
![drawer-open.gif](https://static.wixstatic.com/media/b33402_b4bc30a57c0847da9aead0ecedccbc95~mv2.gif)
Drawer-open
![window-close.gif](https://static.wixstatic.com/media/b33402_a59fd66e17f44c80b18a3c5950ef8054~mv2.gif)
Window-close
![faucet-open.gif](https://static.wixstatic.com/media/b33402_c90ee552d1c54f59b3f702e75a1034d8~mv2.gif)
Faucet-open
The visualisation of meta-training tasks and meta-test tasks. The setting of the above tasks are the same as those of the Meta-world benchmark.