2024

Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation
Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation

Ian Chuang*, Andrew Lee*, Dechen Gao, Iman Soltani (* equal contribution)

Workshop on Whole-body Control and Bimanual Manipulation @ CoRL 2024
International Conference on Robotics and Automation (ICRA) 2025

We introduce AV-ALOHA, a new bimanual teleoperation robot system that extends the ALOHA 2 robot system with Active Vision. This system provides an immersive teleoperation experience, with bimanual first-person control, enabling the operator to dynamically explore and search the scene and simultaneously interact with the environment. We conduct imitation learning experiments and our results show significant improvements over fixed cameras in tasks with limited visibility.

Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation

Ian Chuang*, Andrew Lee*, Dechen Gao, Iman Soltani (* equal contribution)

Workshop on Whole-body Control and Bimanual Manipulation @ CoRL 2024
International Conference on Robotics and Automation (ICRA) 2025

We introduce AV-ALOHA, a new bimanual teleoperation robot system that extends the ALOHA 2 robot system with Active Vision. This system provides an immersive teleoperation experience, with bimanual first-person control, enabling the operator to dynamically explore and search the scene and simultaneously interact with the environment. We conduct imitation learning experiments and our results show significant improvements over fixed cameras in tasks with limited visibility.

InterACT: Inter-dependency Aware Action Chunking with Hierarchical Attention Transformers for Bimanual Manipulation
InterACT: Inter-dependency Aware Action Chunking with Hierarchical Attention Transformers for Bimanual Manipulation

Andrew Lee, Ian Chuang, Ling-Yuan Chen, Iman Soltani

Conference on Robot Learning (CoRL) 2024

InterACT is an imitation learning model that captures and extracts inter-dependencies between dual-arm joint positions and visual inputs. By doing so, InterACT guides the two arms to perform bimanual tasks with precision—independently yet in seamless coordination.

InterACT: Inter-dependency Aware Action Chunking with Hierarchical Attention Transformers for Bimanual Manipulation

Andrew Lee, Ian Chuang, Ling-Yuan Chen, Iman Soltani

Conference on Robot Learning (CoRL) 2024

InterACT is an imitation learning model that captures and extracts inter-dependencies between dual-arm joint positions and visual inputs. By doing so, InterACT guides the two arms to perform bimanual tasks with precision—independently yet in seamless coordination.