MIT has introduced a revolutionary approach to robot training, diverging from traditional methods by adopting a strategy inspired by the success of large language models like GPT-4. This new model involves a comprehensive data accumulation strategy, where robots are trained not just on specific, targeted tasks but on vast, diverse datasets to enhance their adaptability and problem-solving capabilities.
This approach, named Heterogeneous Pretrained Transformers (HPT), was developed to address the limitations of standard imitation learning where robots often fail when facing minor environmental changes or unexpected obstacles.
Traditional models trained on a limited dataset struggle to adapt because they lack the depth of data necessary to cope with variability. By contrast, HPT uses an architecture designed to integrate information from an array of sensors and settings, creating a rich, varied learning environment for robots.
Lirui Wang, the lead author of the study, explains that while in language models, the data is relatively uniform (sentences), robotics deals with a much more heterogeneous set of data, including visual inputs, sensor data from different types of environments, and various task-related information.
To handle this complexity, the HPT framework employs transformers, a type of neural network architecture known for its prowess in handling sequences in natural language processing, to weave together this diverse data.
The size of the transformer model plays a crucial role; the larger the model, the more nuanced and effective the learning can be. This method allows robots to be initially trained on a broad spectrum of scenarios, equipping them with a more robust set of skills that can be fine-tuned for specific tasks. Users can specify the robot’s design, its operational environment, and the task at hand, making the training process highly adaptable.
David Held, an associate professor at Carnegie Mellon University, expressed an ambitious vision for this technology: envisioning a future where robots could come pre-equipped with a “universal robot brain,” ready to perform tasks without extensive on-site training. This aspiration underscores the transformative potential of HPT, suggesting a future where robots could be as versatile and adaptable in physical tasks as AI models are in generating text.
This research, partially funded by the Toyota Research Institute (TRI), reflects a broader trend in robotics toward leveraging AI for more autonomous and flexible robotic operations.
TRI’s previous work in overnight robot training, combined with their recent collaboration with Boston Dynamics, highlights the industry’s move towards integrating sophisticated AI learning with advanced robotic hardware, potentially revolutionizing how robots are developed and deployed across various sectors.
h/t Techcrunch
Leave a Reply