CAMBRIDGE, MA—Engineers at the Massachusetts Institute of Technology (MIT) have developed a way for machines to use whole-body manipulation instead of relying on grippers and other types of end-of-arm tools.
Contact-rich manipulation planning uses an AI technique called “smoothing,” which summarizes many contact events into a smaller number of decisions, to enable even a simple algorithm to quickly identify an effective manipulation plan for the robot.
Reinforcement learning is a machine-learning technique where an agent, like a robot, learns to complete a task through trial and error with a reward for getting closer to a goal. This type of learning takes a black-box approach, because the system must learn everything about the world through trial and error. It has been used effectively for contact-rich manipulation planning, where the robot seeks to learn the best way to move an object in a specified manner.
But, because there may be billions of potential contact points that a robot must reason about when determining how to use its fingers, hands, arms and body to interact with an object, this trial-and-error approach requires a great deal of computation.
“Rather than thinking about this as a black-box system, if we can leverage the structure of these kinds of robotic systems using models, there is an opportunity to accelerate the whole procedure of trying to make these decisions and come up with contact-rich plans,” says H.J. Terry Suh, an electrical engineering and computer science graduate student at MIT who’s working on the project. “Reinforcement learning may need to go through millions of years in simulation time to actually be able to learn a policy.”
Suh and his colleagues conducted a detailed analysis and found that smoothing enables reinforcement learning to perform well. It averages away unimportant, intermediate decisions, leaving a few important ones. Reinforcement learning performs smoothing implicitly by trying many contact points and then computing a weighted average of the results.
Drawing on this insight, the MIT engineers designed a simple model that performs a similar type of smoothing, enabling it to focus on core robot-object interactions and predict long-term behavior. They discovered that this approach can be just as effective as reinforcement learning at generating complex plans.
Even though smoothing greatly simplifies the decisions, searching through the remaining decisions can still be a difficult problem. So, the researchers combined their model with an algorithm that can rapidly and efficiently search through all possible decisions the robot could make. With this combination, the computation time was cut down to about 1 minute on a standard laptop.
Suh believes this new method could potentially enable manufacturers to use smaller, mobile robots that can manipulate objects with their entire arms or bodies, rather than large robotic arms that can only grasp using fingertips.