As a child, I often accompanied my mother to the grocery store. As she pulled out her card to pay, I heard the same phrase like clockwork: “Go bag the groceries.” It was not my favorite task. Now imagine a world where robots could delicately pack your groceries, and items like bread and eggs are never crushed beneath heavier items. We might be getting closer with RoboGrocery.
Researchers at the Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) have created a new soft robotic system that combines advanced vision technology, motor-based proprioception, soft tactile sensors, and a new algorithm. RobGrocery can handle a continuous stream of unpredictable objects moving along a conveyor belt, they said.
“The challenge here is making immediate decisions about whether to pack an item or not, especially since we make no assumptions about the object as it comes down the conveyor belt,” said Annan Zhang, a Ph.D. student at MIT CSAIL and one of the lead authors on a new paper about RoboGrocery. “Our system measures each item, decides if it’s delicate, and packs it directly or places it in a buffer to pack later.
RoboGrocery demonstrates a light touch
RoboGrocery’s pseudo market tour was a success. In the experimental setup, researchers selected 10 items from a set of previously unseen, realistic grocery items and put them onto a conveyor belt in random order. This process was repeated three times, and the evaluation of “bad packs” was done by counting the number of heavy items placed on delicate items.
The soft robotic system showed off its light touch by performing nine times fewer item-damaging maneuvers than the sensorless baseline, which relied solely on pre-programmed grasping motions without sensory feedback. It also damaged items 4.5 times less than the vision-only approach, which used cameras to identify items but lacked tactile sensing, said MIT CSAIL.
To illustrate how RoboGrocery works, let’s consider an example. A bunch of grapes and a can of soup come down the conveyor belt. First, the RGB-D camera detects the grapes and soup, estimating sizes and positions.
The gripper picks up the grapes, and the soft tactile sensors measure the pressure and deformation, signaling that they’re delicate. The algorithm assigns a high delicacy score and places them in the buffer.
Next, the gripper goes in for the soup. The sensors measure minimal deformation, meaning “not delicate,” so the algorithm assigns a low delicacy score, and packs it directly into the bin.
Once all non-delicate items are packed, RoboGrocery retrieves the grapes from the buffer and carefully places them on top so they aren’t crushed. Throughout the process, a microprocessor handles all sensory data and executes packing decisions in real time.
The researchers tested various grocery items to ensure robustness and reliability. They included delicate items such as bread, clementines, grapes, kale, muffins, chips, and crackers. The team also tested non-delicate items like soup cans, ground coffee, chewing gum, cheese blocks, prepared meal boxes, ice cream containers, and baking soda.
RoboGrocery handles more varied objects than other systems
Traditionally, bin-packing tasks in robotics have focused on rigid, rectangular objects. These methods, though, can fail to handle objects of varying shapes, sizes, and stiffness.
However, with its custom blend of RGB-D cameras, closed-loop control servo motors, and soft tactile sensors, RoboGrocery gets ahead of this, said MIT. The cameras provide depth information and color images to accurately determine the object’s shapes and sizes as they move along the conveyor belt.
The motors offer precise control and feedback, allowing the gripper to adjust its grasp based on the object’s characteristics. Finally, the sensors, integrated into the gripper’s fingers, measure the pressure and deformation of the object, providing data on stiffness and fragility.
Despite its success, there’s always room for improvement. The current heuristic to determine whether an item is delicate is somewhat crude, and could be refined with more advanced sensing technologies and better grippers. “Currently, our grasping methods are quite basic, but enhancing these techniques can lead to significant improvements,” says Zhang. “For example, determining the optimal grasp direction to minimize failed attempts and efficiently handle items placed on the conveyor belt in unfavorable orientations. For example, a cereal box lying flat might be too large to grasp from above, but standing upright, it could be perfectly manageable.”
MIT CSAIL team looks ahead
While the project is still in the research phase, its potential applications could extend beyond grocery packing. The team envisions use in various online packing scenarios, such as packing for a move or in recycling facilities, where the order and properties of objects are unknown.
“This is a significant first step towards having robots pack groceries and other items in real-world settings,” says Zhang. “Although we’re not quite ready for commercial deployment, our research demonstrates the power of integrating multiple sensing modalities in soft robotic systems.”
“Automating grocery packing with robots capable of soft and delicate grasping and high level reasoning like the robot in our project has the potential to impact retail efficiency and open new avenues for innovation”, says senior author Daniela Rus, CSAIL director and professor of electrical engineering and computer science (EECS) at MIT.
“Soft grippers are suitable for grasping objects of various shapes and, when combined with proper sensing and control, they can solve long-lasting robotics problems, like bin packing unknown objects,” adds Cecilia Laschi, Provost’s Chair Professor of robotics at the National University of Singapore, who was not involved in the work. “This is what this paper has demonstrated, bringing soft robotics a step forward towards concrete applications.”
“The authors have addressed a longstanding problem in robotics — the handling of delicate and irregularly-shaped objects — with a holistic and bioinspired approach,” says Harvard University professor of electrical engineering Robert Wood, who was not involved in the paper. “Their use of a combination of vision and tactile sensing parallels how humans accomplish similar tasks and, importantly, sets a benchmark for performance that future manipulation research can build on.”
Zhang co-authored the paper with EECS PhD student Valerie K. Chen ’22 MEng ’23, Jeana Choi ’21 MEng ‘22, and Lillian Chin ‘17 SM ’19 PhD ’23, currently assistant professor at the University of Texas at Austin. The researchers presented their findings at the IEEE International Conference on Soft Robotics (RoboSoft) earlier this year.
About the author
Rachel Gordon is senior communications manager at MIT CSAIL. This article is reposted with permission.