When we think of autonomous navigation, the first thing that usually comes to mind is self-driving cars. Although their development has spanned decades, recent years have seen significant advancements.
One important framework that is used ubiquitously in the self-driving car industry is the classification of levels of driving automation. Defined by the Society of Automotive Engineers (SAE) in 2014, this framework remains a standard reference in the field.
While indoor mobile robots have enjoyed nowhere near the fame that self-driving cars have, they’ve evolved substantially in the past decade as well. Driven by staff shortages, service robots are increasingly being deployed across various industries, including hospitality, healthcare, warehouse and logistics, food service, and cleaning.
Relay robots in particular, are being deployed in busy hospitals and hotels across the world. However, unlike automated driving, there is currently no widely adopted standard for levels of autonomous navigation for indoor robots. Our objective is to present such a framework.
Given the inherent availability of a human driver as fallback in self-driving cars, much of the SAE framework is based on the distribution of driving responsibilities between the human driver and the self-driving agent. Level 0 indicates no automation where the human driver is completely in control.
Levels 1, 2, and 3 have varying degrees of partial automation. At Level 4, the vehicle is fully self-driving, but only under certain defined conditions. Leading self-driving companies like Waymo have achieved this level of autonomy.
Finally, Level 5 is full automation everywhere and in all conditions. This level has not been achieved yet.
What influences levels of autonomous navigation for indoor robots?
Installation complexity
Indoor robots do not have an inherent partnership with a human driver. Essentially, they begin at Level 4 of the SAE framework in this regard. But indoor robots do have a different advantage, another crutch to rely on instead at initial levels of autonomy — the ability to modify their environment.
For example, modifying a building’s infrastructure by painting lines on the floor or placing landmarks on the walls is not as difficult relative to modifying all road infrastructure. Such markers can be very helpful aids for automated guided vehicle (AGV) navigation.
In general, indoor robots today go through an installation process before being put into operation. In addition to modifying building infrastructure, mapping, labeling, and other required setup can be a part of this process. This can often be cost-, time-, and labor-intensive.
The more advanced the navigation skills of the robot though, the less complicated the installation process tends to be. And lower installation complexity leads to lower cost and friction for adoption.
Installation complexity is thus an important factor to consider while defining the levels of autonomous navigation for indoor robots.
Social navigation
Another major distinction between self-driving cars and indoor autonomous robots is of course the difference in environments. With the exception of factory-like environments, most indoor environments are very unstructured. There are no lanes or signals, no dedicated crosswalks for people, and no well defined rules of the road.
Instead, indoor environments are highly social spaces. Robots have to co-navigate with all other agents, human and robot, that are also using the space. Well-defined rules of the road are replaced by a loosely defined set of social rules that change based on country, environment, situation and many other factors. For instance, do robots, people, or other vehicles pass on the left or the right?
Successfully navigating in these highly unstructured and social environments requires skills and behaviors that are usually placed under the label “social navigation.” At a high level, social navigation is a set of behaviors that allows a robot to navigate in human-populated environments in a way that preserves or even enhances the experience of the humans around it.
While functional navigation focuses on safety and efficiency, resulting in robots that can complete a task but often need humans to adapt to them, social navigation focuses on the quality of human experience and allows robots to adapt to humans. This may not be crucial for controlled, human-sparse environments like factories and warehouses but becomes increasingly important for unstructured, human-populated environments.
Know your operational domain
A robot’s operational domain is the kinds of environments it can be successful in. Not all indoor environments are the same. Different environments have different needs and might require different levels of navigation sophistication.
For instance, warehouses and factories allow for robots with simpler, safety focused navigation to be successful. On the other hand, environments like hotels or restaurants are unstructured, unpredictable and require higher levels of navigation skill, particularly social navigation. Even more challenging are highly crowded environments or sensitive environments like hospitals and elder care homes.
Not every indoor environment requires a robot of the highest social navigation level, but placing a robot with low social navigation skill in environments like hospitals can result in poor performance. So it is important to define the operational domain of a robot.
Multi-floor autonomous navigation
Self-driving cars need only worry about single-level roads. But a large number of buildings in the world are multi-floor, and robots need to be able to traverse those floors to be effective. Overcoming this challenge of vertical navigation can result in a huge increase in a robot’s operational domain and is an important factor to consider when defining a robot’s level.
So installation complexity, social navigation, and operational domain are the three barometers against which we can measure the level of autonomous navigation for indoor robots.
Multi-floor navigation, while hugely important, is somewhat orthogonal to 2D navigation skill and robots of every navigation level could potentially access it. So we create a level modifier for this capability that could be added to any level.
With that, let’s dive into defining levels of indoor robot navigation.
Levels of autonomous navigation for indoor robots
Level 0
These are robots that have no autonomous navigation capabilities and rely entirely on humans to operate them. Robots that fall into this category are telepresence robots and remote controlled robots like remote-controlled cars.
Level 1
Robots that have a minimal sensor suite and can only navigate on paths that are predefined using physical mechanisms like wires buried in the floor, magnetic tape or paint. These Level 1 robots have no ability to leave these predefined paths.
Such AGVs have no concept of location, using only the distance traveled along the path to make decisions. They can typically detect obstacles and slow down or stop for them, but they do not have the ability to avoid obstacles.
Level 1 robots need extensive changes to a building’s infrastructure during installation leading to significant cost. They have almost no social navigation capability, and so their operational domain is mainly highly structured and controlled manufacturing and logistics environments.
Level 2
Robots operating at Level 2 are AGVs that do not need physical path definition but still rely on paths that are digitally defined during installation. These mobile robots can localize themselves within a site using external aids such as reflectors, fiducials or beacons that are placed in strategic locations at the site. They can use this location to follow the virtually defined paths.
Like Level 1 robots, these robots also cannot leave their virtual predefined paths and can only detect and stop for obstacles but cannot avoid them.
Although the infrastructure changes required are not as intrusive as Level 1, because of the need for installation of external localization sources, these robots have moderate complexity of installation. The fixed paths mean that they have low social navigation skill and are still best used in relatively structured environments with little to no interaction with humans.
Level 3
Robots operating at Level 3 rely entirely on onboard sensors for navigation. They use lidars and/or cameras to form a map of their environment and localize themselves within it. Using this map, they can plan their own paths through the site. They can also dynamically change their path if they detect obstacles on it. So they can not only detect obstacles, but can also avoid them.
This independence and flexibility of Level 3 robots results in moderate social navigation skills and significantly reduced installation complexity since no infrastructure changes are required.
Level 3 robots can be used in unstructured environments where they can navigate alongside humans. They represent a significant increase in intelligence, and systems of this level and higher are called autonomous mobile robots (AMRs). Most modern service robots belong to this category.
Level 4
Even though robots of Level 3 cross the threshold of navigating in unstructured environments alongside humans, they still navigate with moderate social navigation skill. They do not have the advanced social navigation skills needed to adapt to all human interaction scenarios with sophistication. This sometimes requires the humans it interacts with to compensate for its behavioral limitations.
In contrast, Level 4 robots are AMRs with social navigation skills evolved enough to be on par with humans. They can capably navigate in any indoor environment in any situation provided there aren’t any physical limitations.
This means that their operational domain can include all indoor environments. Another ramification of this is that Level 4 robots should never need human intervention to navigate.
This level has not yet been fully achieved, and defining and evaluating everything that is required for such sophisticated social navigation is challenging and remains an active area of research. Here is an infographic from a recent attempt to capture all the facets of social navigation:
To navigate capably in all indoor environments, robots need to be able to optimize within a complex, ill-defined, and constantly changing set of rules. This is something that humans handle effortlessly and often without conscious thought, but that ease belies a lot of complexity. Here are a few challenges that lie on the path to achieving human-level social navigation –
- Proxemics: Every person has a space around them that is considered personal space. Invading that space can make them uncomfortable, and robots need to respect that while navigating. However, the size and shape of this space bubble can vary based on culture, environment, situation, crowd density, age, gender, etc. For example, a person with a walker might need a larger-than-average space bubble around them for comfort, but this space has to shrink considerably when taking an elevator. Specifying rules for every situation can quickly become intractable.
- Shared resources: The use of doors, elevators, and other shared resources in a building have their own implicit set of rules. Navigation patterns that hold for the rest of the building might not apply here. In addition, robots need to follow certain social norms while using these resources. Opening doors for others is considered polite. Waiting for people to exit an elevator before trying to enter, making space for people trying to get off a crowded elevator, or even temporarily getting off the elevator entirely to make space for people to exit are common courtesies that robots need to observe.
- Communicating intent: Robots need to be able to communicate their intent while co-navigating with other agents. Not doing so can sometimes create uncertainty and confusion. Humans do this with body language, eye contact, or verbal communication. We rely on this particularly when we find ourselves in deadlock situations like walking toward another person in a narrow corridor or when approaching the same door at the same time. Robots also need to be able to resolve situations like these while preserving the safety and comfort of the humans they’re interacting with.
All in all, achieving this level of social navigation is extremely challenging. While some Level 3 robots may have partially solved some of these problems, there is still quite a ways to go to reach true Level 4 autonomy.
Level 5
As humans, we are able to find our way even in new, unfamiliar buildings by relying on signage, using semantic knowledge, and by asking for directions when necessary. Robots today cannot do this. At the very least, the site needs to be fully mapped during installation.
Level 5 robots are robots that could navigate in all indoor environments on par with human skill, as well as do so in a completely new environment without detailed prebuilt maps and a manually intensive installation process. This would remove installation complexity entirely, allowing robots to be operational in new environments instantly, reducing friction for adoption, and paving the way for robots to become more widespread.
This is a missing level in the framework for self-driving cars as they also go through a similar process where high precision 3D maps of an area are created and annotated before a self-driving car can operate in it. Developments in artificial intelligence could help realize Level 5 capability.
Multi-floor autonomous navigation+
Robots that can either climb stairs or that can call, board, and leave elevators unlock the ability to do multi-floor navigation and get the “plus” designation. Also, highly reliable sensors are required to detect and avoid safety hazards like staircases and escalators for any robot that operates in multi-floor buildings. So a Level 2 robot that can successfully ride elevators would be designated Level 2+.
Elevator riding is the more common of the two approaches to this capability and may require infrastructure changes to the elevator system to achieve. So this introduces additional installation complexity.
It is also worth noting that in human-populated environments, elevators provide robots an additional social navigation challenge. This is because it requires movement in a confined space with many other agents, tight time constraints for elevator entry and exit, and dealing with special behavioral patterns that humans engage in while riding elevators.
In summary, robots of Levels 1 and 2 rely heavily on infrastructure changes for navigation and have low social navigation, so they are best suited for structured, human-sparse environments.
Robots of Level 3 are more intelligent and self-reliant. They require almost no infrastructure changes during installation, but at minimum they require the environment to be mapped and labeled. They possess moderate social navigation skills and can operate in unstructured, human-populated environments.
Level 4 represents an advancement to human-level navigation skill allowing for safe deployment in any indoor environment. Level 5 robots take this a step further, navigating with the same proficiency even in entirely new, unfamiliar spaces. Any of these robots that can do multi-floor navigation get the additional “+” designation.
Autonomous navigation must be reliable
A crucial factor for success that is not represented in this framework is the overall robustness and reliability of the product. It is easy to underestimate the complexity and unpredictability of real-world environments. Robotic systems typically take several years of field experience to go from a cool lab demonstration to a robust and reliable product that people can rely on.
For example, Relay Robotics offers Level 3+ robots that have already completed over 1.5 million successful deliveries and accumulated years of real-world operational experience. With this mature technology as a foundation, the company is making strides toward Level 4+ navigation.
Relay’s focus on creating sophisticated social navigation that can handle even busy and stressful environments like hospital emergency departments has made our AMRs among the most sophisticated on the market today. For the company and the broader industry, the key to advancing further lies in enhancing social navigation capabilities.
Even though there is still much work to do, Relay Robotics is using breakthroughs in AI and deep learning to get there.
About the authors
Sonali Deshpande is senior navigation engineer at Relay Robotics. Prior to that, she was a robotics software engineer at Mayfield Robotics, a perception systems engineer at General Motors, and a robotics engineer at Discovery Robotics.
Deshpande has a master’s in robotic systems development from Carnegie Mellon University.
Jim Slater is a robotics engineer at Relay Robotics. This article is posted with permission.