Abstract

Applications of mobile robots are continuously capturing the importance in numerous areas such as agriculture, surveillance, defense, and planetary exploration to name a few. Accurate navigation of a mobile robot is highly significant for its uninterrupted operation. Simultaneous localization and mapping (SLAM) is one of the widely used techniques in mobile robots for localization and navigation. SLAM consists of front- and back-end processes, wherein the front-end includes SLAM sensors. These sensors play a significant role in acquiring accurate environmental information for further processing and mapping. Therefore, understanding the operational limits of the available SLAM sensors and data collection techniques from a single sensor or multisensors is noteworthy. In this article, a detailed literature review of widely used SLAM sensors such as acoustic sensor, RADAR, camera, Light Detection and Ranging (LiDAR), and RGB-D is provided. The performance of SLAM sensors is compared using an analytical hierarchy process (AHP) based on various key indicators such as accuracy, range, cost, working environment, and computational cost.

1. Introduction

Autonomous mobile robots have a wide spectrum of applications in various fields such as agriculture, surveillance, defense, and planetary exploration. For autonomous operation, a robot needs to sense the environment and execute operations without human intervention while using a robust algorithm. Numerous algorithms such as simultaneous localizing and mapping (SLAM), Kalman filters (KF), extended Kalman filters (EKF) [1, 2], particle filters (PF) [3], region of interest (ROI) with the help of nearest neighborhood [4], genetic algorithm [5], neural networks (NN) [6], and recurrent fuzzy neural network [7, 8] are used for autonomous operation of a mobile robot. Localization, path planning, and motion control are the main challenges in the navigation of mobile robots [9], and SLAM can play an elementary role in meeting these challenges [10]. Over the past few decades, SLAM is widely used for autonomous navigation because of flexibility, high accuracy, straightforward implementation, concurrent mapping, and localization without any prior environmental knowledge [11].

SLAM consists of front- and back-end processes. Front-end processes includes sensors while the back-end processes include mapping, localization, data fusion, and actuation. Sensors are highly important to provide accurate environmental information to the robot. This motivates one to understand the limits and capabilities of the available SLAM sensors. Sensors face numerous challenges such as flawless detection of landmarks, compromised performance in a dynamic environment, and fast tracking of real-time features during high-speed motion [12]. These complexities can compromise accurate navigation; therefore, error compensation is indispensable. A multiple sensor approach such as extended Kalman filter (EKF) and particle filter is employed [13] for this purpose. So, in a nutshell, an ideal SLAM’s front end should have characteristics such as accuracy, consistency, high resolution, optimum range, decent computational efficiency, cost-effectiveness, and environmental reliability.

This paper provides a holistic literature review of SLAM sensors which include acoustic, visual, Light Detection and Ranging (LiDAR), radio detection and ranging (RADAR), and RGB-D sensors. Their performance is compared using analytical hierarchy process (AHP) technique on the basis of range, cost, computational complexity, accuracy, and environmental reliability. The results suggest that LiDAR and RADAR are better choices as they provide better range with less computational complexity while their efficiency is independent of the environment. The paper is organized as follows: Sections 2 and 3 provide details on the SLAM method and SLAM sensors, respectively. In Section 4, a detailed literature review on the sensors is given. Section 5 discusses and compares the characteristic of the SLAM sensors. Section 6 provides a comparison of sensors based on AHP, and conclusion is provided in Section 7.

2. Simultaneous Localization and Mapping (SLAM)

Simultaneous localization and mapping (SLAM) is widely used in autonomous mobile robots for self-exploration of environment and landmarks. SLAM was proposed by Smith and Cheeseman in 1986, and now, it is widely used in mobile robots for localization, mapping, and navigation [14]. Over time, numerous improvements were proposed in SLAM. In 1991, a probability technique (Kalman filter) was introduced which confiscates a series of measurements to optimize the control input in order to remove sensor noise [15]. Kalman filter was upgraded to extended Kalman filter (EKF) to deal with nonlinear behavior of the system. In 2002, FastSLAM was introduced which measures individual landmarks independently so that robots could explore their environment [16]. Subsequently, UFastSLAM based on an unscented transformation matrix was developed [17]. Various filters were also implemented such as the extended information filter and extended Kalman filter (EKF) [18].

Simultaneous planning, localization, and mapping (SPLAM) is an emerging algorithm which is used for autonomous navigation of a mobile robot which combines SLAM and path planning techniques (see Figure 1) while using Bellman and shooting methods [19]. The Bellman approach works well for accurate localization, map building, and optimal path planning. Optimal path planning is used for finding the shortest, accurate, and robust path. Path planning is carried out using search techniques such as A, Breathe First Search, Depth-First Search, and Dijkstra’s algorithm. However, access to the map is a prerequisite for path planning.

In SLAM, the robot needs to deal with numerous tasks such as the current state of the robot, sensor measurement, robot control, and environmental map building. Let the current state of the robot be “” that represents its position in the plane. The robot updates its position or state according to its path and control input from to . The input control stimulus or execution of action “” can be measured by varying the robot’s direction and position which acts as the control input (linear velocity or angular velocity of wheels). The current state can be identified from sensor data such as a rotary encoder, GPS, or IMU. “” is another sensor measurement in SLAM, and by virtue of that, the robots can distinguish objects in the environment and construct an artificial map “mi” of the landmarks using an obstacle avoidance sensor [21]. Figure 2 shows the error between the estimated state and the real case with the SLAM algorithm.

SLAM comprises front-end and back-end processes (see Figure 3) for distance measurement, feature prediction, position estimation, and real-time mapping. The front end includes sensors and related functions such as signal conditioning and processing. The back end includes optimization of the mathematical models, localization, and map building (with real-time update).

2.1. Front End

The front end consists of SLAM sensors, and it is employed for measuring position, velocity, and direction of the robot as well as observing and localizing the landmarks. The widely used SLAM sensors are the Global Positioning System (GPS), rotary encoders, infrared (IR), acoustic sensors (ultrasound, microphone, and sonar sensor), camera, inertial measurement unit (IMU), ultrawideband (UWB), acoustic, LiDAR, RADAR, and RGB-D sensors. GPS, IMU, and rotary encoders determine the position, location, and velocity of the robot, and they are briefly discussed in this section while other obstacle avoidance sensors are covered in Section 3.

Rotary encoder and GPS sensors are frequently used for localization in indoor and outdoor environments along with IMU. Rotary encoders generate a number of pulses per revolution as an output signal to identify the position and speed of the robot and location of the landmarks [23]. The Global Positioning System (GPS) is a satellite-based radio navigation signal to estimate the position, velocity, and time of the robot. The signal of GPS can only be detected in the outdoor environment, and it is widely used as a robust and accurate system for localization; however, it is costly as compared to other solutions such as rotary encoders [24]. A robot can localize itself much accurately by localizing the vehicle speed, altitude, object location, and turn rate or inclination of the robot by using the IMU. IMU sensors can be employed as a gyroscope, accelerometer, and magnetometer in order to determine the orientation and acceleration of the robot. Moreover, IMU is mostly used along with GPS or rotary encoders because the drift error of IMU can enlarge positioning error [25]. A combination of rotary encoders and IMU compensates the error by measuring short-term velocity with IMU and long-term velocity with the rotary encoder. A multisensor approach such as IMU and GPS or IMU and rotary encoder is used for precise localization. Table 1. summarizes features of the IMU, GPS, and encoders.

On the other hand, different sensors are used for accurate localization which include acoustic sensor, laser sensor, infrared sensor [26], cameras (optical sensor), LiDAR, RADAR, and RGB-D [27]. An environment-independent and high-precision operation is an important characteristic used for the selection of a sensor.

Ultrawideband (UWB) is radio wave technology used for localizing the landmarks, and it uses short-range and high-bandwidth radio waves. It works by transmitting millions or billions of pulses from the transmitter, then receiving at the receiver, and interpreting the pulses with the help of time of arrival (TOA) for finding the range between obstacle and sensors. The accuracy of this technique is better as compared to other wireless technologies which are used for localization. Zhou et al. have proposed a localization solution with a combination of UWB and LiDAR sensors [28]. In addition, UWB-based localization is not affected by the environment and has a strong ability to retransmit signals without cumulative errors. Segura et al. compared UWB localization and SLAM-based localization with extended Kalman filter [29]. Experimental results showed that UWB-based localization technique produced much variance as compared to SLAM localization; however, the overall accuracy of both methods was found similar. For soccer players, an absolute error on the and directions measured with GPS was  cm and  cm while with UWB, the errors were  cm and  cm on and directions, respectively [30]. This shows a better capability of the UWB sensor.

2.2. Back-End Processes

The SLAM problems are usually addressed as a current state estimation in which prediction and correction of uncertainties are performed. The back-end process consists of one or a couple of procedures used to estimate the pose, position of robots, and the location of the environmental landmarks with the help of the front end [31]. Numerous back-end processes are used for the operation of the autonomous mobile robots which include Kalman filter, extended Kalman filter, FastSLAM, ORB-SLAM, RAT-SLAM, and path planning (A, breadth-first search, depth-first search, and Dijkstra’s algorithm) to name a few. The measured data contain environmental noise that can cause statistical or quantitative errors such as state error. Therefore, back-end techniques are exploited to resolve such errors. These techniques are very supportive in finding the location of robots and landmarks, path estimation, distance measurement, and map building.

State error () is the difference between true state () and nominal state (). If we consider a specific method such as Kalman filter, then it is called state error Kalman filter [32]:

Error state Kalman filter estimates the error and directly corrects it to the nominal state. Such error decreases the accuracy of localization and mapping in SLAM. Minor errors can deviate from the position of landmarks that impact mapping as well.

A map is a symbolic representation of the environment and is used for path planning in subsequent stages. Mapping of the environment is a necessary step either it is static or it is dynamic map (depending on static or dynamic environment). 2D mapping can be achieved with filter-based SLAM, Hector SLAM (LiDAR-based SLAM), or Gmapping SLAM. 3D map can be created with visual-based SLAM (RGB-D SLAM), ORB-SLAM, or 3D LiDAR-based SLAM, wherein 2D mapping takes less storage as compared to 3D mapping that increases the computational cost.

Loop closure, sonar sensor, and management of point cloud data (LiDAR or visual sensor) are factors that are to be considered for microprocessor speed optimization. Handling of computational cost is essential according to the surface area of the environment. However, computational cost is a challenging factor for 3D mapping where a point cloud of a visual sensor is required. For example, 322 MB storage was used for storing 226,000 point clouds of visual sensors [33]. So, a sonar sensor is used for 3D mapping of large environments for loop closure technique and point cloud optimization [34]. More details of loop closure are given in Table 2.

3. SLAM Sensors

This section provides a holistic review of the SLAM sensors which are widely employed in autonomous mobile robots. SLAM-based robots are largely dependent on sensing capabilities; therefore, robots are equipped with single or multiple sensors according to their functionality. In the front end of SLAM, different sensors like acoustic, infrared (IR), camera, LiDAR, RADAR, and RGB-D are used for sensing the landmarks of the environment [35]. The selection of appropriate sensors plays an important role in the accurate measurement of landmarks during robot navigation. With the help of front-end sensors, the back-end process constructs an artificial map, which is used for path planning, obstacle avoidance, and navigation.

In the early SLAM system, acoustic and LiDAR sensors were used as range sensors. Acoustic sensors are generally used for underwater (sonar sensor) and short-range applications. LiDAR-based SLAM was introduced by Nguyen et al. in 2005 [36]; this sensor is used for measuring distance at long ranges [24, 26]; however, lack of visual information and feature extraction is their main drawbacks. Cameras are used as a vision sensor for resolving this issue; these sensors are used in mobile robots since the early 1990s; however, dedicated visual sensor-based SLAM was explored in 2011 by Lategahn et al. [37]. Furthermore, the monocular camera lacks depth measurement which is essentially required to estimate the distance from the location of the object. In the last few years, stereo camera and RGB-D [38, 39] sensors are introduced, which are capable of depth measurement. The RGB-D sensor is an advanced version of the camera with depth measurement. The first RGB-D SLAM system was introduced by Henry et al. in 2012 [40]. Medium-range RADARs were also introduced for autonomous mobile robots, and RADAR-based robots are mostly short-ranged systems [41, 42]. The RADAR-based SLAM is an emerging technology which was implemented in 2014 by Dickmann et al. [43].

Multisensor data fusion is utilized for robust localization and mapping; for example, proximity and visual sensors along with IMU, rotary encoder, and GPS are used for accurate mapping and localization. Each sensor has certain strengths and limitations; for instance, sonar and 2D LiDAR works well for obstacle avoidance; however, they are not capable of building 3D maps. On the other hand, visual sensors are used for 3D mapping, but they overburden the processor. Researchers used visual sensors and robot odometry by encoders with Kalman filter [44]. Indoor localization and mapping within a distance of 43.47 m resulted in 2.7% error in the case of visual RGB-D SLAM while the error of wheel odometry was found 1.04%. Zhang and Singh have implemented the SLAM algorithm (EKF) by using the data of LiDAR, stereo camera, and IMU. In this method, if data acquisition from LiDAR or camera failed due to aggressive motion or degraded environment, then the corresponding module is bypassed [45].

In [46], Chen et al. have implemented IMU and stereo camera for SLAM in order to better estimate the poses of the mobile robot in the indoor environment. He proposed a robust technique STCM-based visual-inertial simultaneous localization and mapping (STCM-SLAM) that works on feature extraction which is claimed to be better than ORB-SLAM and Open Key frame-based Visual-Inertial SLAM (OKVIS-SLAM). SLAM by using extended Kalman filter by fusing of LiDAR, sonar sensor, and rotary encoder was also implemented [21]. In this work, a sonar sensor was used for autonomous movement while LiDAR and encoder were used for localization of landmarks and positioning of robots. In this way, loop closure was handled in an efficient way.

In the following, we have discussed SLAM sensors in detail.

3.1. Acoustic Sensor

The acoustic sensor is widely used because of its properties such as accuracy, simplicity, low power consumption (0.01-1 W), and low computational and economical cost [47, 48]. The sensor transmits an acoustic wave at a specific frequency and locates the object by sensing the echo signals from the object. In the case of ultrasonic sensors, the waves travel in the air at the speed of light and bounce back after striking the landmarks with the same speed. An object’s distance can be measured by calculating the time of the signal from emission to the echo reception [49]. Usually, a sonar sensor actuates in the underwater environment at low frequency [50], and ultrasonic sensors operate at high frequency in the air. Likewise, the position and location of landmarks can be estimated from sensor data using a back-end algorithm such as particle filters and EKF [51, 52]; more details on sensors and algorithms are given in Tables 2 and 3.

The accuracy of the sonar sensor is more in the underwater environment in comparison to the LiDAR and vision sensors [53]; however, they have a limited range (normally 2 to 10 meters), therefore are rarely used in industrial applications [54]. As microphones and speakers can also be used as acoustic sensors [47], they offer a cheap solution.

3.2. Light Detection and Ranging (LiDAR)

Light Detection and Ranging (LiDAR) is a preferable sensor for mobile and aerial robotic platforms because of low computational cost, better measurement range, and omnidirectional detection [55]. Robots can measure distance in 2D and 3D using LiDAR [56]. LiDAR measures depth by sending and receiving laser light. Displacement and rotation of the robot are calculated by detecting the laser light lines; these lines give information about the surface topography. The depth of an object can be measured by flight time [57]. Kalman filter is used to measure the position of multiple objects from LiDAR data. Ground filtration, surface extraction, and model construction of urban buildings are performed by morphological transformation, Hogg transformation, RANSAC [58], CNN [59], and deep learning algorithms [60]; more details on feature and module of different LiDAR sensors and LiDAR algorithms are given in Tables 2 and 3, respectively.

The scanning angle and accuracy in measuring distance, angle, and depth are important parameters used for detecting the landmarks [61, 62]. The sensor parameters are used for map creation and obstacle avoidance. In addition, LiDAR has high accuracy (even in environmental disturbances such as fog, storm, and rain) as compared to camera and RGB-D. Similarly, LiDAR has omnidirectional detection (360°) as compared to the line-of-sight sensors such as camera, acoustic, RGB-D, and infrared. LiDARs are generally classified as solid-state, mechanical, and hybrid LiDAR [63], and they can provide 360° visibilities with high accuracy while measuring remote landmarks with a measurement range from 20 to 300 meters (with accuracy of 15 mm). However, the power consumption is very high (50-200 W) [47].

3.3. Camera

Visual-based SLAM are becoming popular with the camera as the most popular sensor. Vision sensors are commonly used in SLAM-based robots because of the simple configuration and comparatively easy programmable techniques [64]. Two types of cameras monocamera and stereo camera are used widely [65]. The monocular camera was used in 2003 for the SLAM problem and was named as mono-SLAM [66]. Monocamera is simple (hardware-wise), economical, and smaller in size. However, these sensors have high computational cost when measuring depth. A stereo camera can be used for reducing the computational cost while measuring the depth [67].

Feature-based and direct approaches are used to solve V-SLAM. A feature-based filter is used with the help of the Kalman filter [68]; however, it has a high computational cost as it increases the state vector for a large environment. The loop closure problem can be solved proficiently with the help of a feature-based technique [69]. V-SLAM is solved with the direct method as well. It uses images directly without using any feature. In this technique, direct tracking, mapping, and large-scale direct monocular (LSD) method are used. The dense technique is also used as a direct method and is generally used for measuring depth pixels [62]. Calibration of cameras (stereo camera, monocular camera) is always a requirement for measuring the accurate depth of the landmarks. The intrinsic and extrinsic parameters of the camera are required to exploit the calibration. The pose of a camera can be found by extrinsic parameters of the camera. The intrinsic parameters consist of focal length, focal point, principal points, and pixel per unit length [70]. In machine learning and deep learning, CNN and regression are used for solving visual-based SLAM (for more details, see Table 2). The range of a monocular camera is dependent on the resolution of pixels and electronics of the camera with the combination of its intrinsic and extrinsic properties. Power consumption for a monocular and stereo camera is 0.01-10 W and 2-15 W [47], respectively.

3.4. Microsoft Kinect Sensor (RGB-D)

The Microsoft Kinect sensor is one of the widely used sensors in SLAM because of the combination of vision and range sensors with a simple configuration. It is compact and useful for 3D mapping with average cost-effectiveness. The digital value of this sensor is identical to the monocular camera with the inclusion of depth factor. The main feature of this sensor is the IR transmitter and receiver along with a monocular camera and works on SL (structure light) and TOF (time of flight) technique. The RGB-D sensor was released in November 2010 [71]. It is mostly used in an indoor environment because the IR emitter and receiver do not make fine patterns in the outdoor environment and generate noise.

The RGB-D camera is an advanced technology in V-SLAM. It is similar to a camera that generates RGB color-based pixels, with additional depth information. The Kinectfusion method is used for representing a 3D environment with the help of voxel space. The SLAM++ method is used for recognizing the 3D object. Segmentation is used by the segmented object from each other according to their features and depth [72, 73]. It is comparatively cheaper than LiDAR but is expensive than a camera with a less measurement range [47]. Power consumption of this sensor is ranging from 2 to 5 W.

3.5. RADAR

RADAR technique is an emerging technology in the SLAM used for measuring long-range distance. A rotating antenna is used in RADAR for localizing the landmarks by emitting the radio waves. RADAR is a robust sensor and can work in every environment such as dust, rainy, day, and night. A moveable antenna is stroked in RADAR which can rotate up to 360° degrees for data acquisition [74]. RADAR used in mobile robots is smaller in size, and its range varies from 3 m to 40 m [75] which depends on the power of the emitted RADAR radio waves. The cost of a RADAR sensor depends on its size and technique used such as IR, radio, and UV wave [76]. If we compare two different types of RADAR and LiDAR, such as MPR (RADAR) and Velodyne VLP-16 (LiDAR), the linear range of MRP is up to 20 m and Velodyne VLP-16 LiDAR can measure up to 40 m. Moreover, the angular resolution of LiDAR and RADAR is 0.4° and 1.8°, respectively [77].

RADAR is a new sensor in the SLAM-based robot, and a few methods used for this sensor are identical with that of the LiDAR. It acquires the data from each side by transmitting the radio signal while rotating the antenna and monitoring the echo signal. The feature-based method [78], RANSAC, Kalman filter, and particle filter are used for processing the sensor data [79, 80]. Panorama is used for combining the whole side in one image for localizing landmarks [81].

The SLAM algorithm consists of various parameters which are integrated employing various methods such as extended Kalman filter (EKF) and particle filter to minimize errors such as sensor artifacts, robot status, and landmark positions. Sensor errors include human error, random error, and corresponding errors. Due to a similar location in the environment, the closure loop can produce errors. These errors are resolved using SLAM-based algorithms such as EKF, particle filter, and FastSLAM (for more details, see Table 2).

4. A Comparison of SLAM Sensors

Every sensor has unique strengths and certain limitations. Sensor selection depends upon cost, environment (workspace), computational cost, accuracy, and measuring range (space required during sensing). Each system can have a different priority; for example, in some cases, cost-effectiveness is the top priority while sometimes accuracy is the requirement. The sensor’s ability to perform well in the desired environment and range measurement are the top priorities. In this section, we will discuss some important factors which are linked with the selection of a sensor.

4.1. Computational Cost

The computational cost plays an essential role in the selection of the SLAM sensor. Calculations performed on SLAM-based mobile robots are very complex because of complex signals; therefore, sensor data requires a high-end processor for processing. In order to achieve high accuracy during localization, a computer’s microprocessor should be very fast. Implementing such a demanding process on embedded microcomputers is a challenging task. For example, a vision sensor camera provides abundant information in the form of a pixel. A camera with pixels provides 2.5 megabytes’ data in one count. In such a case, a computer or microcontroller should have good processing speed for smooth operation. The camera acquires data in digital form, i.e., a huge matrix, which increases the computational cost in comparison to other sensors. RADAR rotates 360° during landmark localization; the panoramic method is used to integrate all directional data in one image or matrix; this needs a remarkably high-capacity RAM and processor. On the other hand, LiDAR is a good sensor for use in average computer hardware or microcontroller. It provides the required information in every possible direction of the robot, and the required signal of the sensor can work at low microprocessor systems. The acoustic sensor is also a good choice because of its low computational cost.

The feature-based visual SLAM method requires less computational cost. On the other hand, the direct visual SLAM method needs high computational cost because of a lot of mathematical calculations during the operation. However, the feature-based method does not perform well during motion. Therefore, LiDAR or sonar sensors are used in the SLAM when computational cost is the top priority. RGB-D sensor properties are also related to a monocular and stereo camera without distance measurement. Based on these parameters, priorities are assigned (for every cited sensor) for AHP. The result of AHP map is shown in Figure 4.

Figure 4 shows the computational cost for the SLAM sensor calculated with the analytical hierarchy process (AHP). The acoustic sensor needs a minimum computer hardware system that is composed of one digital value like 2 to 5 m for SLAM-based robots; however, all cited sensors require good computational hardware. LiDAR signals need an average microcontroller and computer hardware system. The signal emitted from LiDAR consists of 25 to 360 digital values depending on the angular resolution. The computer can solve these numbers, smoothly. RADAR comes next in priority in SLAM sensors. The signal received from RADAR is also similar to LiDAR but needs a panorama method for its integration, which increases its complexity. To solve vision sensors’ and RGB-D sensors’ signals, the system needs a complex algorithm. For SLAM sensors, researchers used computer specifications as given in Table 4.

4.2. Measuring Range

For a mobile robot, the ability to measure accurate range is an important factor. A monocular vision sensor has a major issue in range measurement. Usually, a stereo camera sensor is utilized for range measurements; however, its accuracy is low as compared to LiDAR, sonar, and RADAR. In RGB-D, data acquisition raises complexities for range measurement and needs an end computing machine for solving the range measurement algorithm, but the depth measuring parameter (RGB-D is the parameter for measuring depth) is superlative as compared to monocamera and stereo camera. Sonar sensors are low-cost; however, they possess a lower measurement range (2-5 m), which is insufficient for an industrial mobile robot. RADAR sensors are popular in mobile robots for long-range measurements. LiDAR (20-300 m) is the preferred choice for range measurement in comparison to other sensors because of the low computational cost. Figure 5 and Table 3 show a comparison of the nominal measurement ranges of the SLAM sensors.

4.3. Environment

A self-exploring mobile robot encounters a number of environment-related challenges such as complex geographical features and obstacles. Every SLAM sensor has certain limitations in different environmental conditions. An ideal SLAM sensor should be robust enough to work perfectly in different environmental conditions such as during a bright and sunny day, dust, rain, or smoke. The performance of the camera is heavily compromised in the above-stated conditions. This can lead to fatal errors in data acquisition and interpretation; for example, at night, a camera gives all-zero digital values in an image. However, IR cameras can work well in bad weather conditions with a major compromise in the accuracy in case of rain and smoke. Similarly, the Kinect sensor also has some drawbacks such as pixel by pixel digital data (can be zero while night or smoke environment condition) similar to the vision sensor. LiDAR perfectly works in every environment excluding underwater environment [99]. A sonar sensor also works in every environment, but it can generate artifacts in data acquisition. RADAR is one of the best choices for working in every environment without any compromise in accuracy. Figure 6 shows efficiency and compatibility ranking of different SLAM sensors in multiple environmental conditions. Based on these parameters, we assign priorities (for every cited sensor) on AHP. According to the AHP results, RADAR works well in different environmental conditions while camera and RGB-D are equally ranked, and they have the lowest performance in the case of bad environmental conditions such as smoky and rainy weather.

4.4. Cost-Effectiveness

Cost is another major factor influencing the selection of the SLAM sensor because every designer desires to make a low-cost autonomous robot. It is hard to compare the sensors while concerning cost because it depends on the properties of sensors like accuracy, size, life, range, and resolution. The acoustic sensor is one of the cheapest sensors among all SLAM sensors. A monocular camera is slightly expensive as compared to the sonar sensor but less expensive as compared to other sensors. RGB-D is widely used in SLAM-based robots as it has an average cost, but well in computational complexity as compared to the camera. RADAR is also a new emerging sensor and has variations in prices depending on the properties. However, RADAR used for the autonomous robot is much cheaper as compared to ordinary RADAR. LiDAR is the most expensive sensor used for a mobile robot, but the cost of LiDAR drastically changes depending on its properties such as angular resolution, range measurement, range resolution, and scan angle [100].

4.5. Accuracy

Accuracy in localization and mapping depends on the used algorithm, environment, and range of the sensors. However, here, we are considering accuracies in the range which is provided by the manufacturers such as HR304 (acoustic sensor) which has 3 mm [101], and LiDAR sensor (YD-LiDAR) systematic error is 2 cm, while the range of sensors is more than 10 m [102]. The accuracy of the camera for measuring the 3D was ranging from 0.001 to 5.358% [103], and accuracy of the RGB-D sensor was recorded up to 60 mm [104]. RADAR has means translation error up to 62 mm [105].

5. Map Building in SLAM and Sensor

The map is a symbolic representation of the environment in SLAM, where robots localize themselves and landmarks. The map has two mediums, static and dynamics. In the static map, every object in the environment is static. The sensor output is combined after finding a proportional scale of the object with respect to the stationary world for building a map of static environment called active mapping [106]. In the dynamic map, objects are in movement and continuously changing the environment. Various classifications of SLAM have been discussed based on its working. SLAM map-based classifications are online, offline, active, and full SLAM. In online SLAM, the previous and current poses of the robot are estimated in the perspective of the map. In full SLAM, it estimates both map and complete navigated path (history of path) by the robot in the map. In active SLAM, the robot actuates autonomously and acquires data of the environment for mapping. On the other hand, in passive SLAM, the robot is actuated manually and receives the data with the help of sensors autonomously.

In a real-world scenario, static environment is not realistic due to motion of people, cars, and animals. The two types of dynamic objects are identified, such as high dynamic object and low dynamic object [107]. In high dynamics, objects are changing their location abruptly and sensors observe such types of objects for a short time. On the other hand, in the lower dynamic object, the object moves with low frequency; even most sensors cannot observe the movement of objects such as movement of doors and furniture movement. A sensor with good accuracy can detect abrupt changes. Table 5 shows different approaches that have been used for building a map using different sensors.

6. Optimization of Selecting SLAM Sensor Process Using Analytical Hierarchy Process

The AHP method is a technique used for the selection of objects based on their properties. It is a decision-making approach introduced by Saaty [117]. It is used for different applications such as healthcare items, industrial sensors, and government substances. It is a mathematical implementation and useful approach for dealing with decision-making of a complex problem. The goal of this technique is to select the most suitable leading category. In this technique, the decider sets priorities according to his experience, and based on his priorities, this technique gives suitable decisions in the selection of categories, as in our case it will calculate the sensor value mathematically.

Multicriteria Decision-Making (MCDM) is a method used in the normalization technique to generate an aggregate of the categories. An important point of this method is to find the best result from a set of priorities. Data normalization is a necessary part of the decision-making process, which transfers input data into numerical data to compare the result, rate, and ranks for selecting the best items. The AHP covers mathematical properties (of SLAM sensor) and required preferences such as cost, computational complexity and reliability for the environment, and range. The complexities of a problem can be reduced by converting each preference into pairwise comparison [118].

The main object of using AHP is the selection of the best sensor for SLAM. Cost-effectiveness, measuring range, computational cost, and environmental behavior were set as criteria for the selection of a sensor. Acoustic sensor, camera, LiDAR, RADAR, and Microsoft Kinect sensor have been set as subcriteria for selection of the best sensor (see Table 6 for more details).

AHP technique verification can be found through the following parameters. The consistency ratio (CR) is a ratio of the consistency index by the corresponding random matrix. The consistency ratio must be less than 0.01 values for consistency of weights. In equation (2), is multiple stones with weights. The multiplication result of is . shows the larger eigenvalues of the matrix. All eigenvalues are zero except one. So, the sum of the eigenvector is equal to the trace of matrix . On the other hand, to make unique, normalization has been done on its matrix by dividing its values by their sum. shows that is consistent:

Small changes in implies that deviates from , which makes a deviation in from consistency called consistency index (CI). The formula of CI is given in (i): pairwise in the form of vector(ii): normalized weight in the form of vector(iii): maximum eigenvalue(iv): eigenvalues of (v): numerical pairwise values and

The average consistency of consistency index at the same matrix index is called random index (RI). Hence, the ratio of consistency index by the random index is called consistency ratio as shown in equation (4). Values of CR less than 0.1 show the inconsistency in setting the criteria or subcriteria:

Our requirement for selecting SLAM sensors consists of some priorities such as cost-effectiveness, computational complexity, environment, accuracy, and range. These priorities can be changed based on requirements (depending on users). For high-budget projects, accuracy, working environment, and range are the main focus. However, in the case of a low budget, more priorities are given to the cost and computational complexity of the sensors. Therefore, we have given more importance to the environment and range measurement for a mobile robot. The analytical hierarchy process comprises five sensors (acoustic sensor, camera, RADAR, LiDAR, and RGB-D) that have been shown for the SLAM problem. Figure 7 shows that the acoustic sensor is a superior sensor based on cost-effectiveness, and the vision sensor ranks second. In the case of the environmental factor, RADAR is on the top to work in all environmental disturbances and LiDAR is the second-best based on the numerical value. Based on computational complexity, the acoustic sensor is one of the most reliable sensors and LiDAR can also be used with low-speed computing machines. While considering range as the prime factor, LiDAR is the best choice with the range capabilities of 20-100 m.

All calculation was performed online in the AHP software [119]. According to the AHP results as shown in Table 7, LiDAR can be the best sensor because of its high range, better performance in different environmental conditions, high accuracy, and average computational cost. RGB-D is the lowest in ranking because of the low range and bad environmental reliability.

7. Conclusion

In this paper, SLAM sensors have been compared based on cost, computational complexity, environment, accuracy, and range as the key performance indices. In addition, a holistic review of the SLAM sensors is provided. Autonomous robots depend on sensor data for localization, pose and location estimation of the robot, and map building. Proper sensor selection for the desired application is highly important, and such selection using the AHP-based method can be an appropriate choice. The AHP method with the set priorities (as stated above) shows that LiDAR is the best choice for long-range applications as compared to acoustic, vision sensors, RADAR, and RGB-D sensors. In addition, SLAM sensors such as acoustic, vision, LiDAR, RADAR, and RGB-D are discussed in detail. Our study suggests that RADAR is widely explored for application in autonomous mobile robotics. Vision sensors provide more details about the environment; however, complex algorithms and computational complexity are the limitations. The acoustic sensor is cost-effective with a linear output; however, limited range is its key limitation. AHP analysis reveals that LiDAR is preferable among all the cited sensors for the SLAM problem due to the long range, minimal computational complexity, and capability to work in noisy and foggy environments. The analysis further shows that RADAR can be the second choice after LiDAR due to its optimal measurement range and normal performance in diverse environmental conditions.

Abbreviations

SLAM:Simultaneous localization and mapping
AHP:Analytical hierarchy process
:Probability distribution
:Robot odometry
:Sensor measurement
:Map
:State of robot
GPS:Global Positioning System
IMU:Inertial measuring unit
UWB:Ultrawideband
UFASTSLAM:Unscented FastSLAM
LiDAR:Light Detection and Ranging
RADAR:Radio detection and ranging
SONAR:Sound navigation ranging
SL:Structure light
TOP:Time of flight
RGB-D:Red green blue-depth
CNN:Convolutional neural network
FMCW:Frequency-modulated continuous wave.

Data Availability

The article is a rigorous literature review, and an online AHP software is used for SLAM sensor comparison. All papers and tools used are well cited.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We thank the Pakistan Science Foundation (PSF) for supporting this research through funding under grant PSF/NSLP/C-NUST (746) and support by Robot Maker Lab, National Centre of Robotics and Automation, National University of Sciences and Technology. Muhammad Shahzad Alam Khan thanks NUST, for financial support for his MS degree.