Abstract

In the implementation of robot motion control, complex kinematic computations consume too much central processing unit (CPU) time and affect the responsiveness of robot motion. To solve this problem, this paper proposes a parallel method for solving kinematic equations of articulated robots based on the coordinate rotation digital computer (CORDIC) algorithm. The method completes the fast calculation of the transcendental function based on the CORDIC algorithm, adopts the tree structure method to optimize the key computational paths of forward and inverse solutions, and designs a parallel pipeline to realize the low latency and high throughput of the kinematic equations. The experiments of the proposed method are validated based on the field-programmable gate array (FPGA) hardware experimental platform, and the experimental results demonstrate that the computational time to complete the entire kinematic equations is 4.68 μs, of which the computational time for the kinematic positive solution is 0.52 μs and the computational time for the kinematic inverse solution is 4.16 μs.

1. Introduction

Real-time control of robots is a challenging research priority, especially in space, medical, and industrial robotics applications where fast response is very important [1, 2]. However, robot kinematics involves real-time computation of a large number of transcendental functions such as cosine, sine, arc tangent, square, and so on. Solving kinematic equations takes up a lot of CPU time, which makes it difficult for the robot to respond quickly [35].

To solve this problem, using digital signal processing (DSP) and field-programmable gate array (FPGA) are two possible solutions. Some researchers perform kinematic calculations for robots based on DSP [68], but as robot systems become increasingly complex, the computational tasks undertaken by DSP also become more intensive, and excessive resource occupation leads to a decrease in the robot’s response speed. FPGA is widely used in the field of robot control due to its programmable hardwired characteristics and fast parallel computing capabilities, which improve the processing power of hardware and the speed of real-time information processing [912], Chand et al. [13] used embedded FPGA for motion planning and control of dual arm robots, by establishing accurate arm motion sequences to accurately execute multiple tasks. Gürsoy and Efe [14] proposed proportion integral differential and sliding mode control scheme for robot manipulators based on FPGA, achieving better trajectory tracking performance. Furthermore, kinematic solution based on FPGA has also been proposed by researchers.

The issue of the first importance is the computation of transcendental functions by FPGA, the look-up-table (LUT) method [15], Taylor series expansion method [16], and coordinate rotation digital computer (CORDIC) algorithm [1719] have been proposed. Zhang et al. [15] proposed a master–slave surgical robot forward–reverse kinematics computation method based on FPGA, and all the transcendental function computations use the LUT method, which can effectively improve the computation rate of the transcendental function, but it needs to take up a large amount of LUT resources when performing high-precision computations. In order to solve the problem of kinematic inverse tangent and inverse cosine hardware computation, Kung et al. [16] proposed a combination of Taylor series expansion method and LUT method, which reduces the LUT resource occupation for high-precision computation, but requires multipliers for polynomial computation, which leads to a reduction in the computation rate. The CORDIC algorithm has high speed and area achieved in digital signal processing applications [20], Multiplexers based CORDIC algorithm and fully pipelined CORDIC algorithm [21] used to achieve a fast and efficient hardware on FPGA. Wei et al. [17], Zhang et al. [18], and Çelik et al. [19] have used the CORDIC algorithm for the computation of the transcendental function, which has a higher computation rate and less resource consumption when implemented in FPGA. In order to improve the calculation speed of kinematic equations, Zhang et al. [18] and Petko et al. [22] proposed an FPGA heterogeneous scheme for kinematic computation, where only the FPGA is used as a coprocessor for the fast computation of the transcendental function, which reduces the computation time consuming of the transcendental function, but increases the corresponding instruction scheduling time and puts forward higher requirements for the timing control of the heterogeneous platform. A single FPGA-based method for solving kinematic equations can effectively avoid the shortcomings of heterogeneous platform methods, with a simpler system composition and better robustness and stability. Chen et al. [23] proposed an FPGA-based kinematic intellectual property (IP) for selective compliance assembly robot arm (SCARA) robots, which realizes the overall kinematic computation within 10 μs, but due to the use of finite-state machines, this kinematic IP cannot perform parallel streaming computation and fails to give full play to the FPGA’s parallel data processing capability. Fan et al. [24] proposed a high-level synthesis method based on Zynq FPGA to realize the inverse kinematics computation of humanoid robots, although the high-level synthesis method can realize the rapid development of FPGA, but it is more suitable for the scenarios with lower requirements on timing, the robot kinematics computation process is more complex, and the addition of the computation process to the timing control is more conducive to improving the real-time performance of the robot.

Motivated by the aforementioned discussions, an FPGA-based hardware parallel solving method for robot kinematic equations is proposed in this paper, which is based on the CORDIC algorithm to complete the fast computation of transcendental functions, adopts the tree structure method to optimize the key computational paths of the forward and inverse solutions, and carries out a parallel pipeline design of the whole forward and inverse solution computation to realize the low-latency and high-throughput solving of the kinematic equations. The main contributions and innovations of this paper are as follows:(1)The kinematic hardware computation model proposed in this paper takes only 4.68 μs to complete the whole kinematic computation, and the computation process adopts parallel pipeline design, which allows data computation in each clock cycle and improves the computational efficiency greatly.(2)The computational timing of the kinematic hardware computation model proposed in this paper is fixed, so that when the data bit width is increased, the computational cycle can still be guaranteed to remain unchanged, but the computational accuracy will be improved accordingly.

The rest of the paper is organized as follows: in Section 2, the kinematic model is developed and forward/inverse kinematic equations are derived. The principle of CORDIC algorithm is presented in Section 3. In Section 4, a hardware parallel FPGA-based solution of the kinematic equations is given. Experimental results are given in Section 5. Conclusions are given in Section 6.

2. The Kinematic Equation for Articulated Robots

In this section, the structure of the robot is first described, then the robot is modeled using the Denavit–Hartenberg (D–H) method [25], and finally the forward kinematics equations and inverse kinematics equations are derived separately.

2.1. The Kinematic Model of Articulated Robots

An articulated robot is an open chain structure composed of a series of connecting linkages connected by joints [26, 27]. To accurately describe the robot’s poses, this paper uses D–H method, which constructs a kinematic model using a fourth-order transformation homogeneous matrix to describe the relationships of adjacent links and deriving the positional relationships of each linkage relative to the base in a recursive manner.

The robot studied in this paper is a 6-degree-of-freedom industrial robot, and the kinematic model is built using D–H mothod, the link coordinate systemas{}is shown in Figure 1, is link length and is link offset.

Based on the distribution of the coordinate system in Figure 1 and the connecting rod parameters, the D–H parameters to build this robot model can be derived, is the link twist, is link length, is link offset, is the joint angle, as shown in Table 1.

In the D–H parameter method, the vector defined by the adjacent coordinate system {} is transformed to the coordinate system {}, transformation for which homogeneous matrix can be expressed as [28]:where means and stands for .

The D–H parameters in Table 1 are brought into Equation (1) to obtain the homogeneous transformation matrix between the adjacent linkages of this robot.

2.2. The Forward Kinematic Equations of Articulated Robots

After determining the linear transformation homogeneous matrix between each coordinate system, the kinematic equation can be obtained by multiplying the transformation matrix of each linkage:

The forward kinematic equation solution is to find the position of the end effector with respect to the reference coordinate system, knowing the parameters () of each joint. The inverse kinematic equation solution is to find the motion parameters of each joint based on the given poses of the end effector relative to the reference coordinate system.

2.3. The Inverse Kinematic Equations of Articulated Robots

The robot studied in this paper conforms to the Pieper criterion [29], i.e., three adjacent joint axes intersect at a point, and is a configuration with closed solutions. To meet the requirements of real-time robot control, this paper utilizes the algebraic method in the closed solution method to calculate the inverse kinematic solution, and the analytical expression can be obtained as:where ,

,

,

,

,

,

,

,

,

,

,

,

,

.

3. Parallel Calculation Method of Transcendental Functions for Kinematic Equations

The CORDIC algorithm is an iterative algorithm that uses only shift operations and addition and subtraction operations to solve the problem of real-time computation of trigonometric functions in air navigation control systems [30]. Based on this, Walther [31] proposed a unified form of the CORDIC algorithm, application for which extends to inverse trigonometric functions, hyperbolic functions, and transcendental functions. The algorithm is well-suited to run in platforms such as FPGAs due to its high hardware efficiency. From the perspective of FPGA portability, using fixed-point numbers for cordic algorithm implementation can ensure computational speed while configuring FPGA resource utilization. Therefore, this paper employs the CORDIC algorithm in the form of fixed-point numbers for trigonometric functions, inverse trigonometric functions, and open-root operations in the kinematic solution process.

The unified iterative equations of CORDIC algorithms for circular, linear, and hyperbolic systems are given as follows:where the circular system is , the linear system is , and the hyperbolic system is .

The CORDIC algorithm solves the computation of trigonometric and inverse trigonometric functions under the circumferential system, which contains the rotation mode and the vector mode. The former solves the computation of trigonometric functions, and the latter is for inverse trigonometric functions. In the calculation of Equation (3), there exist open-root operations in the form of . The CORDIC algorithm can be computed in the vector mode of hyperbolic systems.

The coefficients of the iterative Equation (4) of the CORDIC algorithm are different when performing trigonometric functions, inverse trigonometric functions, and open-root operations, but the implementation principle is the same. Taking the rotation mode of the circular system as an example, the operation unit of the iterative Equation (4) includes three adders, LUT, and two shift operations. The structure of the iterative processing unit for FPGA implementation is shown in Figure 2, {} is the input parameters of the iterative computing unit, {}is the output parameter of the iterative computing unit.

The computational accuracy of CORDIC algorithms is determined by the iteration number. To ensure the computational accuracy of kinematics, the number of iterations is chosen to be 16, and the data bit width of 24 bit is selected for fixed-point computation in this paper. In order to maximize the computational efficiency of CORDIC algorithm, this paper adopts a parallel pipeline structure based on FPGA, and the hardware structure is shown in Figure 3; {} is the input parameter of the first iteration of the CORDIC algorithm and {}is the output parameter of the Nth iteration. Certainly, the number of iterations can be set according to different accuracy requirements, thus balancing computational accuracy and resource consumption.

4. Hardware Parallel Solution Method for Kinematic Equations

On the basis of implementing the calculation of transcendental functions based on FPGA, this section carries out the parallel pipeline design of the computation process according to Equations (2) and (3). Finally gives the parallel computation hardware structure for the forward kinematics equation and the inverse kinematics equation, respectively.

4.1. Hardware Parallel Solution Method for the Forward Kinematic Solution

In the computation process of the kinematic forward solution, the logical path with the longest delay from the input to the output is the critical path, and the optimization degree of the key path determines the working speed of the model. In Equation (2), , and are the critical paths to calculate the forward kinematic solution, where means means means means . The computational modules can be divided into different computational modules according to the order of operations: , and , and each computational module is executed in parallel in different time sequences. In this paper, the nodes are computed in parallel through the tree structure method. The entire critical path execution process takes 21 clock cycles () and requires 11-bit wide hard-core multipliers. The computational model of the kinematic forward solution is shown in Figure 4, where is the joint angle, means , means means means , and is procedure calculated value.

Since each performs homotypic structural computation, the whole computation process can be parallel pipelined by inserting flow registers. After completing the structure and timing design of the critical path, it is necessary to perform a register leveling process for other computational paths so as to balance different computational paths and thus increase the overall working frequency. Since the critical path has already achieved the lowest latency output, the parallel pipeline solution of the forward kinematic solution is completed by inserting registers in other shorter computational paths, and inserting a pipeline register after dividing according to the same computational structure and the same computational timing. Due to the insertion of pipeline registers into the computational structure, the computation time of the forward kinematic positive solution increases to 26 , but the data throughput rate is multiplied.

4.2. Hardware Parallel Computation Method for the Inverse Kinematic Solution

In Equation (3), the joint angle and can be calculated as follows:

First:

Second:

Third:

The process of solving the joint angle and involves trigonometric function, inverse trigonometric function, and open-root calculation, where the computation of process data is the most time-consuming, and its computation process includes 16 multiplications, one open root, and one division. Besides, the computation process of joint angle is serial. In order to improve the computational efficiency of the kinematic inverse solution, the invariant constants in the computational process are precomputed. The computational process is parallelized by register leveling, and the computational models of joint angle and are shown in Figure 5, where is the joint angle, means means is procedure calculated value. The denominator term of the process data is converted into a fixed-point multiplication that can be executed by the hard-core multiplier. The results of , and are obtained by pre-computation, and the resultant values are used directly in the calculation model. The parameters required for the trigonometric function of joint angle and the solution of joint angle are calculated in parallel to reduce the time consumption of joint angle solution.

Referring to the computational model in Figure 5, the computation of the entire kinematic inverse solution is completed. The parallel pipeline design of the entire kinematic inverse solution is completed by inserting the pipeline register into the computational model, and the pipeline period is 208 . Therefore, the computation time of the inverse kinematic solution is 208 .

5. Experimental Verification

To verify the effectiveness of the computational method proposed in this paper, the algorithm is experimented on XC7A200T FPGA platform, where the joint angles and linkage parameters , and are represented by 21-bit unsigned numbers, and the poses are represented by 24-bit signed numbers, which are calculated using fixed points.

5.1. Experiments on Computational Time and Computational Errors

The computational time is obtained by inserting the integrated logic analyzer (ILA) IP into the program, configuring the ILA IP to capture program run signal and result output signal, and then calculating the time difference between the two signals to obtain the time consumed. The computation time includes the computation time of the forward kinematic solution and the inverse kinematic solution. The computational error is based on the results of the X86 architecture platform, where the computation time on X86 architecture platform was deaveraged over 10,000 computations. The computational results of the algorithm in this paper are obtained by inserting the virtual input output (Vio) IP core into the FPGA. The input value of the forward solution computation module is the angle of the robot’s joint angle after performing the fixed-point transformation, while the inverse solution computation module inputs the elements of the position matrix into the Vio IP core after performing the fixed-point transformation.

The experimental results obtained in this paper are compared with other methods on X86 architecture platforms, advanced reduced instruction set (RISC) machine (ARM) platforms, DSP platforms, and FPGA platforms as shown in Table 2.

5.2. Discussion

The parallel computation method proposed in this paper reduces the computation time by more than 100 times compared with other software computation schemes for ARM or DSP platforms, and significantly improves the computation efficiency of the kinematic equations. Compared with the study by Celik et al. [19], the computation in this paper takes more time, which is due to the application object of [19] is a 4-degree-of-freedom SCARA robot, the kinematic equations solving steps and computational complexity of SCARA robot are less than that of this paper’s 6-degree-of-freedom robot, and importantly this paper proposes the method to be able to carry out the parallel flow computation, the computational efficiency is far more than that in a study by Çelik et al. [19]. The results of converting the fixed-point results of the kinematic forward and inverse solutions to floating-point numbers are compared with those of the software implementation scheme using single-precision floating-point numbers, the computational errors are less than , and the result is same as in a study by Çelik et al. [19].

It is worth mentioning that the computational time of the computation model proposed in this paper is fixed, which still ensures that the computation period remains unchanged when the data bit width is increased, the computational accuracy will be improved accordingly, but the logic resource consumption of the FPGA will be doubled as well.

6. Conclusions

In this paper, we address the problem of high-real-time computation delay of kinematic equations of multijoint robots, study the hardware logic computation method of kinematic equations based on FPGA, and propose a parallel solution method of kinematic equations based on CORDIC algorithms. In addition, the computational delay of kinematic forward and inverse solutions is reduced to microsecond level, and the solution process is designed as parallel pipeline computation to further improve the computational efficiency.

The parallel solution method proposed in this study is of great importance in the motion control of space robots and ultra-high-speed robots. In future work, the servo control algorithm and the parallel solution method proposed in this paper will be considered for implementation in a single FPGA.

Data Availability

Data were deposited in a public repository.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study is financially supported by the National Natural Science Foundation of China (202014353).