Abstract

This paper aims to study the construction of 3D temperature distribution reconstruction system based on binocular vision technology. Initially, a traditional calibration method cannot be directly used, because the thermal infrared camera is only sensitive to temperature. Therefore, the thermal infrared camera is calibrated separately. Belief propagation algorithm is also investigated and its smooth model is improved in terms of stereo matching to optimize mismatching rate. Finally, the 3D temperature distribution model is built based on the matching of 3D point cloud and 2D thermal infrared information. Experimental results show that the method can accurately construct the 3D temperature distribution model and has strong robustness.

1. Introduction

Object detection and scene categorization are the important research areas in robot vision perception and human-robot collaboration [1]. Owing to recent technological advancements, various devices that can be used to acquire different types of information, such as temperature or spatial 3D geometry, are currently available. Thus, using different devices on common sensor platforms enables multidimensional spatial data collection. In addition, directional emissivity is particularly important in the correction of infrared temperature measurements for complex surfaces [2]. As the shape of the object becomes complex, measuring 3D temperature distribution is necessary for the elimination of geometry-caused errors to obtain precise measurements.

Recent technical developments apply thermal infrared image (2D images) onto spatial information (3D images) to obtain 3D temperature distribution model. Three-dimensional space information can be acquired through various 3D scanning methods, such as omnidirectional vision [3], structured light technology [4, 5], time-of-flight [6], and binocular vision technology [7]. Stereo vision technology can be classified into active or passive technology according to light conditions. The flaws of active vision are the view range limitations and environment light interference. In this case, binocular vision is usually used to obtain 3D point cloud information. The principle of binocular vision is based on the parallax of left and right camera images.

As a simple, noncontact, noninvasive, and inexpensive imaging method, thermography is widely applicable in a variety of industries and research fields. In most of its applications, investigation is performed in a passive manner; that is, the camera observes a scene and detects the thermal radiation emitted by objects. Although radiometric information is typically represented as colored or gray-valued images, thermal camera can capture complementary information that can facilitate the description and analysis of detected objects and observed scenes.

A thermal infrared camera model is a classical pinhole model with intrinsic and external parameters [8]. Given that the traditional chessboard pattern is not visible in the thermal infrared domain, the calibration of thermal cameras is based on a planar with lamps [9, 10]. The lamps are clearly visible in the thermal infrared images and can be detected easily [11]. In real situations, active lamps cannot provide high strength heat that can be identified by infrared equipment. Skala et al. [12] presented a calibration pattern composed of rectangular holes cut through a solid board. The hole pattern can be distinguished by the thermal cameras through the thermal radiation emitted by the objects that are located behind the board and passing through the holes. However, calibration board performance is unsuitable for binocular camera calibration.

For multiple image registrations, González-Aguilera et al. [13, 14] presented solutions for the automated registration of individual 2D images onto 3D range models. These methods are based on feature extraction (i.e., extraction of points, lines, edges, rectangles, or rectangular parallelepipeds). However, extracting these features in the infrared image is difficult. Ben Azouz et al. [15] applied thermal infrared image to determine the approximate position of the teats in an optical image and then applied an image processing algorithm to determine the accurate teat ends within the region of interest. In contrast, Liu et al. [16] presented solutions that combine 2D-to-3D registration with multiview geometry algorithms. This scheme is a simple and efficient method.

In this paper, thermal information and spatial information are both considered. In particular, a thermal camera is used and binocular vision technique is performed to obtain a 3D temperature distribution model. We define a new smooth item in the belief propagation algorithm to address the parallax hollow phenomenon generated during stereo matching. The external parameters of binocular and infrared cameras can be calculated simultaneously by using a calibration board. The 3D space temperature distribution model can be calculated by combining thermal infrared and 3D space information [17].

This paper is organized as follows. Section 2 describes the experimental setup and mathematical description of the calibration method. Stereo matching and 3D temperature distribution model calculation are introduced in Section 3. The analysis of the experimental results is shown in Section 4.

2. System Composition and Calibration

2.1. The 3D Temperature Distribution Imaging System

The hardware of the system adopts an A615 camera produced by FLIR company and MV-VEM120SM visible light camera produced by Veise company. The A615 thermal infrared camera has a resolution of 640 × 480 and a focal plane array (FPA) of 7.5–14 µm. Its temperature measurement range is –20–150, 0–150, and 300–2000°C. In this system, we select the object temperature range of 0–150°C. Noise equivalent temperature difference (NETD) is less than 0.05°C at the temperature of 30°C, and the spectral range is 7.5–14 μm. The binocular camera consists of two imaging devices, each of which provides a visual image of 1920 × 960 pixels. The 3D temperature distribution imaging system is shown in Figure 1.

The projection models of the binocular and thermal infrared cameras are shown in Figure 2. The left and right visible light cameras ( and ) constitute a binocular vision system. The system is used to construct the 3D information of the scene. Binocular vision technology determines the matching points in two images by imitating human eyes and restores the 3D space information of the entire scene according to the triangulation theory. The matching point is defined as the dot pair (DP). The thermal infrared camera is used to acquire the temperature information of the scene. Meanwhile, in Figure 2 represents the world coordinate system after the integration of 3D space information and temperature information. and are the external parameters of the right and left cameras, respectively.

2.2. System Calibration

The calibration method of Zhang is a widely used in computer vision [18, 19]. This method can be used even without specialized knowledge of 3D geometry or computer vision. It only requires a camera to observe a planar calibration pattern shown at a few (at least two) different orientations. The camera or the planar calibration pattern can be moved with freely unknown motion. When lens distortion is considered, this method improves camera calibration results through a nonlinear optimization technique based on the maximum likelihood criterion.

The traditional calibration board used in the calibration method of Zhang is composed of black and white chessboard boxes. Spatial information can be obtained by extracting chessboard corners and can be used to calculate the internal and external parameters of the camera. The temperatures of the black and white areas of the traditional calibration board are the same when no special treatment is applied. Thus, thermal infrared camera cannot divide the chessboard corners used in the method of Zhang. The traditional calibration board cannot complete the calibration work of the thermal infrared camera.

We exploit the black and white calibration chessboard by using a cardboard with a sheet metal. The chessboard corners can be clearly recognized in the binocular image and can be identified in the thermal infrared image, as shown in Figure 3. Subsequently, the calibration results of the binocular and thermal infrared cameras are completed according to the calibration method of Zhang.

Figures 3(a) and 3(b) show the binocular images (left and right images), and Figure 3(c) shows the thermal infrared image.

The specific steps are as follows: Position is assumed on the calibration board with coordinates of and in the left and right cameras, respectively, and in the thermal infrared camera.where and represent the rotation matrix and translation vector, respectively, between the left and right cameras; , , , , , and are the rotation matrixes and translation vectors of left camera, right camera, and thermal infrared camera. The matrixes and vectors can be set as follows:where is a scale value. Given that , (2) can be converted into

Thus, the following can be obtained by combining (1) and (3):Similarly, under the right camera,

In the calibration process of the binocular vision system, , , , , , , , and can be obtained through calibration. Therefore, the coordinate of under the world coordinate can be completely available. Given the image coordinate of in the thermal infrared camera, DP can be obtained. Several dot pairs can be obtained by using the mobile calibration board. In this instance, the traditional calibration method can be used in the calibration work of the thermal infrared camera by constructing multiple sets of linear equations according to the coordinates of the corresponding points. Unknown parameters in the equation are the internal and external parameters of the thermal infrared camera. These unknown parameters can be obtained by substituting the coordinates of the points, including , , , and , into the equation.

The thermal infrared camera is built on the calibration of the visible light cameras. The two visible light cameras and thermal infrared camera simultaneously obtain the images of the calibration board. The infrared image and location information of the target point on the calibration board in the 3D space can be obtained simultaneously to construct the mapping relationship between them. The parameter information of the thermal infrared camera is then obtained.

Meanwhile, the thermal infrared camera projection model can be constructed in the same manner as ordinary camera [20]. The parameter calibration process of the thermal infrared camera is similar to that of traditional camera calibration method.

Levenberg-Marquardt [21, 22] (LM) algorithm is applied in the limit correct process. LM algorithm is realized in the following:(1)Select the initial point and terminate the control constant , , , and .(2)Calculate Jacobi matrix and calculate to construct the incremental normal equation .(3)Solve the incremental normal equation to obtain ; if , ; on the contrary, if , given , resettle the regular equation to obtain and return to Step   at the same time.

3. Stereo Matching and 3D Temperature Distribution Model Calculation

3.1. Stereo Matching and Algorithm Improvement

In the binocular vision, the structure of 3D figure is based on the parallax of the left and right camera images. During stereo matching, which is based on the belief propagation algorithm [23, 24], the principle of constructing the energy function is to convert the parallax calculation to the minimum value calculation of the energy function. In the process of solving the minimum value of energy function, Markov random field model is introduced [25] to obtain the matching cost according to the transcendental scene information. The maximum posterior probability of Markov random field is then calculated by the belief propagation algorithm. These processes are used to obtain the parallax information of the scene.

The energy function of the belief propagation algorithm can be divided into two parts, namely, data item and smooth item. The mapping of the data item is the relationship between known and unknown nodes in the Markov random field, and the smooth item is the relationship between two adjacent unknown nodes in Markov random field. The energy function is defined mathematically as

Negative exponential computing is adopted for the energy function formula. The formula is converted as

The following can be defined:

Then, the energy function is

In the function, represents the smooth item and represents the data item. After the negative exponential change, the minimum value problem of the energy function is converted to the maximum probability distribution of unknown points. The initial matching cost of the belief propagation algorithm is the absolute difference function as follows:

As shown in the formula, the difference between the values of the two pixel points is calculated to obtain the similarity degree of the two images. The smooth item of the energy function is obtained by using the Potts model [26].where is determined by the gradient between the two pixels of i and j.

Two methods for information updating are based on the belief propagation algorithm. One method is synchronous updating, which calculates information from all the neighborhoods of each node and then updates the information in each node [27]. Meanwhile, accelerated updating changes immediately when a node receives information from the neighborhood.

This paper adopts the accelerated updating method to solve maximum joint posterior probability.

The data and smooth items in the traditional belief propagation function use the Potts model (see (11)), in which the smooth item between those two points is 0 when the parallax value between the adjacent points is the same at an area without texture. The smooth item is a constant when the parallax value is different. However, the algorithm causes ambiguity matching or matching error at a flat area. This phenomenon is defined as parallax hollow phenomenon. To overcome this disadvantage, we define a new smooth item in the belief propagation algorithm.

In the formula, λ represents the truncated threshold. When the parallax difference reaches the truncated threshold λ, the smooth item is the maximum and does not grow to protect the smoothness of the weak texture region. Meanwhile, is the penalty coefficient. According to the continuity of the image, the penalties are different in different regions.where represents the brightness difference between the matching pixels in the target image and reference image. When the brightness difference is less than the threshold , the smooth item must be multiplied by a penalty coefficient.

The mismatching rate is greatly reduced in stereo matching after the algorithm is improved.

3.2. 3D Temperature Distribution Model

In the 3D temperature distribution reconstruction system, two visible light cameras are used to construct the 3D point cloud of the object according to the binocular vision technology. The thermal infrared camera obtains the temperature information of an object and captures the 2D thermal infrared image. The parameter models of the thermal infrared and visible light cameras can be further acquired during system calibration. On this basis, the paper proposes a 3D temperature distribution model based on matching principle.

The matching principle is shown in Figure 4 and is based on the pinhole camera model. In this model, the origin of the thermal infrared camera is defined as the world coordinate system origin . In addition, and are the origins of the left and right visible light cameras, respectively, while u And v are the coordinate axes in the camera coordinate system. P is the projection point in the cameras from the object . The -axis is along the optical axis of the thermal infrared camera, and the -axis and -axis correspond to the projection plane of the thermal infrared camera. Thus, a scene view is formed by projecting 3D points into the image plane through perspective transformation. Thus, the relation between point in the 3D space and its corresponding point in the camera coordinate system is computed aswhere is the internal parameter of cameras, is the rotation matrix of external parameter, and is the translation vector of external parameter. Therefore, three corresponding points, namely, , , and , of space point can be obtained according to the different projection parameters of different sensors.

The parameters of rotation matrix and translation vector can be obtained by calibration in Section 3. The depth value can be calculated according to the difference value between and . By the method of polar correction based on the rotation matrices and , the values of and should be equal. The depth of can be calculated through the triangle method as follows:

The calculated depth value is the length from the object to the epipolar plane of the axis. When the thermal infrared pixel is considered, the spatial and temperature information of can be calculated roughly. However, the information of the 3D model is not strictly equal to the thermal infrared image. In other words, no point in the 3D model corresponds to some points in the thermal infrared image in the matching process. Therefore, interpolation processing is required for the 3D model.

As shown in Figure 5, is the origin of the thermal infrared camera, and is the origin in the world coordinate system. The adjacent points , , and in the 3D space are transformed to the imaging plane of the thermal infrared camera through projection. The corresponding points in the infrared image are , , , , , and , , and have the similar topological structures. The direct relationship between the 3D model and infrared image is constructed. The corresponding temperature value in the infrared image is returned to the 3D model to obtain the preliminary 3D temperature distribution.

In the thermal infrared image, is assumed to be in the triangle formed by , , and . Then and , , and can be defined in the following equation:

According to the equation, the ray equation of in the 3D space can be defined.

The following relationships of , , and can be calculated by the properties of the coplanar principle:

Therefore, the complete 3D temperature distribution model can be obtained after the interpolation processing of the 3D model.

4. Experimental Results and Analysis

4.1. Calibration Results

For the validation of the calibration and measurement procedure of the system, 25 sets of data are collected with our novel black and white calibration chessboard. The data are divided into two groups. One group is used to calibrate the parameters of the three sensors, and the other was used to validate the accuracy of the matching between space information and thermal infrared information.

The thermal infrared and binocular cameras are calibrated using the common camera calibration toolbox of MATLAB. The calibration results are shown in Table 1.

In Table 1, and are principal points that are usually at the image center, and are the focal lengths expressed in pixel units, and and are the distortion parameters. In Table 2, and are the matrixes of the extrinsic parameters.

Any point in the 3D space can be mapped to the image coordinate system of the thermal infrared image through matrix transformation. Through mapping, the corresponding relationship between 3D points in the space and thermal infrared image can be obtained.

4.2. Experimental Results

In the experiment, a cup filled with hot water is used for 3D reconstruction. The temperatures at the top and bottom of the cup are different in the temperature image. Their color values are also different. After epipolar rectification, two readily matched images are obtained, as shown in Figure 6.

After epipolar rectification, two images have the parallax only in the horizontal direction. Therefore, the belief propagation algorithm is used to obtain the parallax image of two images, as shown in Figure 6(c).

After the parallax image is obtained, 3D reconstruction is carried out according to the triangulation principle shown in Figure 7(a). In this process, the background area is eliminated according to the different parallax between the background and target. The thermal infrared image of the cup is shown in Figure 7(b). The matching between the temperature image and the 3D model is studied after the target 3D model is obtained, as shown in Figure 7(c).

The experimental results show that the temperature image matched well with the 3D models. The temperature information on the surface of the target is completely mapped on the 3D model, but the 3D model still has some holes because of the noise in the parallax image and mutation of some areas.

In addition, we use a lid and kettle to perform two experiments. The binocular visual images, depth images, and the thermal infrared images are shown in Figure 8.

In Figure 8, columns (a) and (b) show the stereo images, column (c) shows the depth image calculated according to binocular vision, and column (d) shows the thermal infrared images.

Three-dimensional space information and thermal infrared information of the three objects are fused to calculate the 3D space temperature distribution model.

4.3. Experimental Analysis

In terms of matching accuracy, we propose a new method to evaluate matching accuracy on the basis of matching drift. When the matching between the 3D space information and thermal infrared information is not perfect, matching drift is produced, as shown as Figure 9. In this instance, (a) is the depth image calculated according to the horizontal parallax of the left and right images. The red dots in the depth image are the chessboard corners estimated by the stereo images, and the blue dots in (b) are the precise corners in the thermal infrared image. When the matching results are not accurate, matching drift is produced inevitably and contains blue and red dots.

We define the pixel average error through the following equation:where represents the image number, represents the corner point number in the th image, represents the total number of images, represents the total number of corner points, represents the abscissa value of the th corner point in the th depth image, represents the ordinate value of the th corner point in the th depth image, represents the abscissa value of the th corner point in the th thermal infrared image, and represents the ordinate value of the th corner point in the th thermal infrared image. Thus, the accuracy of matching algorithm can be determined by calculating the pixel average error.

In the actual measured object, we cannot determine the match point to verify the match pixel offset. Therefore, we apply 10 sets of samples to measure the matching drift pixel average error according to the calibrated spatial matching results. The matching drift pixel average error of the samples is shown in Figure 10.

The experimental results show that our average pixel error is within four pixels.

The FLIR Tool software of forward looking infrared (FLIR) camera is used to collect the temperatures of some points in target of the thermal infrared image. FLIR cameras exhibit high accuracy in temperature measurement and the accuracy of the temperature is 0.1 degrees. During temperature information acquisition, the coordinate of the corresponding points in the 3D space can be obtained according to the constructed 3D temperature model. The temperature and coordinates are shown in Table 3.

Additional Points

Summary. This paper aims to build the 3D temperature reconstruction system using two visible light cameras and a thermal infrared camera. As indicated by the results of our experiments on multiple objects, the method can accurately construct a 3D temperature distribution model. In the future, we can apply this method on a mobile robot platform.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grant U1613214) and is supported in part by the Fundamental Research Funds for the Central Universities of China (Grants N150404010 and N150308001).