Abstract

Panoramic imaging is information-rich, low-cost, and effective. In panoramic image acquisition, unmanned aerial vehicles (UAVs) have a natural advantage that owes to their flexibility and relatively large observation ranges. Using a panoramic gimbal and a single camera may be the most common means of capturing gigapixel panoramas. In order to manage the constraints of UAV power and facilitate the use of a variety of camera lenses, an effective and flexible method for planning UAV gigapixel panorama acquisitions is required. To address this need, a panoramic image acquisition planning method is proposed in this paper. The method defines image overlaps via a ray casting procedure and then generates an acquisition plan according to the constraints of horizontal and vertical overlap thresholds. This method ensures the completeness of the panorama by maintaining the overlap between adjacent images. Two experiments, including simulated and field cases, were performed to evaluate the proposed method through comparisons with an existing panorama acquisition plan. Results showed that the proposed method can capture complete panoramas with fewer images.

1. Introduction

A panorama is a single wide-angle image of the environment around a camera [1]. A spherical panorama fully covers the surrounding scene with a 360° horizontal and 180° vertical field of view. Panoramic imaging is information-rich, low-cost, and effective—it is one of the best forms of virtual reality. Combining it with geoinformation technologies has facilitated its wide use in many commercial and industrial applications, including street view [2], virtual tours [3, 4], surveillance [5], and risk assessment [6]. Thanks to the availability of high-resolution camera systems, there has been a trend toward gigapixel-sized panoramas. Gigapixel images can include a significant amount of detail [7], which allows their application to traffic or wildlife surveillance and rockfall mapping [8, 9].

Current unmanned aerial vehicles (UAVs), particularly multirotors, are relatively inexpensive and have a high degree of mobility and maneuverability. UAVs have a natural advantage in panoramic image acquisition, owing to their flexibility and large observation range. Combining UAVs and panoramic imaging makes macroscopic landscape analysis possible [10, 11]. Three types of equipment are commonly used for capturing panoramic images on the UAV platform: fisheye panoramic cameras [12], camera arrays [13], and panoramic gimbals [14, 15]. A fisheye camera uses a fisheye lens, which has a relatively large field of view (FoV), to capture the panorama directly. The spatial resolution of the fisheye panorama is limited because of sensor size, so its application is limited to virtual tours or panoramic videos. A camera array contains several lenses that capture images of different areas around them. A multilens panorama has a large FoV. Because of the rigid connection between lenses, this type of camera is able to capture seamless panoramas. However, the payload of the multirotor platform is restricted, so most panoramic camera arrays have five or six lenses. They are unable to capture gigapixel panoramas. A panoramic gimbal rotates itself to allow the camera to capture multiple images that can be stitched together as a panorama. This may be the most common method of capturing gigapixel panoramas. At present, capturing panoramas with UAVs mostly relies on human operation or programmed gimbals. To ensure image stitching, redundant images may exist, which leads to unnecessary use of UAV power. On the other hand, the image capture strategy cannot adjust itself to accommodate multiple cameras and lenses, which limits the flexibility of panorama gimbals.

Multiple methods and tools have been developed for stitching gigapixel panoramas. The pipeline starts with the correction of lens distortion and interior elements using a pinhole camera model [16, 17]. Feature points are then detected and matched as tie points, which requires sufficient overlap between images. For spherical panoramas, images are projected onto a virtual sphere whose radius is equal to the focus length of the camera [18, 19]. The images are stitched together to create a full panorama. To overcome different lighting and exposure conditions, blending is performed to create consistent and seamless panoramas [20]. Among these steps, tie point matching is the basis. Therefore, ensuring sufficient overlap between images is the most important criterion for capturing gigapixel panoramas.

Compared with the number of studies on image stitching, there are fewer studies on capturing gigapixel panoramas. Camera settings, rotation axes, and image overlap have been discussed in previous studies. Krishnan and Ahuja [21] proposed a method that requires a minimal number of camera parameter changes to capture a fully focused panorama. During the image capture, the camera must be rotated around a “no-parallax point” in order to maintain the alignment of foreground and background points in overlapping frames, while Littlefield [22] stated that this point should be the center of perspective. However, in previous studies, the optimal overlap between images during the capture of a spherical panorama varied. Kopf et al. [14] selected a 16% overlap threshold in capturing gigapixel images; Carey [23] recommended a 15% overlap while Eisenegger [24] recommended a 20% to 25% overlap. Furthermore, Afshari et al. [25] proposed a guideline for designing multicamera panoramic systems, which considered the FoV and full coverage distance. Akin et al. [26] followed this guideline and built a hemisphere panoramic camera system in which the estimated overlap was 32%. Although all these systems successfully captured panoramas, the definition of overlap is ambiguous and the value differs. In UAV gigapixel panorama capturing, the definition and selection of the overlap must be optimal, because payload and power in UAV platforms are strictly restricted.

In this paper, a gigapixel panorama acquisition planning method for multirotors is proposed. This method is based on a ray casting image overlap calculation method. The ray casting-based overlap restricts the panoramic image capture planning. The plan is designed for panoramic gimbals that can rotate horizontally and vertically; thus, it is generated considering pan and tilt angles separately. Two experiments, including a simulated case and a field case, are proposed to compare the planning result with existing panoramic acquisition plans. The image stitching result shows that the proposed acquisition plan can ensure the completeness of the panorama while using fewer images.

2. Ray Casting-Based Panoramic Image Acquisition

A spherical panorama is represented using a spherical coordinate in the horizontal angle and vertical angle. It is an equirectangular projected sphere [27] whose latitude and longitude coordinates are mapped onto the horizontal and vertical coordinates of a grid. The spherical panorama is captured by rotating the camera about its center of perspective . This section describes the generation of a panorama acquisition plan for multirotors, based on a ray casting method under the panoramic imaging coordinate system.

2.1. Ray Casting-Based Image Overlap Calculation

We assume that the center of perspective remains during the capture of a panorama, and that the camera only changes its pan and tilt angles to capture different views. The pan and tilt angles are denoted by and . We define to be the panoramic imaging coordinate system of a single camera, while is the image coordinate system, which is shown in Figure 1. Note that denotes that the camera is pointing towards the -axis and denotes that the camera is pointing towards the -axis.

It is assumed that an image has pixels. The distance between the two origins is the focal length, . We first want to calculate the panoramic imaging coordinates () of a point on the image.

In equation (1), , , and , respectively, represent the transformation matrix and the rotation matrices of the - and -axes under the coordinate system as follows:

The coordinates of the point on the image, , , must satisfy the following constraints:

The pixels that fall within the overlap of two images and contribute to the panoramic image stitching; i.e., tie points will be detected from them. According to the imaging principle of perspective cameras, the connection between the ground object and the origin of the panoramic imaging coordinate system is the path of the reflecting ray to the camera. A diagram depicting ray casting is shown in Figure 2.

Note that the ground object, the projection on the image, and the center of perspective are collinear, as shown in and of Figure 2. For two overlapping images and , if a ground object (or point) has projections and on both images, the pixels are defined as overlapping pixels. Otherwise, nonoverlapping pixels are the projections of the rays that pass through only one image. For overlapping pixels, because the projection points on two images are on the same ray that crosses the origin , the set of overlapping pixels between image and can be defined as follows:

Let be the set of all pixels from image ; then, the overlap between images and can be defined as the ratio between the number of overlapping pixels and the number of pixels in an image such as :

The input of equations (1), (4), and (5) contains focal length , sensor size , , and camera orientation , . The overlap between the two images can be calculated with these equations. On the other hand, given an overlap threshold, a series of image orientations can be found with these equations as well.

In practice, the accuracy and stability of a panoramic imaging system may affect the orientation of the image. Thus, the pixels corresponding to the same ground point on two adjacent images may not be strictly collinear. The collinear condition of equation (4) can be relaxed with :

The value of is related to the controlled angle accuracy of the gimbal. The accuracy varies with the axis —; we define to be the accuracy of roll and pitch angles and to be the accuracy of yaw angles. Typically, is determined by the encoder of the motor while is related to the compass; thus, . Because the absolute values of the uncertainties are small, we can simply let . The computational amount of overlap is related to the image size , , which can be adjusted. It determines the precision of the output as well. The purpose of the overlap calculation is to generate a panoramic image acquisition plan for UAVs, and each pixel on an image represents an orientation. Therefore, the size of the image is determined by the controlled angle accuracy: , , where and are the horizontal and vertical FoVs of the camera.

2.2. Panoramic Image Acquisition Planning for UAV

An optimal planning process for UAV panoramic image acquisition should ensure the overlap between adjacent images with a minimum number of images. The overlap can be calculated via equations (1), (4), and (5). This subsection specifies the planning process. A typical panoramic image acquisition plan for panoramic gimbals contains a set of pan and tilt angles and for exposure as follows:

A panoramic gimbal with a single camera should allow exposures at tilt angles and the corresponding pan angles. In equation (7), denotes the acquisition plan with tilt angle and pan angle . The initial position for planning can be any arbitrarily specified position. For simplicity, we can specify it to be (0°,0°). To ensure panoramic image stitching, the overlap between vertical and horizontal adjacent images should be at least and . Given these overlap thresholds, the panoramic image acquisition plan can be derived considering the overlap. The plan contains two parts: the determination of tilt angles and corresponding pan angles . Related algorithms are listed in Algorithms 1 and 2.

The input for tilt angle determination (see Algorithm 1) contains the vertical FoV and the vertical overlap threshold ; the resulting output contains a set of tilt angles for panoramic image acquisition. The heading of the initial image is first set to (0°,0°). We then search for the next optimal tilt angle recursively through a greedy approach. Another virtual image with a different tilt angle is created, and the overlap between and is calculated. This process is repeated until the overlap reaches , at which the optimal tilt angle is acquired. Then, is stored in the set and its value is provided to . The process should be recursive because the density of the ground point will change with ; i.e., the closer the ground point is to the center of perspective, the denser the ground point will be.

The determination of pan angles is similar to that of tilt angles (see Algorithm 2). The main difference is that the search for pan angles is only performed once for each tilt angle; the number of images is then determined. The density of the ground point does not change with . Using the two algorithms listed in Algorithms 1 and 2, a spherical panorama acquisition plan that fulfils the overlap constraints and is acquired.

3. Case Study

This section discusses simulation and field experiments designed to evaluate the proposed panoramic image acquisition planning method in terms of completeness and number of images. In order to evaluate the range of adaptation of the proposed method, it was tested under three different camera and lens scenarios. The 35 mm equivalent focal length ranged from 30 mm to 90 mm. The proposed method was compared with the plan by HDRpano [24] in two of the scenarios. The scenarios and plans for comparison are shown in Table 1.

Input: vertical FoV , vertical overlap threshold
Output: A set of tilt angles for panoramic image acquisition.
1: function TILTANGLEDETERMINATION ()
2:  
3:  °
4:   ⟵ Image with angles (0°, β)
5:  whileβ<90° do
6:   forβ2 in do
7:     ⟵ Image with angles (0°, β2)
8:     ⟵ overlap between and
9:   end for
10:   
11:   
12:  end while
13:  return
14: end function
Input: A set of tilt angles , horizontal FoV , horizontal overlap threshold
Output: sets of pan angles for panoramic image acquisition.
1: function PANANGLEDETERMINATION ()
2:  forβ in do
3:   
4:   °
5:    ⟵ Image with angles (α, β)
6:   for in do
7:     ⟵ Image with angles (α2, β)
8:     ⟵ overlap between and
9:   end for
10:   
11:   
12:    all pan angles for tilt angle β
13:  end for
14:  return
15: end function

The proposed plan was generated with both overlap constraints ( and ) set to 15%. The HDRpano plans were with the spherical setting. It had 8 columns and 4 rows for Scenario 1 and 21 columns and 10 rows for Scenario 2. To clarify those plans, set in equation (7) was rewritten with tilt angles and the corresponding number of images (Table 1), because the images were uniformly distributed in the same tilt angle. Typically, the tilt angle starts from 0°, where the skyline lies at. However, the tilt angle of the plans by HDRpano in Scenarios 1 and 2 started from -30°, so the proposed plans were adjusted to be consistent. Scenario 3 had a different camera system that HDRpano does not support and there was only the proposed plan under evaluation. In the first simulation experiment, original images were rendered from an existing panorama. Some errors related to the accuracy of the gimbal were added. In the second field experiment, images were captured by the drone. Afterwards, the images were stitched by the software PTGui [28] with default settings and no manual intervention. The panoramic image stitching results were evaluated by analyzing their completeness; i.e., a qualified panorama should not have any holes or missing parts. The completeness of the sky was not evaluated, because the sky has too few features to be stitched.

Moreover, it is often postprocessed. Thus, the completeness can be calculated as follows: where and , respectively, represent ground and hole pixel counts.

3.1. Simulation Case

To avoid the influence of the platform, all scenarios were simulated before the field experiment. The synthetic images were rendered from an existing panorama shown in Figure 3. The images were back-projected from the spherical panorama, given the orientation (, ) of each image. It was assumed that the errors of orientation follow a Gaussian distribution. To simulate the control accuracy and interference of the gimbal, errors were added to each orientation. We selected 0.017° for roll and pitch and 0.667° for yaw. According to the plan, images were rendered and then input into the stitching software. The completeness and number of images of each scenario are listed in Table 2.

Considering the number of images of both plans in all scenarios, the proposed plan needs three fewer images in Scenario 1 and six fewer images in Scenario 2. As shown in Table 2, the proposed plan resulted in 100% completeness in all three scenarios, while the HDRpano plan was incomplete in Scenario 2. Although the completeness was 99.97%, there were two holes. Specifically, Figure 4 shows the stitched panorama and zoomed in view from HDRpano plan in Scenario 2. Red lines indicate stitching seams. The empty space on the upper halves of the panoramas represents the sky, which was not captured or stitched. These two holes are located between the third to last row and the second to last row. As can be seen from Table 1, there are four images at tilt angle 78° in the HDRpano plan while there are five images at the same tilt angle in the proposed plan. The additional image used by the proposed plan made the stitching seamless.

3.2. Field Case

In the field case, all plans in three scenarios were evaluated. Two different platforms were selected to evaluate the panorama acquisition plans. The platform used in Scenario 1 and 2 was a DJI Inspire 2 quadcopter with an X5s camera gimbal (M4/3 system); in Scenario 3, a custom ArduPilot [29] hexacopter with a three-axis gimbal carrying a SONY A6000 (APS-C) camera was used. The drone then captured images according to the plans described in Table 1 automatically. Scenarios 1 and 2 were urban area while Scenario 3 had more natural landscape. The flight altitude was 100 m in Scenarios 1 and 2 and 30 m in Scenario 3. Images were then stitched by PTGui. The stitching results are shown in Figure 5, and the completeness of each plan is shown in Table 3.

In Figure 5, the sky in the upper halves of the panoramas was not captured or stitched. Note that the image arrangements of all plans are not in a straight line. This phenomenon does not affect the completeness and can be fixed by postprocessing. It may be attributed to interference from airflow and the Earth’s magnetic field, which made the drone and the gimbal unstable. In Scenario 1, both plans achieved 100% completeness. The proposed plan requires one image less at three different tilt angles separately. The object in the upright of Figures 5(a) and 5(b) was the drone itself. The size of the panorama in Scenario 2 both exceeded 3.8 gigapixels (Table 3). The slight difference in the overall numbers of pixels in the panoramas may be caused by the stitching software. The proposed plan was 100% complete while the HDRpano plan had a 94.13% completeness. The incompleteness existed in the first row of Figure 5(d). There were three images missing owing to the lack of tie points. The tilt angles around the skyline were −5° and 7° for the proposed plan and −6° and 6° for the HDRpano plan (Table 1). The slight difference between those tilt angles may have caused the insufficient overlap. In Scenario 3 (Figure 5(e)), the panorama stitched from 49 images of SONY A6000 camera resulted in about 1.5 gigapixels and 100% completeness.

Figure 6 shows 100% zoom views of the gigapixel panoramas planned by the proposed method in Scenarios 2 and 3. In Figure 6(a), numbers 1, 2, and 3 indicate three zoomed targets with different distances. Those distances were 290, 490, and 1900 meters, respectively. With a 90 mm equivalent focal length mounted on the camera, the spatial resolutions for these targets were 0.10, 0.18, and 0.70 cm. The windows of the buildings in the enlarged views of Figure 6(a) can be identified. Notice that the No. 3 enlarged view contains the skyline, in which the outline and the color of buildings are clear as well. Because of its high spatial resolution, the gigapixel panorama can be used in a wider range of applications, including surveillance. In Figure 6(b), three numbered areas show 100% zoom views of the panorama of Scenario 3. The spatial resolution for these areas was about 0.22 cm, which is sufficient for detecting damages on the cliff and the construction site.

4. Conclusions

In this paper, a gigapixel panoramic image acquisition planning method for UAVs is proposed. This method is based on a ray casting overlap constraint, which ensures panoramic stitching with sufficient overlap between images. During generation of the panoramic acquisition plan, pan and tilt angles are considered separately; the resulting plan can be executed by three-axis gimbals mounted on multirotor UAVs. The proposed method was compared with existing panoramic acquisition plans in different scenarios with equivalent focal length ranged from 30 to 90 mm. Results showed that the proposed method could acquire complete and seamless gigapixel panoramas with fewer images than existing plans.

The novelty of the proposed method is that the ray casting overlap is closely related to the panoramic image stitching process. A relatively low overlap (15%) can fulfil the panoramic stitching. On the other hand, the proposed method has a wide range of adaptation. It can generate panoramic acquisition plans for arbitrary perspective cameras and lenses. Its application is not limited to drones—it can support the planning of panoramic gimbals mounted on the ground or manual acquisition as well. The panorama captured by the proposed method with 90 mm equivalent focal length contains over 3.8 gigapixels. Moreover, the spatial resolution on the buildings near the skyline is higher than 1 cm. This type of panorama contains more information and has a wider range of applications in surveillance. In the proposed method, the overlap constraints and were input parameters. Future work will focus on determining optimal overlap constraints to further increase the efficiency of panoramic image acquisition.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (grant No. 41771481) and the National Key R&D Program of China (grant No. 2018YFF0215304). The authors would like to thank Mr. Chen’guang Xu and Mr. Doudou Zeng for the help with the experiment.