Abstract

Recently, a mobile augmented reality (AR) system with AR technology that requires high performance has become popular due to the improved performance of smartphones. In particular, mobile AR that directly interacts with outdoor environments has been in development because of increasing interest in e-leisure due to improvements in living standards. Therefore, this paper aims to study tracking and augmentation in mobile AR for e-leisure. We analyzed the performance of human body tracking application implemented in a mobile system (smartphone) using three methods (marker-based, markerless, and sensor-based) for the feasibility examination of human body tracking in mobile AR. Furthermore, game information augmentation was examined through the implementation of mobile AR using two methods (marker- and sensor-based).

1. Introduction

PC-based augmented reality (AR) is evolving into mobile AR by the popularization of smartphones with high-resolution image sensors, GPS, and gyro sensors. In particular, mobile AR that directly interacts with outdoor environments such as Pokémon Go [1] is being noted because of increasing interest in e-leisure due to improvements in living standards. e-Leisure, the digital leisure culture that includes e-sports, e-games, and interactive media, is becoming a favorite activity by providing both sociocultural worth and enjoyment [2, 3]. Therefore, research into the mobile AR for e-leisure (i.e., e-leisure mobile AR) is required. Through this research, e-leisure mobile AR that can augment game information for outdoor sports such as baseball, basketball, and soccer in real time is being developed [47].

Tracking points of interest (POI) in the real world and obtaining the precise coordination of game information augmentation have become increasingly important for e-leisure mobile AR, and various studies are underway. Among the various studies on mobile AR-based object tracking, there is greater emphasis on studies in which the POI is tracking a human body [813].

Real-world objects (human bodies in this study) and the precise coordination of virtual data for realistic mobile AR are processed in three steps: positioning, rendering, and merging. In the positioning step, virtual data (known as virtual content) are converted in accordance with the smartphone’s location. In the rendering step, 3D objects involving virtual data are projected into 2D images. Finally, virtual data projected into 2D images are combined with the real world on the smartphone screen (viewport) in the merging stage.

Coordinating these three steps requires tracking the 3D locations of POIs with precision. The POI can be tracked via widely used methods based on vision and sensors [8]. Vision-based methods are categorized into marker-based and markerless tracking methods.

Marker-based tracking methods (Figure 1) are methods of tracking color markers or markers of specific forms [8]. Wagner [9] proposed a marker pattern for mobile AR and compared the performance with mobile AR using conventional markers. Marker-based tracking methods have the advantages of high stability and ease of implementation compared to markerless tracking methods, but the disadvantage that it is inconvenient because the marker is restricted to the subject.

Markerless tracking (Figure 2) is a method of tracking a target’s features naturally without the attaching artificial markers [10]. Ziegler [11] evaluated tracking methods based on markerless suitable for mobile AR. Markerless tracking method has an advantage of recognizing the target’s rotation angle, direction, lighting changes, partial overlap, and so on based on the target’s features. However, markerless tracking method has the disadvantage that it cannot guarantee its real-time performance in a smartphone environment that has lower performance than a PC because the markerless tracking method generally requires more computation than marker-based tracking methods.

The sensor-based method (Figure 3) is a method of tracking a target using various sensors including magnetic, inertial, optical, and mechanical sensors [8]. Tan [12] and Pryss [13] implemented real-time mobile AR using sensor-based tracking methods. Sensor-based tracking methods generally have the advantage of rapid processing speed and the disadvantage that calibration and matching processes are required to solve problems because physical problems such as sensor errors and communication delays affect tracking performance.

In this paper, we study human body tracking and game information augmentation in e-leisure mobile AR. We examine human body tracking by performance measurement experiments in mobile applications based on marker-based, markerless, and sensor-based methods. Game information augmentation is examined by the implementation of e-leisure mobile AR using marker- and sensor-based methods.

This paper is organized as follows. Section 2 describes the existing cases of tracking and augmentation in mobile AR for e-leisure. Section 3 presents the experiments and the results of human body tracking and game information augmentation for e-leisure mobile AR applications. Finally, Section 4 presents the paper’s conclusion.

2. Tracking and Augmentation in Mobile AR for e-Leisure

This section introduces conventional studies of human body tracking and game information augmentation methods that can be applied to e-leisure mobile AR. The e-leisure mobile AR is performed through the process of detection area decisions, human body tracking, and game information augmentation, as shown in Figure 4. In the detection area decision, the POI is set at the game information augmentation location, whereas human body tracking follows the POI’s movement based on the three methods (marker, markerless, and sensor) in real time. Finally, the game information augmentation process augments game information via the tracked POI.

Marker-based mobile AR can be implemented in various ways. Among them, the typical mobile AR implementation method uses a square marker that has easy-to-recognize three-dimensional (3D) position and rotation. Marker-based mobile AR implementation is composed of three stages as shown in Figure 1: (a) acquiring the original image from the camera; (b) estimating the outline of the white boundary area from the connected components; and (c) augmenting the fine location of virtual data for marker patterns using extracted edges and corners.

Tracking human bodies using markers requires attaching the markers to all targets, which is impractical. Studies have been conducted on the use of environmental information of real-world data instead of markers to overcome this disadvantage [15]. There are markerless tracking methods that use the features data of natural objects in the real world to understand how to use environmental information.

Markerless mobile AR implementation methods include Scale-Invariant Feature Transform (SIFT) [13] and Histogram of Oriented Gradient (HOG) [16]. SIFT compares the original image to the feature points of objects. Even if an object is covered, as shown in Figure 2(a), it can still be identified through its feature points. HOG was proposed for tracking pedestrians using detected feature points such as color brightness or the directional distribution of area. This method is suitable if a target has been rotated or if the inner pattern is simple and the object can be identified from the target’s outline. Figure 2(b) shows the results of detecting human bodies with multiple targets using HOG.

Sensor-based mobile AR provides virtual data based on location data acquired through the GPS as shown in Figure 3 [17, 18]. Since sensor-based mobile AR uses location data, it can be easily expanded to various services by replacing the augmented virtual data.

Sensor-based mobile AR provides virtual data based on location data measured by the sensor. Sensor-based mobile AR implementation is performed in two steps: (1) measuring the location of the target though the sensor and (2) matching the sensor coordinate system to the image coordinate system.

A mismatch occurs if the matching error between the sensor coordinate system and the image coordinate system is large [1924]. A typical example of the mismatch phenomenon is mobile AR using GPS. Because GPS-based mobile AR uses a two-dimensional (2D) coordinate system that consists of latitude and longitude, it does not provide height information between the target and the ground surface. Therefore, GPS-based mobile AR has the problem that the virtual data’s location in the image is inaccurate. Improving the accuracy of the sensor-based mobile AR requires calibration for mismatching. Recently, communication technologies have started being used in conventional research. An example method based on communication networks is the Wi-Fi positioning system (WPS), which tracks the location of devices by searching for their Wi-Fi access point location [25].

Game information augmentation’s general aim is to provide information to users by either highlighting or annotating objects or humans [26]. Figure 5(a) shows an example [27] in which game information augmentation has been applied to a basketball game: the game information is augmented by highlighting the surroundings of the objects of interest using bright circles. Here, augmented game information includes the player number, team color, and play direction. Figure 5(b) shows an example [28] in which game information augmentation has been applied to a material management system. The marker attached to the object of interest provides material information to users via annotations, and the provided material information is used to augment the screen of the mobile system (smartpad) based on a marker that exists in the real world.

3. Experimental Results and Discussion

This section examines the human body tracking and game information augmentation methods for application to e-leisure mobile AR as shown in Table 1. Human body tracking methods are implemented in the first experiment, and the feasibility of applying such methods to mobile devices is examined based on performance measurement experiments. The second experiment examines the feasibility of applying game information visualization to mobile devices through the implementation of e-leisure mobile AR.

We examine the feasibility of human body tracking methods in mobile devices by comparing the performance of implemented applications using various human body tracking methods. Markerless and marker- and sensor-based human body tracking were implemented using color markers, HOG, and IMU sensors, respectively.

Three experiments were conducted with respect to the three conditions (capability, resolution, and the number of people). The first condition used low-capability (Galaxy S4, 1.6 GHz CPU) and high-capability (Galaxy S6, 2.1 GHz CPU) smartphones to determine the effect on performance. The second condition used low-resolution (320 × 240) and high-resolution (960 × 720) input images (30 fps) to determine the effect on resolution. The third condition checked the effect on different number of people (single user versus multiple users).

In this experiment, marker-based human body tracking applications were implemented using uniforms as color markers [29, 30], as shown in Figure 6, because those in e-leisure (game and sports) environments often wear uniforms. The color marker method extracts POI using the HSV color space and threshold value. Here, the HSV color space was selected because it is robust against shadows and unequal illumination [29, 30]. The marker-based human body tracking application was implemented using the OpenCV Library connected to the Android software development kit (SDK) and native development kit (NDK).

Figure 6 shows the results of the color-marker-based human body tracking applications. Figure 6(a) displays the tracking result for a single user, and Figures 6(b)6(e) display the tracking results of multiple users (two to five people). In tracking results, a white rectangle means the tracked human body, and a group means a group of users identified based on the color of the marker, red or green.

Many calculations are required for markerless human body tracking. Therefore, this section implements applications using the HOG algorithm, which has been applied to smartphones and is less complex than other markerless human body tracking methods. Markerless human body tracking is used in the OpenCV Library combined with the Android SDK and NDK to implement applications such as marker-based tracking.

Figure 7 shows the results of markerless human body tracking applications. Figure 7(a) displays the tracking results for a single user, and Figure 7(b) displays the tracking results of multiple users (five). In the tracking results, a white rectangle indicates a tracked human body.

Experiments were conducted to measure the recognition rate of marker-based and markerless human body tracking methods in low- and high-resolution images. The sensor-based human body tracking performance was affected by the sensor’s accuracy, and the measurement experiment of the recognition rate for the sensor-based human body tracking method was excluded because the sensor’s accuracy is not in this paper’s scope. The recognition rate was measured by dividing the total number of images by the number of images that were successfully tracked.

Table 2 shows the results of the recognition rate of human body tracking methods for marker-based and markerless in low- and high-resolution images. The recognition rate of the marker-based human body tracking method was 96% for both high- and low-resolution images. These results confirm that the recognition rate can be maintained at a low resolution when using a large-size marker such as a uniform. The recognition rate of the markerless human body tracking method was 82% at high resolution and 62% at low resolution, which confirms that the recognition rate in markerless human body tracking increases when using a higher resolution.

Sensor-based human body tracking is implemented in three steps using IMU, server, and smartphone, as shown in Figure 8. In the first step, the IMU attached to the smartphone transmits the measured 3D position data to the server. In general, IMU measurement data has a large cumulative error. In this experiment, measurement data with a small error were selected through repeated experiments in the same environment. In the second step, the server converted the 3D position data received from the IMUs for calibration into 2D position data of smartphones displayed on the screen (via orthographic projection). Converting 3D position data based on a real-world coordinate system into 2D position data on the user screen requires synchronizing the coordinate system of the 3D position data. In the implemented sensor-based human body tracking, the coordinate system origin of the 3D position data (the IMU’s initial position) was set the same, and the coordinate system of the 3D position data was synchronized. In the final step, the smartphone receives the converted 2D position data from the server.

Figure 9 displays the sensor-based human body tracking display. The implemented application augments the specific symbol (user identification number and white rectangle) by identifying the user as shown in Figures 9(a) and 9(b).

Table 3 shows the experimental results measuring the performance (FPS) of the human body tracking methods according to the smartphone’s performance, the image resolution, and the number of objects. The measurement results show the FPS difference in human body tracking methods according to the smartphone performance and the image resolution, as indicated largely in the order of markerless, marker-based, and sensor-based. The FPS difference according to the increase in tracked targets was largely not indicated in all three human body tracking methods. Marker-based human body tracking using low-resolution (average greater than 35 fps), markerless human body tracking using low-resolution (average over 15 fps), and sensor-based human body tracking (average 60 fps or higher) considering the real-time performance (over 15 fps), that could be applied to mobile AR. However, the recognition rate in markerless human body tracking methods (Table 2) indicates a low recognition rate for low-resolution images and the disadvantage that the user cannot be identified. Therefore, we use marker- and sensor-based methods in experiments to examine game information augmentation methods for e-leisure mobile ARs.

We examined the feasibility of applying game information augmentation to mobile devices by implementing an e-leisure mobile AR using marker- and location-based human body tracking. An e-leisure mobile AR system that was developed for game information augmentation is performed through the processes of detection area decisions, human body tracking, and game information augmentation. The detection area is determined by categorizing outdoor sports into two groups (offense and defense) based on the actions involved, and the main body parts each group used were analyzed as shown in Figure 10 [31]. In the results of analysis, the hand using an object such as a sword or ball was derived as the detection area in the offensive group. The upper body with equipment such as protective gear was derived as a detection area in the defensive group.

Game information augmentation was achieved in detection areas, including the hands and upper body. Examples of game information augmentation in detection areas were analyzed using conventional contents. Figure 11 shows the selected game information for augmentation from conventional content. The detected game information augmentations for the hands and upper body are as shown in Figure 11(a) (fire, sword, and flag) and Figure 11(b) (target, status, and epaulet), respectively.

Figure 12 shows the overall performance of human body tracking and game information visualization on a mobile AR system through the processes of detection-area decision, human body tracking, and game information visualization: The user’s hands and upper body are set as POIs when deciding the detection area: during human body tracking, the human body is tracked using color markers and sensors. Finally, the game information is augmented for the user of interest among the various users.

As shown in Figure 13, the mobile AR system provides functions for selecting the HOI, visualizing game information, and visualizing the first-person shooter. The function for selecting the HOI selects the activation of user visualization (users 1, 2, and 3), and the game information visualization function activates virtual game symbols (target, status (heart), and epaulet) to the HOI.

Figure 14 shows the results of the game information augmented by the mobile AR: Figure 14(a) shows the screenshot for human body tracking in the marker-based mobile AR. Figure 14(b) shows the game information (target) obtained by identifying users in the marker-based mobile AR. Figure 14(c) shows a screenshot obtained for human body tracking in the sensor-based mobile AR. Finally, Figure 14(d) shows a screenshot for augmenting game information (sword) through user identification in the sensor-based mobile AR. The performance measurement results for the implemented mobile AR with the high-capacity smartphone at high resolution were confirmed to have stable operation (real-time tracking and visualization were possible without breaks) at 20 fps (marker based) and 62 fps (location based).

4. Conclusions

This paper examines methods of human body tracking and game information augmentation in e-leisure mobile AR. The application of human body tracking was examined by implementing mobile AR via three methods. In the experimental results, marker-based human body tracking at low resolution (over average 35 fps), markerless human body tracking at low resolution (over average 15 fps), and sensor-based human body tracking (over average 60 fps) considering real-time performance (over 15 fps) could be applied to mobile AR. However, it confirmed that the markerless tracking method was not suitable for the e-leisure mobile AR environment (outdoor with multiple users) considering user identification and recognition according to resolution.

Game information augmentation was examined through the implementation of marker- and location-based mobile AR systems. The performances of the marker- and location-based mobile AR systems were 20 and 62 fps, respectively, and the results confirmed that human body tracking and game information visualization methods operated stably on the smartphone. Furthermore, we confirmed that it provides interactions among users that conventional mobile ARs could not provide and derived a methodology for selecting the game information to be augmented. These results are expected to be greatly helpful in the study of e-leisure mobile AR.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research is supported by the Ministry of Culture, Sports, and Tourism (MCST) and Korea Creative Content Agency (KOCCA) in the Culture Technology (CT) Research & Development Program.