Abstract

A user’s position-specific field has been developed using the Global Positioning System (GPS) technology. To determine the position using cellular phones, a device was developed, in which a pedestrian navigation unit carries the GPS. However, GPS cannot specify a position in a subterranean environment or indoors, which is beyond the reach of transmitted signals. In addition, the position-specification precision of GPS, that is, its resolution, is on the order of several meters, which is deemed insufficient for pedestrians. In this study, we proposed and evaluated a technique for locating a user’s 3D position by setting up a marker in the navigation space detected in the image of a cellular phone. By experiment, we verified the effectiveness and accuracy of the proposed method. Additionally, we improved the positional precision because we measured the position distance using numerous markers.

1. Introduction

In the recent years, cellular phones have been used for various applications that enhance daily life. Services to facilitate pedestrian navigation using cellular phones have increasingly attracted attention. To develop pedestrian navigation, studies in various fields have examined topics such as information retrieval, searching the most suitable course, and human interfaces [14]. In this paper, we propose a new technique for position specification.

A user’s position-specific field has been developed primarily using the Global Positioning System (GPS) technology. A device has been developed for cellular phones, in which a pedestrian navigation unit carries GPS. However, GPS cannot specify a position in subterranean environments or indoors, beyond the reach of transmitted signals. In addition, the position-specification precision of GPS and its resolution is on the order of several meters, which has been deemed insufficient for pedestrians. Techniques that use wave communications with GPS represent the mainstream approaches. However, the position-specification system for cellular phones desired by society needs to offer multimedia information and functionality in both indoor and outdoor environments. As a position-specification system with a cellular phone camera function, M-CubITS [5] is available, but the technique of that system requires many markers to search for position specification. Its application is difficult for low-resolution cellular phone cameras. The proposed system can perform position specification using this method by searching with fewer markers.

We suggest a pedestrian navigation system that uses the TV telephone functions of third generation cellular phones and develops a position-specification system changing in GPS. Then, we test our system. Using the proposed system, we install markers uniformly in the navigation space and detect the markers from an animated image, thereby specifying the user’s position. In addition, we enable position specification through the use of several marker searches using the marker calibration for a marker for installation by feeling of expansion reality technology. The power consumption of TV telephone is about 3 to 4 times of a telephone call. However, power consumption can be suppressed to about 2 times by the development of exclusive LSI [6]. We consider that it can be used by a high-speed communication technology and a high-speed server process in the future.

Furthermore, we improve the handling of specification precision from the low-resolution animation of a TV telephone and realize position-specification precision on the order of tens of centimeters. We validate the effectiveness of the suggested system by testing it using a cellular phone.

2. Theory

Figure 1 depicts the flow of the pedestrian navigation system suggested in this study. The corresponding steps are listed below. (1)Take pictures of scenery around a user using a TV telephone. (2)Send the obtained moving picture to a server. (3)Extracting a marker domain; this is called process matching. (4)Specify the three-dimensional camera positions using the matching result. (5)Transmit the navigation map to the user. (6)Display the map on the cellular phone screen.

The extended sense for real technology is superimposed on a marker that shows the hypothetical computer graphics object in the actual world [79]. We developed the system by applying three-dimensional position calculations.

The system divides the server side into two processes (Figure 2): match processing and postpositional azimuth calculations. The first process extracts a marker domain from an input image and determines the four top coordinate values in the observation screen coordinate system of the marker. A marker search disposal is used for discernment of a marker by pattern matching. The next process creates three dimensions using the position information of the cameras from the top four coordinate values provided. Specifically, the camera system of the coordinates requires a coordinate transformation matrix to transform to the marker system of coordinates. The three-dimensional positions of the camera relative to each marker are detected by extending the coordinate transformation line to specify the user position (Figure 3).

2.1. System Constitution and Coordinate System
2.1.1. System Constitution

For this study, we used marker calibration based on the technical field of the expansion of reality, by which a white pattern for identification is placed in the center of a black square. By varying the pattern design in the marker, repetition with another marker by rotation is prevented. Therefore, these markers enable us to obtain the specifications of the three-dimensional camera positions.

A digital camera (Cyber-shot DSC-700; Sony Corp.) was used to obtain still images of resolution at a 5.2 mm focal distance without compression. A cellular phone (SH901; NTT DoCoMo Inc.) was used to film an animation in 3GP format ( pixels), with the MPEG4 animation image encoding method and a frame rate of 15 frames per second (fps).

We performed camera calibration [10] before conducting an experiment with the digital camera and the cellular phone to input the camera parameters into a program as a transparent transformation line for the focal distance, the camera gist, and bias-line coefficient.

2.1.2. Coordinate Systems

Figure 4 shows the coordinate systems used in this system. The camera coordinate system is located in the direction parallel to the -axis, and the -, -axes of the image are in the direction perpendicular to the starting point and the image plane at the focal position for each -, -axis.

The marker coordinate system (, , ) is parallel to the axis the , axes of the marker are perpendicular to the starting point and the marker plane at the marker center with each , axis.

A turn is a parallel movement that can convert point (, , ), expressed in the camera coordinate system, into the marker coordinate system. It is then transcribed as (, , ) in the marker coordinate system.

We refer to the image plane reflected by a transparent transformation model as an observation screen coordinate system. It is assumed that it appears as (, ) in this coordinate system.

2.2. Marker Detection Processing
2.2.1. Preprocessing

In an input image, the area circumscribed by a rectangle at every connection domain calculates two values using the threshold [11].

2.2.2. Marker Extraction

This excludes a huge domain and white using an area value, and the connection component contacting an image border from the circumscription rectangle information is excluded too. We performed an outline line search for the left connection component and recorded all pixel positions on the outline line and made a polygonal line approximation of the outline line data. We used a connection component that is similar with sufficient precision to four segments of the line of the candidate marker. It records the coordinate value of the four bent points for this time [12, 13].

2.2.3. Marker Identification

Template matching was performed with a pattern that was registered beforehand to distinguish between a pattern for identification drawn on the central part of a marker and the marker itself [14]. Therefore, we performed a normal distribution of the image. It is reflected in the observation screen coordinate system by transparent transformation. In addition, point (, , 0) on the marker coordinate system - plane is converted into point (, ) in the expressions given below.

The marker size needs the value of all Cs of expression (1) with the coordinate value of the four tops that were detected by information and marker extraction because it is known. Then it produces a normal distribution of the marker internal pattern. We produced four template components that handle the turn of a marker in the case of template matching to 90 degrees and calculate the degree of their respective resemblances to the input image. Then we consider the resemblance the maximum to be a kind of a marker and a direction:

2.2.4. Top Position Search

Using outline data that correspond to each side of the marker, rectilinear fitting is done using the two smallest multiplications. The intersection of the straight lines is designated as the apex coordinate value.

2.3. Three Dimensions of Position Estimates of a Camera

We estimate the conversion matrix from the camera coordinate system to the marker coordinate system. This conversion matrix consists of a movement component parallel to the turn movement component . Then, we convert it using the transparent transformation between the observation screen coordinate system and the camera coordinate systems. We can find the actual conversion matrix through prior calibration. We denote the relation of these coordinate systems in the following expressions:

We use the marker coordinate system with a coordinate value and that of the four tops in the observation screen coordinate system provided by conventional processing. We then set and estimate in expression (2). We show the procedure in detail in the following.

2.3.1. Estimate of Turn Movement Component

Expression (4) is the equation of the two sides of the straight lines that face each other from the top position of a marker and it is provided in the observation screen coordinate system. Expression (5) is provided by substituting , of expression (3). This expression is a plane equation in three dimensions of the space expressed by the camera coordinate system, but the side of the marker in three dimensions of space means an existing entity in this plane.

Because the two sides of markers that face each other are parallel, the direction vector agrees. It agrees with the direction in expression 5-2 plane respect. In other words, a vector calculated as the outer product of each normal vector of expression 5-2 plane becomes a direction vector of the two sides of the parallel camera coordinate systems.

We can obtain the two sides of direction vector , for adjacent markers by carrying out this calculation for two sides of the two sets of parallelism. These two vectors are ideally perpendicular but are not actually perpendicular due to the measurement error.

Therefore, for two measured direction vectors, we calculate two unit vectors: , , which are perpendicular in a plane including them, and use these instead of , . In addition, the unit vector of the perpendicular direction is provided on a marker plane by calculating the outer product of , . Then, it becomes a turn conversion component from the marker coordinate system to the camera coordinate system. We seek to convert turn components from the camera coordinate system to the marker coordinate system using this system. Therefore, we determine the transpose of , which we calculate and use for calculation.

2.3.2. Estimate of Parallel Movement Component

When we connect expressions (2) and (3) and substitute a coordinate value in the marker coordinate system of the four marker tops, the coordinate value in the observation screen coordinate system and the first eight equations of , , are provided.

Matrix , is known and , , can be calculated from these expressions.

2.3.3. Revision of Conversion Matrix

The value of is calculated using the above-mentioned calculation, but a large error can occur in calculation of the turn line. Therefore, we use the image information again and revise the turn line. In two expressions, the turn line is expressed in nine parameters, but we can express this in three angles of rotation of the -axis circumference and obtain each angle of rotation from the turn matrix . Therefore, we can calculate the observation screen coordinate value with , which we have obtained up to now by substituting the coordinate value of the four marker tops for 2.3 expressions. We revise the values of the three angles so that the result is a decrease of this calculation value and the sum of square of the error of the value that was really obtained by image processing. We obtain a new turn component through repetitions of ten times’ of the hill climbing method to be concrete. Furthermore, we obtain estimates of parallel movement components again and update parallel movement components .

3. Experiments

For this study, we performed four experiments to inspect the position-specification precision of the pedestrian navigation system that we developed in this study. We prepared markers of 8 8 cm, 50 50 cm, and 77 77 cm. We added a white frame around the calibration markers to enable high contrast for the markers.

3.1. Experiment of Accuracy of Depth Direction and Side Direction Measurement
3.1.1. Experimental Method

We performed an experiment to inspect the measurement precision of the depth direction and side direction of the position-specification system that we developed. We describe the experimental method in the following.(1)Fix a camera position and take a static image. (2)Use an 8 8 cm marker. (3)Capture the image twice and obtain the average measurement values. (4)To inspect the depth direction, fix the side direction in the camera front and move a marker in the depth direction at 10 cm intervals and capture images. (5)Fix a depth direction at 50 cm from the camera and move the marker to the side at 5 cm intervals. Then, for inspection of the side direction, capture images.

We filmed the same situation twice because the possibility of failure exists in both the pattern identification and the conversion matrix estimate. We used animation as input by the system proposed in this study. Therefore, we can minimize the calculation error of the identification error and the conversion matrix of a marker by calculating it from plural frames. Consequently, we limited the error by having plural photography results. Figure 5 shows an example of an image that we filmed for our experiment.

3.1.2. Results and Considerations

The experimental results of the accuracy of the depth direction measurement are shown in Figure 6, and the experimental results of the accuracy of the horizontal direction measurement are shown in Figure 7. These figures show almost equivalent slopes. The measurement value identified as the position-specification precision at the level of tens of centimeters is the ideal (desired) value in Figure 6. In addition, the precision of the marker at the short distance is as good as that at the remote distance because the dispersion in measurement values decreases. The limit depth-direction distance of marker identification is approximately 100 cm. Figure 7 shows that as the marker moves sideways from the camera center, the error increases from the ideal value.

The scale calculation of the side direction includes the error as the marker goes away from the camera center, which is thought to occur because calculation errors accumulate.

3.2. Experiment of Marker Size Inspection
3.2.1. Experimental Method

We used a large marker size and performed an experiment similar to the previous experiments of the depth direction measurement precision. We can distinguish the camera resolution, marker, and marker size, and we can inspect it for three distance relations. The experimental method is as follows.(1)Fix the camera position and capture a static image. (2)Use 10 markers of 77 77 cm and 8 8 cm. (3)Take two images and obtain the average measurement value. (4)Fix a marker in the camera front and move the camera to a depth direction at 100 cm intervals and capture the images.

In these experiments, we fixed the camera position and filmed it, but we fixed the marker position and moved the camera position and then filmed it. The illumination condition for this filming was not the same; the brightness differences according to the location were thought to influence the measurement results. Figure 8 shows an example image that we filmed during the experiment.

3.2.2. Results and Considerations

The experimental results of the marker size verification are shown in Figure 9. This figure shows the tendency of measurement values to be equal to the result of the experiments of the depth direction measurement precision. The marker size is enlarged, which appears to improve precision. In addition, the distance of the limited depth direction of the marker identification was 1200 cm, or approximately 10 times the magnification of that for similar precision measurements, which is satisfactory for short distances. These results show that the identifiable distance for the design of the calibration marker is a short distance of 100 cm because the marker size is enlarged. It is possible to extend the marker distance. These results verified that we can use them even at long distances. It is possible to measure it using the measurement in the vertical direction and omitting the measurement in the horizontal direction.

In a subsequent marker-size verification experiment, we photographed images as in the experiments in Section 3.1.1, with 640 480 resolution and obtained similar results. Therefore, future improvements of camera resolution have made it possible to produce a marker that can be installed inside a small navigation space. These results also infer that improvement of position-specific precision is possible.

The size of the marker depends on the resolution of the camera, as confirmed by the experiment. A marker 50 cm in size was necessary because a camera of 300,000 pixels was used. If a camera with ten times the pixel count is used, the area of the marker becomes 1/10. As the resolution of the camera increases, the problem of the size of the marker decreases.

3.3. Walking Simulation Experiment
3.3.1. Experimental Method

Based on the experimental results described above, plural markers were installed in the navigation space, then an experiment simulating a user walking was performed. The experimental method is described below.(1)Fix the marker position; obtain a still image. (2)Use markers of size 8 8 cm. (3)Assume a 60 cm wide passage, with markers arranged every 40 cm in a depth direction on both sides of the passage. (4)Assume straight walking along the passage center and capture camera images twice every 25 cm while moving and take the average as the depth course at the passage center. (5)Calculate the relative camera position from all markers, which successfully identified markers, and the points of intersection with the measurement results.

A passage of approximately 200–300 cm wide in the experiment represents a hallway in a real building with a scale of approximately 1 : 4. We show an example of an image that we filmed during the experiment in Figure 10.

3.3.2. Results and Considerations

First, we draw a straight line in a camera direction from all markers; the lines identify the respective photography points. In Figures 11 and 12, we show the walking course results and the results of the gap in the depth direction when we used the position of the user and straight line intersection at the point of intersection of the straight line. Both figures show that a large component of the measurement results comes from the center of the passage, indicating that the difference of desired values is great because of the influence of imprecise measurement values. Using a technique that uses the intersection of a straight line for the measurement result, the measurement values from the more precise short-range marker measurement influence the measurement values using the long-range marker, whose precision is poor, thereby causing the large error in this experiment.

Next, a technique with weighted measurement values using the value of the depth direction of each measurement value is devised. A chart of the weighted values of the depth direction is shown in Figure 13. The radius of the circle enclosing a measurement value in the center changes with the measured value of the depth direction. When the territories demarcated by the circles of the measurement value of two or more overlap, the center of that territory is used as the measurement result. The area of the circle reflects the value of the depth direction. When only one marker cannot be identified, when overlapped the territory does not exist, the measurement value of the marker that succeeds in the identification is used as the measurement result. The walking route results for the weighted depth direction and the results for the weighted depth direction gap are shown in Figures 14 and 15. These two figures show greater improvement than the rectilinear intersection technique that shows the cross direction depth direction with an error of 10 cm; thus, it can guarantee position-specific precision of several tens of centimeters. We show sample variance as a statistical error. Sample variance was 8.33. Differences of the actual measurement values and desired values are shown in Table 1. From this table, it can be verified that no great improvement is obtained with depth direction weighting.

3.4. Actual Machine Walking Experiment
3.4.1. Experiment Contents

Using a cellular phone, we performed an actual walking experiment. We verified the aspects related to the camera resolution of the cellular phone and the relationship of position-specific precision of the proposed system. The experimental methods are described below.(1)A walking (moving) picture is obtained using a cellular phone and transmitted to the server. (2)In a 200 cm wide passage, 50 50 cm markers are placed alternately on both passage sides at 150-cm depth intervals. (3)The user takes a picture of the center of the passage while moving straight down the passage. (4)A server transmits the map of the current position of the user to another cellular phone.

The map is drawn based on the average of the information of the position where it was calculated from the moving picture frame inside a fixed section and plotting only the present position.

3.4.2. Results and Considerations

Figure 16 is an execution screen of the cellular phone in the actual machine walking experiment. The results show that it was possible to confirm the position with specific precision of several tens of centimeters, even from an animated picture that was photographed using the cellular phone. Thus, we verified the effectiveness of the proposed system. However, because the current camera resolution of cellular phones is low, it is assumed that the marker size used in this experiment is needed to specify the position. For that reason, higher camera resolution of cellular phones will allow reduction of the marker size and improvement of the position-specific precision.

4. Conclusion

In this study, we described a pedestrian navigation system that uses a TV telephone function; we developed a system to specify the user position and performed experiments to verify the system. The position-specific system examined in this study is formed by two processes: marker detection and three-dimensional position estimates using the camera. Marker detection processing applies the technique of the former territory extraction, normalizes the marker identification, and then performs template matching in different territory forms. In addition, using three-dimensional position estimate processing of the camera, a coordinate transformation matrix is calculated from four apex coordinate values of the marker territory, and relative camera positions from each detected marker are obtained. The cost of the communication is the call charge of the TV telephone. Most of the processing time is a communication delay because it uses a high-speed computer for the server. This delay time depends on the resolution of the image. With the current technology, it is almost possible to operate in real time.

To verify the effectiveness of the developed system, we performed four experiments. Using the results of the cross direction and depth direction measurement precision experiments, we verified that position-specific precision is several tens of centimeters. In the marker size experiment, to verify three relationships (camera resolution, marker size, and marker identifiable distance), we used larger markers and evaluated the measurement precision. We confirmed that a larger marker size is needed. In addition, we confirmed that smaller markers and improvement of the position-specific precision are possible through improved camera resolution. From the results of the walking experiment, which assumes that a user is walking inside a space where plural markers are installed, it was possible to guarantee the position-specific precision of several tens of centimeters using the depth direction weighting technique. Finally, from the results of actual cellular phone use in walking experiments, we showed that the proposed system guarantees position-specific precision of several tens of centimeters, thus verifying that the proposed system is effective for pedestrian navigation.

The results described above indicate that future improvement of the resolution of cellular phone cameras, even with smaller markers than we used in these experiments, can provide sufficient precision for practical use. In addition, the position-specific system developed in this research was verified to be effective for pedestrian navigation.

Acknowledgment

This research was partially supported by the Ministry of Education, Sports, Science and Technology (Grant-in-Aid for Scientific Research (C)).