Abstract

The automatic detection of road related information using data from sensors while driving has many potential applications such as traffic congestion detection or automatic routable map generation. This paper focuses on the automatic detection of road elements based on GPS data from on-vehicle systems. A new algorithm is developed that uses the total variation distance instead of the statistical moments to improve the classification accuracy. The algorithm is validated for detecting traffic lights, roundabouts, and street-crossings in a real scenario and the obtained accuracy (0.75) improves the best results using previous approaches based on statistical moments based features (0.71). Each road element to be detected is characterized as a vector of speeds measured when a driver goes through it. We first eliminate the speed samples in congested traffic conditions which are not comparable with clear traffic conditions and would contaminate the dataset. Then, we calculate the probability mass function for the speed (in 1 m/s intervals) at each point. The total variation distance is then used to find the similarity among different points of interest (which can contain a similar road element or a different one). Finally, a -NN approach is used for assigning a class to each unlabelled element.

1. Introduction

The automatic detection of road related information by in-vehicle systems in general using data from different types of sensors while driving has many potential applications. Some major examples from previous research studies are road crash detection [1], traffic congestion estimation [2], potholes and bumps detection [3], traffic lights automatic recognition [4, 5], and automatic routable maps generation [6, 7].

Several sensing technologies have been previously used in order to automatically detect road related information. These technologies can be categorized into 3 major families: laser based systems, vision based recognition algorithms, and smartphone based sensing systems (combining inertial and GPS information). As an example of a laser based system, the authors in [8] used a LIDAR based system to automatically obtain the geometrical inventory of road cross-sections. The study in [9] provides a review for the different mechanisms for assessing the visual condition of vertical and horizontal civil infrastructure based on computer vision algorithms. Other examples of vision based recognition systems can be found in [10] that addressed the problem of automatically reading the rules encoded in road markings and inferring the semantics of road scenes and [4, 5] that detected traffic lights based on image processing on mobile devices. An example of a previous research that combines the information gathered form different sensors in smartphones for automatic road element detection can be found in [11].

Among the different sensors in smartphones, the accelerometer (alone or in combination with the gyroscope) and the GPS receiver are the most commonly found in previous literature for automatic road related information detection. The research study in [12] addressed the problem of detecting different transport modes by using the acceleration data from a mobile device for accurate and fine-grained detection. The authors used a set of accelerometer features that capture key characteristics of vehicular movement patterns and used a hierarchical decomposition of the detection task. The accelerometer data from a mobile device has also been used to detect infrastructural elements on the road, when either carried by pedestrians or by vehicles. The authors in [13] used acceleration based movement pattern recognition applied to a day-by-day urban street behaviour to be able to detect the pattern of a pedestrian stopping and then crossing a street ruled by a traffic light. The research in [14] made use of on-vehicle inertial measurement units to detect driving behaviour, pedestrians, and particular types of road conditions such as bumps on the road. The sensors in smartphones have also been previously used to automatically detect traffic accidents [1]. The authors in [15] used accelerometers (with additional information containing acoustic data) to immediately notify a central emergency dispatch server after an automatically detected accident. The GPS sensor on mobile devices has also been used for automatically extracting road related information. The GPS sensor is used sometimes to estimate the information about the vehicle’s speed which has also been used to detect driving related events or infrastructural elements. The authors in [16] used the driving speed calculated from accelerometer data in order to perform the detection of different traffic levels. The authors in [7] produced high-quality routable maps by means of the use of GPS information that allowed the automatic extraction of road network properties such as intersections and traffic rules. The research study in [17] also used information from the GPS sensor while driving in order to detect road traffic congestions and incidents in real-time.

The information gathered from different sensors together can be fused to improve detection rates and accuracy levels. The research study in [11] focused on the automatic detection of certain road related information such as tunnels, bumps, bridges, footbridges, and crosswalks based on the combined use of various sensors on mobile devices. These sensors included inertial sensors (such as accelerometer, gyroscope, and magnetometer) as well as cellular network information. The authors performed a combination of information both from vehicles and pedestrians. A crowdsensing mechanism was also introduced in order to improve the accuracy of results. Using the information of several users provides an additional source of data to validate that a detected element is not an outlier (false positive). The authors in [18] also used crowdsensing techniques in order to detect dangerous road sections.

The information obtained from the underlying sensors has to be automatically analysed to detect common patterns associated with the particular elements to be detected. Several machine learning techniques and approaches have already been used in order to detect road related information from data obtained from mobile sensors. The authors in [19] used -means to solve the problem of pothole detection. The algorithm was applied to acceleration data while driving. The research in [20] made use of decision trees, logistic regression, Naïve Bayes, -NN (-Nearest Neighbours), SVM (support vector machines), and MDA (Mixture Discriminant Analysis) to be able to extract information from GPS traces. The authors in [21] detected traffic incidents by using several classification techniques applied to on-vehicle telemetry data. A related study [22] detected the driver state while driving by use of deep learning methods applied to GPS data obtained from a mobile device and a heartrate wearable sensor. These machine learning algorithms are applied to either hand-crafted features or to automatically learned features extracted from the data sensed. When solving classification problems, statistical moments based features are commonly used. The authors in [23] used some statistical central moments based features such as the mean, variance, skewness, and kurtosis in order to provide an automatic sleep scoring method based on the use of single channel electroencephalograms. The research in [24] made use of statistical moments to extract features for a Multilayer Neural Network for the prediction of certain membrane proteins. The authors in [25] used the higher-order moment statistical features for the silhouette of the Gait Energy Image for a natural and normal gait characterization to be used as a biometric cue for human identification. Statistical central moments have also been applied to on-board sensors while driving. The authors in [26] used manually computed features based on the statistical central moments applied to inertial and GPS data to classify drivers according to their driving style.

The major drawbacks in previous related research, as described in the previous paragraphs, can be categorized into two main aspects. On the one hand, high detection accuracy tends to be achieved only when multiple sensors are used, especially in crowdsensing environments. On the other hand, using statistical descriptors, such as the central moments, as the input features for the classification algorithms does not exploit the entire statistical information on the sensed data. This paper focuses on contributing to the automatic detection of road elements based only on GPS data from a smartphone when driving and using the entire probability mass function from the sensed data as the input feature for classification. A new algorithm is developed that makes use of the total variation distance instead of the statistical moments in order to improve the classification accuracy. The algorithm is validated for detecting traffic lights, roundabouts, and street crossings in a real scenario and the obtained accuracy is compared with previous approaches based on applying machine learning algorithms to statistical based features. The total variation is one of the two most popular gauges of the distinctness between a pair of probability measures together with the Relative Entropy [27]. A paper presenting a general framework for multiclass total variation clustering can be found in [28]. We have characterized each element to be detected as a vector of speeds measured when a driver goes through it. We first eliminate the speed samples in congested traffic conditions which are not comparable with clear traffic conditions and would contaminate the dataset. Congested traffic conditions are assessed based on a stochastic distance to the mean value of the speed distribution when driving around a particular location of interest. Then, we calculate the probability mass function for the speed (in 1 m/s intervals) at each point. The total variation distance is then used in order to find the similarity among different points of interest (which can contain a similar road element or a different one). The total variation distance will provide a measure about how stochastically similar two locations are based on the entire speed stochastic patterns. Finally, a -NN based approach is used for assigning a class to each unlabelled element. The -NN will select the closest locations based on the distance provided by the total variation as previously described. The results show that using the total variation distance, a better accuracy is obtained as compared to the same values based on statistical moments based features.

The rest of the paper is organized as follows. Section 2 presents the method proposed in this paper, including the data gathering process, the filtering of slow traffic segments, the probability mass function computations, and the similarity based classification algorithm based on the total variation distance. Section 3 describes the scenario implemented in order to record the dataset. The locations of the selected points containing road elements are presented. Section 4 shows the results obtained applying the proposed method to the recorded data. A comparison with other methods found in literature is performed in order to validate the achieved results. Finally, Section 5 captures the conclusions of this research.

2. Method

This section presents the method proposed in this paper in order to detect road elements based on the probability mass function measured at each particular location after filtering slow traffic conditions. The first subsection presents the data gathering process. Then, the slow traffic filtering is presented. The third subsection captures the way in which the probability mass functions are computed. Finally, the method based on the total variation distance and the -NN classification algorithm is presented. The flowchart of the proposed method is captured in Figure 1.

2.1. Data Gathering

We use the GPS sensor embedded in a mobile device in order to obtain both the vehicle’s location and the estimated driving speed. The speed can be derived from the distance travelled per time unit following (1). The distance travelled between points 1 and 2 can be calculated from the GPS coordinates as captured in (2):where represents the latitude in radians, is the difference of longitudes in radians, and is the Earth radius. The location errors in the coordinates provided by the GPS sensor will propagate when using (1) and (2) in order to estimate the instant speed. The random errors can be reduced by increasing the size of . The resolution in time, on the other side, will decrease when increases. A trade-off between the compensation of the GPS errors and the time resolution has been set to a value of 5 seconds to estimate the vehicle’s speed.

Each drive will generate a matrix of (location, speed) samples (where each location will be defined by its GPS latitude and longitude coordinates). The samples from all the different drives for the same location will be used in order to compute the speed probability mass function (after filtering the slow traffic data). The sampling rate and the location coincidence criteria have to be set together so that the distance travelled at the maximum allowed speed between two consecutive samples and the inter-adjacent-location distance coincide. In this way, travelling at the maximum allowed speed will visit once each adjacent location in the drive (in other words, all the locations are visited at least once each drive when travelling under the maximum speed limit). In our case, we have taken into account only city segments in which the maximum allowed speed (speed limit) is 50 km/h or around 14 m/s. In this way, two adjacent locations will be set 14 meters apart, or similarly, all GPS coordinates in a radius of 7 meters around each target road element will be mapped onto that location. This location mapping mechanism will help to compensate GPS errors which will cause that the same location points may generate similar but different coordinates each drive.

2.2. Slow Traffic Filtering

The speed patterns when travelling in a clear traffic condition are different from those speed patterns in congested or heavy traffic conditions. Therefore, we need to filter the test drives in which the driver was experiencing speed disturbances due to slow traffic or traffic congestion. For each particular location, the average speed in the time interval between 30 seconds before arriving to that location until 30 seconds after having visited that location is calculated. The mean and standard deviation values are then computed for the average speeds for each location of interest. We will discard the speed data for that location in drives where the following condition is met where is the discard criteria at a particular drive (from a total of different drives) and is the average speed for drive (in the 60-second time span cantered at the target location). The value for will be chosen depending on the percentage of drives in which congestion is expected. For a busy hour, should be small. For a test set in which congestion is very rare, the value of should be increased. Increasing the value of will consider a bigger number of samples and therefore improve the training of the algorithms (as long as the samples are taken in clear traffic conditions).

2.3. Probability Mass Functions Computation

After applying (3) to all target locations, considering the data from all the test drives in the dataset, the instant speed for each location for each remaining drive will be considered in order to compute the probability mass function. The instant speed is discretized in 1 m/s intervals (from 0 to 14 m/s, being the maximum speed allowed in the considered driving segments). The probability mass function (pmf) at each speed will be computed as shown in where is the speed in m/s in the range 0 : 14 m/s, are the prefiltered speeds for noncongested drives at location , and is the number of noncongested speed samples. For each speed range [), only speeds in that range will be added.

A pmf will be computed for each target location. In our case, we will select 24 different locations in 4 different classes and will compute a pmf for each of these 24 locations. We will select 6 traffic light locations, 6 roundabouts, 6 street crossings, and 6 locations describing the null class (containing neither a traffic light nor a roundabout or a street crossing).

2.4. Total Variation Distance and -NN Classification

The total variation is a way to compute a distance between two distribution functions. For two probability mass functions (pmf) assigning similar mass to the same regions, the distance will be small. For pmf functions assigning probability mass to different regions, the distance will be close to 1. For categorical distributions, the total variation can be computed as captured in where , are the probability mass functions at locations and and gets the values from 0 : 14 in order to add all the pmf values.

In order to assign an unlabelled new location to one of the classes (to decide if the candidate location is a traffic light, a roundabout, a street crossing, or none of them), the total variation between the pmf for the candidate point and the pmf functions calculated at the 24 locations in the training set will be compared. The -NN algorithm will assign the class in which the total variation distance is smaller for one of the class member locations in . A value of will assign the candidate location to the closest class taking into account up to all the members of the class. An intermediate value for will be used as a trade-off solution in the case of our research.

3. Scenario and Dataset Generation

The implemented scenario to build a driving dataset, which will be used in order to validate the proposed method in this paper, comprises two intracity segments connected by a highway segment. Only the intracity paths have been taken into account since all the road elements to be detected are found in them. Figure 2 shows the two driving segments considered in the experiment in one direction. The first one (left part in Figure 2), 2.9 km long, crosses the city of Leganes in the Madrid area in Spain. The second one (right part in Figure 2), 1.2 km long, is located in the adjacent city of Getafe. Figure 3 shows the corresponding driving segments in the opposite direction. The driving path has been travelled 55 times (26 following the path in Figures 2 and following the path in Figure 3) using 3 different car models (captured in Table 1). A number of 6 different locations for each road element (traffic lights, street crossings, and roundabouts, as well as 6 locations for the null class) have been selected to train and validate our approach.

We have used a Nexus 6 Android mobile device in order to record the GPS traces along the way (as it has been captured in Figure 1). As we have previously mentioned, the GPS sensor was sampled at 1 Hz (1 sample per second). This sampling rate allowed us to take samples which will be separated less than 14 meters when travelling under the maximum allowed speed of 50 km/h (or around 14 m/s). Each target location will be therefore defined with all the points in a radius of 7 meters centred at the target location.

The table with the 6 selected locations for traffic lights is captured in Table 2. Figure 4 shows the 6 locations in a map. The information for the specific locations for the selected roundabouts, street crossings, and the null class is shown in Tables 3, 4, and 5.

4. Validation Results

The method described in Section 2 has been applied to the dataset generated according to the description in Section 3. In order to validate the results, a comparison of the proposed method based on the use of the total variation distance has been compared with methods based on features obtained from the moments of the pmf functions as proposed by other previous research studies.

4.1. pmf Functions

A pmf function has been generated as described in Section 2 for each of the 24 selected locations using the dataset generated as described in Section 3. The average results for all the elements for each class are captured in Figures 5, 6, 7, and 8.

The pmf for the speed at the traffic lights (Figure 5) shows that around 41% of the times the vehicle stops at the traffic light. For the rest of the occasions, either the vehicle stops a bit earlier (if there are some other vehicles already waiting at the traffic light), and therefore the speed is low when crossing the traffic light location, or the traffic light shows the green light and therefore the vehicle crosses at a normal speed.

The pmf for the speed at the roundabouts show that the vehicle slows the travelling speed in all the cases (the speed is always lower than 10 m/s in a 14 m/s speed limit environment). There are some cases (around 13% of the times) in which the vehicle has to come to a stop, but in clear traffic conditions, it is most likely that the driver approaches the roundabout slowing the travelling speed.

The pmf for the speed at the street crossings again shows that the driver has to reduce the travelling speed. Compared to the roundabout case, the speed reduction is bigger (the visibility conditions in order to assess if there is a coming vehicle to which to give way are worse than in the case of the roundabouts and therefore the speed should be further reduced).

Finally, the pmf for the speed at the null class locations show that the speed in clear traffic conditions is rarely slow and the vehicle tends to travel at a speed above half of the maximum allowed value.

4.2. Classification Results Using Statistical Moment Based Features

The mean, standard deviation, skewness, and kurtosis have been computed from each pmf function for all the 24 preselected locations. The normalized th central moment is calculated according to where is the th central moment, is the mean value, and is the standard deviation.

The results when using a 20-fold cross-validation technique for all the 4 features (mean, standard deviation, skewness, and kurtosis) for different classification algorithms are presented in Tables 6, 7, and 8. A -fold cross-validation technique divides the dataset into subsets of equal size and uses subsets for training the classification algorithm and 1 subset for validation. The procedure is repeated times so that all subsets are used once for validation. Table 6 captures the best achieved results, which have been obtained when using a support vector machine (SVM) with a Gaussian Kernel classifier. The accuracy in this case is 0.708. There are 3 traffic light locations which are considered as roundabouts, 1 roundabout which is considered to be a street crossing, 2 street crossings which are considered to be roundabouts, and 1 location in the null class which is classified as a street crossing. The results for the linear SVM classifier are shown in Table 7. In this case, the accuracy worsens to 0.542. Table 8 captures the results for the -NN () classifier with an accuracy of 0.583.

4.3. Classification Results Using the Total Variation Distance and the -NN Classifier

Instead of capturing the statistical information in the pmf functions as a set of features such as the central moments, and then using these features to execute classification decisions based on computed distances such an in the -NN and SVM classifiers, the method proposed in this paper uses the entire pmf function in order to compute stochastic distances based on the total variation and uses them in order to classify each location. The flowchart described in Figure 1 is followed. The samples in the generated dataset as described in the previous section are used to generate the pmf functions as described in Section 2. For each particular location of interest, the speed when crossing that location for each drive when not suffering traffic congestion is considered. The pmf for each location is generated by computing the percentage of drives circulating at each speed interval (in 1 m/s increments) for that location. The speed limit is 50 km/h (around 14 m/s). The pmf will contain probability mass from 0 m/s up to that speed limit in 1 m/s intervals. The total variation will provide a distance between the pmf functions among different locations. The smaller the distance, the more similar the points which could be considered. In order to classify a particular location into one of the 4 target classes (traffic lights, roundabouts, street crossings, and the null class), the total variation distances with all the other locations are computed. An unlabeled new location of interest will be assigned to one of the four classes depending on the number of closest locations (as for the total variation distance) which are found belonging to that particular class. The -NN classification algorithm is used to perform the assignment of each location to each class according to the biggest number of closest locations of that class (as for the total variation distance). The value of is important in order to use more or less neighbours in the class assignment process. The -NN () applied to the total variations will assign each location to the class having a training sample with the smallest total variation to the point to be classified. Increasing the value of will allow compensating errors due to similarities with outliers in the training set. In our case, we have chosen a value of .

The total variation of each target location with all the rest of locations is first calculated. The 3 location points with the smallest total variation distances (excluding the self-distance which is always 0) are then selected. The class with more representatives in the 3 locations with smaller total variation distances is finally selected for classification. The process is repeated for all the 24 locations in the dataset. The results for the confusion matrix are presented in Table 9. The accuracy in this case is 0.75, which is 4% better than the best case in the previous section.

There are 2 traffic lights that are classified wrongly. One of them is classified as a street crossing and the other as a member of the null class. This is due to the fact that there are 2 traffic lights (ids 3 and 6 in Figure 4) which only turn red if requested by a pedestrian in order to cross the street and therefore tend to show the green light most of the times. Figure 9 shows the pmf computed for one of these traffic lights showing that none stop was registered in the recorded dataset for this particular traffic light. Table 10 captures the distances as for the total variation distance for all the traffic lights. Traffic lights with ids 3 and 6 in Figure 4 are captured at the end of the table. All the distances among traffic lights 1, 2, 4, and 5 are smaller than 0.5. All the distances between traffic lights 3 and 6 and each of the other traffic lights are bigger than 0.5 As a further study, we plan to increase the size of the dataset in order to be able to better capture examples in which the user has to stop at all the traffic lights. Moreover, we plan to further subclassify different types of traffic lights and different types of street crossings as well.

5. Conclusions

We have proposed a new method to automatically detect road elements while driving based on the use of GPS estimated locations and instant speeds. The method is based on calculating the total variation distance between the speed probability mass functions (pmf) at each candidate location. The class with representative locations with the smallest total variation distances to the point to be classified will be selected.

We have generated a new dataset from driving tests in an urban environment and used it to validate the results. We have selected 24 locations in the dataset, 6 for each of the target classes (traffic lights, roundabouts, street crossings, and the null class). The speed information when crossing each point at each drive, for those drives with clear traffic conditions, has been used in order to generate the speed pmf function for each target location. A classical classification approach based on the use of features based on the central statistical moments has also been implemented in order to compare the best achieved results with our approach.

The results show that, using our approach, a classification accuracy of 0.75 is achieved. Moreover, some of the misclassified locations are in fact singular locations not sharing the same statistical information as other locations in the same class. A further subclassification study will be done in future studies. The results also show that, using our approach based on the total variation distance, the best results based on central statistical moments are outperformed by 4%.

The pmf functions for the speed computed at each location, as proposed in this paper, could also have a direct application in order to estimate a stochastic penalty that each road element adds to the total travel time for the journey as opposed to travelling at the maximum allowed speed in clear road conditions. This information could be fed into applications such as Google Maps (https://www.google.com/maps) in order to better estimate the required travel time for a particular journey. As a future work, we also plan to explore and validate this approach.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The research leading to these results has received funding from the “HERMES-Smart Driver” Project TIN2013-46801-C4-2-R (MINECO), funded by the Spanish Agencia Estatal de Investigación (AEI), and the “Analytics Using Sensor Data for Flatcity” Project TIN2016-77158-C4-1-R (MINECO/ERDF, EU) funded by the Spanish Agencia Estatal de Investigación (AEI) and the European Regional Development Fund (ERDF).