Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2015 (2015), Article ID 940624, 10 pages
Research Article

Transportation Mode Detection Based on Permutation Entropy and Extreme Learning Machine

1School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
2School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China

Received 14 August 2015; Revised 3 October 2015; Accepted 8 October 2015

Academic Editor: Michael Small

Copyright © 2015 Lei Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


With the increasing prevalence of GPS devices and mobile phones, transportation mode detection based on GPS data has been a hot topic in GPS trajectory data analysis. Transportation modes such as walking, driving, bus, and taxi denote an important characteristic of the mobile user. Longitude, latitude, speed, acceleration, and direction are usually used as features in transportation mode detection. In this paper, first, we explore the possibility of using Permutation Entropy (PE) of speed, a measure of complexity and uncertainty of GPS trajectory segment, as a feature for transportation mode detection. Second, we employ Extreme Learning Machine (ELM) to distinguish GPS trajectory segments of different transportation. Finally, to evaluate the performance of the proposed method, we make experiments on GeoLife dataset. Experiments results show that we can get more than 50% accuracy when only using PE as a feature to characterize trajectory sequence. PE can indeed be effectively used to detect transportation mode from GPS trajectory. The proposed method has much better accuracy and faster running time than the methods based on the other features and SVM classifier.

1. Introduction

With the increasing prevalence of positioning technologies, GPS mobile devices, smartphones, and so forth are equipped with multiple sensors [1]. It is possible to collect movement data of human. This makes it possible to implement various location-aware services [2]. Humans travel by different transportation modes, for example, walking, bicycle, car, and train [3]. In ubiquitous and context aware computing, understanding the mobility of a mobile user is an important research area. The knowledge of the transportation mode is critical for travel behavior research, transport planning, and traffic management [4, 5]. Transportation modes of individuals effectively reflect their past events and we can deeply understand their own life pattern [6]. Because the collected data from GPS do not contain the transportation mode, the detection of transportation mode from GPS trajectory is necessary.

Transportation mode detection from GPS data has been studied in the literature. Different studies use different features (or combination of features), such as speed, acceleration, maximum or median speed, and acceleration and length between GPS fixes. A simple approach is to measure the speed and acceleration of the GPS data, which is then compared with empirical thresholds [5, 7]. However, for some transport modes, such as cycling and running, the usage of speed and acceleration is not enough. For example, for traffic jam, rain, and snow weather, the speed and acceleration under different transportation modes may be the same, so it is hard to differentiate using only the speed and acceleration thresholds. Zheng et al. identified a set of sophisticated features including heading change rate, velocity change rate, and stop rate [6, 8]. Beyond simple velocity and acceleration, they are more robust to traffic condition and contain more information of users’ motion. Stenneth et al. considered transportation network data which consist of real time locations of buses, rail lines, and bus stops spatial data [9]. This approach can achieve over 93.5% accuracy for inferring various transportation modes. However, transportation network data is not available in most cases. Reddy et al. combined GPS sensor data with accelerometer data to detect the modes of transportation [10]. They select GPS speed, accelerometer variance, and accelerometer DFT as features. Stopher et al. considered the average speeds and the maximum and minimum speed as feature set [11]. The other features used in transportation mode detection are shown in [12, 13].

Permutation Entropy (PE) directly investigates the temporal information contained in the time series, which was introduced by Bandt and Pompe [14]. PE has the quality of simplicity, robustness, and very low computational cost. PE has been applied in many applications [15], such as neural applications [16], electroencephalography (EEG) signal analysis [17], electrocardiograph (ECG) [18], and stock market analysis [19]. As a new learning algorithm for single-hidden layer feedforward neural networks, Extreme Learning Machine (ELM) has attracted a lot of research interests [2023]. ELM has shown good performance in classification applications due to the low computational cost, better generalization performance, and faster learning speed than traditional gradient-based learning algorithms [24, 25].

PE has not been used for analyzing moving objects data. Therefore, it is interesting to investigate PE in mobility analysis of moving object. In this paper, we propose to use PE as a feature for the transportation mode detection. To reduce the training time without compromising accuracy, Extreme Learning Machine (ELM) is used as a classifier in this paper. Experiments are conducted on GeoLife dataset to validate the feasibility of the proposed method and evaluate the effectiveness of this approach. We make comprehensive performance evaluation for various feature measures as well as different supervised classifiers in transportation mode detection. The results showed that the proposed scheme was capable of detecting transportation not only with high accuracy but also with a very fast speed.

The remainder of this paper is organized as follows. In Section 2, we introduce the proposed transportation modes detection algorithm based on PE and ELM. In Section 3, we review the concept of PE. In Section 4, we provide a review of ELM. In Section 5, we present experiments and results to demonstrate the effectiveness of the algorithm. Section 5 contains conclusions.

2. Transportation Mode Detection with PE and ELM

In transportation modes detection from GPS data, different features, such as speed, average speed, acceleration, and sophisticated features, are used. But the speed of different transportation modes is usually vulnerable to traffic conditions and weather. Intuitively, the average speed of driving would be as slow as walking in congestion. Speed change of GPS trajectories is an important indicator to describe trajectories. For example, the speed of car changes in a wide range. Compared with this, the speed of walking has less change. Permutation Entropy estimates the complexity of time series through the comparison of neighboring values. It is conceptually simple and computationally very fast. So, we explore whether Permutation Entropy can be used to detect transportation modes.

In the paper, we adopt Permutation Entropy as a measure of complexity due to its fast calculation, robustness, and invariance with respect to nonlinear monotonous transformations.

Up to now, to the best of our knowledge, in the literature, there is no related study about detecting moving objects’ transportation mode by PE. Figure 1 shows the proposed transportation mode detection method with PE and ELM.

Figure 1: The steps of transportation mode detection method based on PE and ELM.

Firstly, GPS trajectory data are collected from GPS sensor. Secondly, trajectory data is segmented into time sequences with the same length. For each trajectory segment, several features, such as speed and PE, are extracted. Finally, a classification model, ELM, is used to detect transportation mode.

3. Permutation Entropy (PE)

Permutation Entropy is widely used to study the irregularity and nonlinearity in time series, which has a fairly high sensitivity on time, so it is an effective method to detect the dynamic changes of a complex system.

For a given time series , the calculation steps of PE are described as follows.

Using time delay embedding theorem to reconstruct the phase space, the data segment is derived from the point of original time series:

In (1), is the embedding dimension and is the delay time.

Each component in can be arranged in an increasing order to achieve an ordinal pattern:where is the index of the element in the new vector.

If there are two same values in , for example, , we rearrange them according to the index. Namely, if , then, .

Each vector can be mapped into a symbol series . is one of the permutations of distinct symbols. The probability distribution for each distinct symbol series can be estimated, , where .

PE for is defined as the Shannon entropy for different symbols:

When , we can get the maximum value of , .

4. Extreme Learning Machine

4.1. Single-Hidden Layer Feedforward Neural Network (SFLN)

Feedforward neural networks have been extensively used in many fields. A single-hidden layer feedforward neural network (SLFN) with at most hidden nodes and with almost any nonlinear activation function can exactly learn distinct observations. The activation function of a node defines the output of that node given an input or set of inputs. The input weights (linking the input layer to the first hidden layer) and hidden layer biases need to be adjusted.

For arbitrary distinct samples , where and , SLFN with hidden nodes and activation function is modeled aswhere is the weight vectors connecting the input nodes to the th hidden node, is the weight vectors connecting hidden node to the output nodes, and is the threshold of the th hidden node. denotes the inner product of and .

The standard SLFN with hidden neurons can approximate these samples with zero error; that is, , and there exist such that

The above equations in (4) can be written compactly aswhere is defined as

is called the hidden layer output matrix of the neural network; the column of is the hidden node output with respect to inputs .

Traditionally, all the parameters of the feedforward networks need to be tuned iteratively. Gradient descent-based methods have mainly been used in various learning algorithms of feedforward neural networks. Gradient descent-based learning methods are generally very slow due to improper learning steps or may easily converge to local minima.

4.2. Extreme Learning Machine

Extreme Learning Machine (ELM) is a simple learning algorithm for SFLN. The learning speed of ELM can be thousands of times faster than traditional feedforward network learning algorithms while obtaining better generalization performance.

In most applications, the number of hidden neurons is much smaller than the number of distinct training samples, and is a nonsquare matrix. In the ELM approach, the input weights and the hidden layer biases of SLFNs are not tuned but are assigned randomly and then fixed. This is equivalent to mapping the samples to a random feature space. Then, training SLFN is equivalent to find a least squares error solution of the linear system . is the Moore-Penrose generalized inverse of matrix .

There are many ways of calculating the Moore-Penrose generalized inverse of a matrix such as the orthogonal projection method, iterative method, and singular value decomposition [26]. Singular value decomposition is used to calculate the Moore-Penrose generalized inverse of a matrix.

5. Experiments Evaluation

In this section, we first describe experiment dataset. Second, we present feature extraction. Finally, transportation modes detection based on elementary features, only PE, and combination of PE and the elementary features is discussed.

We do not compare our method with the previous transportation modes detection methods because of the following. Firstly, the transportation mode detection method is composed of trajectory partition, feature selection, and learning process. Subtrajectories are attained automatically from trajectories partition algorithm in the other transportation mode detection methods. The length of subtrajectories is not the same. However, subtrajectories in our method need to have the same length to calculate PE. So, we get the fixed length subtrajectories by partitioning the trajectories with the same length. Other researchers partition the trajectories with the specific trajectories partition algorithms. Secondly, we compare PE with the other features in the other method to show that PE is a valid indicator. Finally, we compare our learning method, ELM, with SVM, commonly used learning method in the other methods, to validate ELM’s efficiency.

5.1. Dataset Description

The experiments are carried out on the Microsoft GeoLife dataset [1] which consists of 17621 moving trajectories of 182 users over three years. These trajectories were recorded by different GPS loggers and GPS phones. A GPS trajectory is represented by a sequence of time-stamped points of a user in a certain time interval and each time-stamped point contains the information of latitude, longitude, and altitude. The trajectories of 73 users have been labeled with transportation mode. The total distance and duration of transportation modes are listed in Table 1.

Table 1: Total distance and duration of transportation modes.
5.2. Feature Extraction

We extract the features from each trajectory. The features are shown in Table 2. The elementary features , and have the same definition in [9].

Table 2: Extracted features of each trajectory segment.
5.3. Experimental Results
5.3.1. Transportation Modes Detection Based on Elementary Features

We select 30 of 73 users and extract 5525 trajectories to perform the experiments. The elementary features , and are calculated to detect trajectory modes. 5525 trajectories segments are partitioned into the training set and the testing set randomly. Table 3 lists different training set sizes and the used features.

Table 3: Different training set sizes and the used features.

We choose SVM and ELM as classifiers to detect transportation modes. Tables 48 show the running time and accuracy. We observe that classification accuracy of ELM is about 62% when nodes number is larger than 500. We can get higher and steady results when nodes number of ELM is 800. Detection accuracy of SVM is about 45% and lower than ELM. For ELM, different activation functions have great effect on running time and accuracy. Sigmoid can get the most accurate result. The accuracy of Hardlim is slightly less than Sigmoid. However, training time of Hardlim is much shorter than Sigmoid.

Table 4: Running time and accuracy of 1 : 9 data.
Table 5: Running time and accuracy of 2 : 8 data.
Table 6: Running time and accuracy of 3 : 7 data.
Table 7: Running time and accuracy of 4 : 6 data.
Table 8: Running time and accuracy of 5 : 5 data.
5.3.2. Transportation Modes Detection Based on Only PE

To compute the speed PE of transportation modes, we extract trajectory segments with the same transportation mode. The number of points of each trajectory segment is greater than 1000. We collect 500 trajectory segments in our experiment. We calculate the speed of each point and the speed of transportation modes by using PE of each trajectory segment. We use the speed from PE as a feature to detect transportation modes.

Figure 2 shows the average speed of 500 trajectory segments. Figure 3 presents the speed distribution of different transportation modes. The range of speed change in each transportation mode is high. Different transportation modes, such as walk, bike, and bus, have high overlap in the average velocity. Consequently, the average velocity is not a perfect feature to distinguish different transportation modes.

Figure 2: Average speed of 500 trajectory segments.
Figure 3: Speed distribution of different transportation modes.

Figure 4 is the PE from the speed for different transportation modes. We can see that the PE from the speed in cars and buses is lower and PE from the speed in walking and bikes is higher. The PE from the speed in car and bus has smaller scale and lower value. The PE from speed in walking and bikes has a larger scale and a higher value. When the speed of car, bus, walking, and bike is usual, we can recognize different transportation modes from the average speed easily. But, in the traffic jam, the average speed of car and bus is almost the same as the average speed of walking and bike. Because the PE from the speed for car is lower than that of walking, we can make a distinction between car and walking from PE from the speed.

Figure 4: The distribution of velocity PE of different transportation modes.

The average PE of 500 trajectory segments with different under multiple transportation modes is shown in Table 9. We can see that the average PE from the speed becomes large with the increase of . This is possibly because when is larger, the probability of distinct symbols is smaller and each row of the reconstruction matrix is much more complex.

Table 9: Average PE of 500 trajectory segments with different under multiple transportation modes.

We use the PE from the speed as the feature to detect transportation modes for 500 trajectory segments. We choose SVM and ELM as classifiers. In ELM, we adopt Sigmoid as the activation function and the number of nodes is 800 since these parameters can give relatively good performance for ELM. To demonstrate the effect of the number of training samples, we design the experiments by setting different training set sizes (10%, 20%, 30%, 40%, and 50%) and the remaining samples act as the training set.

For , Bandt and Pompe recommend [14, 15] and found that and 4 may still be too small, and a value of , or 7 seems to be the most suitable.

We set as 4, 5, 6, and 7. Tables 1013 present the experimental results. We find that when , it is too small to get better effect as shown in [15]. For , or 7, we find that when is larger, the detection accuracy is lower and the training time is longer. At the same time, we note that the larger the dimension is, the more time PE computing needs. When , we obtain the best experimental results. For two kinds of classifiers, it is noted that ELM gives a better stability and a higher accuracy than SVM.

Table 10: Experimental results with .
Table 11: Experimental results with .
Table 12: Experimental results with .
Table 13: Experimental results with .
5.3.3. Transportation Modes Detection Based on PE and the Elementary Features

We gradually add the other elementary features based on PE. In ELM, we adopt Sigmoid as the activation function and the training data size is 50%. Tables 1418 present different detection results with different feature sets. We can obtain about 80% accuracy by PE + AV as features. It is obvious that the accuracy of our method has no obvious increase after adding HCR, SR, and VCR. When we use the features PE, AV, HCR, SR, VCR, and DV, the accuracy will decrease. This is partly because there is a negative correlation between different features.

Table 14: Detection results with PE + AV.
Table 15: Detection results with PE + AV + HCR.
Table 16: Detection results with PE + AV + HCR + SR.
Table 17: Detection results with PE + AV + HCR + SR + VCR.
Table 18: Detection results with PE + AV + HCR + SR + VCR + DV.

6. Conclusions

In this paper, we have proposed a transportation mode detection method based on PE and ELM. We employ speed PE as the feature of trajectory segments. The low computational complexity of PE makes it become an excellent feature. Experimental results based on the GeoLife dataset show that speed PE is a valid feature to detect transportation modes from trajectory segments and obtain more than 50% accuracy. We also apply ELM as a classifier and validate the notion that ELM performs faster and obtains a higher accuracy than SVM in our experiments.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This work was supported by the Fundamental Research Funds for the Central Universities (2014XT04).


  1. K. Waga, A. Tabarcea, M. Chen, and P. Franti, “Detecting movement type by route segmentation and classification,” in Proceedings of the 8th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 508–513, Pittsburgh, Pa, USA, October 2012. View at Publisher · View at Google Scholar · View at Scopus
  2. K. Waga, A. Tabarcea, and P. Fränti, “Context aware recommendation of location-based data,” in Proceedings of the 15th International Conference on System Theory, Control, and Computing (ICSTCC '11), pp. 1–6, IEEE, Sinaia, Romania, October 2011.
  3. J. P. Rodrigur, C. Comtois, and B. Slack, The Geography of Transport Systems, Routledge, New York, NY, USA, 2008.
  4. F. Biljecki, H. Ledoux, and P. van Oosterom, “Transportation mode-based segmentation and classification of movement trajectories,” International Journal of Geographical Information Science, vol. 27, no. 2, pp. 385–407, 2013. View at Publisher · View at Google Scholar · View at Scopus
  5. W. Bohte and K. Maat, “Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: a large-scale application in the Netherlands,” Transportation Research Part C: Emerging Technologies, vol. 17, no. 3, pp. 285–297, 2009. View at Publisher · View at Google Scholar · View at Scopus
  6. Y. Zheng, L. Liu, L. Wang, and X. Xie, “Learning transportation mode from raw GPS data for geographic applications on the web,” in Proceedings of the 17th International Conference on World Wide Web (WWW '08), New York, NY, USA, April 2008. View at Scopus
  7. J. L. Wolf, M. G. S. Oliveira, P. Troped, C. E. Mathews, E. K. Cromley, and S. J. Melly, “Mode and activity identification using GPS and accelerometer data,” in Proceedings of the 85th Annual Meeting of the Transportation Research Board, Washington, DC, USA, January 2006.
  8. Y. Zheng, Q. Li, Y. Chen, X. Xie, and W.-Y. Ma, “Understanding mobility based on GPS data,” in Proceedings of the 10th International Conference on Ubiquitous Computing (UbiComp '08), pp. 312–321, ACM, Seoul, South Korea, September 2008. View at Publisher · View at Google Scholar
  9. L. Stenneth, O. Wolfson, P. S. Yu, and B. Xu, “Transportation mode detection using mobile phones and GIS information,” in Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS '11), pp. 54–63, ACM, Chicago, Ill, USA, November 2011. View at Publisher · View at Google Scholar · View at Scopus
  10. S. Reddy, M. Mun, J. Burke, D. Estrin, M. Hansen, and M. Srivastava, “Using mobile phones to determine transportation modes,” ACM Transactions on Sensor Networks, vol. 6, no. 2, article 13, 2010. View at Publisher · View at Google Scholar · View at Scopus
  11. P. Stopher, E. Clifford, J. Zhang, and C. FitzGerald, Deducing Mode and Purpose from GPS Data, Institute of Transport and Logistics Studies, Sydney, Australia, 2008.
  12. A. Bolbol, T. Cheng, I. Tsapakis, and J. Haworth, “Inferring hybrid transportation modes from sparse GPS data using a moving window SVM classification,” Computers, Environment and Urban Systems, vol. 36, no. 6, pp. 526–537, 2012. View at Publisher · View at Google Scholar · View at Scopus
  13. M. Garbe, C. Bünnig, A. Gutschmidt, and C. Cap, “Moving type detection without time information,” in Proceedings of the 6th IEEE International Conference on Semantic Computing (ICSC '12), pp. 318–324, Palermo, Italy, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  14. C. Bandt and B. Pompe, “Permutation entropy: a natural complexity measure for time series,” Physical Review Letters, vol. 88, no. 17, Article ID 174102, 2002. View at Publisher · View at Google Scholar · View at Scopus
  15. Y. Cao, W.-W. Tung, J. B. Gao, V. A. Protopopescu, and L. M. Hively, “Detecting dynamical changes in time series using the per-mutation entropy,” Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, vol. 70, no. 4, Article ID 046217, 2004. View at Publisher · View at Google Scholar · View at Scopus
  16. Z. Li, G. Ouyang, D. Li, and X. Li, “Characterization of the causality between spike trains with permutation conditional mutual information,” Physical Review E, vol. 84, no. 2, Article ID 021929, 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. A. A. Bruzzo, B. Gesierich, M. Santi, C. A. Tassinari, N. Birbaumer, and G. Rubboli, “Permutation entropy to detect vigilance changes and preictal states from scalp EEG in epileptic patients: a preliminary study,” Neurological Sciences, vol. 29, no. 1, pp. 3–9, 2008. View at Publisher · View at Google Scholar · View at Scopus
  18. B. Graff, G. Graff, and A. Kaczkowska, “Entropy measures of heart rate variability for short ECG datasets in patients with congestive heart failure,” Acta Physica Polonica B, Proceedings Supplement, vol. 5, p. 153, 2012. View at Publisher · View at Google Scholar
  19. L. Zunino, M. Zanin, B. M. Tabak, D. G. Pérez, and O. A. Rosso, “Forbidden patterns, permutation entropy and stock market inefficiency,” Physica A: Statistical Mechanics and its Applications, vol. 388, no. 14, pp. 2854–2864, 2009. View at Publisher · View at Google Scholar · View at Scopus
  20. G. Feng, G.-B. Huang, Q. Lin, and R. Gay, “Error minimized extreme learning machine with growth of hidden nodes and incremental learning,” IEEE Transactions on Neural Networks, vol. 20, no. 8, pp. 1352–1357, 2009. View at Publisher · View at Google Scholar · View at Scopus
  21. N.-Y. Liang, G.-B. Huang, P. Saratchandran, and N. Sundararajan, “A fast and accurate online sequential learning algorithm for feedforward networks,” IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1411–1423, 2006. View at Publisher · View at Google Scholar · View at Scopus
  22. H.-J. Rong, G.-B. Huang, N. Sundararajan, and P. Saratchandran, “Online sequential fuzzy extreme learning machine for function approximation and classification problems,” IEEE Transactions on Systems, Man, and Cybernetics B: Cybernetics, vol. 39, pp. 1067–1072, 2009. View at Publisher · View at Google Scholar · View at Scopus
  23. Y. Wang, F. Cao, and Y. Yuan, “A study on effectiveness of extreme learning machine,” Neurocomputing, vol. 74, no. 16, pp. 2483–2490, 2011. View at Publisher · View at Google Scholar · View at Scopus
  24. G.-B. Huang and L. Chen, “Enhanced random search based incremental extreme learning machine,” Neurocomputing, vol. 71, no. 16–18, pp. 3460–3468, 2008. View at Publisher · View at Google Scholar · View at Scopus
  25. G.-B. Huang, D. H. Wang, and Y. Lan, “Extreme learning machines: a survey,” International Journal of Machine Learning and Cybernetics, vol. 2, no. 2, pp. 107–122, 2011. View at Publisher · View at Google Scholar · View at Scopus
  26. C. R. Rao and S. K. Mitra, Generalized Inverse of Matrices and Its Applications, John Wiley & Sons, 1972. View at MathSciNet