Abstract

Machine learning algorithms are among the driving forces towards the success of intelligent road network systems design. Such algorithms allow for the design of systems that provide safe road usage, efficient infrastructure, and traffic flow management. One such application of machine learning in intelligent road networks is classifying different road network types that provide useful traffic information to road users. We propose a deep autoencoder model for representation learning to classify road network types. Each road segment node is represented as a feature vector. Unlike existing graph embedding methods that perform road segment embedding using the neighbouring road segments, the proposed method performs embedding directly on the road segment vectors. The proposed method performs embedding directly on the road segment vectors. Comparison with state-of-the-art graph embedding methods show that the proposed method outperforms graph convolution networks, GraphSAGE-MEAN, graph attention networks, and graph isomorphism network methods, and it achieves similar performance to GraphSAGE-MAXPOOL.

1. Introduction

Throughout the world, the number of vehicles and road users is increasing, and this has created traffic problems such as traffic congestion, accidents, and fuel cost. The rise of such traffic problems has led to the need to design and develop smart cities. Smart city design integrates physical, digital, and human systems in the built environment to facilitate the planning, construction, and management of the city’s infrastructure [1]. Smart cities cover a wide range of applications within the transport and health industries. One key element of smart city design within the transport industry is the intelligent road network system design, which aims to ensure efficient traffic flow by minimising traffic problems. Intelligent road networks have also seen a wide range of applications in the domain of autonomous vehicles.

The use of intelligent road networks system is widely accepted in many countries, and its use is not only limited to traffic flow control and information but also expands to efficient infrastructure and road safety usage. Machine learning algorithms have been the driving force behind the successes of intelligent road networks; indeed, access to big data has opened doors to the development of various intelligent road network models. One such application of machine learning in road networks is classifying different road types. Road-type classification models are becoming important, as they can be embedded in interactive maps to provide helpful traffic information to road users. Other benefits of road-type classification include efficient traffic flow management, avoidance of congested routes, avoidance of routes where accidents are likely to occur, avoidance of routes with many intersections, and model integration to autonomous vehicles.

However, modelling road networks with machine learning is complex due to the lack of available feature extraction methods for representing road-types as feature vectors. Thus, researchers have introduced deep learning embedding methods to learn the spatial information of road networks to automatically extract features in the network data. These embedding methods are termed graph representation learning (GRL) as they rely on the spatial connection of different objects within the road network structure; thus, each object’s feature vectors are constructed by leveraging its spatial connection with neighbouring objects. The main goal of GRL methods is to achieve automatic feature extraction on the non-Euclidean graph data space without relying on the actual object attributes.

In this work, however, a method for representing different road types as feature vectors is proposed, such that machine learning classification algorithms can be trained and evaluated on these features. We take full advantage of the state-of-the-art baseline road network feature extraction method proposed in [2]. Furthermore, we introduce a deep autoencoder (DAE) embedding method to reduce the dimensions of feature vectors obtained by the baseline method. We then pass the feature vectors extracted by our DAE method to several machine learning classification algorithms; we select the classifier with the highest performance measure.

The rest of the paper is organised as follows. In Section 2, we present the literature study and related works. Section 3 provides the materials and methods used to build our model. In Section 4, we present the experimental results obtained by the proposed method, and we further compare the results to some of the state-of-the-art methods found in the literature. Finally, in Section 5, we conclude our work and provide recommendations for future work.

Graph theory is the fittest paradigm for modelling road networks, as it embraces all the topological information of any road network. Apart from the spatial road networks, graphs diagrammatically represent all transport networks, including highway, transits networks, air, and water. Thus, attributes such as speed, travel times, number of lanes, and head ways can be represented. A network’s topological spatial structure is represented by graphs composed of lines and points. Lines are also called edges, while points are nodes or vertices. Therefore, graphs can represent the topology and spatial structure of a road network such that nodes represent intersections, dead-ends, and locations of interest on the roads, while edges represent the road segments between such nodes.

Machine learning in road networks has had many successes in facilitating important traffic information such as traffic forecasting [35], speed limit annotation [68], and travel time estimation [911]. However, machine learning in road networks for modelling road-type classification is often challenging due to a lack of attributes representing different road types. Thus, it sounds reasonable to apply deep learning methods to automatically learn the network’s structure and represent every road segment by aggregating its neighbouring road segments. However, solving a learning problem on graphs is challenging. This is because many widely used data types, such as images and texts, are not structured as graphs. Also, the underlying connectivity patterns on the graph-structured data are more complex and non-Euclidean.

The fundamental solution to modelling complex, non-Euclidean patterns is to learn the graph representation from a low-dimensional Euclidean space using the GRL methods. Once the low-dimensional representations are learned, graphs-related problems such as node and link predictions can be achieved. Also known as graph embedding functions, the main goal of the GRL methods is to pack the properties of every road segment into a vector with a smaller dimension; this enables road segment similarity in the original complex graph space to be quantified in the embedded feature space using standard metrics. Several embedding methods have been proposed in the literature for modelling road networks. In [12], a hybrid graph convolution neural network (HGCN) method is proposed for traffic flow prediction in highway networks, where nodes represent the toll stations while edges represent road segments between two toll stations. In addition to modelling the spatial feature of the highway, the authors achieved better traffic flow prediction by considering factors such as time, space, weather conditions, and data type of each toll station.

It is worth noting that the HGCN method proposed in [12] uses local neighbourhood aggregation to learn the spatial connection of toll stations, and it cannot integrate road segment features into the learning process. This is valid since many state-of-the-art GRL methods rely on node features only. However, road segment features in road networks not only provide the connectivity information of two nodes but can also provide important, descriptive information that could be significant for the learning representation. To tackle this problem, the notion of relational fusion networks (RFN) is proposed in [13], for the speed limit classification and estimation tasks. RFN integrates edge information on the representation learning using the novel graph convolution operator. The RFN operator aggregates information over the relations between nodes instead of aggregating the information over neighbouring nodes.

To the best of our knowledge, the work proposed in [2] is the only available work in the literature that classifies different road types on a graph dataset extracted from Open Street Maps (OSMnx). Similar to RFN, the authors used the dual graph generated by the line graph transformation of the original graph to incorporate the edge features into the learning process. Thereafter, a method for generating road segment features is proposed based on information such as the length of the road segment, speed limit, and midpoint coordinates of the adjacent start and end nodes. The authors further compared the performance of learning representation using several embedding methods, including graph convolution networks (GCN) [14], GraphSAGE [15], graph attention networks (GAT) [16], and graph isomorphism network (GIN) [17] in inductive and transductive tasks, and in supervised and unsupervised learning tasks. In addition, a new GRL method, graph attention isomorphism network (GAIN), is proposed.

In our work, we attempt to improve the robustness of road segment features extracted in [2], by using the deep autoencoder (DAE) model as the embedding function; furthermore, we focus on the transductive and supervised learning settings only since these are the settings that achieved the highest accuracy in [2]. Unlike most graph embedding methods proposed in the literature, our DAE model does not construct the vector representation of the target road segment by aggregating over its neighbouring segments; instead, it operates directly on the high dimensional feature vectors of each road segment and produces compact feature vectors in a much smaller dimensional space. We then pass these compact features into several machine learning algorithms and report the results using the microaveraged f1-score. Finally, we compare our highest f1-score to the f1-score obtained using the methods proposed in [2].

3. Materials and Methods

As depicted in Figure 1, our proposed method for road-type classification comprises 6 steps. First, we extract the original road network graph dataset of Linkoping city from OSMnx. Edges in the original graph represent the road segments, while nodes represent information such as intersections and crossroads. In the second step, we transform the original graph into a line graph representing road segments as nodes. In the third step, we use the original and transformed graphs to derive attributes and represent every road segment as a feature vector. To the best of our knowledge, steps 1 to 3 of our proposed method follow a similar procedure proposed in [2]. In step 4, we introduce the deep autoencoder model as the embedding function, and dimensionality reduction is performed. In step 5, we use the feature vectors obtained by our embedding function to train, validate, and test the deep neural networks, support vector machines, and K-nearest neighbor classifiers. We then select the classifier with the highest microaveraged f1-score and compare our obtained results to some of the state-of-the-art embedding methods for solving a similar task to ours.

3.1. Input Dataset

Similar to the transductive setting in [2], the input dataset used to conduct the experiments in our work is the road network graph dataset of Linkoping city. The dataset was extracted from OSMnx within a 14 km radius of the city centroid. The obtained graph dataset is represented as , where and are set of nodes and set edges, respectively. Edges represent road segments, and nodes represent crossroads, intersections, and junctions. Some of the preprocessing steps on the obtained graph involved transforming into an undirected graph, consolidating parallel edges, and intersections within a 10 m distance.

3.2. Line Graph Transformation

As depicted in Figure 2, the original graph is converted to line graph , such that edges (road segments) in become nodes in and two edges (two road segments) that share a node (intersection) in become an edge in . Transforming to has two significant advantages. Firstly, graph embedding methods in the literature are designed for nodes and not edges; thus, the transformed graph has road segments as nodes. Secondly, nodes (cross-roads, intersections, and junctions) on the original graph do not have the essential information required for road-type classification tasks. Algorithm 1 gives the steps used to transform to .

Require: Original graph
Output: Transformed graph
(1)Construct nodes.
(2)for each edge do
(3) Set as a node in .
(4)end for
(5)Construct edges.
(6)while two edges share a node do
(7) Set as an edge in .
(8)end while
3.3. Class Distribution

Road segments in OSMnx are tagged with their corresponding road-type labels, thus allowing for a supervised classification task to be accomplished. However, 15 road-type labels are obtained, and some of these labels rarely occur on our obtained dataset. Therefore, the distribution of data is highly characterised by extreme class imbalances. To tackle this problem, we follow the same technique as in [2], where the authors merged and relabelled road types as shown in Table 1.

3.4. Feature Engineering

Feature generation of each road segment is conducted by extracting its descriptive attributes from the edges of the original graph and nodes of the transformed graph. Indeed, attributes such as the width, length, number of lanes, and speed limit of light vehicles and heavy vehicles provide useful road segment information required for feature generation. Nevertheless, we generate the road segment feature vectors using four main components as in [2] to compare the results fairly. As shown in Table 2, these four components yield a 58-dimensional feature vector for every road segment.

Let represent the road segment length, be the midpoint coordinates of two nodes in longitude and latitude directions, respectively, and be the one hot encoding vector of speed limits. Then, the final feature vector of each road segment is generated using Algorithm 2.

Require: and
Output: Feature vector for each road segment
(1)for each road segment do
(2)Obtain the length of from .
(3)Compute midpoint coordinates of .
(4)Obtain geometry of .
(5)if does not have a geometry then
(6)  Convert to line geometry.
(7)  Divide into 20 equally spaced distanced points .
(8)  for to 20 do
(9)   Subtract by midpoint coordinates .
(10)  end for
(11)else
(12)  Divide geometry into 20 equally spaced distanced points .
(13)  for to 20 do
(14)   Subtract by midpoint coordinates .
(15)  end for
(16)end if
(17)Obtain one hot encoded vector of the speed limits with standard values.
(18)Concatenate features generated from steps 2 to 17.
(19)end for
3.5. Embedding with Deep AutoEncoder

We introduce the deep autoencoder (DAE) model to achieve the embedding task. In contrast to the graph embedding methods found in the literature, where road segment vector representation is obtained by aggregating over the neighbouring road segments, our DAE model performs embedding directly on the high-dimensional features of each road segment. As shown in Figure 3, our DAE model comprises three crucial components: the encoder, the embedding space, and the decoder. The encoder component takes the -dimensional road segment feature vectors as input and compresses these into the smaller dimension while preserving as much important information as possible. The preserved -dimensional feature vectors (where ) are stored in the embedding space. The decoder component aims to reconstruct the original -dimensional road segment features by decompressing the -dimensional features in the embedding space. Taking the above objectives of each component, we can therefore define the learning process of our DAE model into three steps. First, we compress the -dimensional input road segment features into -dimensional feature space in the encoder component. Then, we reconstruct the output from the small dimension using the decoder component. Finally, we calculate the error difference between the original inputs and the reconstructed outputs and adjust the weight parameters to reduce this difference.

Our DAE model is a fully connected network with an input layer, four hidden layers, and an embedding space layer on the encoder component. The decoder component comprises four hidden layers and an output layer. The output layer has the same size as the input layer in the encoder, while the size of the hidden layers in the decoder is similar to the size of the hidden layers in the encoder. We first normalise the road segment feature vectors on the encoder before feeding them to the input layer. Thereafter, we obtain the value of each neuron in the next compressed layer by computing the sum of products of values in the previous layer and their corresponding weight parameters. We then introduce nonlinearities to the network by applying the rectified linear unit (ReLU) activation function defined as . On the decoder, we decompress values in the embedding space layer and obtain values in the next decompressed layer using a similar procedure; again, the ReLU function is used as the activation function. Furthermore, we normalise the values in the output layer to be between 0 and 1 through the sigmoid function defined as . This normalisation is important since input features are also normalised. Finally, we measure the error difference between values in the input layer and their corresponding values in the output layer. Therefore, our optimisation problem is finding the set of optimal weight parameters on the encoder component that achieves the smallest possible error difference. Finally, we extract features in the embedding space layer which we later use to train, validate, and test the machine learning algorithms. Algorithm 3 shows the step-by-step implementation of our DAE model for an embedding task. The reasons for choosing the number of hidden layers and corresponding sizes will be given in greater detail in Section 4.

Require: Road segment features space:
Outputs: Road segment embedded features space:
(1)Define encoder parameters:
(2)Input layer: .
(3)Hidden layer1: , Activation = ReLu.
(4)Hidden layer2: , Activation = ReLu.
(5)Hidden layer3: , Activation = ReLu.
(6)Hidden layer4: , Activation = ReLu.
(7)Embedding layer , Activation = ReLu.
(8)Define decoder parameters:
(9)Hidden layer1: , Activation = ReLu.
(10)Hidden layer2: , Activation = ReLu.
(11)Hidden layer3: , Activation = ReLu.
(12)Hidden layer4: , Activation = ReLu.
(13)Output layer: , activation = Sigmoid.
(14)Define DAE model: model(encoder, decoder)
(15)fordo
(16)Fit input feature vectors to DAE model.
(17)Initialise weights randomly.
(18)Obtain reconstructed feature vectors .
(19)Compute the error difference:
(20)while error difference is not converging do
(21)  Update weight parameters.
(22)end while
(23)Store weights parameters.
(24)Obtain the embedding features vector
(25)
(26)end for
(27)Return embedding space features
3.6. Road Segment Classification

We use the obtained embedded features, , in -dimensional feature space (where ) to compare the performance of deep neural networks (DNN), support vector machines (SVM), and K-nearest neighbors (K-NN) classifiers for road-type classification of road classes mentioned in Section 3.3. These classifiers were chosen for comparison as they are deemed adequate for multiclass classification tasks across various applications [1823]. Furthermore, these classifiers represent three unique learning methods: the artificial neural networks, the hyperplane-based, and the instance-based learning methods.

The DNN classifier belongs to a family of artificial neural networks where the network’s underlying parameters are fine-tuned to match a given class label for each input vector. The SVM is a hyperplane-based learning method that transforms nonlinearly separable input features into a high-dimensional feature space where input features can be separated linearly. The K-NN classifier belongs to the family of instance-based learning methods; unlike the SVM classifier, where two classes are trained simultaneously, the K-NN achieves multiclass classification tasks in one go, where feature vectors (with class labels) representing multiple classes are stored in a feature space. The K parameter is used to decide the class label of the unlabelled vector. Thus, comparing these three classifiers will signify the best learning method for a road-type classification task.

We initially divided the input features into train and test datasets. We perform the 10-fold cross-validation method on the training dataset to obtain optimal parameters for each classifier; then, we use the test dataset to obtain the microaveraged f1-score of each classifier based on the optimal parameters.

3.6.1. Deep Neural Networks

The DNN classifier is a fully connected network with the input layer, two or more hidden layers, and the output layer. The size of the input layer corresponds to the number of components of road segment feature vectors , and the size of the output layer corresponds to the number of road-type classes . The size and number of hidden layers are often fine-tuned for optimal results. The embedded road segment features are passed into the input layer; the outputs from the input layer are fed into the layer, the outputs from the layer are fed into the layer, and so on, and ultimately the outputs from the layer are fed into the layer; equations (1) and (2) are used to obtain the value of the neuron of the layer, , by taking the sum of products of values previous layer and their corresponding weight parameters , where , and is the size of the layer.where is the bias term, is the size of the layer, and .

In equation (3), Sigmoid function, , is applied to compress outputs to be between 0 and 1 and thus obtain probabilities that a given road segment feature vector belongs to a class.

Equation (3) determines the predicted class label for each road segment input feature vector . Given a training sample representation , such that determines the class of , and is the size of the training sample. The training of our DNN classifier is made by first obtaining the predicted class label, based on the randomly initialised weight parameters for each road segment vector, . The error incurred between predicted outputs and actual class labels for all the training samples is measured by the following formula:

During the training process, the parameters vector will be updated using the following formula:where is the learning rate parameter, and is the gradient calculated using the backpropagation. Algorithm 4 shows the steps used to classify road segment features using the DNN classifier.

Require: : size of input layer; : size of output layer
 Training set:
 Test set:
Output: DNN structure: : Optimal weight, : number of layers
 List of labels such that is the class label of the Test set element :
(1)Training phase:
(2)Initialise weight parameters structure
(3)Define number of hidden layers, , and corresponding sizes .
(4)while optimal parameters are not obtained do
(5)for each training sample do
(6)  Calculate neurons value using equations (1) and (2).
(7)  Obtain the predicted output using equation (3).
(8)end for
(9)Calculate loss using equation (4).
(10)Obtain the updated weight parameters using equation (5).
(11)end while
(12)Store optimal weight parameters .
(13)Classification phase:
(14)for each road segment vector do
(15)Predict class of
(16)end for
(17)Return
3.6.2. Support Vector Machines

We perform the multiclass road-type classification task using the one vs. one support vector machines (SVM) formulation. In one vs. one SVM, we train two road-type classes at a time; thus, for road-type classes, we obtain a total of SVM classifiers; we then assign a class label to the unknown road segment feature vector based on the class with majority counts. For any pair of road-type classes with road segment features from the training dataset and the corresponding class labels and , we construct an optimal hyperplane that separates the two classes with the largest possible margin as shown in Figure 4. This margin is defined as the distance between vectors nearest to the optimal hyperplane (support vectors) from both classes.

The and planes defined by and represent the boundaries for feature vectors that belong to two distinct road-type classes. The margin which must then be maximised for the optimal hyperplane is the distance: between the and planes. We maximize by solving a dual optimisation problem defined as

Subject to the following constraints:

Furthermore, we introduce Lagrangian’s multipliers to eliminate the constraints, and we obtain the dual SVM formulation defined as maximizing

Solving the dual SVM yields the coefficients of , feature vectors where are the support vectors, and they lie directly on the and planes. The dual SVM problem is 0 for . Thus, the SVM optimisation problem is affected only by the support vectors. The optimal hyperplane for assigning a class label to an unknown road segment feature vector is done by evaluating the following function:

Radial basis function (RBF) kernel is used to transform nonlinearly separable features to a higher features space where they can be separated linearly. We compute the transformation by taking the dot product between any pairs of feature vectors using a Kernel function: . The RBF kernel is defined as follows:

Algorithm 5 outlines the steps used to classify road segment features using the SVM classifier.

Require: Training set:
 Test set:
Output: Parameters: ,
 List of labels such that is the class label of the Test set element :
(1)Classifier training.
(2)Find that satisfies equation (8)
(3)Compute
(4)Find the set of indices such that
(5)Compute
(6)Classifier testing
(7)for each road segment feature do
(8)Assign class label to using equation (9)
(9)end for
(10)Return
3.6.3. K-Nearest Neighbors

-nearest road segment feature vectors from the training dataset are used to assign a class label to the unknown feature vector. Thus, given road segment features with their corresponding class labels from the training dataset as , we calculate the distance between the unknown vector and vectors in the training dataset using the Euclidean distance defined aswhere is the value of the component of the vector .

We then define a set of features from the training dataset nearest to the unknown feature to assign its class label. According to equation (12), the unknown feature vector is assigned to a class that appears the most within a set of -nearest feature vectors.where is the class of sample , is the class label of the vector , and the function is defined as follows:

Algorithm 6 lists the steps used to classify road segment features using the K-NN classifier. Inputs are the feature vectors initially divided into training and test datasets. We use the training dataset to train the classifier and obtain the optimal through cross-validation. We use the test dataset to obtain the accuracy of the K-NN classifier based on the optimal K value.

Require: Training set:
 Test set:
Output: List of labels such that is the class label of the Test set element :
(1)for i = 1 todo
(2)for j = 1 to m do
(3)  Compute the distance of to the element of the training set
(4)end for
(5)Compute the set of the -nearest vectors to based on
(6)Compute the class label of using equation (12)
(7)end for
(8)Return

4. Experimental Results and Discussion

The Linkoping city road networks graph dataset [24] is used to train our DAE embedding method. Embedded features are used to train, validate, test, and compare DNN, SVM, and K-NN classifiers for road segment classification tasks. Classifier with the highest microaveraged f1-score is selected, and results achieved are compared to some state-of-the-art embedding methods found in the literature for solving a similar problem. Experiments are designed mainly to obtain optimal parameters on our DAE embedding method and classifiers.

4.1. Embedding with Deep AutoEncoder

Input to our DAE embedding method is a total of 6761, and 58-dimensional road segment feature vectors described in Section 3.3. First, the dataset is divided into a 70/30 split, where of the dataset is used to train the DAE model while the remaining portion of the dataset obtains optimal parameters of the DAE method. Optimisation is achieved through the Adam optimiser, while the batch size and maximum iterations were chosen as 1024 and 500, respectively. As depicted in Table 3, we define several DAE models with varying numbers of hidden layers and sizes on the encoder and decoder component and the layer size on the embedding space. This is done to identify optimal parameters that achieve the lowest reconstruction error and the highest accuracy using the validation dataset.

We train each DAE model listed in Table 3 using different learning rates , and we report the results in terms of the reconstruction error and accuracy based on the validation dataset after 500 iterations. We then select the DAE model with a corresponding learning rate that achieves the lowest average reconstruction error and the highest accuracy as our optimal DAE embedding method.

Figures 57 show the performances of three DAE models at increasing learning rates in terms of reconstruction error and validation accuracy. It can be observed that the DAE model with 5 hidden layers and embedding space of 4 units achieved the lowest reconstruction error and the highest accuracy of 0.0013 and , respectively, at a learning rate of . The DAE model with 4 hidden layers and embedding space of 8 units achieved the lowest reconstruction error and the highest accuracy of 0.000578 and , respectively, at a learning rate of . Finally, the DAE model with 3 hidden layers and embedding space of 10 units achieved the lowest reconstruction error and the highest accuracy of 0.000623 and , respectively, at a learning rate of . Based on these observations, we select our DAE embedding method using the model with 4 hidden layers and embedding space of 8 units as it achieves better performance.

4.2. Road-Type Classification

Input dataset to the DNN, SVM, and K-NN classifiers is 6761 road segment features of 8 dimensions obtained by our DAE embedding method. The dataset comprises 5 classes of merged and relabelled road types according to the method described in Section 3.3. We initially divided the dataset into a 70/30 split, where 70% of the data are used to train and validate the classifiers based on the 10-foldcross-validation method. We use the remaining 30% of the data to test the classifiers’ performance in the microaveraged f1-score based on the optimal parameters obtained by the 10-foldcross-validation method. We obtain the micro f1-score of each classifier by first computing the confusion matrix; thereafter, we calculate the sums of the true positives , false positives , and false negatives across all the classes.

4.2.1. Classification with DNN

Road-type classification with DNN was achieved using the steps mentioned in Algorithm 4. Optimisation was performed using the Adam optimiser, and the ReLU function as the activation function. Various learning rates and the number of hidden layers and sizes were investigated to identify the optimal parameters. The number of hidden layers and sizes was as follows: Layers A = {64, 128, 64}, Layers B = {64, 128, 128, 64}, Layers C = {400, 800, 800, 400}, and Layers D = {300, 500, 800, 800, 500, 300}. Through the 10-foldcross-validation method, we obtained optimal learning rate of and six hidden layers (Layers D), respectively, as indicated in Table 4.

We used the obtained optimal parameters to train and test the DNN classifier, and we report the results obtained from the test dataset using the microaverage f1-score shown in Table 5.

4.2.2. Classification with SVM

Road-type classification with SVM was achieved using the steps mentioned in Algorithm 5. The one vs. one SVM formulation was used, thus giving a total of 10 classifiers. Through the 10-foldcross-validation method, we obtained optimal RBF kernel width of and the error term parameter of as indicated in Table 6.

We then used the obtained optimal parameters to train and test the SVM classifier; we report the results obtained from the test dataset using the microaverage f1-score as shown in Table 7.

4.2.3. Classification with K-NN

Road-type classification with K-NN was performed using the steps mentioned in Algorithm 6. Through the 10-foldcross-validation method, we obtained the optimal K value of 5 as shown in Table 8.

We used the obtained optimal parameter to train and test the K-NN classifier, and we report the results obtained from the test dataset using the microaverage f1-score as shown in Table 9.

Tables 57 show the performances of three classifiers using microaveraged f1-score for road segment classification tasks. It can be observed that the DNN is the best-performing classifier with the micro f1-score of 80.16%. The second best performing classifier is the K-NN, with the micro f1-score of 76.13%. The SVM is the worst-performing classifier with the micro f1-score of 70.98%.

4.3. Comparison to Other Methods

We then compared the highest microaveraged f1-score obtained by our DAE model with some of the state-of-the-art embedding methods presented in [2] for the road-type classification task. Some of the similarities between our study and the study proposed in [2] are as follows: both studies were carried out using the Linkoping City road network graph dataset extracted from OSMnx, both studies transform the original graph to line graph to obtain more descriptive features for each road segment, and both studies similarly construct a 58-dimensional feature vector representing each road segment. Training of embedding methods for both studies is achieved using 500 iterations and 1024 batch size while optimisation is achieved using the Adam optimiser. The major difference between the two studies is how embedding is achieved. Our DAE method achieves embedding by reducing the dimensionality of each road segment feature vector, while the methods proposed in [2] achieve embedding on each road segment vector by aggregating information from its neighbouring road segments. The final embedded vector obtained from our DAE method has 8 dimensions, while the methods proposed in [2] have the final embedded vectors obtained from one of the output dimensions .

Table 10 shows the comparison of the performance of the methods in terms of the micro f1-score. Our DAE embedding method achieves the micro f1-score of more than when compared to raw features (original 58-dimensional features without embedding). Furthermore, our DAE method outperforms the GCN, GSAGE-MEAN, GAT, and GIN methods by micro f1-score of , , , and , respectively. We also observe that our DAE method achieves the same micro f1-score of as the GSAGE-MAXPOOL method. Finally, our DAE methods fall short by micro f1-score of compared to the GSAGE-MEANPOOL and GAIN methods.

One of the reasons why the proposed method outperforms the state-of-the-art graph embedding methods is that the two methods perform embedding differently. Our DAE acknowledges that not all 58 features representing each road segment are necessary; thus, it performs embedding by reducing the dimensionality of each road segment from 58 dimensions to 8 dimensions representing the most prominent features. On the graph embedding methods, embedding on each road segment feature vector is performed by using the feature vectors of neighbouring road segments; while this allows for modelling spatial connection of road segments, the fact that some features in each road segment are not necessary is ignored, thus yielding less performance compared to our DAE method.

5. Discussion

This study presents a novel representation learning method for a road-type classification task. Compared to other methods found in the literature, which normally perform embedding on each road segment by aggregating information from neighbouring road segments, our method performs embedding by reducing the dimensionality of each road segment while preserving only the important features using the deep autoencoder (DAE) model. To compare the methods fairly, we conducted our experiments using the Linkoping city road networks graph dataset extracted from OSMnx. We then used the same line graph transformation and feature engineering methods as in [2] to represent road segments as nodes and obtain more descriptive features of each road segment, respectively. We then passed the road segment vectors to our DAE embedding methods, obtaining more robust features at much smaller dimensions than the original ones.

We then passed the vectors obtained by our DAE embedding method to the deep neural networks (DNN), support vector machines (SVM), and K-nearest neighbor classifier (K-NN) classifiers to select best performing classifier using the microaveraged f1-score. As shown in Tables 59, we demonstrated that the DNN is the best performing classifier for road-type classification task of the vectors obtained by our DAE method. We compared our DAE method to some of the state-of-the-art methods experimented in [2] for solving a similar task. These methods include graph convolution networks (GCN), GraphSAGE (MEAN, MEANPOOL, MAXPOOL, and LSTM), graph attention networks (GAT), graph isomorphism networks (GIN), and graph attention isomorphism networks (GAIN).

In Table 10, we demonstrated that our method outperforms the GCN, GSAGE-MEAN, GAT, and GIN methods while achieving similar performance to GSAGE-MAXPOOPL. Furthermore, we observed that our method falls short by , compared to the GSAGE-MEANPOOL and GAIN methods. It is worth mentioning that GSAGE-MEANPOOL and GAIN embedding methods achieve the best performances at much larger dimensions of the embedded feature vectors compared to our method, which achieves comparable performance at a much smaller dimension of the embedded vectors. We also note from Tables 59 that merging and relabelling different road types using the method shown in Table 1 is not ideal as several classes (class 2 and class 3) are characterised by many false negatives across all three classifiers, thus, resulting in low micro f1-score in all classifiers.

6. Conclusion

This paper proposes a novel deep autoencoder (DAE) embedding method for road-type classification tasks. We used the state-of-the-art feature extraction method found in the literature and represented each road segment as a feature vector. We then applied our DAE embedding method and obtained embedded road segment features which we later used to train, validate, and test several machine learning classifiers. We compared our results to several state-of-the-art graph embedding methods and demonstrated that our method outperforms some of these methods while achieving comparable results to others. It is worth noting that our method performs embedding by reducing the dimensionality of each road segment vector. In contrast, the graph embedding methods in the literature achieve road segment embedding using the neighbouring road segment features. Therefore, future work will employ a double embedding technique where the vectors obtained by our DAE method are fed as inputs to the graph embedding methods proposed in the literature.

Data Availability

The datasets analyzed during the current study are available at https://planet.openstreetmap.org/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank the authors in [2] for publicly making the road network extraction and feature engineering scripts available for experimentation purposes.