A Structural Comparison between the Origin-Destination Matrices Based on Local Windows with Socioeconomic, Land-Use, and Population Characteristics

Afandizadeh Zargari, Shahriar; Memarnejad, Amirmasoud; Mirzahossein, Hamid

doi:https://doi.org/10.1155/2021/9968698

Journal of Advanced Transportation

On this page

Abstract Introduction Literature Review Results Conclusion Data Availability Disclosure Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Road Traffic Performance and Transport Systems in the Cities

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9968698 | https://doi.org/10.1155/2021/9968698

A Structural Comparison between the Origin-Destination Matrices Based on Local Windows with Socioeconomic, Land-Use, and Population Characteristics

Shahriar Afandizadeh Zargari,¹Amirmasoud Memarnejad,¹and Hamid Mirzahossein²

Academic Editor: Elżbieta Macioszek

Received31 Mar 2021

Accepted28 May 2021

Published14 Jun 2021

Abstract

The origin-destination (OD) matrices express the number and the pattern of trips distributed between OD pairs. OD matrix structural comparison can be used to identify different mobility patterns in the cities. A comparison of two OD matrices could express their difference from both numerical and structural aspects. Limited methods, such as the mean structural similarity (MSSIM) index and geographical window-based structural similarity index (GSSI), have been developed to compare the structural similarity (SSIM) of two matrices. These methods calculate the structural similarities of two OD matrices by grouping the OD pairs into local windows. The obtained results from the MSSIM entirely depend on the dimensions of the chosen windows. Meanwhile, the GSSI method only focuses on the geographical adjacency and correlation of zones while selecting local windows. Accordingly, this paper developed a novel method named Socioeconomy, Land-use, and Population Structural Similarity Index (SLPSSI) in which local windows are selected according to socioeconomic, land-use, and population properties for SSIM comparison of OD matrices. The proposed method was tested on Tehran’s OD matrix extracted from cell phone Geographic Position System (GPS) data. The advantage of this method over two previous ones was observed in determining the new pattern of trips on local windows and more precise detection of SSIM of the weekdays. The SLPSSI approach is up to 10 percent more accurate than the MSSIM method and up to 5.5 percent more accurate than the GSSI method. The proposed method also had a better performance on sparse OD matrices. It is capable of better determining the SSIM of sparse OD matrices by up to 8% compared with the GSSI method. Finally, the sensitivity analysis results indicate that the suggested method is robust and reliable since it is sensitive to applying both constant and random coefficients.

1. Introduction

An origin-destination (OD) matrix of urban trips indicates the demand for trips between different traffic zones [1]. The matrix provides transportation engineers with important information on the characteristics and patterns of trips in cities. There are two aspects in an OD matrix comparison:(1)The numerical value of each cell of the matrix, which indicates the number of trips between OD pairs(2)The structure of trip distribution among different traffic zones, which shows the matrix structure

In comparing the OD matrices, both mentioned aspects are important. A difference between each cell’s numerical values when comparing two OD matrices indicates the difference in the number of trips between them and consequently the probability of difference in the flow in links after assigning the matrices to the network. Hence, a difference in trip distribution structures raises the possibility of different analyses of the trip patterns. Therefore, it is crucially important to consider both numerical and structural aspects between the estimated matrix and the real matrix in OD matrix estimation processes. Previous studies have employed various methods to compare two OD matrices. Some of these methods, namely, traditional ones in this paper, only consider the numerical difference or similarity of two matrices, while some called structural methods consider both numerical and structural differences simultaneously. Two matrices are identical when they are structurally similar and have no difference in cells’ numerical values. For a better description, Figure 1 presents two sample matrices with structural and numerical differences.

Figures 1(a) and 1(b) illustrate 2 different OD matrices, that is, T₁ and T₂, in which the number of trips from a location in a zone listed in the left column to other zones listed in the top row is shown. There are no similarities in the structure of the matrices. As shown in T₁, the preferred order of destinations for trips from origin A is B, C, A, and D. Meanwhile, the preferred order of destinations for trips from origin A in the T₂ matrix is A, D, C, and B. Hence, the preference of trips from origin A is different in the two matrices. Rows B has no structural similarity (SSIM) in the above matrices too. However, for both matrices, C, B, D, and A are the preferred destinations for trips from origin C. Thus, rows C in the two matrices are dominantly consistent in terms of the structural framework while having different numerical values. The fourth rows of both matrices demonstrate the trip demand from origin D to other traffic zones, and they are structurally and numerically identical. Though, from a statistical point of view and during comparing any two identical rows, a slight difference in the values of each cell could be neglected.

In previous studies, many traditional methods have been used for comparing OD matrices, including root mean square error (RMSE) [2–4], normalized root mean square error (RMSN) [1, 5], mean square error (MSE) [6], mean absolute error ratio (MAER) [7], mean absolute percentage error (MAPE) [8, 9], the goodness of Theil’s fit (GU) [10], R-squared (R²) [11], and entropy measure (E) [12]. These methods use mathematical formulation to compare cell-to-cell of OD matrices and are not capable of doing group comparison of origins and destinations. It means that they cannot analyze the structural difference of matrices caused by the difference in choosing the trip destinations [13]. For instance, these mathematical relations could not detect the structural difference between two matrices when, for example, one of them is obtained from multiplication of a constant number to a base OD matrix, and the other is chosen from random arrays [14].

Unlike the traditional methods, the structural methods used for comparison of OD matrices are not much in the literature and can be summarized as follows:(1)Mean structural similarity (MSSIM) index [15](2)Geographical structural similarity index (GSSI) [13](3)Wasserstein distance [16](4)Normalized Levenshtein distance for OD matrices (NLOD) [14]

The MSSIM consists of three components. The mean and standard deviation mathematical relations determine the numerical similarity between two OD matrices, while the covariance measure detects the SSIM. Therefore, it can define the similarity or difference between OD matrices [15]. However, it has some disadvantages, including the following:(1)The results depend on the dimensions of the chosen window(2)Smaller and more precise windows require higher temporal calculation costs(3)Since the OD pairs have no correlation with each other in a local window, the SSIM value does not necessarily illustrate any particular concept [17]

The GSSI method, which has been designed based on the MSSIM to overcome the problems, has not been able to overcome all weaknesses either. For instance, it used the geographical locations of zones to design local windows and solved the problem concerning the dimensions of chosen windows. However, the boundaries mostly predefined according to the geographical location of zones can be defined based on socioeconomic indices, land use, and population. While the Wasserstein distance method carefully detects the structural and numerical similarity between two OD matrices, it is very time-consuming due to its optimization nature. The NLOD method has also been developed based on the Levenshtein distance method, and it has been used for comparing two strings of texts. Like the Wasserstein distance method, this method compares OD matrices’ flows using optimization techniques. This method calculates the minimum structural distance between two OD matrices to convert one matrix to another.

This paper focuses on the MSSIM and GSSI methods to make effective changes and modify the mentioned problems. Regardless of the geographical locations of Traffic Analysis Zones (TAZs), this study has considered the following properties to offer a new tool for developing the MSSIM:(1)The socioeconomic properties of the TAZs, such as the level of residents’ employment and private vehicle ownership per capita(2)Land-use properties, such as the area of each land-use class(3)Population properties, that is, the population of each zone

In this research, the computational structure of the MSSIM method was improved with new windows designed according to the socioeconomic indices, land use, and population. Accordingly, Socioeconomy, Land-use, and Population Structural Similarity Index (SLPSSI) is introduced and implemented during this research.

The remainder of the paper is structured as follows: Section 2 reviews the literature on the use of statistical measures for OD matrices comparison; Section 3 presents the methodology and explains in detail the development process of the proposed measure, SLPSSI; Section 4 compares the proposed method with MSSIM and GSSI methods through case study application using GPS-OD dataset from Tehran city. Section 5 tests its robustness through sensitivity analysis; and finally, the paper concludes in Section 6.

2. Literature Review

Extensive studies addressed the development of performance assessment indices for transportation using numerical comparison of OD matrices [18]. However, there is little attention devoted to structural comparison methods. As mentioned before, traditional methods are not capable of comparing a group of OD matrices and cannot determine the SSIM or difference between two matrices as well [14].

2.1. Structural Comparison Indices

Limited studies have been conducted on the structural comparison of OD matrices. Most indices in this field are from other scientific fields as explained in the following sections.

2.1.1. MSSIM Index

Wang et al. [19] used the MSSIM for the first time to compare two images. They showed that two images with equal MSEs relative to a base image may have different MSSIMs compared to the same original image. Djukic [17] suggested using MSSIM to compare two OD matrices assuming that each cell represents one pixel in the image. In this method, by defining a local virtual window, a group of OD pairs is compared. Finally, the values associated with each window for the whole matrix are averaged. The dimensions of the selected windows are necessarily smaller than the matrix dimensions. For example, for an M × N matrix, the dimensions of the chosen m × n window are such that m ≤ M and n ≤ N. Figure 2 demonstrates a sample window for an assumed constant T matrix and how a 2 × 2 local window moves. The dimensions of the chosen local window do not have to be the same for calculating the MSSIM.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

In order to calculate the local SSIM, each figure (a–i) is compared with a matrix that has a similar local window (the matrix that is decided to evaluate its similarity with matrix T). For instance, matrices T₁ and T₂ illustrated in Figure 1 are compared in the first local window (a), as shown in Figure 3.

(a)

(b)

The MSSIM calculation consists of three mathematical relations. These relations evaluate the similarity or difference of the chosen window by comparing the mean, standard deviation, and covariance values. Equation (1) compares the means of the selected windows, that is, µ_x and µ_y. This equation determines luminance l(x, y). Equation (2) compares the standard deviations (σ_x and σ_y). It is called contrast and is shown by c(x, y). Finally, equation (3) compares the covariance (σ_xy) between the entries of the chosen local window in two OD matrices. It is called structure and represented by str(x, y). The first two sections evaluate the numerical similarity/difference between two matrices, while the third section assesses their SSIM. In these equations, x and y are a group of origins and destinations located in the same window in matrices X and Y. Equation (4) is determined from the multiplication of equations. Parameter c₁, c₂, and c₃ stabilize the solutions when the mean or standard deviation is zero. Generally, c₃ is considered equal to c₂⁄2. Previous studies reported the values of 10⁻¹⁰ and 10⁻² for the coefficients c₁ and c₂, respectively [20]. These parameters could be zero unless the mean and standard deviation of the windows are zero. Powers α, β, and γ in equation (4) denote the importance and effects of the proposed parameters, that is, mean, standard deviation, and covariance, respectively. These powers are normally set to one. With these assumptions, the SSIM relationship leads to equation (5). Equation (6) could generate the average value of SSIMs belonging to L local windows. The MSSIM varies between −1 and 1. The value of 1 indicates two identical matrices; meanwhile, comparing two inverse matrices leads to −1.

Assuming and .

Although the MSSIM can recognize the SSIM/difference of two OD matrices, it has the following flaws:(1)No report has addressed the procedure for choosing the dimensions of local windows and its relation with the result of the comparison between two matrices. On the other hand, no article is found showing how MSSIM can identify the structural difference of two matrices when the whole matrices are considered as one window. The reduction in the dimensions of local windows increases the accuracy and the computational cost.(2)Since this method is based on comparing images, it is sensitive to the adjacency of image pixels. In other words, since the numerical order of the origin and destination numbers does not necessarily indicate the closeness of the origin and destination [20], errors may occur during calculating the correlation between zones in the local window. Therefore, previous studies have suggested considering the whole matrix as one window [13].(3)It is not clear how to propose the constant coefficients and further investigation is required to determine these constants.(4)In this method, the OD pairs located in the same window do not necessarily have any correlation with each other, and the SSIM has no particular meaning for each local window. When the matrix is ordered based on the value of flow between origins and destinations, although the zones with high volumes are positioned beside each other, it is not suitable to calculate MSSIM on the ordered matrices when comparing various days with different flow volumes order patterns (like Sundays and Fridays). Therefore, it is again suggested to calculate MSSIM on a window with the dimension of the whole matrix.

2.1.2. Supplementary Methods for MSSIM

The GSSI [13] and 4D-MSSIM [21] are developed to overcome the MSSIM’s flaws, that is, the lack of clearness in choosing the dimensions of local windows and deficiency in spatial adjacency of different TAZs in chosen local windows.

The 4D-MSSIM identifies the adjacency of OD pairs using spatial distance estimation. The obtained Euclidean distance is used to calculate the adjacency distance between OD pairs. This method seeks to select and classify the zones adjacent to each other. The classification of these zones depends on selecting the reference zone from which the distances of the other zones would be calculated. Moreover, the Euclidean distance cannot express the spatial neighborhood of zones precisely. For instance, the TAZs separated by natural or artificial features like rivers or highways are spatially close to each other, even though the trip distance between them is longer.

In the GSSI method, the origins and destinations are first sorted according to the geographical locations. Afterward, the local windows used in the MSSIM method are determined according to the previously defined geographical boundaries and located on higher layers. In this method, the size of local windows varies depending on the number of TAZs in each geographical area. Each local window chosen according to the geographical boundaries is called a geographical window. This method provides a new way to organize windows in the SSIM method. However, it still uses the geographical distance index and predefined boundaries for geographical areas to determine local window dimensions.

2.1.3. Optimization-Based Methods

The Wasserstein distance method has been used for optimum transmission of goods. This method is used to identify the proper distribution of goods from given origins to destinations. The Wasserstein distance between two OD matrices determines the minimum travel time required to assign the first matrix considering the second matrix assignment pattern [16]. The goal of this method is to match two matrices to determine the lowest travel time. So, it can naturally determine the numerical and structural differences of two matrices. It is a linear programming issue and uses an optimization technique. Although this method detects the difference between distributions in various matrices properly or in other words recognizes the structural difference between two matrices, it computationally costs too much. However, this method cannot be easily used for OD matrices with large dimensions. In other words, this method is not suitable for complex urban networks [16].

The normalized Levenshtein distance for OD matrices (NLOD) method [14] calculates the structural and numerical differences between two matrices using an optimization tool by determining the similarity between two strings. Levenshtein distance calculates the minimum cost required to cross over two strings, similar to mutation and crossover operations performed in Genetic Algorithm (GA) optimization technique by insertion, deletion, or substitution of parameters, or value of trips between zones in here. The normalized Levenshtein distance has rarely been used in transportation engineering. In most studies, this method is used for manipulation of a string that may consist of characters, variables, numbers, and so on, for example, the comparison between the license plates of vehicles [22], time-series comparison [23], sequences of trip purposes, and cluster activity-travel patterns [24]. In the NLOD method, each row of the OD matrix is sorted according to its traffic flow. Afterward, the minimum changes for converting each row of the intended matrix to the main matrix considering the numbers in the cells and distribution of trips in each row are determined. Eventually, for the whole matrix, a normalized number between zero and one is reported (zero for two identical matrices) [14].

2.2. Summary of the Literature Review

The primary methods for comparing OD matrices that only focus on the numerical cell-to-cell comparison of two matrices cannot specify the structural difference of two matrices despite their overall numerical similarity. Other methods, such as MSSIM and its developed versions, including GSSI and 4D-MSSIM, can successfully detect the SSIM/difference between two OD matrices. However, given the points mentioned above, such as the way to choose a local window in the GSSI method, ignoring the socioeconomic, land-use, and population parameters when choosing local windows, and the error in selecting the adjacent zones are not considered in the 4D-MSSIM method, even if the developed methods require more supplementary progress. On the other hand, more complex methods like NLOD and Wasserstein distance have high computational costs because of the problem’s optimization nature. Therefore, by improving the MSSIM method with the developed approach in this study, an opportunity to more precisely calculate the structural difference of OD matrices is provided.

3. Methodology

Firstly, the proposed method tries to classify traffic zones based on their ability to generate and attract trips. Five indices (FI) define each TAZ, that is, the resident population, car ownership per capita, population of employees, and lands used for commercial and administrative purposes (land uses). These indices are potential factors for producing and attracting trips in traffic zones. The first three indices produce trips, while the last two, that is, areas of commercial and administrative land use, attract trips. In this process, TAZs are divided into a given number of classes. To categorize similar zones, each zone is assigned a number between zero and one (the normalized value) for each of the above five indices. The total score of each zone is the average of all indices. Then, considering the conducted scoring for each zone using the k-means method, the zones are clustered according to the highest similarity to each other. Numerous methods had been used to cluster traffic data and zones, among which the k-means method has provided the best results [25, 26]. There are TAZs in each class having similar production and attraction potential for trips. The local windows are located on these classes to compare OD matrices (Figure 4).

According to equation (5), the SSIM value can be calculated on the selected local windows considering the socioeconomic, land-use, and zone population properties. That equation is applied on L local windows in this paper with the symbol SLPSSI according to equation (8).

In this equation, X and Y denote the proposed matrices for comparison. Furthermore, and indicate a series of OD pairs on the th local window. The SLPSSI varies from −1 to 1. Equation (7) only gives the structural difference of two matrices, which also varies between −1 and 1.

Figure 5 provides the steps of employing the suggested method.

3.1. Implementation of the Suggested Windowing Method on the OD Matrix of Tehran

In order to investigate the appropriateness of using the developed method in this paper, the suggested windowing method is evaluated on the network of Tehran. The city of Tehran and its surroundings have 731 TAZs, 699 of which are inside the internal boundary of the city. This classification is the finest one considered for Tehran, and comprehensive urban and suburb transportation studies of the city have been conducted at this zoning level. It is called Zone Level 1 (ZL 1) hereafter. On the upper level, the city has been divided into 122 urban zones. Most likely, these zones have been separated based on geographical characteristics and natural features (such as floodways) and manmade ones (like highways). It is called Zone Level 2 (ZL 2) hereafter. Finally, on the topmost level, Tehran consists of 22 municipal districts. These districts form the urban management framework. It is called Zone Level 3 (ZL3) hereafter. Figure 6 demonstrates the zoning of Tehran considering the area level of zones.

In order to compare OD matrices, the matrix estimated by GPS data recorded through navigation software is used. In this procedure, the total data of locations recorded by more than 400 thousand Neshan application users in Tehran throughout one month has been used. It is worth mentioning that numerous methods have previously been used to estimate OD matrices using cellphone data [27] and GPS data [28]. In the present study, the OD matrix of the flow estimated from the GPS data is used only to evaluate the suggested method in actual conditions. The matrix is obtained by daily analyses in one month of the users’ location data, including 326 million records. For estimating the matrix, computational rules are employed to detect the stops and movements of individuals. It should be noted that the obtained matrix is not necessarily the actual OD matrix of trips; in other words, the start and end of trips in this matrix are not necessarily the start and end of the main trips of the citizens. The matrix is only used to test the proposed method.

After estimating the OD matrix of Tehran at ZL 1, the OD matrix is aggregated on ZL 2 to be used in the present study. The aggregation eliminates the sparsity of the OD matrix, leading to a 122 × 122 matrix during investigation. The whole stages of choosing and clustering similar zones are determined in ZL 2. The five parameters consist of the ratio of the employees to the total population, car ownership per capita, area of administrative land uses, area of commercial land uses, and population aggregated for each zone on ZL 2. The zones with the highest similarity in producing and attracting trips are clustered, as shown in Table 1. The group of zones with the highest scores is classified as SLP Group5, and the other zones are classified accordingly.

Figure 7 indicates the number of zones in each cluster. The local windows used in the SLPSSI equations are defined on the zones with similarity on ZL 2 (Figure 8).

The suggested method is also compared with basic MSSIM and GSSI to evaluate its performance and advantage. Accordingly, to design local windows in the GSSI method, 122 urban zones (ZL 2) are divided into northern, southern, eastern, western, and central zones in a higher layer as shown in Figure 9.

4. Results

In this section, the results of the proposed method (SLPSSI) are compared with MSSIM and GSSI and its advantages are discussed. It should be noted that the working days in Iran are from Saturday to Wednesday, Thursday is the weekend of many companies and offices, and also Friday is the national weekend.

4.1. Evaluation of Travel Patterns in Local Windows

The local windows suggested by the proposed method allow for comparing the travel patterns between a group of zones with similar socioeconomic, land-use, and population properties. Since the windows are fixed, each window can reveal the nature of the structural difference or similarity within trips. This feature cannot be concluded from the base MSSIM method because the local windows are not fixed in MSSIM. Moreover, the GSSI method only expresses the travel patterns that coincident with the geographical locations, while the proposed method can extract patterns independent of the geographical locations of zones. For this purpose, the OD matrix of Sunday (a working day), on 13 October 2019, was compared with Friday (weekend), on 18 October 2019, using the SLPSSI method. Table 2 addresses the local SLPSSI values of travel from zones with the lowest potential of trip production and attraction (SLP Group1) relative to other SLP groups.

According to Table 2, the highest SSIM between the trips on Sunday and Friday belongs to trips from zones with the lowest trip attraction and production (SLP Group1) to those with the highest trip attraction and production (SLP Group5). The properties of SLP Group5 zones seem to be able to attract trips regardless of weekdays. Therefore, it seems that urban traffic planners and policymakers should establish demand management policies between these two groups of zones independent of the weekdays. It is also apparent that, to improve the quality of the public transportation system’s services, a specific program should be considered for public transportation lines between these pairs of zones on holidays. On the other hand, the lowest SSIM exists between the trips of Sunday and Friday from SLP Group1 to SLP Group3. This means that the pattern of trips from the origin of SLP Group1 zones to the destination of SLP Group3 zones on holidays is completely different from working days. Accordingly, the local SSIM between the trips from the origin of SLP Group1 to the destinations of SLP Group3 and SLP Group5 is 0.6345 and 0.9210, respectively.

According to Table 3, the SLP Group3 zones have the lowest SSIM with SLP Group1 zones. Thus, the trips from SLP Group3 zones to SLP Group1 zones follow a different pattern on holidays. It seems that the pattern of trips between the OD pair zones of SLP Group1 and SLP Group3 is completely different on holidays from working days. Therefore, public transportation program between these pairs of zones could be completely different on holidays. The evaluation of local windows in the SLPSSI method provides more information with more precise details about the travel pattern of each pair of zones.

4.2. Dimension Comparison of Local Windows Based on the Results

To display the sensitivity of MSSIM results considering the dimensions of the selected windows, the OD matrix obtained from GPS (GPS-OD) of an arbitrary Sunday is compared with two OD matrices of Monday and Friday. During this comparison, the dimensions of local windows vary from 2 × 2 to 122 × 122 as shown in Figure 10. The horizontal axis represents the chosen window dimensions, and the vertical axis shows the MSSIM values. The upper line indicates the MSSIM results obtained from comparing Sunday and Monday. The lower line shows the MSSIM results of comparing Sunday with Friday. As can be seen, the MSSIM values increase with the increase in local window dimensions. In other words, larger dimensions of local windows lead to lower accuracy of MSSIM during determining the SSIM between two matrices. The change in the MSSIM value is more noticeable when comparing Sunday and Friday due to the different travel patterns on these two days. It can be concluded that when the dimensions of the selected local window are smaller, the similarity between two compared OD matrices is expected to be better identified.

The dimensions of the local windows in the SLPSSI method are determined according to the number of zones located in similar groups in terms of socioeconomic, land-use, and population properties. Meanwhile, the dimensions of local windows in the GSSI are determined by the number of zones located in the same geographical location. These windows are fixed but not necessarily square. The dimensions of local windows are variable in both SLPSSI and GSSI methods. For instance, for SLP Group1 and SLP Group2, the local window size is 12 × 20, while it is 29 × 8 for SLP Group3 and SLP Group5. Figure 11 demonstrates the results of comparing the average OD matrix of Sunday and other days of the week within one month of data gathering for the base MSSIM method with local windows of different sizes using SLPSSI and GSSI methods.

As shown in Figure 11, by focusing on the value of SSIM between the averaged OD matrices of Sundays, Thursdays, and Fridays, it can be found that GSSI calculates the value of SSIM of two matrices close to the local window of size 10 × 10 while SLPSSI calculates them close to the local window of size 5 × 5 in the MSSIM method. While having windowing with larger dimensions, the GSSI and SLPSSI methods produce results closer to reality due to the smart and logical selection of zones inside a window.

4.3. Structural Comparison of Weekdays

In this section, GSSI, MSSIM, and SLPSSI methods are compared. For this purpose, the data of one week (from the 5 of October to 11 of October 2019) is chosen and used from the total data of one month. In this process, Sunday, which is an ordinary working day, is compared with other days of the week using the three methods. As mentioned before, 25 local windows with particular dimensions are defined for GSSI, and 25 local windows with dimensions corresponding with the number of zones in each window are defined for SLPSSI. In the MSSIM method, the whole matrix is considered a single window. Figure 12 and Table 4 provide the results of the comparison. According to Figure 12, all three methods are capable of recognizing the structural and numerical differences between OD matrices of daily trips. All three methods detect the difference in the travel patterns of Thursday and Friday in comparison to Sunday.

According to Table 4, the SLPSSI method catches the dissimilarity between Sunday and Friday more precisely. The values of SSIM of Sunday with Friday in MSSIM, GSSI, and SLPSSI are 0.8500, 0.8074, and 0.7630, respectively. Therefore, the SLPSSI method is up to 10% more accurate than MSSIM and up to 5.5% more accurate than the GSSI method. Moreover, the utilized methods indicate the SSIM of all working days. Similar trip distribution patterns in Tehran on Sunday, Monday, and Tuesday can be observed using all three methods. The structural difference of the OD matrices in the first working day (Saturday) and the last full working day (Wednesday) compared with Sunday can only be observed using the SLPSSI. It is worth reminding that part of enterprises are inactive on Thursday, and this day is considered a weekend for them.

According to Table 4, the SSIM calculated by MSSIM for Thursday is negligible (0.9853), although the travel patterns on this day are completely different from other working days. This difference has been recognized by GSSI (0.9675) and SLPSSI (0.9324). Moreover, given its precise classification of zones with respect to socioeconomic, land-use, and population properties, the SLPSSI is able to assign zones with more similarity to each group and design more logical local windows compared to GSSI. Therefore, it has given more accurate results in calculating the structural difference/similarity of OD matrices. Although all three methods detect the structural difference between Sunday and Friday’s travel patterns, the details of these differences can only be evaluated by GSSI and SLPSSI. Tables 5 and 6 provide these details considering the local windows for the two methods.

According to Table 5, the lowest SSIM between Sunday (working day) and Friday (weekend) is observed in the trips with the origins and destinations in the south of Tehran. On the other hand, the highest SSIM can be observed in the trips with origins and destinations in the west of Tehran.

According to Table 6, SLPSSI calculates the SSIM of the OD matrices of Sunday and Friday in each local window. It seems that the suggested method better identifies the lack of SSIM of the zones located in each local window. The lowest SSIM can be observed in the zones with medium socioeconomic, land-use, and population properties (SLP Group3) between Sunday and Friday. Furthermore, no noticeable structural difference is observed between the travel patterns of Sunday and Friday in zones with a high potential of trip production and attraction (SLP Group5). The lack of SSIM observed in the SLPSSI between the zones with different properties does not necessarily depend on their geographical locations. Therefore, it shows a new aspect of difference in daily travel patterns.

4.4. Computational Time

In the MSSIM method, by assuming m × n local windows from M × N matrix, (M − m + 1) × (N − n + 1) local windows could be extracted. For instance, (122 − 2 + 1) ×( 122 − 2 + 1) = 14641, 2 × 2 local windows could be formed for calculations out of a 122 × 122 matrix. Meanwhile, there are only 25 local windows for calculating the SSIM using the GSSI and SLPSSI methods in this study. Calculations for each window are time-consuming, and the required computation time increases with the dimensions of selected windows. The smaller the dimensions of the selected local window, the longer the calculation time for the whole OD matrix. The shortest calculation time is for the 122 × 122 matrix (with a local window of the same size), and the longest calculation time is for a 2 × 2 local window in the MSSIM method. Figure 13 shows the computational time needed to compare 30 GPS-OD matrices with Sunday OD matrix on 6 of October 2019 using different window sizes for MSSIM, GSSI, and SLPSSI. Accordingly, the GSSI and SLPSSI methods are more efficient in terms of computational cost compared to the MSSIM with smaller window sizes. According to Figure 13, The SLPSSI approach is about 11 times faster than the MSSIM method with a 5 × 5 sliding window. There is no remarkable difference between the GSSI and SLPSSI methods in terms of calculation time.

4.5. Performance Evaluation of the Suggested Method with Sparse OD Matrices

In the previous sections, the performance of SLPSSI was discussed using a matrix aggregated in ZL 2. The OD matrix at this level is very dense. However, in most traffic flow patterns studies in cities, matrices in lower levels (e.g., ZL1) with smaller size are considered for calculations. The OD matrix at this level has remarkable sparsity. Thus, evaluation of the suggested approach performance at this level seems necessary. For this purpose, the OD matrix obtained from GPS data is evaluated in ZL 1. Since this OD matrix shows part of daily trips, it has the required sparsity for evaluation. Twenty-five local windows are used for each of the SLPSSI and GSSI methods. However, in the MSSIM method, the zones should be ordered according to trip production. Since the ordered matrices for the weekdays are different, the local window should be of the same size as the whole matrix.

According to Table 7, the highest similarity between Sunday and Friday matrices (0.8075) is shown by MSSIM. Meanwhile, the GSSI and SLPSSI methods have better recognized the discrepancy of these two days, that is, 0.7671 and 0.7048, respectively. Consequently, the proposed approach is able to catch the SSIM in sparse OD matrices in comparison to the GSSI method by up to 8%. The same discussion is true when comparing Sunday with Thursday. Furthermore, with a large size local window, MSSIM is not capable of detecting the structural difference between the traffic flow patterns of Sunday and Thursday. According to Table 7, when comparing Sunday and Monday, the MSSIM is 0.9478, which is slightly different from the MSSIM when comparing Sunday and Thursday (0.9431). The closeness of these values indicates the weakness of the MSSIM method in recognizing the structural difference between the OD matrices of Monday and Thursday. The SLPSSI method also better recognizes the structural difference of two sparse OD matrices. In sparse matrices, choosing zones with similar properties in a local window is more important. The structural similarities for dense matrices are shown in Table 1. The SSIM difference identified when comparing Sunday and Friday using the GSSI and SLPSSI methods for dense matrices is 0.8074 − 0.7630 = 0.0444, while the difference for the sparse matrix is 0.7671 − 0.7048 = 0.0623 according to Table 7. Therefore, the performance of the suggested method is much more suitable for sparse OD matrices as well.

5. Sensitivity Analysis

The sensitivity analysis is a process that measures the level of uncertainty and difference in the results of a mathematical model to the uncertainty and variation of input data [29]. To examine the suggested method’s efficiency and robustness, a sensitivity analysis framework is designed in this section and the efficiency of the model considering any changes in the input data is evaluated. The model efficiency is evaluated under different conditions, such as a change in the structure of the OD matrix or a change in the numerical values of its cells without any change in the structure. Therefore, the sensitivity of the SLPSSI and SLPSTR to the changes in the input variables is determined. The sensitivity analysis is performed on the OD matrix obtained from GPS data of Tehran. The OD matrix of the working day, Sunday 6 of October 2019, is considered the base matrix and shown by symbol X. This matrix aggregates in ZL 2, so it is a 122 × 122 matrix. The sensitivity analysis matrices are performed by manipulating the base matrix in two stages:(1)Applying a constant coefficient: in this mode, a constant coefficient is applied to the base matrix and the robustness of the suggested method in recognizing. The SSIM of the developed matrix is compared with the base matrix. Applying the constant coefficient 1, the base matrix is compared with itself (Mode 1). If a constant coefficient other than one is applied, a difference is formed in the numerical values of the base matrix cells, but the trip distribution structure does not change (Mode 2).(2)Applying random coefficients on the base matrix: in this case, the changes are applied to the values of the base matrix cells and their trip distribution structure (Mode 3).

5.1. Sensitivity Analysis Design for SLPSSI

This section discusses the sensitivity analysis of SLPSSI. The suggested method’s robustness is measured in each of the three modes defined in the previous sections.

5.1.1. Applying Constant Coefficients

The sensitivity of SLPSSI and SLPSTR is measured by applying constant coefficients. Matrix Y is determined from matrix X, by multiplying it to a constant factor α between 0.1 and 2, Y = α × X. In this case, the suggested method will be efficient and robust if(1)SLPSTR equals 1 for any value of α(2)SLPSSI equals 1 for α = 1, and SLPSSI is lower than 1 for any other factor

Figure 14 shows the values of SLPSSI and SLPSTR after applying constant values for α. As can be seen, by applying different factors, the SLPSTR does not vary, which indicates that no change in the structure of the two matrices occurs after applying a constant factor. By multiplying one, the SLPSSI is still one while it alters with values lower than one. Moreover, the factors lower than one are more effective in reducing the SSIM between two matrices compared to the factors more than one. Thus, the suggested method is efficient and robust in the case of applying constant factors.

5.1.2. Applying Variable Coefficients

In this case, the sensitivity of SLPSSI and SLPSTR is evaluated in three probable scenarios, which are usually used in modeling the traffic demand [30]. The random coefficients in four modes are considered equal to θ = [5%, 10%, 15%, 20%] for each of these scenarios:(1)A scenario for OD matrices obtained from old studies (low demand)(2)A scenario for OD matrices similar to the base matrix (medium demand)(3)A scenario for OD matrices in heavy traffic conditions in the network (high demand)

For each of the above modes, the base matrix X is compared with 100 replication of matrix Y and the mean SLPSSI and SLPSTR are calculated for each scenario.

Scenario with low demand: in this scenario, SLPSSI and SLPSTR for matrices and are calculated.

For example, for , varies from 60% to 80% of .

Scenario with medium demand: in this scenario, SLPSSI and SLPSTR for and matrices are calculated.

For example, for , varies from 80% to 100% of .

Scenario with high demand: in this scenario, SLPSSI and SLPSTR for and matrices are calculated.

For example, for , varies from 105% to 125% of .

The proposed method is efficient and robust when the SSIM/difference varies with respect to the value of the simulated matrix in each scenario for both SLPSSI and SLPSTR. This change is appropriate when the difference between the two matrices rises with an increase in the random coefficient; that is, the SSIM between the two OD matrices reduces. The results of calculating the average SLPSSI and SLPSTR for the iterations conducted for different demand scenarios and random coefficients can be observed in Table 8.

According to Table 8, with an increase in θ (in each scenario), the average SLPSSI between the base matrix and the one made by random coefficients decreases. This trend can also be observed in the average of SLPSTR, which only compares the structure of two matrices. Therefore, the suggested method has the required efficiency and robustness to detect different SSIM of matrices by applying random coefficients.

6. Conclusion

The presented study outlines a new method called SLPSSI for OD matrices comparison. A comprehensive evaluation of the similarity or difference between two OD matrices should recognize the difference between them numerically (difference in the value of each cell of the matrix) and structurally (the difference between trips distribution in the whole matrix). In general, the traditional methods only calculate the numerical deviation of two matrices and do not consider their structural differences. A limited number of studies have investigated the structural differences of matrices. Most of these methods have been taken from other scientific fields than transportation engineering including MSSIM. This method recognizes the SSIM of two matrices by choosing the local windows and their movements on two matrices. The accuracy of the MSSIM results depends on the dimensions of the selected windows. Supplementary methods like GSSI have also been developed on MSSIM. In this method, local windows are defined only concerning the geographical locations of zones. The geographical windows cannot put the zones with the same characteristics in one group in the best form. Thus, the presented paper has suggested a new method to classify the zones concerning the similarity between the socioeconomic, land-use, and population properties. The following can be concluded:(1)The suggested method is capable of detecting travel patterns in local windows. This means that, given the logical selection of local windows, it allows for analyzing the travel patterns between zones in each pair of local windows.(2)The proposed method has a lower computational time compared to the base MSSIM method. By choosing fixed local windows with particular dimensions, as GSSI, the suggested method enjoys higher computational speed. It is shown that the proposed approach is 11 times faster than the MSSIM method.(3)The method is capable of identifying the SSIM of the weekdays with higher accuracy compared to the previous methods. The main goal of designing this method is to provide the ability to detect the SSIM (numerical values of cells and trip distribution) between OD matrices. The SLPSSI method precisely recognizes the SSIM/difference between two matrices by evaluating the OD matrices of different days of a week with various patterns. The proposed method is up to 10% and 5.5% more accurate than MSSIM and GSSI methods, respectively.(4)The proposed method could also be used for sparse matrices and is capable of catching the SSIM in sparse OD matrices in comparison to the GSSI method up to 8%.(5)The sensitivity analysis results proved that the proposed SLPSSI approach is a robust statistical measure and is readily deployable to practical applications that involve OD matrices comparison. The sensitivity analysis results indicate that the suggested method is sensitive to applying both constant and random coefficients. Applying constant coefficients shows that the SLPSSI could be used to compare similar matrices, for example, two working days of a week. The sensitivity of the proposed method by applying random coefficients proved its ability to recognize the similarity/difference between different matrices. Therefore, it could be used in the validation of predicted matrices. Thus, SLPSSI is a robust index and reliable from a statistical point of view.

Data Availability

The datasets are available from the corresponding author upon reasonable request.

Disclosure

This paper’s conclusions reflect the authors’ understandings, who are responsible for the accuracy of the research findings.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge Tehran Municipality and Rajman Information Structures Company for providing the required data for this research.

References

J. Barceló, L. Montero, M. Bullejos, M. P. Linares, and O. Serch, “Robustness and computational efficiency of kalman filter estimator of time-dependent origin-destination matrices,” Transportation Research Record: Journal of the Transportation Research Board, vol. 2344, no. 1, pp. 31–39, 2013.
View at: Publisher Site | Google Scholar
B. Hellinga and M. V. Aerde, “A statistical analysis of the reliability of using rgs vehicle probes as estimators of dynamic O-d departure rates,” I V H S Journal, vol. 2, no. 1, pp. 21–44, 1994.
View at: Publisher Site | Google Scholar
K. Ashok and M. E. Ben-Akiva, “Estimation and prediction of time-dependent origin-destination flows with a stochastic mapping to path flows and link flows,” Transportation Science, vol. 36, no. 2, pp. 184–198, 2002.
View at: Publisher Site | Google Scholar
R. Frederix, F. Viti, W. W. E. Himpe, and C. M. J. Tampère, “Dynamic origin-destination matrix estimation on large-scale congested networks using a hierarchical decomposition scheme,” Journal of Intelligent Transportation Systems, vol. 18, no. 1, pp. 51–66, 2014.
View at: Publisher Site | Google Scholar
C. Antoniou, M. Ben-Akiva, and H. N. Koutsopoulos, “Incorporating automated vehicle identification data into origin-destination estimation,” Transportation Research Record: Journal of the Transportation Research Board, vol. 1882, no. 1, pp. 37–44, 2004.
View at: Publisher Site | Google Scholar
E. Cascetta, “Estimation of trip matrices from traffic counts and survey data: a generalized least squares estimator,” Transportation Research Part B: Methodological, vol. 18, no. 4-5, pp. 289–299, 1984.
View at: Publisher Site | Google Scholar
S.-J. Kim, W. Kim, and L. R. Rilett, “Calibration of microsimulation models using nonparametric statistical techniques,” Transportation Research Record: Journal of the Transportation Research Board, vol. 1935, no. 1, pp. 111–119, 2005.
View at: Publisher Site | Google Scholar
M. Cools, E. Moons, and G. Wets, “Assessing the quality of origin-destination matrices derived from activity travel surveys,” Transportation Research Record: Journal of the Transportation Research Board, vol. 2183, no. 1, pp. 49–59, 2010.
View at: Publisher Site | Google Scholar
M. Nigro, E. Cipriani, and A. del Giudice, “Exploiting floating car data for time-dependent Origin-Destination matrices estimation,” Journal of Intelligent Transportation Systems, vol. 22, no. 2, pp. 159–174, 2018.
View at: Publisher Site | Google Scholar
J. Barceló, L. Montero, M. Bullejos, O. Serch, and C. Carmona, “A kalman filter approach for exploiting bluetooth traffic data when estimating time-dependent od matrices,” Journal of Intelligent Transportation Systems, vol. 17, no. 2, pp. 123–141, 2013.
View at: Publisher Site | Google Scholar
A. Tavassoli, A. Alsger, M. Hickman, and M. Mesbah, “How close the models are to the reality? Comparison of transit origin-destination estimates with automatic fare collection data,” in Proceedings of Australasian Transport Research Forum, Melbourne, Australia, November 2016.
View at: Google Scholar
X. Ros-Roca, L. Montero, A. Schneck, and J. Barceló, “Investigating the performance of SPSA in simulation-optimization approaches to transportation problems,” Transportation Research Procedia, vol. 34, pp. 83–90, 2018.
View at: Publisher Site | Google Scholar
K. N. S. Behara, A. Bhaskar, and E. Chung, “Geographical window based structural similarity index for origin-destination matrices comparison,” Journal of Intelligent Transportation Systems, pp. 1–22, 2020, inpress.
View at: Publisher Site | Google Scholar
K. N. S. Behara, A. Bhaskar, and E. Chung, “A novel approach for the structural comparison of origin-destination matrices: Levenshtein distance,” Transportation Research Part C: Emerging Technologies, vol. 111, pp. 513–530, 2020.
View at: Publisher Site | Google Scholar
T. Djukic, S. Hoogendoorn, and H. Van Lint, “Reliability assessment of dynamic OD estimation methods based on structural similarity index,” in Proceedings of Transportation Research Board 92nd Annual Meeting, Washington, DC, USA, January 2013.
View at: Google Scholar
A. Ruiz de Villa, J. Casas, and M. Breen, “OD matrix structural similarity: wasserstein metric,” in Proceedings of Transportation Research Board 93rd Annual Meeting, Washington, DC, USA, January 2014.
View at: Google Scholar
T. Djukic, ““Dynamic OD demand estimation and prediction for dynamic traffic management”,” Delft University of Technology, Delft, Netherlands, 2014, PhD Thesis.
View at: Google Scholar
Y. Hollander and R. Liu, “The principles of calibrating traffic microsimulation models,” Transportation, vol. 35, no. 3, pp. 347–362, 2008.
View at: Publisher Site | Google Scholar
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
View at: Publisher Site | Google Scholar
T. Pollard, N. Taylor, and T. V. Vuren, “Comparing the quality of OD matrices in time and between data sources,” in Proceedings of the European Transport Conference, AET, Frankfurt, Germany, January 2013.
View at: Google Scholar
T. Van Vuren and T. Day-Pollard, “256 shades of grey-comparing OD matrices using image quality assessment techniques,” in Proceedings of 2015 Scottish Transport Applications Research Conference, Glasgow, Scotland, May 2015.
View at: Google Scholar
F. M. Oliveira-Neto, L. D. Han, and M. K. Jeong, “Online license plate matching procedures using license-plate recognition machines and new weighted edit distance,” Transportation Research Part C: Emerging Technologies, vol. 21, no. 1, pp. 306–320, 2012.
View at: Publisher Site | Google Scholar
I. Markou, K. Kaiser, and F. C. Pereira, “Predicting taxi demand hotspots using automated Internet Search Queries,” Transportation Research Part C: Emerging Technologies, vol. 102, pp. 73–86, 2019.
View at: Publisher Site | Google Scholar
A. Zhang, J. E. Kang, K. Axhausen, and C. Kwon, “Multi-day activity-travel pattern sampling based on single-day data,” Transportation Research Part C: Emerging Technologies, vol. 89, pp. 96–112, 2018.
View at: Publisher Site | Google Scholar
H. Dong, M. Wu, X. Ding et al., “Traffic zone division based on big data from mobile phone base stations,” Transportation Research Part C: Emerging Technologies, vol. 58, pp. 278–291, 2015.
View at: Publisher Site | Google Scholar
S. Afandizadeh Zargari and F. Safari, “Using clustering methods in multinomial logit model for departure time choice,” Journal of Advanced Transportation, vol. 2020, Article ID 7382569, 12 pages, 2020.
View at: Publisher Site | Google Scholar
Y. Asakura and E. Hato, “Tracking survey for individual travel behaviour using mobile communication instruments,” Transportation Research Part C: Emerging Technologies, vol. 12, no. 3-4, pp. 273–291, 2004.
View at: Publisher Site | Google Scholar
B. Sana, J. Castiglione, D. Cooper, and D. Tischler, “Using google's passive data and machine learning for origin-destination demand estimation,” Transportation Research Record: Journal of the Transportation Research Board, vol. 2672, no. 46, pp. 73–82, 2018.
View at: Publisher Site | Google Scholar
Z.-P. Du and A. Nicholson, “Degradable transportation systems: sensitivity and reliability analysis,” Transportation Research Part B: Methodological, vol. 31, no. 3, pp. 225–237, 1997.
View at: Publisher Site | Google Scholar
T. Djukic, M. Bullejos, L. Montero, and S. Hoogendoorn, “Advanced traffic data for dynamic OD demand estimation: the state of the art and benchmark study,” in Proceedings of TRB 94th Annual Meeting Compendium of Papers, Washington, DC, USA, January 2015.
View at: Google Scholar

Copyright

Copyright © 2021 Shahriar Afandizadeh Zargari et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1100

Downloads

707

Citations