Abstract

In recent years, free-floating bike-sharing systems (FFBSSs) have been considerably developed in China. As there is no requirement to construct bike stations, this system can substantially reduce the cost when compared to the traditional bike-sharing systems. However, FFBSSs have also become a critical cause of parking disorder, especially during the morning and evening rush hours. To address this issue, the local governments stipulated that FFBSSs are required to deploy virtual stations near public transit stations and major establishments. Therefore, the location assignment of virtual stations is sufficiently considered in the FFBSSs, which is required to solve the parking disorder and satisfy the user demand, simultaneously. The purpose of this study is to optimize the location assignment of virtual stations that can meet the growing demand of users by analyzing the usage data of their shared bikes. This optimization problem is generally formulated as a mixed-integer linear programming (MILP) model to maximize the user demand. As an alternative solution, this article proposes a clustering algorithm, which can solve this problem in real time. The experimental results demonstrate that the MILP model and the proposed method are superior to the K-means method. Our method not only provides a solution for maximizing the user demand but also gives an optimized design scheme of the FFBSSs that represents the characteristics of virtual stations.

1. Introduction

Bike-sharing systems (BSSs) offering a mobility service with public bikes available for shared use are becoming popular in urban environments. With growing awareness of green transportation, the BSSs provide alternate and sustainable carbon-free mode of transportation (especially for short-distance trips) to support a green growth in urban environments and to significantly reduce traffic congestion, pollution, and noise. BSSs permit the travelers to rent a bike at stations and then return it to any station with vacant lockers. Several studies have been widely conducted to optimize the station’s location [1], design the shared bikes network [2, 3], and maximize the capacity levels [4].

In recent years, an innovative system for the management of bike-sharing, called free-floating BSSs (FFBSSs), is gradually developing as an emerging technology [5]. This new system can avoid the necessity of docking stations and kiosk machines with relevant physical and information communication technology infrastructures [6]. Two recent studies have demonstrated the advances of FFBSSs. Caggiani et al. [7] proposed a novel method for the strategic design of FFBSSs whose facilities could be allocated in the territory according to spatial and social equity principles. Leonardo et al. [5] proposed a dynamic and operator-based bike redistribution method that could pursue a decision-making process for the relocation of FFBSSs operating area by predicting the number and position of shared bikes.

In the traditional BSSs (see Figure 1), if people want to go to the bus station from their homes in the morning rush hours, they can unlock a shared bike in the bike station near their home and ride it to the bus station. They can use a smart phone to find the location of a bike station besides the bus station; however, the users may face the following two problems: there is no shared bike in the bike station near their homes or there is no parking space in the bike station near the bus station. In either case, they have to find another bike station, which would be very inconvenient and waste a lot of time. Conversely, the BSSs’ companies need to build bike stations near the home, bus station, and company, which would dramatically increase the maintenance cost of the company. In contrast, FFBSSs can solve such problems as shown in Figure 2. In the morning rush hours, people can find a shared bike in any virtual station near their home and then park it anywhere near the bus station and company. They are not required to be concerned about empty spaces in the virtual station. As the virtual stations are not physical infrastructures, the FFBSSs’ companies do not need to build and maintain bike stations. This would sufficiently decrease the cost to the company. Furthermore, FFBSSs also have many advantages such as dynamic adjustment and easy management.

However, though the FFBSSs are superior to the traditional BSSs, it is still a challenging task to decide the location of virtual stations to maximize the satisfaction of user requirements. To address this issue, this study assumes that the people traveling during the morning rush hours use a similar traffic route when they return during the evening rush hours. They expect to find a shared bike in the same place where they had parked it during morning rush hours. The main parking spot in Figure 2 represents the bike parking place of the majority of people. If the bike parking place of most people changes, the main parking spot will also change. As illustrated in Figure 2, during the morning rush hours, people obtain a shared bike from the virtual station and ride it from their homes to the bus station or from the bus station to their companies. Most people would park the shared bikes at the main parking spot, where there is a high user demand during the evening rush hours because most people expect to find a shared bike at the same location during the evening rush hours. The main parking spot of the morning rush hours will probably become the location of virtual stations of FFBSSs during the evening rush hours. Similarly, the main parking spot of evening rush hours has high user demand during the morning rush hours the following day. Therefore, the main parking spot of evening rush hours will probably become the location of virtual stations of FFBSSs during morning rush hours the following day. Therefore, optimizing the location of virtual stations while maximizing user demand during morning and evening rush hours is required by both the FFBSSs’ companies and local governments.

The problem can be described as a mixed-integer linear programming (MILP) model to maximize user demand, and we can use CPLEX and the clustering algorithm [810] to obtain the solution. As this problem is a classic nondeterministic polynomial-time hard problem, the computational time will rise exponentially according to data size increments. CPLEX cannot obtain the exact solution in real time, whereas the clustering algorithm can find an approximate solution value in finite iterations. Compared with other methods, the proposed clustering algorithm can not only obtain the characteristics of virtual stations of FFBSSs but can also outperform K-means method [1115].

The rest of this paper is organized as follows. Section 2 reviews numerous studies related to optimizing the location of shared bike stations. Section 3 describes and formulates the proposed problem. Section 4 presents our clustering algorithm. Section 5 depicts the numerical examples. Finally, Section 6 gives our conclusion.

2. Literature Review

The station location of BSSs was studied for decades. Several researches have investigated the optimal location of deploying stations [1], the network design of bike lanes [2, 16], and the maximization of capacity levels [4, 17]. Stations location of BSSs is a strategic decision that depends on their preliminary goals. Location can be efficiently decided by the support of an optimization model, called facility location model [18, 19]. This model considered various objectives, e.g., the minimization of the overall costs and transportation costs and the maximization of demand coverage. Lin and Yang [2] proposed an integer nonlinear program to determine the optimal location of stations. However, the purpose of this work was to minimize cost. The authors did not consider the relocation of bicycles. Martinez et al. [1] employed a mixed-integer linear program to optimize the location of shared bike stations through a heuristic process. The main purpose of this study was to maximize revenue. Romero et al. [17] depicted a bilevel mathematical programming model to optimize the location of public bicycle stations. This paper considered a simulation optimization method that related public bicycles to private cars. Raviv et al. [20] proposed an inventory model to define the management of bike-sharing by depicting a user dissatisfaction function to assess the quality of the relocation service. This model identified the initial inventory of the station so as to minimize the dissatisfaction function. Church and ReVelle [21] also introduced the maximal covering models to maximize the demand coverage. These models and their applications were widely used for determining the location of BSSs stations based on the maximization of covered demand [22].

In recent years, FFBSSs were gradually developed instead of BSSs [5]. There are only two state-of-the-art papers related to the location of virtual stations of FFBSSs. Caggiani et al. [7] proposed a strategic designing methodology of FFBSSs whose facilities could be allocated in the territory according to spatial and social equity principles. Leonardo et al. [5] proposed a dynamic and operator-based bike redistribution methodology that starts from the prediction of the number and position of bikes over an FFBSSs operating area and ends with a decision support system for the relocation process. Therefore, the key to solve the problem of the location of FFBSSs’ virtual stations is to create an optimization model that considers the maximization of user demand and characteristics of FFBSSs’ virtual stations. In this study, the location of virtual stations was formulated as a MILP model. We also used a clustering algorithm to solve it. In the field of optimizing location of FFBSSs virtual stations, the user demand during morning and evening rush hours has received limited attention. To our knowledge, there is no research that has discussed the location of virtual stations while taking the effect of user demand during morning and evening rush hours into account.

3. Problem Description and Formulation

3.1. Problem Description

In this section, we explain the use of a MILP model to describe the optimizing location of FFBSSs’ virtual stations. The shared bikes of FFBSSs could be parked without the physical stations. However, we could still obtain the parameters of the model such as , , and by analyzing the usage data of the shared bikes provided by the FFBSSs’ companies. The model solved the optimal design scheme of virtual stations of FFBSSs under the condition of maximization of user demand. We represented the user demand through the number of shared bikes of all virtual stations. The model considered that virtual stations could not exist in isolation. Within a certain distance, each virtual station had a minimum of one virtual station as a support to form a mesh structure. We used the concept of adjacent virtual stations to represent the mutual support between certain virtual stations. We used and to present the lower bound and upper bound of distance between adjacent virtual stations i and j. was also the lower bound of distance of all virtual stations. When the distance of certain virtual stations was in , it could be considered that these virtual stations were mutually supportive. When the virtual candidate station was i and j was the adjacent virtual candidate station, then was equal to one. Otherwise, was equal to zero. Owing to the limited area, the quantity of shared bikes in each virtual station was limited.

Figure 4 shows the location of ten shared bikes. Each location could be considered as the location of virtual candidate stations of FFBSSs. We used the concept of service radius to represent service area of virtual candidate station. We assumed that shared bike No. 1 was the virtual candidate station. The virtual candidate station No. 1 covered shared bike No. 3 in the service radius. Therefore, .

3.2. Formulation

The problem is formulated using the following notation:

Sets/IndicesN: Set of all virtual candidate stations

Parameters: Lower bound of distance between adjacent virtual station i and j: The adjacency matrix of adjacent virtual candidate station i and j: The number of bikes of virtual candidate station i: The distance between virtual candidate stations i and j: Maximum capacity of virtual station i: Minimum capacity of virtual station iK: The number of virtual stationsM: A large positive number

Decision Variables, if virtual candidate station i is virtual station; otherwise.

Formulation

The objective function (1) of this linear program maximizes the user demand by maximizing the number of bikes of virtual stations. Constraint (2) indicates that K virtual stations are selected from the virtual candidate stations. Constraint (3) guarantees that each virtual station has one adjacent virtual station at least. Constraint (4) ensures that the distance between virtual stations i and j is greater than or equal to the lower bound of the distance between adjacent virtual stations. Adjacent virtual stations are virtual stations in the network. also represent the lower bound of the distance of all virtual stations. The number of bikes of the virtual station is lower than maximum capacity of the virtual station according to constraint (5) and higher than a minimum according to constraint (6). Constraint (7) is the binary constraint for the decisional variables.

4. Solution Method

4.1. Clustering Algorithm

According to user demand during the morning and evening rush hours, this study proposed a clustering algorithm to optimize the location of virtual stations of FFBSSs by analyzing the usage data of shared bikes and considering the characteristics of location of virtual stations of FFBSSs. The algorithm randomly generated K centroids and clustered these centroids. We guaranteed that each centroid had a minimum of one adjacent centroid. We calculated the fitness function and selected the optimal solution of the best objective function value. The flowchart is presented in Section 4.2.

The procedure of the clustering algorithm is provided below.

Step 1. Load and process data .

Step 2. Set the lower bound and upper bound of distance of adjacent centroid; the number of centroids K; the number of iterations ; the location of centroid ; cluster C = ; service radius L; is the number of points in the cluster ; is the repair coefficient; repair neighborhoods .

Step 3. While the number of iterations do.

Step 4. Randomly generate K centroids U = from data D.

Step 5. Calculate the distance between centroids.

Step 6. If and , then go to Step 7 else return to Step 4.

Step 7. Calculate the distance between points and centroids .

Step 8. If then .

Step 9. Repair strategy: set ; find the and repair it with , until . If the value of is better than before, put it back to cluster C. Step 9 is repeated until the value of does not change.

Step 10. Calculate the fitness function .

Step 11. Compare with the optimal value of history.

Step 12. Repeat Steps 411, until .

Step 13. Output the best solution.

4.2. Flowchart of Clustering Algorithm

See Figure 3.

5. Numerical Studies

In this section, we describe two numerical examples: a small-scale example and a large-scale example, which are used to illustrate the problem properties and the performance of the clustering algorithm. The small-scale example in Section 5.1 was solved using the branch and cut method in IBM-ILOG CPLEX 12.8 and the clustering algorithm. This example demonstrated the location of 10 shared bikes, and each location could be considered as the location of a virtual candidate station in the FFBSS. We selected the location of virtual stations from those virtual candidate stations to maximize user demand. The large-scale example in Section 5.2 was solved with the clustering algorithm to obtain the location of virtual stations of FFBSSs in a main area of Beijing in China. The data for the large-scale example comes from a competition of the Mobike algorithm.

5.1. Small-Scale Example

This section makes use of the small network shown in Figure 4 to demonstrate the solution of the model and the clustering algorithm. and are variable parameters. It depends on certain factors, such as terrain and city size. The empirical values of and were obtained from the shared bike company Mobike. A scenario was investigated, and the parameter settings were as follows:

(1) The lower bound of distance of adjacent virtual stations was set as 400 m;

(2) The upper bound of distance of adjacent virtual stations was set as 1000 m;

(3) The number of virtual stations K was set as 2;

(4) The maximum capacity of virtual station i was set as 10;

(5) The minimum capacity of virtual station i was set as 1;

(6) The adjacency matrix of adjacent virtual candidate station was as follows: (7) The matrix of was as follows: (8) The number of bikes in virtual candidate station was as Table 1.

Each location of shared bike could be considered as a virtual candidate station. We used the number of shared bikes in the service radius of the virtual candidate station as the number of shared bikes for the virtual candidate stations. For example, we assumed that the service radius of the virtual candidate station was 200 m. There is only shared bike No. 3 in the service radius of the virtual candidate station No. 1. Therefore, the number of shared bikes of virtual candidate station No. 1 was one.

5.1.1. Result

Based on the problem properties and parameter setting in Section 5.1, the problem was solved with the branch and cut method in CPLEX 12.8 and the clustering algorithm.

These two methods obtained the same answer in a short time. CPLEX obtained an optimal solution in less than 1 s, whereas the clustering algorithm took more than 30 s to obtain the same solution. The latter required more time because it could not stop immediately after obtaining an optimal solution but instead could stop only after a predetermined number of iterations (6000 in this case) without further improvement. The results are listed in Table 2. According to Table 2, the virtual candidate station Nos. 3 and 8 are selected as virtual stations of FFBSSs. The value of user demand was 8.

5.1.2. Comparison of the Performance of the Exact Method and the Clustering Algorithm

The clustering algorithm was coded in C++ and all computational experiments were performed using a Lenovo notebook with an Intel Core i7-7700HQ CPU with processor base frequency of 2.80 GHz. The small-scale example is a subset of the large-scale example. This set contains instances of 30-230 virtual candidate stations. The number of virtual stations K is set as 5 in Table 3 and 10 in Table 4. The results obtained from the clustering algorithm were compared to those obtained from CPLEX 12.8 with default setting and a maximum running time of 2 h.

As listed in Table 3, Obj denotes the true optimal objective value from CPLEX. CPU is the running time of CPLEX. Table 3 also lists the Avg. Obj and Avg. CPU obtained by the clustering algorithm in 100 runs. Avg. gap (%) represents the deviation of the average objective value from the Obj. Best gap (%) represents the deviation of the best objective value from the Obj.

When N = 30, as listed in Table 3, CPLEX obtained an optimal solution in a time marginally over 1 min, whereas the clustering algorithm required less than 40 s to obtain the same solution. Further, Avg. Obj was nearly equal to Obj value. The Avg. gap was 2.67% and Best gap was 0%. When N = 50, CPLEX achieved an optimal solution in more than 10 min, whereas the clustering algorithm could obtain a good, feasible solution with a Avg. gap of 3.75% and a Best gap of 3.13 in less than 1 min.

For the larger problems (N = 70, 90, 110, 150), CPLEX obtained an optimal solution in 2 h. Meanwhile, the clustering algorithm obtained a more feasible solution in only approximately 50-90 s. The Avg. gap and the Best gap obtained by the clustering algorithm increased with the problem sizes, with a Best gap of less than 5% for N = 150 in Table 3.

When N ≥ 190, as listed in Table 3, the clustering algorithm obtained a feasible solution in about 90-100 s, whereas CPLEX was unable to do so in 2 h. This demonstrates the limitations of CPLEX and the strength of our proposed method in large applications.

Table 4 exhibits that when K is set as 10, the value of Obj, Avg. Obj, and Best Obj is better than Table 3. However, it required more time than Table 3. For example, when N = 30, the value of CPU is greater than 70 s in Table 4. However, the CPU is 64.231 s in Table 3. The Avg. gap and the Best gap obtained by the clustering algorithm increased with the problem sizes in Table 4. Tables 3 and 4 demonstrate that when N 190, CPLEX could not obtain an optimal solution in 2 h.

To sum up, the clustering algorithm yields a good, feasible solution in shorter running time when N ≥ 30. Overall, this method produces high-quality solutions with short computing times.

5.2. Large-Scale Example

The large-scale example uses data (https://biendata.com/competition/mobike/data/) obtained from a competition of the Mobike algorithm. Figure 5 shows the location of the shared bike that is distributed in the areas of the sixth loop and second loop in Beijing, China. As listed in Table 3, owing to the large-scale problem, CPLEX could not obtain the precise solution in limited time. Therefore, we used the clustering algorithm to analyze the data of shared bikes used by people at 9 am and 9 pm in the area of second loop of Beijing to obtain the location of virtual stations of FFBSSs.

5.2.1. Parameter Setting

This section is composed of the related parameter settings. We will explain the reasoning for setting the quantity of virtual stations to 40 in Section 5.2.4. The service radius of virtual stations and the distance of adjacent virtual stations was derived from relevant data of Mobike. The area of second loop of Beijing was obtained from Google Maps.

(1) The number of virtual stations K was set as 40;

(2) The service radius of virtual stations L was set as 200 m;

(3) The lower bound of distance of adjacent virtual stations was set as 400 m;

(4) The upper bound of distance of adjacent virtual stations was set as 1000 m;

(5) The fitness function was the objective function of the model;

(6) The area of second loop of Beijing: latitude: [39.8698, 39.9497], longitude: [116.3595, 116.4324];

(7) was 6000;

(8) The initial objective function value was zero.

5.2.2. Result

Figure 6 shows the best solution for location of virtual stations of FFBSSs during morning and evening rush hours. In this figure, the horizontal axis and vertical axis present the longitude and latitude. The black crosses illustrate the location of virtual stations. The red circles illustrate the service radius of virtual stations. The green dots show the quantity of shared bikes in virtual stations. The black dots indicate shared bikes that are not included in the virtual stations. There are 1585 shared bikes in Figure 6(a) and 1520 shared bikes in Figure 6(b).

We used the clustering algorithm to analyze the usage data of shared bikes at 9 am in the main area of Beijing, to obtain the location where the demand of shared bike is high. To maximize the demand of users, 40 virtual stations were deployed as black crosses in Figure 6(a).

Simultaneously, we also used the same clustering algorithm to analyze the usage data of shared bikes at 9 pm to obtain the location where the demand of shared bike was high. To maximize the demand of users, 40 virtual stations were deployed as black crosses in Figure 6(b), which is different from Figure 6(a). Therefore, the results reveal that FFBSSs’ virtual stations have dynamic merit that can satisfy the users demand for different rush hours.

5.2.3. Diagram of Algorithm Convergence

In the experiments, we used the clustering algorithm to analyze the usage data of shared bikes at 9 am and 9 pm in the main area of Beijing to obtain the diagram of algorithm convergence. Figure 7 is the diagram of algorithm convergence of Figure 6(a). Figure 7 shows the result that the algorithm converges on the generation and the maximum value is 1281. Figure 8 is the diagram of algorithm convergence of Figure 6(b). Figure 8 illustrates the result that the algorithm converges on generation and the maximum value is 1242.

5.2.4. Sensitivity Analysis of Quantity of Virtual Stations

Further, we used the clustering algorithm to estimate the optimal number of virtual stations to obtain the value of user demand in different quantities of virtual stations.

Table 5 depicts user demand increasing owing to the increase in the number of virtual stations. Avg is the ratio of user demand to virtual station, which also denotes the average user demand for each virtual station. The value of Avg is maximum when the number of virtual stations is 40. Therefore, we set the number of virtual stations to 40 in the main area of Beijing.

5.2.5. Comparison of the Performance of Different Algorithms

We also compared our method with K-means algorithm. K-means clustering is one of the most popular methods for data clustering and classification. For a large number of highly dimensional data, K-means can provide an efficient approach to divide similar objects into the same cluster by minimizing the global Euclidean distance. In our numerical experiments, K-means randomly selected the points as initial means, and the means of each cluster was recalculated until it could satisfy the distance requirement of virtual stations of FFBSSs.

In contrast, our method selected a series of feasible points as initial solution, which can dramatically enhance the coverage rate of FFBSSs’ virtual stations. Furthermore, to be compared with the clustering process of K-means method, our method could recluster the objects by maximizing the user demand of each FFBSS’s virtual station. Table 6 depicts the results of maximizing user demand using our method outperforming that of K-means when the number of virtual stations was 40.

6. Conclusions

This article proposed a method for optimizing the location of virtual stations of FFBSSs to achieve the demand of people during morning and evening rush hours. Compared to the fixed stations, the location of virtual stations could change as the user demand changes. Virtual stations were more effective than fixed stations. The problem was solved by a MILP model and a clustering algorithm based on the maximization of user demand. We further tested our proposed method using data from Mobike. The results demonstrated that our method could effectively obtain the location of FFBSSs’ virtual stations that could satisfy the user demand during the morning and evening rush hours. We derived the rationality of parameter setting through sensitivity analysis. By comparing the performance of different algorithms, our method was determined to be more effective than the K-means method. According to the diagram of algorithm convergence, our method could converge to a satisfactory value in limited iterations.

This article presents three contributions as follows. First, our method is superior to the CPLEX in CPU time, especially for large data. Second, our method can better consider the characteristics of virtual stations of FFBSSs than the existing algorithms. Third, our method exhibits computational efficiency. We could find a satisfactory value in limited iterations. This practical method can be useful for strategic planning in the companies of shared bikes.

In the future, we plan to extend our analysis to consider user demand changes at different times and improve our method to fit this dynamic adjustment of FFBSSs’ virtual stations. For this purpose, we will also develop our method to cluster objects without fixed cluster numbers.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported in part by the National Natural Science Foundation of China (61304179, 71501021, 71871036, 71431001, U1813203, 71831002, and 71672016), the Science & Technology Innovation Funds of Dalian (2018J11CY022), the Program for Innovative Research Team in University (IRT_17R13), and the Fundamental Research Funds for the Central Universities (3132018301, 3132018304).