Abstract

There is a huge amount of data in the opportunity of “turning waste into treasure” with the arrival of the big data age. Urban layout is very important for the development of urban transportation and building system. Once the layout of the city is finalized, it will be difficult to start again. Therefore, the urban architectural layout planning and design have a very important impact. This paper uses the urban architecture layout big data for building layout optimization using advanced computation techniques. Firstly, a big data collection and storage system based on the Hadoop platform is established. Then, the evaluation model of urban building planning based on improved logit and PSO algorithm is established. The PSO algorithm is used to find the suitable area for this kind of building layout, and then through five impact indicators: land prices, rail transit, historical protection, road traffic capacity, and commercial potential have been established by using the following logit linear regression model. Then, the bridge between logit and PSO algorithm is established by the fitness value of particle. The particle in the particle swarm is assigned to the index parameter of logit model, and then the logit model in the evaluation system is run. The performance index corresponding to the set of parameters is obtained. The performance index is passed to the PSO as the fitness value of the particle to search for the best adaptive position. The reasonable degree of regional architectural planning is obtained, and the rationality of urban architectural planning layout is determined.

1. Introduction

The rapid development of urban population urbanization has brought many problems, such as population management, traffic congestion, environmental protection, and safety, which must be faced by every city manager and need to be coordinated and standardized. The present situation of extensive urban development must be stopped and reversed. It is an urgent task to change the development model and reduce the development cost. The rational planning and layout of the city are beneficial to the sustainable development of the city, the establishment of the modern advanced city, and the overall development of the city. Urban building planning must pay more attention to overall planning, comprehensive development of the design concept. Urban planning must do a better job of inheriting yesterday and creating today, the need to foresee the future. Urban building planning is very important for the future development of a city. If the planning and design of the city are not well done in the beginning, it may lead to the scale of the city in the future, and it will be very difficult to break through the layout of the city. As a result, the development space of the city is restricted. Du et al. [1] introduced PSO algorithm into spatial optimization decision selection, compared the difference between exhaustive method and algorithm in solving spatial optimization decision problem through examples, and verified that PSO algorithm has better convergence speed and higher result precision. It is an effective method to solve the problem of spatial optimization decision-making. Zhu Ning et al. [2] added inertia weight to the velocity of discrete particles, the improved PSO algorithm is applied to the urban land planning. Therefore, the PSO algorithm can be better used in urban building planning. The planning and design of urban architecture should pay special attention to the problem of long-term sustainable development and lay the foundation for the integrated development of city. In addition, in the urban planning and design, we should grasp some important design points with the view of science and development, so as to ensure that the architectural community can be unified and coordinated in the overall planning and design. In order to truly improve the functions of the city and realize the urban sustainability, it is necessary to regard urban building as a part of urban ecological environment construction and constantly promote the superiority of urban land and space resources in the scientific planning and adjustment of the city.

2. Urban Planning Big Data Acquisition and Storage System Based on Hadoop Platform

2.1. Basic Framework of Hadoop Platform

The Hadoop platform developed by Apache Foundation is a software platform for processing large-scale data. It is a distributed system infrastructure. Hadoop software has the characteristics of high reliability, high fault tolerance, high expansibility, and high efficiency. It mainly includes two core application functions: HDFS and MapReduce. HDFS serves as distributed storage for large-scale data. MapReduce is used for distributed computing of large-scale data.

2.1.1. HDFS

HDFS is a distributed file system running on PC. It uses master/slave architecture to build a cluster system. It is usually composed of one master node and several slave nodes. Master node is called NameNode, which is usually served by a high-performance server. It is mainly used for metadata management, including client access to files and file system namespace management [3]. Slave node is called DataNode, mainly composed of cheap PC in cluster environment. It is mainly used to store and manage the data on the node and to respond to the request of the client to read and write.

As shown in Figure 1, the directory structure of the files in HDFSs is stored independently on the NameNode. Each actual data file is split into several blocks. This block redundancy is stored in the data of the DataNode collection. NameNode is used to hold the system’s file information, directory information and corresponding block information.

2.1.2. MapReduce
2.1.3. MapReduce

MapReduce is a computing model for large amount of data parallel operation on Hadoop platform. The model can be divided into two stages: first, map, and then reduce. The approximate operation process is as follows: first, the big data used is divided into several parts, and then each subdata is processed, which is called map [4]; the results of each processing are then merged, and then the merged data is processed. This process is called reduction. The process is shown in Figure 2:

2.2. Storage Platform Framework Based on Urban Planning Big Data

In urban planning, people and cities can be visually analyzed through data collection and mining, and then a model is established for urban planning. According to the characteristics of urban planning big data, the Hadoop distributed [5] storage system is placed in the virtualization pool of the resource management platform, with dynamically deployed Hadoop slave nodes, and the characteristics of Hadoop distributed storage are built quickly.

The newly built big data storage platform [6] has good compatibility and long life cycle. It can store the data of a particular city in real time in the platform to realize the analysis and processing of the data. In the process of data storage, including land price, transportation convenience, road traffic capacity, commercial potential, and other quantifiable data, the detailed planning of urban buildings can be realized by docking with the prediction system.

Vector data include spatial coordinate information, attribute information, and topological information. In order to improve data access performance and isolate data faults effectively, the vector data is stored in layers and blocks, and the data tables in the layers are not related to each other. According to the characteristics of vector data, a storage layer table structure based on HBase is designed.

The spatial coordinate information is stored in the spatial data column family, the attribute information is stored in the attribute column family, and the topological information is stored in the topological column family, as shown in Table 1.

3. Logit Regression Model

3.1. Establishment of Logit Regression Model

The logit model, also called the classification evaluation model [7], is one of the earliest discrete selection models, which belongs to the probabilistic nonlinear regression model. The model is simple to sample and does not need to assume the sample distribution. It is a common method of statistical empirical analysis such as sociology, economics, and marketing. The parameters of the model are clear and explanatory. In this paper, the logit model is used to optimize the layout of urban building planning and to find the best location of all kinds of buildings.

The method of establishing logit regression model is as follows: let be a binary variable. The value is “1” (which means that the building is reasonably placed here) or “0” (which means that the building is placed here unreasonably), and there are also independent variables which have an effect on . represents the probability of the building being placed here under the influence of the independent variable ; the logit regression model can be expressed as

In the formula, , is a constant term, is a regression coefficient. Setting is a linear combination of variables; the formula can be described as follows:

The logit curve between and is shown in Figure 3.

It can be seen from Figure 3 that the range of value varies from 0~1, showing an S type symmetry centered on (0, 0.5).

The logit regression model can be expressed as

In the form, represents the natural logarithm of the ratio of reasonable and unreasonable probability of occurrence of building placement, which we call the logit transformation of , which is recorded as .

3.2. Parameters of Logit Regression Model

It can be seen in (3) that the constant term denotes the natural logarithm of the ratio of probability of occurrence between reasonable and unreasonable results in the absence of independent variable . The regression coefficient is which indicates the change of caused by the change of the independent variable when the other independent variables are invariant [8]. There is a corresponding relationship between regression coefficient and odds ratio (OR), which is an index to measure the effect of risk factors. For the same argument Xjs, when adding one unit to it, a formula can be obtained.

Formulas (5)–(4) get (6):

Namely,

When the independent variable , . It shows that the independent variable has no effect on the reasonable placement, and when the regression coefficient of the independent variable is , it means that the independent variable after the increase of the independent variable, which indicates that the existence of the independent variable is beneficial to the occurrence of a certain phenomenon. When the regression coefficient of independent variable , then the increase of independent variable , which indicates that the existence of independent variable is not conducive to the occurrence of a phenomenon and is a restraining factor for the occurrence of event.

3.3. Parameter Estimation of Logit Regression Model

The model parameters are obtained after the establishment of the logit regression model. The parameters of the model need to be estimated by the sample data. Generally, the estimators are recorded as . When estimating model parameters according to sample data, the most common method is maximum likelihood estimate (MLE) [9]. The central idea of maximum likelihood estimation is to select the value of the current sample which can obtain the maximum probability as the estimated value of the model parameters. Construction sample natural functions:

In order to simplify the calculation process, logarithm is usually taken for the likelihood function.

At this time, the objective function , which is called logarithmic natural function, is obtained. Under the condition that the likelihood function is maxima, the solution obtained is , that is, . Generally referring to the maximum likelihood estimate of the regression parameter.

3.4. Hypothesis Test of Logit Regression Model

According to the sample data, the estimated parameters are obtained. After the logit regression model, the regression coefficient is still needed to be tested, in order to verify whether the regression coefficient is statistically significant. Because the dependent variable in the logit regression model is two classified and discontinuous, the distribution of the error is no longer a normal distribution, but it satisfies the two distribution, and all the analysis is based on the two distribution, and the regression model and the regression coefficient test are different from the multiple regression models, and the logit model is different. The hypothesis test mainly includes two aspects: first, the hypothesis test is carried out on the whole model to test whether the regression coefficient of all independent variables is not all 0. The likelihood ratio test [10] is usually used; two is a hypothesis test for a single regression coefficient, and the test of the effect of the corresponding variable of the independent variable is true, and the Wald test is usually used.

3.4.1. Maximum Likelihood Estimation

(1) Principle of MLE. Suppose for a given sample , and its joint probability distribution is. The joint probability density function is regarded as a function of unknown parameters , is called likelihood function. The maximum likelihood principle is to estimate unknown parameters, so that the likelihood function can reach the maximum or find the maximum probability of sample occurrence.

(2) Likelihood Ratio Test. In statistical inference, the classical test method is based on the likelihood ratio. Likelihood ratio test is an index of response authenticity and sensitivity, which is defined as the maximum of likelihood function under constraint conditions and the ratio of maximum likelihood function under unconstrained conditions. For the maximum likelihood estimation of the parameters of linear regression model, the following formula is used:

They maximize the likelihood function under unconstrained conditions. The maximum likelihood of unconstrained maximum likelihood is obtained by substituting them in the likelihood function.

(3) Formula. The constant is independent of any parameter in the model, and is the sum of squares of residuals.

On the other hand, if the likelihood function is maximized under the constraint condition , the estimated values are and and representation is the maximum likelihood value under the constraint condition, the maximum value of the constraint will not exceed the unconstrained maximum value. But, if the constraint condition is “effective,” the maximum value of the constraint should “approach” the most unconstrained. This is the basic idea of likelihood ratio test. Likelihood ratio is defined as

Obviously, . If the original hypothesis is true, we will think that the value of is close to 1. Or, if is too small, we should reject the original hypothesis. The establishment of likelihood ratio test is to make the original hypothesis rejected when . That is, ( is a significant level). In some cases, the denial of domain can be transformed into the form of t statistics or F statistics that we know very well. However, the general application is a large sample test. It can be proved that for large samples, the likelihood ratio statistics are

Specifically, if LR is large, the original hypothesis should be rejected, or the rejection field . is the lower part of the 1-α distribution of the chi-square distribution.

The unconstrained maximum likelihood value has been obtained. In order to guarantee the computation of LR, we also need to obtain the maximum likelihood value under constraint conditions. For this purpose, maximizing the (the in the formula is the Lagrange multiplier vector of and the unconstrained logarithmic likelihood function), can be obtained under the constraint condition. Since the maximum likelihood estimation of the parameters is actually the same as the least squares estimator, the residual error is , and the maximum likelihood estimation of the band constraint is (similar to (12)).

(4) In the Form. The residual squares of the unconstrained model and RSS are .

The residuals square and RSS of the constrained model are recorded as .

(5) Wald Test. The advantage of Wald test is that it only needs an unconstrained model. When the estimation of the constraint model is very difficult, this method is more applicable. Moreover, the Wald test is suitable for testing linear and nonlinear constraints. The principle of Wald test is to measure the distance between unconstrained estimators and constrained estimators, for example, suppose the following unconstrained models are assumed:

To test whether it holds under the linear constraint condition of , the constraint model can be expressed as

can also be used. Because there must be , Wald test is only used to evaluate the unconstrained model. If the constraint condition is established, the unconstrained estimator () approximated to zero. If the constraint is not established, the unconstrained estimator () will be significantly zero.

4. Optimization of Architectural Layout and Layout Based on PSO Model

4.1. The Selection of Optimization Algorithm

As we know, the simulated annealing algorithm is flexible, widely used, and highly efficient. However, the ability to search globally is poor, and it is easy to be affected by the parameters. Therefore, it is not suitable to search the appropriate location on the city map. Although the genetic algorithm can jump out of the local optimum and get the global optimum, it is not suitable for the further development of urban areas because of its limited search ability for new space. In addition, the shortcoming of more training time is more limited in the context of large data. Therefore, the PSO algorithm with fast convergence speed and global search is more suitable for this kind of problem.

4.2. The Principle of Particle Swarm Optimization Algorithm

In the classical particle swarm optimization algorithm [11], each particle has its own position and velocity. The position of the particle represents a point in the solution space, and the velocity represents the direction and distance of the particle’s various dimensions. Usually, is used to indicate the current position of the particle. indicates the current speed of the particle, indicates the optimal location of the particle search. Sometimes, the is also used to represent the particle itself. The position of a particle depends on the objective function value of the optimization problem, that is, the fitness function value. It is recorded as .

The particle swarm optimization algorithm first randomly initializes the position and velocity information of each particle in the particle swarm size. Then, in the subsequent iteration, the particle is updated by two historical optimal information. The first is the highest position of its own historical adaptation value, expressed in , and the other is the historical suitability of the whole population. The maximum value of the value is expressed in . In the solution space of the D dimension, the location update and speed update of the particle swarm are shown in (18) and (19), respectively, where represents the D dimension of position or speed.

Type (18) of as the number of iterations and is uniformly distributed random numbers on [0, 1], and and is normal positive number, called learning factors. In order to prevent the particle from flying beyond the solution space, the constant is used to limit the maximum velocity of the particle. If the absolute value of one dimension of the particle’s velocity exceeds , its value is set to or . Particle position also is restricted in the corresponding search range . If the particle’s position exceeds the boundary during flight, the corresponding boundary treatment is required [12]. When a particle exceeds the range of search space in one dimension, the value in that dimension can be set as the boundary value.

After obtaining the new position of the particle, the corresponding target function value can be obtained, and the historical optimal position information of the particle and the historical optimal position information of the population can be updated.

The right side of formula (18) consists of three parts. The first part is the velocity of particles before updating. This part has the ability to explore new area and expand the search scope so that the particle swarm optimization algorithm is equipped with global optimization ability. The second part is called “cognition model,” which enables particles to have strong local search ability by learning the best information of their history. The third part is called “social model,” which means that particles are influenced by the optimal information of the population history, which reflects the cooperation and information sharing among particles. Together, these three parts determine the particle’s ability to find the optimal solution.

4.3. Standard Particle Swarm Optimization

In the actual solution of optimization problems, it is often necessary to carry out global search. Particle swarm optimization algorithm converges quickly, but it is easy to fall into local optimum. Shi and Eberhart [13] were committed to studying this problem. After the search converging to a region, local detailed search is used again to get a better solution. Therefore, in the first part of (18), that is, the velocity of the particle itself is multiplied by the inertial weight . When the inertial weight is large, the algorithm has strong global searching ability, while when the inertial weight is small, the algorithm has strong local searching ability. In this way, we can achieve the effect of global search before local search. Therefore, the new speed update of particle swarm optimization algorithm is shown in (20).

Shi and Eberhart compared the improved PSO algorithm with the original algorithm and found that the improved algorithm has greatly very big enhancement in the optimization performance. It was thought that this algorithm improved the standard PSO algorithm, which is also the basis of the current research in particle swarm optimization algorithm. The inertia weight ω of omega represents the speed inheritance ability of particles, and it is the first time to introduce inertia weight into PSO algorithm. The analysis shows that larger inertia weight is beneficial to global search and lower inertia weight is more favorable to local search. The linear decreasing inertia weight is used to select the weight, that is, where is the initial inertial weight. is the inertial weight of the maximum number of iterations. is the number of current iterations; for maximum iterative algebra, in general, the inertia weight w = 0.9, w = 0.4 algorithm performance is best, so, following an iterative linear decreasing inertia weight from 0.9 to 0.4, the iterative initial large inertia weight algorithm keeps the strong global search ability, and iterative late low inertia weight is advantageous to the local search algorithm and is more accurate to the local search algorithm and is more accurate (Figure 4).

The particle swarm optimization algorithm mentioned in this paper without special description refers to the standard particle swarm optimization algorithm [14]. The standard PSO algorithm steps are as follows:

Step 1. , the position and velocity of each particle in the population are initialized, and the maximum velocity and position boundary and are set.

Step 2. Calculate the adaptive value of each particle.

Step 3. Update the optimal location of each particle and the optimal location of all particles .

Step 4. Update the particle velocity according to (17). If the particle velocity exceeds the boundary, transboundary treatment is carried out.

Step 5. Update the particle position according to (16). If the particle position exceeds the boundary, transboundary treatment is carried out.

Step 6. . If the completion condition is satisfied, the algorithm is finished. Otherwise, go to Step 2.

The flow chart of the standard particle swarm optimization algorithm is shown in Figure 5.

5. Urban Architecture Planning Based on Logit and PSO Algorithms

5.1. The Main Factors Influencing the Layout of Urban Architecture Planning

Urban planning and layout optimization can coordinate the internal functions and external environment of buildings, so the layout of land occupancy determines the direction of the whole urban design, related to urban development space. The layout of the building needs to meet the basic requirements of economy, integrity, and the premise to meet the needs of users. Urban planning is to plan the development of urban economic, spatial, and social structure on the premise of development vision, scientific demonstration, and expert decision-making. According to the theory of urban planning, common indicators of significance are found in different types of buildings. As each city has its own history, special geographical location, and its own characteristics, it should consider many factors in the planning and layout of a particular building. After data review and selection, this paper selects land price (X1), rail transit (X2), historical protection (X3), road traffic capacity (X4), and commercial potential (X5) to measure the distribution of urban building planning.

5.1.1. Land Price

The value of urban land affects the degree of urban planning. From overall city level, the discretion of the future urban land price factors include the reasonable degree of the urban land allocation, land use function layout, the development level of urban infrastructure, overall capacity of urban construction, and control standard. All these are decided by urban planning decision. From the perspective of urban local areas, the factors that determine land price include land use function, development intensity, and environmental control, which are determined by specific planning control requirements. Reasonable planning means that the development of some land is limited while giving a large development value to some land, and the market price of each specific land varies with the control conditions of the planning. A good planning can not only ensure the efficient and reasonable use of urban land but also improve the overall level of urban land price. Land price changes due to different planning stages, and the urban land price is affected by the planning outline, comprehensive planning, regulatory detailed planning, and planning implementation process.

5.1.2. Rail Traffic

China has entered a period of rapid urbanization. With the continuous expansion of urban scale, the commuting time and distance have increased correspondingly. Commuting costs have become a major factor in choosing land for residential and commercial areas. Therefore, according to the distance between the plot and the subway and bus stations, the planning of a certain type of building and its land is measured. The construction of rail transit improves the accessibility of traffic along the route and drives economic development.

5.1.3. Historical Protection

Cities are the products of economic development and the crystallization of the cultural evolution of the past dynasties. In the long history of urban development, predecessors left us with a rich historical and cultural heritage: the ancient castle, ancient buildings (towers, bridge, temple, view, garden, house, etc.), ancient streets, ancient town, ancient sculpture, ancient city sites, etc. are the ancient cultural ideology, science and technology, and social landscape. They are like a mirror, which can reflect the characteristics of the civilization in a certain historical era and become the reference and foundation for the further development, evolution, and the future of urban construction. The history of cities is the social and cultural connotation of cities, representing the social behavior habits and cultural values of urban residents, and is the social life demand of urban residents. Only by respecting history can we make use of the means of urban planning to make the city develop towards the ideal direction of local residents and build a city with cultural connotation and unique characteristics. Therefore, for some areas in the city, such as historical monuments and cultural protection, it is necessary to give reasonable consideration to the physical planning and spatial, so as to minimize the impact and destruction of the protection areas.

5.1.4. Road Traffic Capacity

The capacity of road traffic refers to the ability of a lane or section of a road to pass through a vehicle or pedestrian within a certain amount of time under the certain road and traffic condition. When there is vehicle intermingling, domestic roads are generally used as units of equivalent medium-sized cargo vehicles, and equivalent minibuses are used as units in expressways, first-class roads, and cities. The main factors affecting road conditions, vehicle performance, traffic and environmental conditions, the quality of management level, the pilot, and climate, according to the traffic conditions, can be divided into continuous traffic capacity, namely, the road between the two intersection traffic capacity. Based on the capacity of road traffic, different levels of urban roads will have different influence. The higher the level and number of traffic roads around the plot, the greater the traffic capacity.

5.1.5. Business Potential

Based on the commercial potential, the commercial potential also has a significant impact on the architectural planning, and the areas with large commercial potential will generally form business clusters.

5.2. Application of PSO Algorithm in Urban Building Planning and Layout

The particle swarm optimization algorithm is used to optimize the parameters of urban building planning and layout, as shown in Figure 1.

In Figure 6, the bridge between particle swarm optimization and logit model is the corresponding adaptive value of the particle (the performance index of the urban planning layout of the particle’s current position). Particle swarm optimization process is as follows: PSO produce can be initialized particle swarm, can also be updated particle swarm, the particle swarm particles in an assignment to the logit model parameters, then run the evaluation system of the logit model, namely, type (18), get the group index parameters corresponding to the performance index, the performance index passed to adapt to the values of the particles in PSO, and finally decide whether you can exit the algorithm.

5.3. Logit Regression Model of Urban Architecture Planning Serving for PSO

Selection according to the above five indexes, land prices, rail transportation, historical preservation, road traffic capacity, commercial potential (respectively, for ), established the Logit model is as follows:

With the help of SPSS software, the logit regression analysis of the sample data collected and stored on Hadoop platform is carried out. The regression coefficient and the bias term are obtained.

Bring the parameters into (22):

Formula (23) is used as the PSO fitness function. Therefore, the objective function is

5.4. PSO Parameters Settings

In (18) and (19), the maximum of the particle’s velocity and position is 2. is the upper and lower limit of the scope of search for urban objects. Particle velocity update constant is determined by the size of the target building. As for selection of the value of inertia weight , making use of the sample data of Hadoop platform, dozens of omega value are used to solve functions. The average value of the solution, the number of failure, and the number of approaching t0he optimal value are compared to analyze the convergence accuracy and speed (Table 2). Finally, 0.8477 is found to be a general weight of inertia.

5.5. Case Studies

Taking the layout of an industrial building in Hebei Province as an example, with the help of the Hadoop-based data acquisition and storage platform constructed in this paper, the location information of the industrial building in the city is collected, and the index system of the layout data of the industrial building in the city is screened out. After data pretreatment, the index was scaled 1–7 points in turn, centered around historical protection and railway traffic capacity according to Richter’s seven-grade scale. Screen the location index data of the city within a certain area.

Through regression analysis of data, logit regression formula is obtained.

Logit transform is applied to the upper form:

The above PSO algorithm is used to find the location with greater fitness. The result is as follows:

Six of the largest fitnesses and their position are listed in the table above. These positions are chosen to be the most suitable for industrial land to be located at.

With particle swarm and logit model-based algorithm processes, use of MATLAB simulation calculation step by step and urban industrial land planning layout optimization, results are obtained: with the increase of the number of iterations, the optimization of the industrial land reduced, while the credibility of 0.8477 and no longer increases, the optimization model in this area of industrial land planning points out six. As shown in Figure 7 with the X axis and the Y axis showing the length and the width of our spatial model, the optimization model in this area of industrial land planning points out six spatial plots for planning recommendation.

6. Conclusion

Layout planning of large data storage system, through the establishment of evaluation model to realize the logit regression model to build the current urban construction point, and then as a particle swarm optimization model of adaptive value function to search the optimal construction put points, economic feasibility, belonging, natural durability as many elements considered as the best suitability evaluation of urban architecture through case analysis, this article constructed based on the background of big data the rationality of the urban planning layout construction put for the urban economy sustainable development, in line with the urban construction planning must pay more attention to the design concept of unified planning, comprehensive development.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (Nos. 51508334; 51178279) and the Natural Science Foundation of Guangdong (No. 2015A030310276).