Factor Analysis Model Based on the Theory of the TOPSIS in the Application Research
In view of the existing literature panel data factor analysis model in practical application of the deficiency, this paper established the model of factor analysis based on TOPSIS method, which is applied to the analysis of the panel data factor in practice. Compared with the generalized dynamic factor analysis model, the model does not need to satisfy the 4 assumptions of the generalized dynamic factor analysis model at the same time. The model is calculated with regard to every year’s cross section data factor composite scores the highest and lowest, respectively, for the best and worst vector. By TOPSIS theory, the optimal factor scheme approach degree of each research object is obtained. Take the development of China’s service industry as an example; use the optimal factor scheme proximity of model degree to depict the eastern, central, and western development of service industry. The study found that the development of service industry in eastern provinces and in central and western regions differs greatly. In total, China’s service industry has a great development space.
For the generalized dynamic factor analysis model put forward by Forni, set a variable system available model as follows: for describing wherein represents the lag operator; represents the model of “common factor”; is representation of “common ingredient”; represents ’s “immediate ingredient.” If you have the following four assumptions, this is called generalized dynamic factor analysis model, which is as follows.
The -dimensional vector is an orthogonal white noise process, for and , where is a zero mean stationary vector process; at the same time, for any ; are a unilateral filter on , and their coefficients are square summable.
Hypothesis 1. It illustrates the -dimensional vector is a zero mean stationary vector process.
Assumption 1. The spectral density matrix of is , whose elements are , and there exists a real such that for any .
Assumption 2. The first eigenvalue of the spectral density matrix of the random component satisfies that such that and .
Assumption 3. common dynamic eigenvalues diverge almost everywhere in , that is, , a.e. in . denote the dynamic eigenvalues of .
From the perspective of a model itself, although the generalized dynamic factor analysis model put forward by Forni relatively traditional factor analysis model shows many advantages, it can be a very good factor analysis that was carried out on the panel data. However, in practical applications, factor analysis was carried out on the panel data; the generalized dynamic factor analysis model’s four assumptions are sometimes difficult to get and satisfy at the same time.
On this basis, the article will be supported by the traditional theory of factor analysis and establish factor analysis model based on TOPSIS improvement, relatively generalized dynamic factor analysis; this model avoids the variable system of generalized dynamic factor analysis model which must satisfy four assumptions at the same time. At the same time, this study is not based on annual factor cumulative variance contribution rate weighted sum, and this can avoid the error caused by the annual data caliber inconsistencies. We use the highest and lowest comprehensive score of annual cross sectional data of each research object as the best and worst vector respectively, by TOPSIS method to get the proximity between the comprehensive score vector and the optimal factor vector so as to describe the development status of service industry of each research object.
2. The Theory of TOPSIS Factor Analysis Model
TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution). It is based on the normalization after the original data matrix, finding the optimal solution and the worst in this limited solution scheme (with the optimal vector and the worst vectors, resp.), and then calculating the distance between the evaluation object and the optimal plan and the distance between the evaluation object and the worst plan, as well as obtaining relative proximity of each evaluation object and the optimal scheme, as the basis of evaluation of advantages and disadvantages.
Specific steps are as follows.
Step 1. There are evaluation objects and evaluation indexes, and the original data can be written as a matrix .
Step 2. Maximum index uniform transformation and minimum index normalization transformation are as follows:
Step 3 (normalized matrix ). The maximum and minimum values of each column constitute the best and the worst vector, expression with and , respectively.
Step 4. The distances between the th evaluation object and the optimum and the worst scheme are as follows:
Step 5. The proximity between the th evaluation object and the optimal scheme can be represented as .
2.2. Factor Analysis Model
This method attempts with less number of common factors to get linear function and the sum of specific factors to express each variable of the original observations, in order to achieve a reasonable explanation of the correlation between the original variables and the dimension of the simplified variable. Specific steps are as follows.
We assume there are samples, with each sample of indicators, written in the original data matrix .
Step 1. It is the original data standardization.
Step 2. Calculating coefficient of correlation matrix , principal component analysis was carried out on .
Step 3. Seeking eigenvalues and the corresponding feature vector , determine the common factor of the number of the method; there are two ways, according to the size characteristic values determined. Generally take more than 1 of the characteristic values, also you can use the cumulative variance contribution rate to determine , and the general cumulative variance contribution rate should be above 80%.
Step 4. Extract the previous common factor’s loading matrix , , in actual analysis, in order to make public the factor variable meaning a relatively clear understanding, which often loads matrix maximizing rotated so that each public number of variables highest load factor is on the minimum.
Step 5. Calculate each common factor score which is , and factor variable is determined, which can be calculated for each sample common factor score. Because of the existence of the error, each factor score calculation shall be estimated in a variety of different methods, such as regression method and Bartlett method.
Step 6. Calculate comprehensive evaluation index value, which is a comprehensive factor score:
2.3. Factor Analysis Model TOPSIS Theory
Step 1. Establish index system, is studied object set, and the service industry development evaluation index sets are , and if the evaluation range is for years, then .
Step 2. Factor analysis of cross section data for gets factor of the object set , and the factor comprehensive score of each set of objects: , and, among them, represents the time intercept point, that is, to get the matrix of , as shown in formula (8).
In the above formula, is the common factor of the cross section data factor analysis in th year.
Step 3. Make the final evaluation with TOPSIS method on the factor comprehensive score of the object set . Steps are as follows.
(A) With every year of the cross section data score factor analysis is th regional economic development of an evaluation index data, thus forming a new index system, that is, there are indicators and evaluation objects, and then construct a data matrix with elements, that is, .
(B) Because the factor composite score is a very large index, are indicators of a very large index of the consistency of processing methods, as shown in that is, (C) Find the maximum value of each column in the matrix , minimum value constitution optimal and worst vector . For ; , the optimal vector is and the worst vector isand, among them,(D) Make and representatives, respectively, of the optimal and worst proximity to the object. The distance between evaluating the th object and the optimal or worst solutions, respectively, is as follows:In formula (14), for the smaller the better, and for the bigger the better.
(E) Set to denote proximity degree of optimal scheme-factor. Hence, is the th evaluation objects proximity degree of optimal scheme-factor; that is, If is greater, indicating the th objects development of service industry is better in interval , on the contrary, it is worse.
Symbol Description is the total number of evaluation indexes; is the total number of evaluation objects. represents the factor comprehensive score of the th objects in the th year. The value of is after consistency standardization, . and are representatives, respectively, of the optimal vector and worst vector. and are representatives, respectively, of the optimal and worst proximity to the object. denote proximity degree of optimal scheme-factor.
The Model Description. The model is calculated for each sample and the distance between the worst and the best vector by TOPSIS method, and then calculate the samples and the proximity of optimal solution, as the final results of the evaluation. On the processing of data, the model does not need the deflator to time series data of sample, which overcomes the Dynamic Factor Model which needs to meet four preconditions.
This article is based on the service industry development since China’s accession to WTO Case empirical research [7, 8]. By the availability of data, only selected service industries in 25 provinces of China are the research object, where the object of study in this paper is the model set , 2015 updated version of the “National Bureau of Statistics of the People’s Republic of China” and the Province Bureau of Statistics Statistical Data as the basis (http://www.stats.gov.cn/tjsj/). Using TOPSIS theory of factor analysis model for these 25 provinces in 2004–2013 we conducted research on development status of the services in this range.
3.1. Establish Indicator System
This article is based on the establishment of index system services (China 14 Index) which are divided into sets and each province annually national statistics services subsectors of employment (Unit: 10,000) as an index of data values, that is, .
Symbol Explanation. Symbols are as follows: transportation, storage and postal services: ; information transmission, computer services and software industry: ; wholesale and retail: ; accommodation and catering industry: ; finance: ; real estate: ; leasing and morning services: ; research, technical services geological prospecting: ; water conservancy, environment and public facilities management Industry: ; resident services and other services: ; education: ; health, social security and social welfare: ; culture, sports, and entertainment: ; public administration and social organizations: .
There were a total of 14 services subsector indicators to study and evaluate 25 Chinese provinces in 2004–2013 years of service industry development.
3.2. Based on the Factor Analysis TOPSIS
3.2.1. Sectional Data Analysis Factor Composite Score
By Section 2.2 of the factor analysis theory in this paper, factor analysis was performed on the cross section data of each year in the period of 2004–2013 and with the aid of SPSS software to calculate the use factor analysis to explain the total variance contribution rate and 2004–2013 factor comprehensive scoring function , as shown in the following.
Factor comprehensive scoring functions of cross section data of factor analysis of 2013 are as follows:Factor comprehensive scoring functions of cross section data of factor analysis of 2012 are as follows:
Similarly Available. Factor comprehensive scoring function of cross section data of factor analysis of 2004–2011.
Factor comprehensive scoring function of cross section data of factor analysis of 2004 are as follows:Each factor score function, , is shown in Table 1.
By 2004–2013 years of cross-sectional data factor composite score function and using mathematical software MATLAB, the relevant data into the factor score function can be calculated in 25 provinces of 2004–2013 years on the factor composite score: , to obtain 25 lines of 10-column factor composite score matrix: .
Firstly, it was found from the data in Table 2 that the D+-value’s maximum, that is, the maximum value of the proximity optimal scheme, is 2.4148, with a minimum of 0, mean value of 2.1541, and variance of 0.364. The -value’s maximum, that is, the maximum value of the proximity worst scheme, is 2.4158, with a minimum of 0.0001, the mean of 0.2684, and variance of 0.373. The C-value’ maximum, that is, the maximum value of the optimal scheme of the proximity degree, is 1, with a minimum of 0.0001, mean value of 0.1101, and variance of 0.063. Most of the provinces and the proximity degree of optimal scheme have a big distance except coastal Guangdong, Jiangsu, and Zhejiang. This shows that in 25 provinces in the service sector development degree difference is bigger.
Secondly, Table 2 shows we can also find that the eastern region’s C-value averages only 0.3213, so service industry development space is very big. At the same time, we also found the eastern region average (0.3213) compared to the middle west, that is, (0.0156) and (0.0063), which is much larger. This shows that the central and western regions of service industry development are very backward.
On the one hand, from the frequency distribution diagram, that is, Figures 1–3, we found that the development of service industry in most provinces and the proximity optimal scheme have a big distance, close to the edge of maximum distance (i.e., ). However, it is close to worst scheme. In most provinces, the C-value, i.e., the proximity of the optimal scheme in a relatively low state, is not more than 0.2, which is the development of digital services in 25 provinces in China, most areas in a relatively backward state.
On the other hand, as Figures 1–3 show, we also found that service industry development is not balanced in eastern China. At the same time, it was found that the values of D−, D+, and have large disparity between the east and west; that is, the central and western regions of service industry development behind the eastern region are very large. And service industry development in the western area is the most backward, with C values close to zero.
Table 2 and Figures 1–3 fully explained that China’s service industry development is relatively backward, the development overall level is not high, there is uneven development, the midwest and east gap are big, and the western region of the service industry development is the most backward. So, the Chinese service development space is very large.
4. Relevant Policy Recommendations and Model Prospect
Factor analysis model of TOPSIS established in this paper, research and analysis of the development of the services sector in 25 provinces of China, concluded that, in addition to China’s coastal areas, that is, Guangdong, Jiangsu and Zhejiang province, most of the provinces and the proximity of optimal schemes is bigger, especially in the western region, which shows that most of China’s service industry development degree is far lower than coastal areas. For service industry development is relatively backward area, China can promote their resource advantages of the existence of the region, such as the western region of Guizhou province has a very good climate environment, beautiful mountains, and rich mineral resources; Yunnan region has rich and colorful ethnic minorities and the beautiful Lijiang; Xinjiang region has a beautiful Uygur dance, rich fruit, and so on. Through the support of the government and vigorously promoting their existing resources advantages, we need to attract foreign enterprises and intellectuals to promote service industry development in the region.
The model is calculated for each sample and the distance between the worst and the best vector by TOPSIS method; then calculate the samples and the proximity of optimal solution, as the final results of the evaluation. On the processing of data, the model does not need the deflator to time series data of sample, which overcomes the Dynamic Factor Model which needs to meet four preconditions. The model will also be used to research the biological, medical, and other fields.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by the National Natural Science Foundation of China (Grants nos. 71461027, 71471158, and 71001072); Science and Technology Plan Project of Guizhou Province (no. LH 2015 7055); Guizhou Province Natural Science Foundation in China (Qian Jiao He KY 2014 295); 2013, 2014, and 2015 Zunyi 15851 talents elite project funding; Zhunyi innovative talent team (Zunyi KH (2015) 38); Science and Technology Talent Training Object of Guizhou Province Outstanding Youth (Qian Ke He Ren Zi 2015 06); Guizhou province colleges and universities teaching contents and curriculum system reform project (Qian Jiao Gao Fa 2015 337).
D. Victor Utomo, R. Gernowo, and A. Sugiharto, “Ata-based fuzzy TOPSIS for alternative ranking,” Jurnal Sistem Informasi Bisnis, vol. 3, no. 2, pp. 104–108, 2016.View at: Google Scholar
K. Arif-Uz-Zaman, A Fuzzy TOPSIS Based Multi Criteria Performance Measurement Model for Lean Supply Chain, Queensland University of Technology, Brisbane, Australia, 2012.