Discrete Dynamics in Nature and Society

Volume 2017 (2017), Article ID 2387543, 6 pages

https://doi.org/10.1155/2017/2387543

## A PLS Approach to Measuring Investor Sentiment in Chinese Stock Market

^{1}Glorious Sun School of Business and Management, Donghua University, Shanghai 20051, China^{2}Shanghai Ocean University, Shanghai 201306, China

Correspondence should be addressed to Gang He

Received 8 April 2017; Revised 7 June 2017; Accepted 27 June 2017; Published 6 August 2017

Academic Editor: Alicia Cordero

Copyright © 2017 Gang He et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We select five objective sentiment indicators and one subjective sentiment indicator to build investor sentiment composite index in Chinese stock market by using the partial least squares. The reason why we do that is to improve the shortcomings of the principal component analysis, which was adopted to build investor sentiment composite index in the pioneering research. Moreover, due to the large proportion of individual investors in Chinese stock market and the rapid change of investor sentiment, we innovatively use the weekly data with smaller information granularity and higher frequency. Through empirical tests for its reasonability and market’s predictive capability, we find that this index appears to fit the data better and improves prediction.

#### 1. Introduction

Recently, investor sentiment measurement has become one of the more widely examined areas in behavioral finance. The key to measuring investor sentiment is to find the proxy indicators which can express sentiment accurately. It is better that these proxies are observable and quantifiable and can objectively and comprehensively reflect the views of investors on the market. Investor sentiment proxy indicators are usually divided into three types: single objective sentiment indicator, single subjective sentiment indicator, and comprehensive sentiment index. Single indicator is the basic component of composite index construction, which is used flexibly in different studies. While the composite index has theoretical advantages, if the method is properly constructed, we will obtain a more accurate measure of sentiment. According to the pioneering literature, the construction of comprehensive sentiment indexes has become the mainstream of the construction of sentiment indexes. Baker and Wurgler [1] used the first principal component of the proxies as their measure of investor sentiment, and it had been extensively adopted in the following research. For example, Stambaugh et al. [2], Ben-Rephael et al. [3], Chen et al. (2014), Chong et al. [4], Zhigao and Ning [5], Ma and Zhang [6], and so on are basically adopted this method.

However, the first principal component appears to be a combination of six proxies that maximally represents the total variations of the six proxies. Since all the proxies may have approximation errors for the actual condition but unobservable investor sentiment and these errors are parts of their variations, the first principal component can potentially contain a substantial amount of common approximation errors that are not relevant for forecasting returns. The partial least squares (PLS) will address the problem effectively. The principal advantage of PLS is that it can extract as much as possible part of investor sentiment from the proxy variable of sentiment. This will ensure that the extracted part is close to the real investor sentiment. For example, Huang et al. [7] use the same six American individual investor sentiment proxies of Baker and Wurgler [1], which include close-end fund discount rate, share turnover, number of IPOs, first-day returns of IPOs, dividend premium, and equity share in new issue to propose a new sentiment index by adopting PLS method. They call the new index extracted by this way the aligned investor sentiment index. They find that their index has greater power in predicting the aggregate stock market than the Baker and Wurgler [1] index.

The PLS method has proved suitable for constructing investor sentiment index in the American stock market by Huang et al. [7]. In this paper, for the purpose of predicting the Chinese aggregate stock market better, we develop Chinese market sentiment index by using the PLS method. The rest of the paper is organized as follows. Section 2 introduces principle of partial least squares (PLS) method to construct indexes. Section 3 constructs the comprehensive index of investor sentiment and then tests its robustness and the power of predicting the stock market.

Finally, Section 4 concludes the paper.

#### 2. Principle Introduction of Partial Least Squares (PLS)

Partial least squares (PLS) was first proposed by Wold and Albano in 1983. It can realize multiple variables regression modeling in small samples. After the improvement of Kelly and Pruitt [8], it can be used to solve the problem of variable information extraction. Different from the principal component analysis, partial least squares use the method of decomposing predictive variable and response variable , extract component (usually called factors) from them at the same time, and then arrange the factors from large to small arrangement according to the correlation between them. In other words, the partial least squares method can not only well explain the information in the prediction variables, but also well summarize the response variables and eliminate the noise interference in the system. Therefore, it can effectively improve the problems where the PCA method just extracts the information hidden in the predictive variable , resulting in regression model accuracy decrease. We assume that the one-period ahead expected log excess stock return explained by investor sentiment follows the standard linear relation:where represents the comprehensive investor sentiment index of the period. represents the closing price of China Securities Free Float Index (CSI Free Float) (the CSI circulation index is composed of full circulation shares of Shanghai and Shenzhen stock markets; it is based on December 30, 2005, and it adjusts the market capitalization of the stock based on all samples; the base point is 1000) during time period* t*. The formula shows that the excepted closing price of CSI Free Float in the period is related to the investor sentiment in the period. So the real closing price of CSI Free Float in the period iswhere is a residual term. It is unpredictable and has nothing to do with investor sentiment , ordering to represent a single investor sentiment proxy variable vector of -order in the period and assuming that each original proxy index has the following structure:

We assume that should be a linear combination of , which means the relationship between and iswhere represents the investor sentiment information contained in the original proxy variable . represents the deviation information, which is unrelated to the investor sentiment but is related to the closing price of CSI Free Float. is a unique noise contained in the proxy variable . , represent the sensitivity of and to proxy variables , respectively. represents the weight of the integrated measure index in the investor sentiment information which is contained in the proxy variable . Therefore, we think that the core of the problem lies in how to decompose investor sentiment information of a structure for each original proxy variable . The partial least squares method is better than the principal component analysis method, which can effectively eliminate the interference of information deviation and specific noise and can construct the comprehensive sentiment index which can reflect the real investor sentiment.

Integrating (2), (3), and (4), we can sort out that there is such a relationship between the individual investor sentiment proxy index and the closing price of CSI Free Float :

From it, represents the explanatory power of the original proxy variable to the closing price of CSI Free Float combining with (2), (3), and (4); we can see that each investor sentiment proxy variable can be expressed as a linear function of the closing price of CSI Free Float, and it has nothing to do with the unpredictable deviation . Therefore, we think that in (5) can be used to reflect the contribution degree of investor sentiment proxy variable to the comprehensive investor sentiment index . As far as the contribution of each proxy variable to investor sentiment is concerned, it can be determined by the covariance between the investor sentiment proxy variable and the closing price of CSI Free Float . Then, based on the PLS method, the comprehensive investor sentiment index can be expressed as

From it, represents a single investor sentiment original proxy variable sequence; represents the weight of each proxy indicator in the comprehensive investor sentiment index.

#### 3. Empirical Analysis

##### 3.1. Data

In the process of collecting indicator data, considering the larger proportion of individual investors in Chines stock market, it is extremely easy to be influenced by short-term market volatility and then lead to irrational speculation. In order to more accurately track changes in investor sentiment on the market, in this paper, we innovatively adopt weekly data which have smaller information granularity and higher frequency, to capture the immediate investor sentiment, rather than the annual or monthly data used in most of the previous literature. In this paper, the weekly data set from January 4, 2008, to May 30, 2014, is used as the training set of sentiment index construction. At the same time, in order to test the validity and robustness of the index construction method, we will intercept the weekly data from June 6, 2014, to May 29, 2015, as the test set of the index construction and use the corresponding cycle of CSI Free Float to represent the overall performance of Chinese A shares. In this paper, we select five objective indicators through the optimization in the specific selection of proxy indicators, which are SWS Low Profit Margin Stock Index (LPM(0)), SWS High-P/E-Ratio Index (HPEI (0)), SWS High-P/B-Ratio Index (HPBI(0)), one-period lag Newly Additional Fund Accounts (NAFA (+1)), six-period lag new number of IPO (NIPO (+6)), and a subjective indicator: New Fortune Analyst Index (CAI (0)) over the same period. Based on conclusions of Baker and Wurgler [1], we believe that investor sentiment leads investors to make decisions; at the same time, investor sentiment itself will also be affected by changes of macroeconomic factors; for example, the number of IPOs will change with the macroeconomic cycle fluctuations. But this is based on the objective analysis of the reality of the macroeconomic operation situation. It is a rational sentiment based on the investor’s psychological factors and not included in the scope of the study. Therefore, we will separate the rational components of investor sentiment through the multivariate regression model, eliminate it, and only retain the irrational elements of investor sentiment:

From it, is the original proxy variable value of the period. That means, , , , , , , and are a series of indicators reflecting macroeconomic fundamentals, is the parameter to be estimated, and is a constant. is the residual of a regression equation, which represents irrational sentiment excluding macroeconomic fundamentals. Here, taking into account the representativeness of the macroeconomic cycle variables and the availability of weekly data, we use China’s commodity price index (CCPI) and the Central Bank weekly monetary net supply (MNS) as proxy variables to reflect the macroeconomic fundamentals.

Residual sequence obtained by regression is as follows: , , , , and , respectively, expressed by , , , , and . They represent the proxy variables of irrational investor sentiment after the elimination of macroeconomic fundamentals.

Because the selected original proxy variables of the investor sentiment are not subject to normal distribution, in this paper, we choose the standardization of 0-1 method to standardize the index. The method uses observed value of a variable to subtract the minimum value of the variable. The specific formula is

After the sequence of the standard deviation, , , , , and are expressed as , , , , and . After standardization, the observed values of each variable will fall between (0, 1); the standardized data are pure numbers without units and can be directly used for the following index structure. After the above pretreatment, the results of the descriptive statistics of the selected investor sentiment proxy indictors are shown in Table 1.