Complexity in Financial MarketsView this Special Issue
Econometric Modeling to Measure the Efficiency of Sharpe’s Ratio with Strong Autocorrelation Portfolios
Sharpe’s ratio is the most widely used index for establishing an order of priority for the portfolios to which the investor has access, and the purpose of this investigation is to verify that Sharpe’s ratio allows decisions to be made in investment portfolios considering different financial market conditions. The research is carried out by autoregressive model (AR) of the financial series of returns using Sharpe’s ratio for evaluations looking over the priority of financial assets which the investor can access while observing the effects that can cause autocorrelated series in evaluation measures for financial assets. The results presented in this study confirm the hypothesis proposed in which Sharpe’s ratio allows decisions to be made in the selection of investment portfolios under normal conditions thanks to the definition of a robustness function, whose empirical estimation shows an average 73% explanation of the variance in the degradation of the Spearman coefficient for each of the performance measures; however, given the presence of autocorrelation in the financial series of returns, this similarity is broken.
The assessment of financial assets determines how an investment has behaved against some contrast parameter, providing signals about whether a decision exceeds or falls short of the investor’s expectations. This type of evaluation improves financial activity by making an investment decision based on a set of alternatives, enabling the investor to make an adequate selection regarding the combination of risk and return. The investor, using information about the yields of financial assets, can make decisions about the composition of his or her portfolio.
According to Cesarone et al. , the risk parity model is always the most stable in all the cases analysed with respect to the composition of the portfolio. In addition, minimum risk models are often more stable than maximum risk-gain models, and the minimum variance model is usually the preferred one. Bessler et al.  indicated that diversification benefits use various asset allocation strategies, such as , risk parity, minimum variance, and mean variance, analysing whether an industry- or country-based approach provides superior performance, but depending on the conditions of the financial markets, a strategy could be better compared with another depending on the assets that make up the portfolio.
According to Bailey et al. , a portfolio design based on retrospective tests often fails to deliver real performance. The research indicates that, given any desired performance profile, a portfolio composed of common securities is designed as constituent of the S&P 500 index, which achieves the desired profiling based on sample backtest data.
Understanding that the existence of autocorrelation in time series is common, it is necessary to comprehend the effect of autocorrelation on work with financial time series. The problem generated by the autocorrelation is to properly estimate the performance parameters (returns) and the estimates of the methods of evaluation of the performance of portfolios, the latter mentioned in the financial literature as mentioned by Lo and Eling [4, 5], among other authors. However, the evaluation of financial asset performance in the presence of autocorrelation is an area that has not been heavily explored.
The first developments in the field of financial asset assessment can be found in the seminal contribution of Sharpe  who developed the metric that has since been considered the main means for investors to evaluate the returns of financial assets . Sharpe’s ratio shows an inverse relationship between the expected return and the risk level of a given asset and is measured by the standard deviation of the asset. Amenc et al.  mentioned that 80% of managers use Sharpe’s ratio for the evaluation of their portfolios [9, 10]. This measure of market returns and risk is widely used by investors when they consider that the Sharpe property responds to the generation of data from normally distributed returns, and Sharpe’s ratio has been widely accepted because of the direct linkage that can be drawn from modern portfolio theory, which was first proposed by Markowitz .
For Bao , among others, risk estimation can play an important role in the optimal choice of the portfolio; if the sample distribution of Sharpe’s ratio can be derived, then a risk-adjusted ratio could be designed. However, Sharpe’s ratio does not have a manageable distribution under general conditions. The use of alternative risk and return measures also mitigates the problem that returns on assets are often not normally distributed or correlated in series .
Even with the strong theoretical properties of Sharpe’s ratio observed when measuring the goodness of a financial asset, as presented by Van Dyk et al. , one can ask why Sharpe’s ratio is more extensively used by investors than other performance measures , such as those of Sortino and Van Der Meer , Sortino et al. , Keating and Shadwick , Dowd , Young , Kestner , and Kaplan and Knowles , which in principle are able to characterise the distributions of the returns that usually appear in financial markets. The performance measures are shown in Table 1.
Eling and Schuhmacher  performed an analysis of Sharpe’s ratio in comparison with 12 other evaluation measures, showing that Sharpe’s ratio exhibits high correlations with these other metrics and implies that the decision criteria of investors do not change if another evaluation measure is used instead of Sharpe’s ratio.
For Bamms and Honarvar , an increase in the standard deviation is needed for each of the following factors: (a) when investors are more risk-averse, they expect a higher return for providing liquidity; (b) when assets are volatile, liquidity shocks create stronger commercial demands and therefore liquidity seekers pay a higher premium; and (c) when assets are highly correlated, the increased risk of overflow from liquidity shocks between assets raises the price of liquidity by raising the expected daily yield of liquidity providers (annualised Sharpe index) by 0.16%, 0.38%, and 0.40% (0.82, 1.27, and 2.10 units).
Similarly, Eling  indicated that the use of different valuation measures by investors does not substantially change the rankings assigned to financial assets. Hass et al.  repeated the analysis of Eling and incorporate other evaluation measures, showing different results that expand upon the robustness of Sharpe’s ratio regarding the measure financial assets and highlight a particular evaluation metric that is superior to the others, the manipulation-proof performance measure (MPPM), which exhibits sensitivity to the parameters with which it is computed.
Geltner , Okunev and White , and Gallais-Hamonno and Nguyen-Thi-Thanh  found that the resulting evaluation differs when the correction-based method is used. Therefore, the purpose of the research is to apply the correction of the methodology of Geltner , Geltner , and Okunev and White  to S&P 500 financial assets for performance measurement under Sharpe’s ratio and other performance methodologies.
It is necessary to mention that, according to Chkrabarti , the estimates of the average profitability of the assets are noisy, and this noise harms Sharpe’s ratio out of sample of the current methods. For Kim et al. , the methodology in simulated economies and a large panel of US equity returns works well in the application in shares and finds that the arbitration portfolio has significant alphas (statistically and economically) in relation to several popular asset pricing models and Sharpe’s ratio.
This study considers the effect of the use of Sharpe’s ratio on the robustness of financial asset evaluations while observing the effects that can cause autocorrelated series in evaluation measures for financial assets. The paper is presented in the following order. The Methodology section shows an analysis of the influence of autocorrelation, an analysis of the data used, and the statistical model used; it is followed by the Results section and the Discussion, Conclusions, Limitations, and Future Research section.
2.1. Autocorrelation and Bias Problem
The phenomenon of serial autocorrelation is present in series financial asset returns and can cause strong biases during decision-making processes and errors that can impact investor decisions. There are at least two significant problems dealt with constantly by investors: one involves the management of the information possessed by investors about funds, and the second corresponds to the statistical properties of the returns of the funds that impact their investment strategies. In this section, we focus on the second problem, as it affects the decisions made by investors.
One of the main statistical elements that impact the returns of financial assets corresponds to the serial correlation inherent in monthly measurements, as mentioned by Brooks and Kat . In their study, it is shown that many evaluation indexes present strong serial correlations with the parameters of an autoregressive model of order 1, showing a strong level of bias that leads to the underestimation of volatility.
In contrast, Avramov et al.  mentioned in their article that there is strong evidence that the illiquidity of financial markets has an effect on the autocorrelation of returns. In a similar way, we can cite Chordia and Swaminathan  who mentioned that the problems of autocorrelation and cross autocorrelation are related to the trade volumes of financial assets.
Zakamouline  mentioned how the high degree of correlation between the different measures of performance obtained with Sharpe’s ratio represents a puzzle; the study focused on explaining the reasons for this phenomenon, which is described by Eling and Schuhmacher  and Eling . In that same study, Zakamouline concluded that the calculated correlation depends on the properties of the given sample, finding that financial assets with significant Sharpe’s ratios lead to substantial changes in the rankings of assets if other performance measures are used. However, the study used a small sample, which may bias and condition the results.
Thus, a general model of time series is used in the current study for the measurement of performance ratios, and we analyse the effect of autocorrelation on Sharpe’s ratio. The data generation model is described as follows:where corresponds to the time series returns of asset , corresponds to an adjustment parameter, corresponds to a model parameter related to the lags of the time series of , and corresponds to a white noise error term.
This study assumes that the returns of financial assets follow a stationary process and examines the process of autocorrelation using an autoregressive process of order 1 (AR (1)) to exemplify and characterise the effects of autocorrelation on the function that defines the rankings of financial assets using Sharpe’s ratio criterion.
For the case of an AR (1) model, the mean of the process isand the variance of the process is defined aswhere corresponds to the variance of . We define by the variance of process and the autocovariance isand the autocorrelation is defined as follows:
The effect of autocorrelation is shown in the work of Asness et al.  which demonstrates that the tested funds have positive autocorrelation factors that are statistically significant but generate a bias that underestimates the true variance of the return series. However, in a more thorough analysis of the nature of this bias, we can show that, in a case with negative autocorrelations, the effect is the reverse of that described previously, and the traditional estimate of variance overestimates the true volatility of the series.
For the adequate calculation of financial asset returns, we can describe the procedure developed by Geltner and Blundell and Ward [27, 38], allowing us to correct for the bias produced by the autocorrelation of the time series of financial assets, where corresponds to the observed return during period ; this return is weighted by , which corresponds to the true value of the return and the observed return during period .where corresponds to a model parameter that represents the correlation factor, and . By ignoring a true value of the return, we obtain
On the other hand, we define standard deviation bias aswhere corresponds to the process realisations; corresponds to the total number of observations for the calculation of the statistics; and corresponds to the true variance of the process.
Figure 1 shows the difference between the computed standard deviation and the standard deviation corrected by the procedure described by Geltner and Blundell and Ward for simulated data. It is shown that, at a higher level of negative autocorrelation, the difference between the computed volatilities tends to increase explosively. A similar phenomenon is observed in the presence of positive autocorrelation, where the bias tends to grow very rapidly, corroborating the overestimation of the standard deviation for the case in which the series is negatively autocorrelated and the underestimation for the case when the series is positively correlated.
The effect of the bias produced by autocorrelation affects performance measures such as Sharpe’s ratio, causing strong losses due to an inadequate investment strategy derived from measurement errors made by investors.
We define Sharpe’s ratio bias as follows:where corresponds to a risk-free rate; corresponds to the variance of the process; corresponds to the observed return in ; corresponds to the corrected return in ; and corresponds to the total number of observations for the calculation of the statistics.
Figure 2 shows the bias that results from Sharpe’s ratio calculation because it fails to properly estimate the values of the returns for simulated data and the risk-free rate equal to 5% per year, indicating the effect of the parameter on the proper estimation of the variance or standard deviation of a time series.
2.2. Data Analysis
The data utilised correspond to the 446 financial series of stocks obtained from daily quotations from the New York Stock Exchange evaluated from January 1, 2010, to June 30, 2021. The group of financial assets corresponds to the list belonging to the S&P 500 on July 1, 2021, which was kept active in the time period evaluated. The companies that have been excluded from the analysis correspond to those that were not available on January 1, 2010, but were part of the S&P 500 in July 2021.
The profitability of each of the assets in the sample is determined as follows:where corresponds to the profitability of stock in period and corresponds to the price of stock in period . For the computation of the parameters of the AR (1) model, we use 250 pieces of rolling subsample data for each value reported (1 horizon year), resulting in periods with the parameters of the computed model.
The descriptive statistics of the whole sample, including the mean value, standard deviation, skewness, and kurtosis, are shown in Table 2.
We can observe an average return of for all the analysed stocks; however, we observed great variability between the different stocks, represented by a standard deviation of . The average value of the standard deviation measured for each of the stocks is , with which we can characterise the behaviour of the stocks with a view to a risk analysis strategy.
Table 3 shows descriptive statistics for the two tested subperiods. The first period was from January 1, 2010, to December 31, 2013, which was called the postcrisis subprime period. The second period was from January 1, 2014, to December 31, 2016, which was considered a period of economic stability.
The postcrisis subprime period is characterised by a period of high returns on financial assets with an average value of and an average volatility of , which corresponds to the lowest volatility of the revised period. The presence of a positive skewness implies the existence of a tail of the heavier distribution to the right and therefore events of high positive returns can be observed.
On the other hand, the period of economic stability shows a lower level of returns compared with the postcrisis period, with a similar level of volatility. This period shows the highest level of kurtosis with an average value of , which implies that it has fatter tails, allowing us to observe positive and negative extreme yields of returns of financial assets.
Similar to Tables 3, and 4 shows descriptive statistics for the two tested subperiods. The first period was from January 1, 2017, to December 31, 2019, when a commercial war between USA and China was observed, which generated instability in the global markets. The second period was from January 1, 2020, to June 30, 2021, which corresponds to the period of the COVID-19 crisis, where structural failures of the markets were observed [39, 40].
The period associated with the trade war shows a higher level of returns of financial assets compared with the period of economic stability but with a higher level of volatility and the existence of negative bias shows the existence of extreme values in the negative tail of the distribution of returns of financial assets.
Finally, the period of the COVID-19 crisis is characterised by a period of lower returns of financial assets with an average of and greater volatility with an average value of 0.21.
The potential existence of autocorrelation in the time series in each of the periods evaluated may cause a bias in the estimation of Sharpe’s ratio, as described in Figure 2. This potential problem cannot be appreciated in a direct way, so, to verify the potential problems that may arise from the use of Sharpe’s ratio as the main means of performance evaluation, we proceed to evaluate a sample of 446 stocks from the list of companies belonging to S&P 500 in the same spirit as Eling  and Zakamouline  to observe the phenomena that are occurring in these portfolio assessments and calculate the correlation between Sharpe’s ratio and several performance measures.
In Figure 4, we observe the variation in the Spearman correlation coefficient for each performance method, observing substantial differences between them. In particular, the Calmar, Upside, and Sterling methods present significant differences in the analysis of the portfolio rankings versus the rankings reported by the Sharpe method. By reviewing the evolution of performance measures separately, we are allowed to observe anomalous events that may arise in financial markets and impact the risk management of financial operators.
Table 5 allows us to compare the evolution over time of each of the performance measures as an annual average. We can observe that aggregation as the annual average avoids detecting specific events that occur in financial markets, such as structural breakdowns or other phenomena that may have real importance in short-term risk management strategies.
The high correlation between the ranking of financial assets generated by Sharpe’s ratio and other performance measures shows that the selections of investment portfolios are similar regardless of the performance measure used. The use of Sharpe’s ratio is therefore the best source of information on the risk of financial assets due to its widespread use and simplicity. However, a degradation of this correlation indicates that we no longer have a single criterion for the selection of financial assets, and further analysis is required for the selection of financial assets to optimise the portfolio.
In Figure 5, the average Spearman correlation of the ratios is shown as a function of the average scale factor values over the whole sample, and this function exhibits a nonlinear trend. Similarly, we proceed with the autocorrelation factor (Figure 6), where we do not observe a clear relation that describes a particular function. However, autocorrelation cannot be neglected a priori, since the sample has many points concentrated near the zero value.
2.3. The Statistical Model
The reviewed literature shows that the assertion of Sharpe’s ratio as an accurate metric for measuring financial assets is at least questionable and deserves a deeper review due to the large impacts that can be caused when defining investment procedures.
The extensive use of Sharpe’s ratio (unlike the metric proposed by Eling ) in the financial industry as a performance criterion can have a critical influence on investment decisions, which may lead to potential arbitrage opportunities with the understanding that the chosen evaluation criterion of investment performance is inadequate over certain time periods. For this purpose, we propose the following functional relationship:where corresponds to the Spearman correlation degree of Sharpe’s ratio with respect to ; corresponds to the scale factor as the average of the assets within the investment portfolio at time ; corresponds to the value of the correlation factor as the average of the assets within the investment portfolio at time ; corresponds to the standard deviation of error of the autocorrelated model as the average of the assets within the investment portfolio at time ; and corresponds to the type of ratio to which Sharpe’s ratio is compared.
We continue with the estimation of the model defined in equation (11) through an ordinary least-squares (OLS) model. The model is described aswhere parameter corresponds to an adjustment parameter and represents the specific value of the level corresponding to , parameter corresponds to the factor describing the different powers of , parameter corresponds to the power factor of , corresponds to the factor describing the different powers of , and corresponds to an error factor. These power factor parameters are used to capture the nonlinear effect.
For the first 3 models, the measurement is considered the average of the Spearman correlation between all the performance measures used in this study. The first specification of the models, called Model 1, considers the following explanatory variables: (i) scale factor , (ii) correlation factor , and (iii) standard deviation of error . The second specification of the models, called Model 2, considers the following additional explanatory variables: (iv) squared scale factor , (v) squared correlation factor , and (vi) squared standard deviation of error . The third specification of the models, called Model 3, considers the following additional explanatory variables: (vii) third power scale factor , (viii) third power correlation factor , and (ix) third power standard deviation of error .
Table 6 shows a summary of the tested Models 1, 2, and 3. We can observe a high significance for all the parameters of the three models described, where Model 1 manages to capture 57% of the variance, while Model 2 and Model 3 manage to capture approximately 80% of the variance.
Preliminary analysis of these specifications shows that the effect described in equations (6), (8), and (9), where the autocorrelation effect distorts the mean value, standard deviation, and Sharpe’s ratio, also causes a degradation in the correlation between Sharpe’s ratio and other performance measures, which may alter the conclusions described by Eling and Shuhmacher  and Eling  and complements what is mentioned by Zakamouline .
The effect of is significant at a level of 0.1% for the three models tested; therefore, from a statistical point of view, the autocorrelation factor is part of the robustness model of Sharpe’s ratio when chosen as the evaluation criterion.
The effect of is statistically significant at a 0.1% level for the three models tested, showing the strong effect it has on the robustness of Sharpe’s ratio. Finally, the effect of the standard deviation is also significant at a level of 0.1% in Model 1.
For models 4 to 9, the measurement is considered the Spearman correlation of each performance measure used in this study. All specifications consider the complete set of variables described in Model 3: (i) scale factor , (ii) correlation factor , (iii) standard deviation of error , (iv) squared scale factor , (v) squared correlation factor , (vi) squared standard deviation of error , (vii) third power scale factor , (viii) third power correlation factor , and (ix) third power standard deviation of error .
Table 7 reports the same exercise for each of the performance measures; highly significant parameters are observed, as described in the previous table. It was not modeled for the Dowd ratio because it presents a correlation equal to 1 for all periods, so there is no variability of the dependent variable; therefore, an appropriate estimate cannot be made.
We can observe that all models, except for the Upside ratio, show high with which we can say that this strategy manages to capture a large proportion of the variability of the phenomenon, providing an appropriate tool to identify the phenomenon of degradation of Sharpe’s ratio in the presence of autocorrelation in the assets that can compose an investment portfolio.
The analysis of the nature of this function allows for defining criteria for the use of Sharpe’s ratio within this framework of analysis so that the evaluation of financial assets can actively be performed by investors in the context of their portfolios.
4. Discussion, Conclusions, Limitations, and Future Research
The results presented in this study confirm the hypothesis raised about the importance of autoregressive processes in the determination of the performances of financial assets and the care that must be taken when working with such processes. These results allow us to characterise the robustness of Sharpe’s ratio as a means for analysing the yields of these financial assets.
The robustness function described in this paper captures 80% of the variance in the degradation of the Spearman coefficient, allowing for the definition of monitoring and control criteria during the task of tracking the evolution of financial assets and adequately selecting a combination of risk and return.
The results presented confirm the hypothesis proposed in which Sharpe’s ratio allows decisions to be made in the selection of investment portfolios under normal conditions. All models presented in Table 7 show high significance in all parameters; on average the degree of adjustment is of the variance in the degradation of the Spearman coefficient in presence of autocorrelation for each of the performance measures.
Within the main findings is the quantification of the bias that arises when a serious bias is found against an autocorrelated process under a measurement without corrections for the average or standard deviation of data, which in principle allows us to intuit that working with series that are far from the assumptions of normality can lead to problems during calculations and subsequent investment decisions.
The effects of autocorrelation, variance, and scale are not contradictory but rather complementary, and they generalise the results presented by Eling and Schuhmacher  and Eling , showing in turn that if a financial series approaches a process of normality, it is indifferent to the evaluation method used, as mentioned by Zakamouline ; this provides a global view of the selection of an evaluation method for financial assets while focusing on the phenomenon of autocorrelation and incorporating a dimension of temporality into the assessment of financial assets.
Sharpe’s ratio is used to evaluate the performance of financial assets in different industries, considering the level of risk return that the investor observes. This evaluation generates a ranking in which the investor makes the decision about his investment portfolio. Sharpe’s ratio allows decisions to be made about the selection of investment portfolios, which is similar when comparing different performance measures; however, in the presence of certain phenomena in financial markets, this similarity is broken. Therefore, it cannot be ensured that Sharp’s ratio delivers the best information on financial risk, and a more in-depth analysis of the selection criteria of investment portfolios is needed before certain observable events in the financial series.
With respect to future work, we want to expand the analysis to other phenomena that are observed in financial series, such as autoregressive conditional heteroskedasticity (ARCH-GARCH models) and heavy tailed distribution analysis, among other commonly observed phenomena in this type of time series.
The data used to support the findings of this study are obtained from the corresponding author.
Conflicts of Interest
The authors declare that there are no conflicts of interest.
Hanns de la Fuente-Mella was supported by a grant from Núcleo de Investigación en Data Analytics/VRIEA/PUCV/039.432/2020 from the Vice-Rectory for Research and Advanced Studies of Pontificia Universidad Católica de Valparaíso, Chile.
W. Bessler, G. Taushanov, and D. Wolff, “Optimal asset allocation strategies for international equity portfolios: a comparison of country versus industry optimization,” Journal of International Financial Markets, Institutions and Money, vol. 72, Article ID 101343, 2021.View at: Publisher Site | Google Scholar
N. Amenc, F. Goltz, V. Le Sound, and L. Martellini, Edhec European Investment Practices Survey 2008, EDHED-Risk Institute, Singapore, 2008.
H. Markowitz, Portfolio Selection, Yale University Press, London, England, 1959.
Y. Bao, “Estimation risk-adjusted sharpe ratio and fund performance ranking under a general return distribution,” Journal of Financial Econometrics, vol. 7, no. 2, pp. 152–173, 2009.View at: Google Scholar
G. Zhang, “Pairs trading with general state space models,” Quantitative Finance, vol. 21, no. 9, pp. 1–21, 2021.View at: Google Scholar
C. Keating and W. F. Shadwick, “A universal performance measure,” Journal of performance measurement, vol. 6, no. 3, pp. 59–84, 2002.View at: Google Scholar
T. W. Young, “Calmar ratio: a smoother tool,” Futures, vol. 20, no. 1, p. 40, 1991.View at: Google Scholar
L. Kestner, “Getting a handle on true performance,” Futures, vol. 21, no. 5, 1996.View at: Google Scholar
P. Kaplan and J. Knowles, “Kappa: a generalized downside risk-adjusted performance measure,” Journal of Performance Measurement, vol. 8, pp. 42–54, 2004.View at: Google Scholar
J. Okunev and D. White, “Hedge Fund Risk Factors and Value at Risk of Credit Trading Strategies,” University of New South Wales, Kensington, Australia, 2003.View at: Google Scholar
G. Gallais-Hamonno and H. Nguyen-Thi-Thanh, “The Necessity to Correct Hedge Fund Returns: Empirical Evidence and Correction Method,” Université Libre de Bruxelles, Brussels, Belgium, 2007.View at: Google Scholar