Abstract

The Sharpe ratio is the prominent risk-adjusted performance measure used by practitioners. Statistical testing of this ratio using its asymptotic distribution has lagged behind its use. In this paper, highly accurate likelihood analysis is applied for inference on the Sharpe ratio. Both the one- and two-sample problems are considered. The methodology has distributional accuracy and can be implemented using any parametric return distribution structure. Simulations are provided to demonstrate the method's superior accuracy over existing methods used for testing in the literature.

1. Introduction

The measurement of fund performance is an integral part of investment analysis. Investments are often ranked and evaluated on the basis of their risk-adjusted returns. Several risk-adjusted performance measures are available to money managers of which the Sharpe ratio is the most popular. Introduced by William Sharpe in 1966 [1], this ratio provides a measure of a fund’s excess returns relative to its volatility. Expressed in its usual form, the Sharpe ratio for an asset with an expected return given by and standard deviation given by is given by the following: where is the risk-free rate of return. From this expression, it is clear to see how this ratio provides a measure of a fund’s excess return per unit of risk.

The Sharpe ratio has been extensively studied in the literature. The main criticism leveled against this measure concerns its reliance on only the first two moments of the returns distribution. If investment returns are normally distributed then the Sharpe ratio can be justified. On the other hand, if returns are asymmetric then it can be argued that the measure may not accurately describe the fund’s performance as moments reflecting skewness and kurtosis are not captured by the ratio. To address this issue, several measures exist in the literature which integrate higher moments into the performance measure. The Omega measure is one such measure that uses all the available information in the returns distribution. Keating and Shadwick [2] provide an introduction to this measure. While various methods are available, they are also more complex and often very difficult to implement in practice. To gauge the trade-off between the attractiveness of such measures and their cost, Eling and Schuhmacher [3] compared the Sharpe ratio with 12 other approaches to performance measurement. Eling and Schuhmacher [3] focussed on the returns of 2,763 hedge funds. Hedge funds are known to have return distributions which differ significantly from the normal distribution and, as such, provide a rich and relevant environment for such a comparison. Their study concluded that the Sharpe ratio produced rankings that were largely identical to those obtained from the 12 other performance measures. In other words, the choice of performance measure did not matter much in terms of ranking funds. From a practical perspective, this imparts credibility to the use of the Sharpe ratio for ranking funds with highly nonnormal returns.

Our focus in this paper is on the statistical properties of the Sharpe ratio. While this ratio may be the most widely known and used measure of risk-adjusted performance for an investment fund, fund managers rarely (if ever) indicate any measure of associated statistical significance of their rankings. This point was made in Opdyke [4]. We use a likelihood-based statistical method to obtain highly accurate values and confidence intervals for evaluating the significance of the Sharpe ratio. While the methodology we use is applicable under any parametric distributional assumption, for expositional clarity we demonstrate the use of the method under the assumption of normal returns. We investigate inference for both the one-sample problem as well as the two-sample problem. We compare our method to currently existing methods in the literature. Simulations indicate the superior performance of the method compared to existing methods in the literature.

The paper is organized as follows. Section 2 provides an overview of the asymptotic distribution of the Sharpe ratio under various assumptions. Section 3 lays out the framework for the likelihood-based approach we will use for testing. Section 4 applies this methodology for inference on the Sharpe ratio. Examples and simulations for this one-sample case are then presented in Section 5. The two-sample problem is given in Section 6, with example and simulations presented in Section 7. Section 8 concludes.

2. Distribution of the Sharpe Ratio

Jobson and Korkie [5] and Lo [6] derived the asymptotic distribution of the Sharpe ratio assuming identically and independently normally distributed returns. Consider a fund with a return at time given by , . Further assume that the returns are identically and independently distributed (IID) as . If we let the vector , then as given in Lo [6] we have the following asymptotically normal distribution for : where and is the maximum likelihood estimator (MLE) with and given by the following formulas Lo’s result is straightforward to prove. For large samples, by the central limit theorem, we have where and See for instance, Cox and Hinkley [7] for this result. Given and , we have

Given the Sharpe ratio is a function of , the Delta method can be used to derive its asymptotic distribution. Let be a continuous function with Jacobian denoted by and let the Sharpe ratio estimate be given by . Applying a first-order Taylor expansion on around the true parameter value , we have Rearranging terms and multiplying by yields Using the above result and applying the Delta method, we have the asymptotic distribution of : Note that the Jacobian term is readily computed as follows: Combining terms yields the variance of the asymptotic distribution: By the central limit theorem, the distribution of the Sharpe ratio assuming identically and independently distributed normal returns is therefore From this distribution, confidence intervals for the Sharpe ratio can be constructed in the usual fashion: where is the th percentile of the standard normal distribution.

The above derivation only holds under the IID normal assumption. Under the IID assumption but without assuming normally distributed returns, Mertens [8] derived the following asymptotic distribution of the Sharpe ratio: where and are defined as follows:

Christie [9] derived the asymptotic distribution of the Sharpe ratio under the more relaxed assumption of stationarity and ergodicity. Opdyke [4] interestingly showed that the derivation provided by Christie [9] under the non-IID returns condition was in fact identical to the one provided by Mertens [8]. For a complete discussion and proof see Opdyke [4]. For our purposes, we will use asymptotic distributions (2.12) and (2.14) to compare to our proposed likelihood-based approach. In the next section below we review the likelihood-based approach to inference and in the following section we apply the methodology to the Sharpe ratio.

3. Likelihood-Based Approach to Inference

In this section we review standard first-order likelihood methods and then turn our attention to higher-order methods. For this section, the underlying model under consideration is as follows: is a vector of IID random variables with density given by , where is a -dimensional vector of parameters. Statistical inference concerns drawing conclusions about or a function of , namely, , given an observed sample and in this paper we assume that the interest parameter is . We further assume that this interest parameter is scalar. Let be a -dimensional nuisance parameter vector. The log-likelihood function of is

Likelihood analysis typically involves the maximum likelihood estimator (MLE) and the constrained maximum likelihood estimator = , where the maximum is taken for fixed values of . The constrained MLE can be solved by maximizing subject to . This is often done using the Lagrange multiplier method where the Lagrangean is given by the following: The Lagrange multiplier is given by . The tilted log-likelihood is defined as follows:

Using the above information, standard first-order departure measures can be obtained. These measures include the Wald departure (the standardized maximum likelihood estimator) and the signed square root log-likelihood ratio statistic: Note that can be estimated by using the Delta method where the variance of , , can be estimated by using either the Fisher expected information matrix or the observed information matrix evaluated at the MLE The latter is preferred however because of the simplicity in calculation. The value function for these methods are and , where represents the standard normal cumulative distribution function. These value functions have order of convergence and are correspondingly referred to as first-order methods (note that (3.5) is invariant to reparameterization whereas (3.4) is not. However, in practice, (3.4) is preferred because of its simplicity of use. Doganaksoy and Schmee [10] illustrate that (3.5) has better coverage than (3.4) in the cases they examined). A confidence interval is then given by the following:

Approximations aimed at improving the accuracy of the first-order methods have been worked on during the past three decades. Higher-order asymptotic approximations for the distribution of an estimate can be constructed using expansion methods of which Edgeworth expansions and saddlepoint approximations are most common. Edgeworth approximations tend to produce good approximations near the center of a distribution, with relatively poorer tail approximations, while saddlepoint approximations produce remarkably accurate tail approximations. Given we are interested in tail probabilities, our focus will thus be on approximation methods that use the saddlepoint method. We will not review the saddlepoint approximation literature here but the interested reader is directed to Daniels [11, 12] and Barndorff-Nielsen and Cox [13].

For a canonical exponential family model with canonical parameter vector and density where is a minimal sufficient statistic for , the saddlepoint-based approximation for the density of is given as follows: where is a normalizing constant. This extremely important approximation to the density of the maximum likelihood estimator is referred to as Barndorff-Nielsen’s formula and can be found in Barndorff-Nielsen [14] (this approximation is valid for more general models but requires the existence of an ancillary statistic). Thus the marginal density for can be obtained. In general however, the closed form of the marginal density is not available.

Fraser et al. [15] showed that the result by Lugannani and Rice [16] can be applied to approximate the value function and it takes the form: The statistic is given by the following: where is the observed information matrix and is the observed nuisance information matrix defined as follows: The term can be viewed as an estimate of the variance of , where the elimination of the nuisance parameter has been taken into consideration. Note that is a standardized maximum likelihood departure in the canonical parameter scale. The statistic is the log-likelihood ratio given by (3.5). An alternate approximation to (3.12) is given by Barndorff-Nielsen [17]: These approximations are asymptotically equivalent (see [18]) and have order of convergence . They are correspondingly referred to as third-order methods. Hence confidence interval for can be obtained from (3.9).

Consider now a general full rank exponential family model with parameter vector and density given by the following: where is the canonical parameter and is the canonical variable. In order to use either the Lugannani and Rice approximation given in (3.12) or the Barndorff-Nielsen approximation given in (3.16) for inference on our interest parameter , we need to calculate both and in scale. Note that calculated in the original scale is equivalent to calculated in scale. Moreover, Fraser and Reid [19] derive a in the scale. Their methodology involves replacing the parameter of interest by a linear function of the coordinates (for a detailed discussion and derivation see Fraser and Reid [19]). This newly calibrated parameter is given by the following: Then measures the departure of in the scale. The function is then constructed as follows: Fraser and Reid [19] show that where is defined in (3.14) and is defined as follows: where is the tilted log-likelihood function defined in (3.3). The function in (3.19) along with is used in the Lugannani and Rice approximation in (3.12) or the Barndorff-Nielsen expression in (3.16) which implicitly yields a new .

In the next section we apply the above methodology and use Barndorff-Nielsen’s approximation to obtain highly accurate values to test particular hypothesized values of the Sharpe ratio. We further note that for all our examples and simulations in this paper, the Lugannani and Rice expression given in (3.12) was also used but we do not report results from this method as they are virtually identical to those produced by the Barndorff-Nielsen expression.

4. Methodology for the Sharpe Ratio

We emphasize that while the third-order methodology discussed in the previous section is applicable under any parametric distributional assumptions, for expositional clarity we demonstrate the use of the method under the assumption of IID normal returns (see Fraser et al. [20] for the general model setup). Consider a fund with return at time given by , . And let be the mean return for the risk-free asset. We assume that IID . The interest parameter is the Sharpe Ratio: and . The log-likelihood and related first and second derivatives, denoted by and , respectively, are given as follows:

To obtain the maximum likelihood estimators, the first order conditions are solved, and we have The estimated Sharpe ratio is then The observed information matrix is given as which can be expressed as follows: The observed information matrix is then evaluated at the MLE: with determinant equal to

To solve for the constrained MLE, the Lagrange multiplier method from (3.2) is used with first derivatives equal to Setting these first derivatives equal to zero, we have: Working on (4.13) we have which further simplifies to If we let and , then we have with

The constrained MLE are therefore given by: so that . Using these constrained maximum likelihood estimators we are able to obtain the titled log-likelihood as defined in (3.3), and the corresponding first and second derivatives are: We can use these second derivatives to calculate as For the normal model, the canonical parameter is given by: Two related matrices derived from the canonical parameter are given below With our parameter of interest we have Quantities (4.21), (4.23), and (4.25) can be used to calculate (3.18). The quantity can further be calculated. We now have all the ingredients to calculate given in (3.16).

5. Examples and Simulations

In this section, we provide examples and simulations for inference on the Sharpe ratio. We compute confidence intervals and values using our proposed third-order method given in (3.16). We label this method “proposed” in the tables below. We compare our results to those obtained by using the existing methods, in particular, using both the Jobson and Korkie [5] asymptotic distribution in (2.12) and the Mertens [8] distribution given in (2.14). These methods are labeled as “Jobson and Korkie” and “Mertens,” respectively. Results from the signed square root log-likelihood ratio statistic in (3.5) are additionally provided and labeled “likelihood ratio.”

5.1. Examples

The dataset for our examples consists of monthly return prices for three time series. The first series represents return prices for a large-cap mutual fund (Fund), the second for a market index (Market), and the third for 90-day Treasury bills (Cash). This data spans a period of one year. The data is listed in Table 1 and originates from the Matlab Financial Toolbox User’s Guide (data in the User's Guide is provided for a 5-year period. We focus on the most recent year of data. See Section 4: Investment Performance Metrics of the manual for further information).

Table 2 reports 95% confidence intervals for the Sharpe ratio separately for the large-cap mutual fund and the market index for the four methods discussed previously. Monthly returns are calculated from the return prices listed in Table 1 and are used to calculate the Sharpe ratio. The mean of the 90-day Treasury returns is used as the risk-free rate. Table 2 shows that the confidence intervals obtained from the four methods produce rather different results. Theoretically, the proposed method has third-order accuracy whereas the remaining three methods do not. Although the confidence intervals produced by the Jobson and Korkie [5] approximation have the best concordance with those produced by the proposed method, the difference is still noticeable. This result will be borne out in the simulations as well.

The value functions calculated from the methods discussed in this paper for the Sharpe ratio for the large-cap mutual fund and the Sharpe ratio for the market index are plotted in Figures 1 and 2, respectively. These significance functions can be used to obtain values for specific hypothesized values of the Sharpe ratio. As we are typically interested in tail probabilities which tend to be small, it is important to estimate such probabilities with precision. A few values with their corresponding values are provided in Tables 3 and 4 for the market fund and market index, respectively. From these tables we can see that the values vary across the methods. If for instance, interest is on testing whether for either fund, then the corresponding values for such a test are given in the last column of Tables 3 and 4. Focussing on the market index and using a 5% level of significance, the Sharpe ratio may or may not be statistically significant depending on the method chosen for the hypothesis test.

5.2. Simulations

Two simulation studies of size 10,000 were performed to compare the three existing methods to the proposed third-order method. The first simulation was constructed to mimic the mutual fund return data from the above example and the second to mimic the market index returns: , and . 90%, 95%, and 99% confidence intervals for the Sharpe ratio were obtained for each sample and the lower probability error, the upper probability error, and the central coverage were recorded. Lower (upper) probability error refers to the proportion of times the true parameter value falls below (above) the confidence interval. Central coverage is defined as the proportion of times the true parameter values falls within the confidence intervals. Tables 5 and 6 report the results from these simulations for the fund and market returns, respectively. For reference, the nominal values and the corresponding standard errors have been included for each the lower probability error, the upper probability error, and the central coverage.

From these results tables it is clear that the third-order method outperforms the other three methods based on the criteria we examined. It must be noted however, that the Jobson and Korkie [5] method provides surprisingly good results. We note that other values for the parameters of the simulation were chosen (but not reported) and the results were consistent with the reported results. Standard errors for the simulations can easily be computed. This information has been included in the tables and labeled “standard error.” From these standard errors, it can be seen that the proposed method produces results that are uniformly within three standard deviations of the nominal value. The other three methods produce less satisfactory results.

6. The Two-Sample Case

Suppose one is interested in testing hypotheses concerning the Sharpe ratios of two funds and . For instance, one may be interested in testing the null hypothesis or against the alternative hypothesis or , respectively. In this section we apply the third-order methodology described in Section 3 to test the difference between the Sharpe ratios of two funds. Consider two funds with returns given by and . Further assume for expositional clarity that these returns are identically and independently distributed as and , respectively. The mean return for the risk-free asset is given by . Our interest is in testing the difference in Sharpe ratios of the two funds. The interest parameter is then defined as follows: The log-likelihood function for this problem is written as follows: The corresponding first derivatives are These first derivatives are set equal to zero and simultaneously solved for the overall MLE: The second derivatives are required to obtain the observed information matrix: Evaluating this observed information matrix at the MLE produces with corresponding determinant denoted by .

To solve for the constrained MLE, the Lagrange multiplier method is used with the following first derivatives:

Setting these derivatives equal to zero and solving the resulting system produces the constrained MLE. The constrained MLE can be obtained by solving the following iteratively: and Lagrangian So we have . We are now able to define the tilted log-likelihood: with first derivatives: The second derivatives are needed to calculate :

The canonical parameter for this model is given by the following: The matrices associated with this vector can now be calculated: Given our parameter of interest we have Quantities , and can be used to calculate the new parameter given in (3.18). The additional ingredients necessary for the construction of the standardized maximum likelihood departure given in (3.19) are readily available from the above information.

7. Example and Simulations

In this section we provide an example and a simulation study for the two-sample Sharpe ratio case. We compare the third-order likelihood-based inference method to the classical methods used for testing, namely, the maximum likelihood statistic and the likelihood ratio statistic. These latter two statistics are analogous to expressions (3.4) and (3.5).

7.1. Example

For our example, we use the data presented in Table 1 to compare Sharpe ratios. We may for instance, be interested in whether the mutual fund’s risk-adjusted return as captured by the Sharpe ratio is significantly better than the market’s return. In Table 7 we present the 95% confidence interval for the difference between the Sharpe ratios for the mutual fund and market index. We can see that while the intervals produced using the MLE and likelihood ratio methods are similar, the interval obtained from the third-order method differs from these methods. As this is an example, we cannot comment on which interval is more accurate, but it is relevant to note the differences between the intervals which may be important in real world settings. The values for testing a null hypothesis of a zero difference between the Sharpe ratios of the mutual fund and market index are given by: 0.3675, 0.3675, and 0.3777 for the MLE, likelihood ratio and proposed, respectively. As tail probabilities tend to be small probabilities, it is important to approximate these as accurately as we can.

7.2. Simulations

In this section we provide a simulation study to assess the performance of the third-order method relative the MLE and likelihood ratio. The size of each simulation is 10,000 and again the parameter values were chosen to mimic the example data: , , and . Confidence intervals for the difference between the Sharpe ratios for funds and were obtained for various sample sizes of each fund. Table 8 records the results from this simulation. Lower and upper probability errors are recorded as is central coverage. The nominal values and standard errors of the simulation are also reported for reference. As in the one sample case, these simulation results generally indicate that the propsed method outperforms the other methods based on the criteria we examined.

8. Conclusion

A higher order likelihood method was applied for inference on the Sharpe ratio. The two-sample problem for comparing Sharpe ratios was also considered. This statistical method is known to be extremely accurate and has distributional accuracy. The methodology was demonstrated and worked out explicitly for independently and identically normally distributed returns. Simulations were provided to show the exceptional accuracy of the method even for very small sample sizes. While our assumption of independently and identically normally distributed returns may seem restrictive, we stress that the methodology can be applied more generally for any parametric return distribution structure and our choice of normality was made to clearly illustrate the methodology’s merits to the applied practitioner. A natural next step would be to consider inference when returns are not independently and identically distributed. For the two-sample problem, it would be fruitful to compare the approach taken by Ledoit and Wolf [21] which involves the studentized bootstrap with the proposed methodology.

Acknowledgment

It is noted that part of this paper originates from a chapter of Liu's doctoral dissertation.