Data Loading...

On Investment Pessimism - caoruanmin.com Flipbook PDF

On Investment Pessimism Ruanmin Caoa Zhenya Liub D.G.Dickinsonc December 24, 2013 Abstract We discuss Choquet utility ma

98 Views
49 Downloads
FLIP PDF 600.36KB

On Investment Pessimism Ruanmin Caoa

Zhenya Liub

D.G.Dickinsonc

December 24, 2013

Abstract We discuss Choquet utility maximization and equivalence of pessimistic portfolio and optimization using conditional value-at-risk. An example is present to show its eﬀectiveness in international equity allocation. Unstable tail risk leads to signiﬁcant deterioration of out-of-sample robustness. This can be mitigated by bootstrap method.

1

Introduction

As von Neumann and Morgenstern (1944) and Salvage (1954) axiomatized expected utility under uncertainty, capital allocation among a set of risky assets becomes a classical decision making problem. Pioneered by Markowitz (1952, 1959), mean-variance framework has been dominated as standard model in most textbooks partially due to its simplicity. Just as controversies on expected utility, MV has never received irrefutable acceptance. Its mainly criticized by two disadvantages: unrealistic conditions on preferences or return distribution (Quiggin, 1993), and risk of errors in modeling expected return and covariance matrix (Bawa et al. 1979; Michaud, 1989). Markowitz (2012) opposed the assertion that either quadratic utility function or Gaussian distribution is required. These conditions are suﬃcient rather than necessary. There are also numerous eﬀorts in literature dealing with the second issue. Estimating expected mean has been constructively discussed by Jobson and Korkie (1980), Michaud (1989). Stability of correlations has attracted concerns as well (like Engle, 2002; Bouchaud and Potters, 2009; Laloux et al., 2000). Black and Litterman (1992) introduced Bayesian method and updating belief in optimization procedure. Continuous-time mean-variance approach is developed recently. (Lindberg, 2009) Paralleling to expected utility, Schmeidler (1986, 1989), Quiggin (1981) introduced choquet utility function for incorporating subjectivity. Artzner et al (1999) lay axiomatic foundation for coherent risk measure. Several papers contributed to developing of the conservative risk measures (Rockafellar and Uryasev, 2000; Jaschke and Kuchler, 2001). Combining rank-dependent utility with coherent risk measure, Koenker and Bassett (2005) proposed pessimistic portfolio by transforming conditional VaR portfolio optimization to quantile regression. This section is focused on empirically comparing pessimistic portfolio with MV approach using global equity index data, aiming at testing practicality and eﬃciency of the methods. The following sectors are organized as: theoretical model developed in model analysis and eﬃcient frontier; experiment details described in data preprocessing and computational aspects; performance discussed in QQ Plot, degeneracy, robustness; bootstrap that potentially boosts out-of-sample eﬀectiveness and conclusion. a Miurhead

Tower, University of Birmingham,Edgbaston Birmingham,UK,B15 2TT, Email: [email protected] University of Birmingham,Edgbaston Birmingham,UK,B15 2TT, Email: [email protected] c J.G.Smith, University of Birmingham,Edgbaston Birmingham,UK,B15 2TT, Email: [email protected]

b J.G.Smith,

1

2

Quadratic Utility function and Coherent Risk Measure

Consider a set of lotteries X = [X1 , ..., Xn ] and weight vector w = [w1 , ...wn] .Their combination can be expressed as Xp = w X. Denote random variable X has probability distribution F (X), expected utility of compounded lottery is ∞

EF u(Xp ) =

−∞

u(Xp )dF (X)

(2.1)

Using Taylor expansion approximating utility around expectation (Markowitz 2012), which is

u(Xp ) ≈ u(w EX) +

∂u(w EX) 1 ∂ 2 u(w ) w (X − EX) − w Vw ∂X 2 (∂X)2

(2.2)

We assume utility function is second-order diﬀerentiable. Expectation of lotteries and covariance matrix are ∞ deﬁned by EX = −∞ XdF (X) and V = (X − EX)(X − EX) , as usual. Substitute utility with (2), we obtain 1 ∂ 2 u(w EX) (2.3) EF u(Xp ) = u(w EX) − 2 w Σw Σ is the covariance matrix on probability space of F (X). Equation (3) transforms maximization of expected

utility into quadratic programming problem. Denote risk aversion coeﬃcient δ = variance analysis becomes min δw Σw w

2 1 ∂ u(w EX) , 2 w Σw

classical mean (2.4)

Given utility level, which is equivalent to a linear constraint w EX = μ0 . Note that we didnt pre-assume either probability distribution of random variables or quadratic utility function, which is emphasized to verify the validity of mean-variance framework. Therefore, Markowitz (1959, 2012) argued that Gaussian distribution and quadratic utility form are suﬃcient but not necessary conditions. As mentioned earlier, a set of distributions is always compatible with this approach. Moreover, we assert that expectation and covariance matrix can be reasonably observed. However, the formulation above relies heavily on the accuracy of the quadratic approximation. Obviously, ineligible eﬀect from higher order moment may signiﬁcantly deteriorate applicability of MV approach. Back to expected utility, let X = F −1 (t),Deﬁnition 2.1 becomes: ∞ EF u(F −1 (t)) = u(F −1 (t))dt (2.5) −∞

It shows that expected utility is essentially a uniform integral on R. Alternatively, if distortion function is introduced to capture asymmetry of preference, we may obtain ∞ EF u(F −1 (t)) = u(F −1 (t))dγ(t) (2.6) −∞

This is expected choquet utility. Speciﬁcally,Conditional VaR φα (X) or α−risk measure under wide investigation, has a capacity of γ(α, X) = min( X α , 1). Koenker (2005) proved that a portfolio selection that minimizes the function of φα (X) − λEX is a quantile regression problem min∈R Eρα (X − ) by the following theorem. Theorem 2.1 (Koenker,2005) If EX < ∞, then min Eρα (X − ) = α(φα (X) + EX) ∈R

2

(2.7)

Following Koenker, Bassett (2003), any pessimistic risk measure, by deﬁnition, is a choquet integral of -risk measure for some probability measure: 0 γ(α, x)dϕ(α) (2.8) φ(X) = 0

And the equivalence between coherence and pessimism makes quantile solution a tractable paradigm to all coherent risk measure. Despite of its universality, we only investigate α−risk measure in this section.

3

Optimization of Conditional Value-at-Risk and Pessimistic Portfolio

Not surprisingly, optimization of conditional value-at-risk developed by Rockafellar and Uryasev (2000) is connected to pessimistic portfolio, since both adopt -risk in bi-criteria optimization problem. The former approach, however, ﬁrstly constructed loss function ρ(z), z ∈ Rn associated with the outcome measured by loss. The following discussion is paralleling two methods in hope of setting up a consistent CVaR optimization procedure. Start from general expression of ρ(z) with respect to α, η is (Pﬂug, 2000) ρα (z, η) = η +

1 E[z − η]+ 1−α

(3.1)

∞ Where expectation takes the form of E[z −η]+ = η (z −η)dF (z). o gain intuitive understanding, a piecewise loss function deﬁned on real-valued random variables is given as: Lα = {

z η[1 −

z>η z≤η

(3.2)

1 ELα (z, η) 1−α

(3.3)

α F (η) ]

Without much algebra, we prove the following proposition Proposition 3.1 ρα (z, η) =

One remarkable property of ρα (z, η) is the constant penalty rate imposed on loss lower than threshold level , albeit linear relation is reserved in upper side. This asymmetry indicates pessimism consistent with coherent risk measure (Koenker, 2005) and leads to diﬀerence from volatility. Also notice that ρα (z, η) may not be continuous at X = η. More accurately, outcome below some breakpoint always delivers no lower satisfaction. And continuity is achieved if and only if =0. Another scenario worth noting is α = 1,loss function ρ(z) is no longer bounded. ρα (z, η) has abrupt shift but is still ﬁnite. In Choquet utility setting, such proposition applies: Proposition 3.2 Consider distortion function γα (X) = max{1, and linear utility u(X) =

X } η

1 η(F (η − α)) (X + ) 1−α 1 − F (η)

(3.4)

ρ(z) is a Choquet utility. Its coherent and pessimistic. Speciﬁcally, if F (η) is chosen to be α, ρ(z) degrades to α−risk.

3

Theorem 3.3 (Rockafellar and Uraysev, 2000) As function of α, ρα (z, η) is convex and continuously diﬀerentiable. CV aRα associated with any z ∈ Rn is determined by min ρα (z, η)

(3.5)

η∈R

And V aRα is jointly derived by the value of η attaining the minimum V aRα = arg min ρα (z, η)

(3.6)

η∈R

P roof. By expressing expectation with integral, Equation 2.8 reads ∞ 1 η [1 − F (η)] ρα (z, η) = η + zF (z) − 1−α η 1−α

(3.7)

Taking ﬁrst-order condition of 3.7 with respect to , as it is continuously diﬀerentiable −α + F (η) =0 1−α

(3.8)

Which gives F (η ∗ ) = α or η ∗ = F −1 (α). We then plug the η ∗ into Equation 2.8 to obtain ∞ 1 1 1 zdF (z) − F −1 = zdF (z) ρα (z, η) = F −1 + [z − F −1 (α)]+ dF (z) = F −1 + 1 − α F −1 1−α 1−α

(3.9)

Doing further expansion 1 ρα (z, η) = 1−α

∞

1 zdF (z) = 1 − α F −1

∞

1 zdF (z)− 1 − α −∞

F −1

1 1 E(z)− zdF (z) = 1−α 1−α

−∞

F −1

zdF (z) −∞

(3.10) Comparing 3.10 with 2.8 completes proof. One should notice that Theorem 3.3 doesnt explicitly assume bounded expectation of z, which is necessary for existence of minimum. The required condition has caused fragility in the case of tail distribution that ﬁrst moment is not ﬁnite. Instability will be discussed later. Most importantly, optimal loss function under Rockafellars setting combines an expectation and risk assessment which is essentially the right side of 2.1. Thus we restate 3.10 by borrowing deﬁnition of φα (z). ρα (z, η ∗ ) =

1 [E(z) + φα (z)] 1−α

(3.11)

Formulated diﬀerently, risk measure and loss function share the same nature. If we apply ρα (z, η) in the context of asset allocation, optimizing CVaR is to obtain decision vector that minimize ρα (z, η ∗ )

min ρα (X w, η) = min η + E[X w − η]+ w,η

w,η

(3.12)

As CV aR captures expected tail risk, objective function 3.1 incorporate ﬁrst order moment. A more general version may consider a stochastic programing problem Theorem 3.4 (Krokhmal,2006) Let a mapping φ : χ → R satisﬁes monotoniciy, sub-additivity and positive homogeneity deﬁned by Arzner (1999), and φ(η) > η for all real η, optima of ρ(z) = inf η + φ(z − η) η

Is coherent.

4

(3.13)

The contribution of the Theorem 3.4 is bridging between a mapping with relaxed properties and coherent risk measure. Solving high moment coherent risk measure requires linearization with strict condition and is beyond our discussion. Its computational cost and restrictions with date length prevent wide application. Further exploring link between pessimistic portfolio and Rockafellars idea, we establish such proposition: Proposition 3.5 w, η that minimize loss function in the form of 3.12 is also solution to quantile regression problem (3.14) min Eρα (X w − η) w,η

P roof. Expand quantile loss function in continuous case ∞ Eρα (X w − η) = α (X w − η)dF (X) + (α − 1) η

= (α − 1)

∞

−∞

η

−∞ ∞

(X w − η)dF (X) +

η

= (α − 1)EX w + (1 − α)η +

∞

(X w − η)dF (X)

(X w − η)dF (X)

(3.15)

(X w − η)dF (X)

η

Since EX only contributes to optimal value, divided by scalar factor, w∗ , η ∗ achieve minima simultaneously at ∞ 1 (X w − η)dF (X) (3.16) min Eρα (X w − η) ⇔ min η + w,η w,η 1−α η Which is exactly 3.14. In discrete case, quantile regression applies to as follows.

Eρα (X w − η) = (α − 1)

q

X w+

k=1

1 [X w − η]+ + (1 − α)η q q

(3.17)

k=1

in which q periods of observations replace integral in 3.15 except some notational change, which is consistent with approximated solution proposed by Rockafellar and Uryasev (2000). To sum up, under both continuous and discrete circumstance, pessimistic portfolio and optimization of CV aR by Rockafellar share the same risk measure and structure of solution. The former stem from choquet utility maximization and follows the bi-criteria optimization paradigm. Its elegant tractability, oﬀering additional convenience in bootstrap analysis, is favored in this section.

4

Eﬃcient Frontier and Maximizing Reward/Risk Ratio

The model discussed doesnt infer a unique solution. As investors vary in risk tolerance, optimal risk reward pairs expand the eﬃcient frontier. As all feasible portfolios are bounded by eﬃcient frontier in mean-variance settings, we can characterize pessimistic portfolio by risk-reward proﬁle. The central problem is what risk measure should be adopted to match expected return. A natural choice, analogously, is the measure consistent with its min-max problem. The following ﬁgure shows how risk increases as higher return is targeted. We indeed anticipate a concave frontier and a hump reward/risk ratio. More generally, maximum drawdown (popular among practitioners), VaR expected shortfalls and traditional standard deviation are also candidates. Yet whatever measure is used, trade-oﬀ relationship between risk and return is commonly expected. In practice, we have to ﬁx portfolio selection for comparison. A common idea is to set the same target return μ0 . The problem associated with this simple procedure is that instable performance at diﬀerent return level usually leads to contradictory conclusions. Another empirical diﬃculty is that we always have to set μ0 < min(EXt ) in order to make the problem feasible.

5

Figure 1: Eﬃcient Frontier (Conditional Value-at-Risk)

Alternatively, we may consider optimization without constraints like: min w

δ(s) s

(4.1)

Here s = σ(w,X) . σ(w, X) is risk measure concerned while w X are expected return accordingly. w X It can be easily shown that for any w ∈ W , there is a wf whose portfolio performance is on eﬃcient frontier, such that f racδ(sf )sf ≤ f racδ(s)s. Thus practically, search on global minimum is restricted within boundary. This is the optimization process we adopt. The following theorem justiﬁes our procedure. Proposition 4.1 For any concave risk measure ρ : w → R, concave reward measure γ : w → R, and convex decision space χ, there exists unique vector w∗ ∈ W such that γ(w∗ ) − μ γ(w) − μ ≥ ∗ ρ(w ) ρ(w)

(4.2)

For all w ∈ χ, w = w∗

5

Data Preprocessing

We chose major indexes of 24 developed economies as representatives to their stock markets. Some countries have more than one candidate, in which case index with highest trading volume was selected. All data were collected from DataStream on weekly basis until 28 December 2012. As data length varies across countries, we employ a matrix to retain MSCI developed market index integrity. It divides whole set into sub-periods by start date. In each interval, only actively traded indexes are included, aiming at accurate simulation and exclusion of selection bias. Table 5.2 is summary of descriptive statistics. As its expected, all return distributions are negatively skewed and leptokurtic. VaR and CVaR vary little across countries, indicating similar historical tail risk. Only Italy experienced average loss during the past 30 years. One problem arises from international investment is currency risk hedging because foreign equity values denominated in local currency have to be translated to aggregate performance. Some empirical evidence

6

Table 1: Countries Covered in MSCI Developed Market Country Australia Austria Belgium Canada Denmark France Finland Germany Greece Hong Kong Ireland Israel Italy Japan Netherlands New Zealand Norway Portugal Singapore Spain Sweden Switzerland UK USA

Main index S&P/ASX 200 ATX 200 BEL 20 S&P/TSX OMXC20 CAC 40 OMXH DAX 30 PERFORMANCE ATHEX COMPOSITE HANG SENG ISEQ ISRAEL TA 100 FTSE MIB NIKKEI 225 AEX NZX 50 OSLO PSI-20 STRAITS TIMES INDEX L IBEX 35 OMXS30 SMI FTSE 100 S&P 500

Region Oceania Europe Europe North America Europe Europe Europe Europe Europe Asia Europe Asia Europe Asia Europe Oceania Europe Europe Asia Europe Europe Europe Europe North America

Start date 1992/5/29 1986/1/10 1990/1/5 1983/1/7 1989/12/8 1987/7/10 1987/1/2 1983/1/7 2001/5/6 1983/1/7 1983/1/7 2010/5/6 1998/1/2 1983/1/7 1983/1/7 2000/12/29 1983/1/7 1997/11/4 1999/9/3 1987/1/9 1986/1/3 1988/7/1 1983/1/7 1983/1/7

has shown convenience of currency hedging in global asset allocation. (Eun and Resnick, 1988; Perold and Schulman, 1988; Glen and Jorison 1993). In reality, constructing optimal portfolio cannot neglect the eﬀect of exchange rate ﬂuctuation. We can employ the techniques developed by Black (1990), Adler and Prasad (1992), Walker (2008) to partially alleviate, if not fully eliminate, the risk. Yet our research shouldnt deviate to detailing implementation issue and concentrate purely on return on speculative capital gains. Finally, length of rolling window deserves convincing justiﬁcation. Empirical studies usually compromise between the risk of introducing additional noise and obsolete information both of which may signiﬁcantly deteriorate estimation. Here we run simulation with 100-Days. The experiment is thus designed as to evaluate one-day-forward performance after every 100 days period iteratively. It is essentially a tradeoﬀ between timely adaption to heterogeneous data generating process (Giacomini and White, 2006) and reservation of stable tail distribution. Robustness will be also tested by varying rolling window length and signiﬁcance level.

6

Performance

Figure is performance comparison against index. The blue dash line is accumulative return on MSCI global index, while bold red line and normal blue line are returns on pessimistic portfolio and mean-variance respectively. Performance is calculated without transaction cost.

7

Table 2: Descriptive Statistics of Country Index Country Australia Austria Belgium Canada Denmark France Finland Germany Greece Hong Kong Ireland Israel Italy Japan Netherlands New Zealand Norway Portugal Singapore Spain Sweden Switzerland UK USA

Mean 0.00093 0.00092 0.00046 0.00117 0.00128 0.00066 0.00127 0.00167 0.00086 0.00214 0.00150 0.00243 -0.0005 0.00016 0.00125 0.00029 0.00225 0.00060 0.00056 0.00094 0.00177 0.00118 0.00125 0.00146

SD 0.0204 0.0327 0.0268 0.0222 0.0267 0.0299 0.0372 0.0303 0.0421 0.0366 0.0295 0.0360 0.0348 0.0285 0.0287 0.0169 0.0293 0.0268 0.0286 0.0315 0.0315 0.0259 0.0242 0.0232

Skewness -0.8476 -1.1861 -1.1633 -0.9475 -0.9791 -0.6744 -0.5399 -0.6586 0.0983 -1.2987 -1.5688 -0.6029 -0.7602 -0.8494 -1.1806 -0.8382 -1.0754 -0.8859 -0.4028 -0.7408 -0.4523 -0.8744 -1.3656 -0.8104

Kurtosis 8.815 14.540 12.653 10.932 10.214 7.747 6.263 7.743 6.370 14.752 15.735 5.678 9.022 10.184 12.248 7.912 10.067 9.995 7.809 8.308 7.142 14.034 17.338 9.393

V aR0.01 -0.0604 -0.0988 -0.0814 -0.0670 -0.0712 -0.0860 -0.1087 -0.0838 -0.1256 -0.0975 -0.0928 -0.1205 -0.1325 -0.0769 -0.0959 -0.0551 -0.0874 -0.0917 -0.0732 -0.0856 -0.0861 -0.0721 -0.0633 -0.0707

CV aR0.01 -0.0807 -0.1425 -0.1130 -0.0998 -0.1152 -0.1143 -0.1446 -0.1171 -0.1526 -0.1575 -0.1406 -0.1392 -0.1581 -0.1121 -0.1255 -0.0706 -0.1318 -0.1209 -0.1175 -0.1295 -0.1178 -0.1122 -0.1004 0.0963

V aR0.05 -0.0339 -0.0491 -0.0451 -0.0339 -0.0407 -0.0475 -0.0578 -0.0486 -0.0656 -0.0541 -0.0447 -0.0617 -0.0576 -0.0460 -0.0451 -0.0259 -0.0467 -0.0416 -0.0469 -0.0496 -0.0507 -0.0398 -0.0351 -0.0354

Table 3: Performance Statistics

Sharp Ratio without Risk-Free Rate Expected Shortfall (α = 0.05) Value at Risk (α = 0.05) Breakeven Transaction Cost

8

PP 0.591 -1.50% -2.10% 0.49%

MV 0.260 -1.48% -2.11% 0.61%

Index 0.20 -1.61% -2.30% –

CV aR0.05 -0.0500 -0.0818 -0.0670 -0.0553 -0.0641 -0.0690 -0.0912 -0.0719 -0.0958 -0.0857 -0.0739 -0.0873 -0.0875 -0.0672 -0.0721 -0.0428 -0.0730 -0.0650 -0.0674 -0.0745 -0.0733 -0.0612 -0.0565 -0.0558

Figure 2: Portfolio Performance

Here are some statistics: We deﬁne information ratio as: μp − μb IRp = σp−b

(6.1)

With μb benchmark average return that can be market portfolio return, index return or risk free rate (the ratio becomes sharpe ratio). If we use index return as μb it can viewed as relative risk-reward proﬁle. Lai (2011) showed that mean-variance optimization is maximizing information. However, we obtained higher realized information ratio on pessimistic portfolio. (IRpp = 0.039 > IRMV = 0.0139)

7 7.1

Discussion Quantile-Quantile Plot

QQ plot is widely used to investigate the relationship between two probability distributions. Speciﬁcally, a set of quantiles of empirical distribution is plotted versus that of theoretical distribution. If the points clustered on a line, it justiﬁes the assumption that the empirical data follows given distribution. In the case of ﬁnancial time series, a straightforward method of testing fat-tail is to plot its distribution against normal distribution. Deviation from the line passing through points at central area is the evidence. QQ plot is naturally connected to sharpe ratio. It is easy to show that a suﬃcient condition for equal sharpe ratio with no risk-free rate of two distributions is linearity upon all quantiles. To see this, the linear relation is: f (x) = kg(x) k = 0 (7.1) First and second order moments justify the assertion. Another aspect is concerning the sum of random variables, whose distribution is convolution of probabilities on these variables. Consider N assets with price pi (i = 1, ..., N ), and portfolio weights wi (i = 1, ..., N ), portfolio distribution is: N N ν(p) = f (p1 , ..., pN )dp1 ...dpN with contraints wi pi = 1, wi = 1 (7.2) 1

9

1

Figure 3: QQ Plot (a) Pessimistic Portfolio

(b) Mean-Variance

Central limit theorem states that the sum of m independently distributed random variables has asymptotically normal distribution as m increases inﬁnitely. If the dimension of asset space is m and a set of eigenvectors has comparable variance, the central area around location parameter [μ − L , μ + U ] is Gaussian. Feller derived boundary condition of width rigorously: √ N 1/6 ) |ν(p) − mN | σ N ( N∗

(7.3)

Given variance, normal distribution maximizes diﬀerential entropy. (See Appendix A) And Granham Wallis provided a strict justiﬁcation for application of assigning probability at maximizing entropy. It dictates that the most probable outcome should follow the distribution whose entropy is a local maximum. It is natural to assert that any deviation from normal distribution is an inaccurate description of likelihood, given variance level. Therefore, plotting empirical distribution versus normal distribution can be viewed as an absolute measure of eﬃciency in information extraction. To test normality, following ﬁgures are QQ plot of returns on pessimistic portfolio and index. One observation is that returns on both pessimistic portfolio and mean-variance optimization also exhibit higher probability at both left and right tails than normal distribution. It shows that the occurrence of extreme events cannot be eliminated by choquet portfolio optimization. This conclusion is consistent with higher order moments. However, advantage of the proposed portfolio lies on the left side. Benchmark distribution begins to negatively deviate at one standard deviation. On comparison, pessimistic portfolio has a prolonged Gaussian area of 1.5 standard deviations. As discussed earlier, it is essential to evaluate the eﬀect of optimization scheme from distributional perspective. An alternative graph is using two empirical distributions instead of normal benchmark. Active portfolio performance can then be compared to index on quantile-basis. This is what Figures are showing Green crusade is pessimistic portfolio quantiles. Red cycle is MV portfolio quantiles. Thanks to asymmetric risk measure and choquet expected utility, portfolio has been improved by the fact that left-side fat-tail is alleviated by the same quantiles of Index returns while right-side hasnt been restricted. From the ﬁgure, we are also able to infer higher information ratio of pessimistic portfolio, due to steeper slope.

7.2

Out-of-sample Degeneration

As Davies and Servigny (2012) states, the discretion between realized return and target return usually exhibits a negatively distorted distribution. The extent of degeneracy, when extending in-sample allocation scheme beyond, is a critical part of evaluation. It is also worthy of investigation from the perspective of 10

Figure 4: QQ Plot Comparison (a) QQ Plot

(b) Left Tail

(c) Right Tail

11

Figure 5: Out-of-Sample Degeneracy (a) Pessimistic Portfolio

(b) Mean-Variance

Table 4: Out-of-Sample Degeneracy Moments

Mean Standard Deviation Skewness Kurtosis

PP -0.0015 0.0218 -1.190 11.68

MV -0.0013 0.0194 -1.581 21.67

historical information extraction. The Figure 5.2 shows estimation error, which is deﬁned by the diﬀerence between realized return and target return: (7.4) et = Wt−1 (Xt − EXt−1 ) Here denote Wt−1 , Xt , EXt−1 are portfolio weights at t−1, asset return at t and expected asset return at t−1 respectively. Ideally,Eet = 0 indicates maximal eﬃciency that active portfolio can reach. As long as EXt−1 is an unbiased estimator of EXt , et should be symmetric. Thus negative mean indicates ineﬃcient modeling of return. More severe degeneracy is present in pessimistic portfolio as both magnitude and uncertainty are higher. One explanation is that tail distribution can be relatively mobile and instable over time and thus causes unfavorable misallocation. The phenomenal is well documented by several researchers. Another aspect we observe is that pessimistic portfolio has remarkable higher-order advantage in higher order moments (skewness and kurtosis). This is in connection with risk measure: pessimistic portfolio minimizes tail risk while mean-variance approach neglects higher order eﬀects in approximation.

7.3

Sensitivity Analysis

As White (2000) stated, parameterized models more or less suﬀers from data-snooping. Deliberate manipulation usually justiﬁes spurious relations by appealing result and leads to out-of-sample deterioration. In our procedure, ﬁxed rolling-window length and loss signiﬁcance are ex-ante settings, both of which aﬀect risk measured by CVaR. By varying these parameters, we can gain some robustness on optimization process. Additionally, rebalancing frequency is also changed to test portfolio eﬃciency. In all scenarios, in-sample optimal portfolios are losing risk-adjusted proﬁtability as capital reallocation deters. Tail risk increases, evidenced by larger magnitude of both V aR0.01 and CV aR0.01 . It supports heterogeneity in data generation and consistent with observations in out-of-sample degeneration section. Moreover, longer training period 12

has an alleviated degradation, which might indicate invariable risk measure. Higher performance is anticipated with higher rebalancing frequency in frictionless markets, if tail behavior is properly modeled in ﬁnite samples. All four cases when τ = 0.1 beat their counterparts with τ = 0.01. This is partially due to investors risk aversion and extreme risk pricing. Realistically, transaction cost need to be taken into consideration. Scaled by weights change, turnover is deﬁned by the sum of absolute values: Tt =

Nt

|wt,i − wt−1,i |

(7.5)

i=1

Where Tt is turnover, Nt number of assets and wt,i the weight of asset i at time t. This is the capital base subject to friction. Breakeven transaction cost, an indication of highest cost rate to preserve active portfolio proﬁtability, is then a ratio of overall return divided by total turnover T rt r (7.6) = Tt=0 b= T t=0 Tt Interestingly, weekly adjustment achieves slightly better performance at the sacriﬁce of serious turnover. Asset allocation on monthly basis is much more resistant to friction. We have following inequalities, empirically derived from discussion above, ∂T (L, τ ) ∂r(L, τ ) < 0,