Investment decisions when utility depends on wealth and other attributes

The problem of optimal investment under a multivariate utility function allows for an investor to obtain utility not only from wealth, but other (possibly correlated) attributes. In this paper we implement multivariate mixtures of exponential (mixex) utility to address this problem. These utility functions allow for stochastic risk aversions to differing states of the world. We derive some new results for certainty equivalence in this context. By specifying different distributions for stochastic risk aversions, we are able to derive many known, plus several new utility functions, including models of conditional certainty equivalence and multivariate generalisations of HARA utility, which we call dependent HARA utility. Focusing on the case of asset returns and attributes being multivariate normal, we optimise the asset portfolio, and find that the optimal portfolio consists of the Markowitz portfolio and hedging portfolios. We provide an empirical illustration for an investor with a mixex utility function of wealth and sentiment.


Introduction
The typical problem of portfolio choice, in which an investor gains utility from wealth only is well-understood. The impact of a higher risk-aversion to wealth would lead a particular individual to lower the proportion they wish to invest in a risky asset (e.g. Pratt 1964). Arrow (1971) showed that if riskaversion decreases with wealth, then a higher initial wealth will lead to a higher proportion invested in risky assets. The process by which wealth is transferred into utility is implicit.
As Finkelshtain and Chalfant (1993) note, if investors derive their utility from the consumption of goods purchased with wealth, there are likely to be situations in which their utility functions would be multivariate, rather than dependent on wealth alone. Alternatively, there may be sources of risk to which individuals are exposed that are beyond their control (non-diversifiable, non-tradeable and non-insurable), which are also known as background risks. While multivariate generalisations of risk-aversion have been developed (e.g. Karni 1979, Pratt 1988, Gollier and Pratt 1996, problems of portfolio choice have been studied to a lesser degree. † *Corresponding author. Email: andrew.grant@sydney.edu.au † One problem is that when utility functions depend on variables in addition to wealth, separability between wealth and the In this paper, we explore the impact of multiattribute mixex (short for 'mixture of exponentials') utility functions (Tsetlin and Winkler 2009) on investment decisions. Multiattribute utility functions in this context allow the investor to exhibit preferences over asset returns as well as non-financial attributes, which may effectively provide a hedge against poor returns. For example, an investor might exhibit a preference for socially responsible investment, and be willing to accept slightly lower returns or higher risk to invest in psychically pleasing stocks. In general, the hedge provided by the nonfinancial attributes will depend on the correlation between attributes (e.g. whether socially responsible investments have lower returns), as well as the risk aversion an investor shows towards them (an investor might be quite risk tolerant with respect to market returns, but highly averse to uncertain ethical outcomes). Empirical evidence suggests that investors may be willing to pay a price to invest in socially responsible portfolios (see Auer and Schumacher 2016 for example).
We implement the mixex utility function, which admits stochastic absolute risk aversions to states of the world. The form of the utility function means that 'good-bad' states other attributes may no longer apply. When utility functions are non-separable, complications arise beyond the scope of this paper (Kihlstrom and Mirman 1974, Duncan 1977, Karni 1979. of the world are preferred to the equally-likely mixture of 'good-good' and 'bad-bad' states of the world. We show that the distributions of the stochastic risk aversions allow for the recovery of other forms of multiattribute utility functions, such as a multiplicative hyperbolic absolute risk aversion (HARA) utility function, or products of power-utility functions (generalised Cobb-Douglas utility). Additionally, we explore the notion of certainty equivalents for the utility function given particular values of the distribution of risk aversions, considering in detail both discrete and continuous specifications for the risk-aversion distribution. Our results benefit from, and are consistent with, the theoretical findings of Brockett and Golden (1987) and Tsetlin and Winkler (2009). Whilst the link between risk aversion and the risk-return trade-off was established analytically by Merton (1971, equation (117)), and has led to a large literature on stochastic risk aversion, see Bjork et al. (2012) and Guo et al. (2013) and references contained therein; this literature differs markedly from our approach. In this paper, we consider absolute rather than relative risk aversion, and do not explicitly model observable state variables. We believe that the stochastic risk-aversion embodied in mixex models may provide a natural framework for the modelling of preference shocks. A link between preference and volatility shocks has been established theoretically by Xu (2016).
We list some of the advantages of using mixex below. Firstly the utility functions are defined via the assumptions about the distribution of stochastic risk-aversion. We utilise this approach in Section 2 to characterise well-known utility functions, and also develop some new ones. Secondly, mixex utilities have certain attractive characteristics about decreasing risk-aversion. This allows us to characterise (see Tsetlin and Winkler 2009, Theorem 6) increasing/decreasing risk aversion in one attribute as another attribute changes.
We derive optimal portfolios for a general number of assets and attributes, where we assume multivariate normality. The optimal portfolio is approximated by the Markowitz portfolio combined with a hedging term from the other attributes in the utility function. This is broadly similar to results by Merton (1973) for the ICAPM, and Jiang et al. (2010) on optimal portfolios with background risk. We provide a detailed study of a simple asset allocation problem, where we calculate the optimal equity proportions based on U.S. equity data and the (Baker and Wurgler 2006) sentiment index, applied to some of the utility specifications derived in our paper. We estimate two models, one where risk aversion to wealth and sentiment are independent, and a second case where risk aversions to wealth and sentiment are correlated. The empirical evidence for the sign of this correlation seems somewhat mixed. There appears to be no direct empirical evidence on the correlation between risk aversion to sentiment and risk aversion to wealth. Such evidence would require a separate study, most likely based on survey data. Bams et al. (2017, Table 2) find some evidence of a positive correlation between aggregate risk aversion and an orthogonalised sentiment index, while West and Worthington (2012) show that surveyed attitudes toward risk tolerance seem negatively related to consumer sentiment, which implies a positive correlation between risk aversion and sentiment. However, Sahm (2012), using U.S. HRS panel data, find some evidence of negative correlation between risk aversion to wealth and the level of sentiment, as measured by the Michigan Consumer Sentiment Index. Again, this correlation changes depending on whether sentiment is current or lagged. Whilst a meta-study on the relationship would be interesting, there are differences between the measurement of both sentiment and risk aversion, and the samples of individuals used, which makes generalisations challenging. We do not pursue this further as the focus of the paper is on the implementation of bivariate mixex utility to empirical problems, rather than the empirical problem itself.
Notwithstanding our above comments, we choose sentiment as our non-financial, non-traded attribute that is not measured in wealth terms. There is a considerable amount of research on sentiment in financial markets. See, for example, Baker and Wurgler (2006) and Stambaugh et al. (2012). Furthermore, the data for sentiment are of high quality and readily available. In alternative data sources analysed by the authors, such as Environmental, Social and Governance (ESG) information, short histories and annual updating render implementation challenging. By contrast sentiment indices are typically updated monthly and have 50 years of history. Finally, notions of sentiment underpin various sub-areas of finance (such as behavioural finance and empirical asset pricing) and also related areas of study such as psychology and sociology.
Examples of multivariate utility applied to investment problems are relatively limited. Based on earlier work on multiattribute utility functions by Losq and Chateau (1982), Li and Ziemba (1989) develop a portfolio selection model with two risky assets where returns are bivariate normally distributed, and utility depends on wealth and some other attribute. They find that optimal portfolios under these circumstances can be characterised by a matrix measure of risk aversion involving risky assets and factors. Finkelshtain and Chalfant (1993) derive conditions under which a utility function in wealth and consumption goods is separable, showing that independence between returns and other attributes is not a sufficient condition to ensure that optimal portfolios do not depend on other attributes. Additional restrictions on the form of the utility function are required to meet this criterion.
Empirical studies of portfolio choice in the presence of background risk have typically found that investors would prefer to hold a reduced proportion of wealth in the risky asset, assuming some positive correlation between the two sources of risk. For example, Heaton and Lucas (2000) show that investors with significant entrepreneurial risk or holdings in their employers stock tend to hold a lower share of their wealth in stocks. Other studies examine the impact of background risk in terms of employment income (Viceira 2001), housing (Cocco 2005, Yao andZhang 2005), health (Rosen and Wu 2004, Edwards 2008, Yogo 2014, and divorce (Bertocchi et al. 2011, Christiansen et al. 2015, with each different source of risk reducing the share of wealth investors hold in stocks. These results partially explain why observed holdings in liquid stocks are lower than those predicted by traditional single-variable utility models. It would appear that individuals do not make investment decisions as though wealth is the only consideration, and thus extending portfolio choice to a multi-attribute utility setting is important. Many of these approaches, however, treat background risk (e.g. health) as measurable in wealth terms. We propose a framework that lets the attributes, other than wealth, be measurable in non-wealth terms, be non-tradeable, and nontransferable. Moreover, while prior literature has considered appropriate parameterisations of the Constant Absolute Risk Aversion (CARA) utility functions for investment decisions, it is less well-understood how risk attitudes toward nonfinancial attributes should be estimated. We discuss issues of calibration, as well as correlation among not only the attributes themselves, but also the (stochastic) risk-aversions to attributes.
The remainder of the paper is organised as follows. Section 2 introduces the mixex utility function, demonstrates properties of certainty equivalents under the stochastic risk aversions, and shows how other forms of multiattribute utility function can be recovered through judicious choice of the risk aversion distributions. Section 3 generates optimal portfolios for Normally distributed attribute risk aversions. Section 4 introduces a worked example showing how a bivariate utility function containing wealth and sentiment is constructed. Section 5 demonstrates the impact of the risk aversion distributions for the worked example, while Section 6 summarises the main findings and concludes.

Investment under mixex utility
Initially, we present some definitions for mixex utility and derive a number of new results concerning the relevant notion of certainty equivalence. This concept is central to both economics and finance, as it is the basis for optimisation. There is a large literature on univariate certainty equivalence; the literature on multivariate certainty equivalence generally does not produce unique solutions without arbitrary restrictions (e.g. Courbage 2001, equations 1, 3, and 5 for alternative restrictions) typically there are different vectors of certainty equivalents that will give the same expected utility.
While we do not argue that the solution we provide below produces a unique solution for certainty equivalence, we do find under circumstances in which the risk aversion distribution is degenerate and univariate. We hence recover the conventional solution. We name this concept mixex certainty equivalence (MCE). Before we proceed to the definitions of certainty equivalence, we discuss the concept of stochastic risk aversions in the multivariate context.

Stochastic risk aversion to sentiment
The focus of this paper is on the impact on investment decisions from a secondary attribute other than wealth. Risk aversion to non-wealth attributes can often be difficult to interpret. In the case of sentiment, a low level of risk aversion to sentiment could be regarded as 'sentiment-prone,' that is, susceptible to the zeitgeist (i.e. excessive optimism (pessimism) with respect to high (low) state of the world). In contrast, high risk-aversion to sentiment can be considered as 'against prevailing sentiment,' as may be observed in a contrarian who becomes excessively pessimistic in good states of the world.
However, such an investor would not behave as a contrarian in low sentiment states of the world, preferring to continue to scale back their equity position or, equivalently, increase their hedging position. On average, we would expect that individuals exhibit a below average risk aversion to sentiment in societies where herding is present. If the second attribute was status, then risk aversion to status loss is likely to be manifested in choice of financial instruments or intermediaries. In a different literature, educational choice is influenced by risk aversion to status loss (e.g. Jaeger and Holm 2012). To aid readers, we will utilise 'sentiment' as the secondary attribute, although the analysis is more generally applicable.

Mixex certainty equivalence (MCE)
We now proceed to our definition of MCE. Firstly, define λ to be (N × 1) stochastic risk aversions for (N × 1) attributes A with joint pdf (A, λ), A = (a 1 , . . . , a N ) so that a j is the jth element of vector A. For the attributes of different individuals we will use a superscript notation A i . When considering observed data, we shall write A t , where A t is a vector of attributes observed at time t. At various points we define A in terms of its first element a 1 and its remaining N − 1 elements A 2 , so that Mixex utility (Tsetlin and Winkler 2009) is defined as and expected mixex utility by where pdf(λ) is the marginal probability density function of λ. This is the distribution of vector risk aversion. The attractive feature of the mixex model is that risk aversion, here defined as a vector of random variables, can be state dependent. Such a representation is particularly valuable in modern markets, where practitioners talk about 'risk-on' and 'risk-off.' A closely related term is risk appetite. These terms mean either that markets have differing risk at different times, but also have come to mean that agents' attitudes to risk also vary depending on market circumstances (e.g. Lee 2012, Froot et al. 2014. Previous analysis of this class of utility functions has focused on the interpretation of mixex utility as a Laplace transform. Whilst we also continue using this interpretation, we are particularly interested in the underlying distribution of risk aversion. We define MCE of A as some function C(λ) whereby the following equation is satisfied.
Also, taking expected values of (1), where m A (−λ) is the joint moment generating function of A evaluated at −λ. Here, we assume that pdf(A, λ) = pdf(A)pdf(λ), i.e. that A and λ are independent. However, it is straightforward to relax this assumption, in which case m A (−λ) would be interpreted as the joint moment generating function of A with respect to the conditional pdf, pdf(A|λ). Equation (4) can be rewritten as Defining K A (s) as the cumulant-generating function (CGF) of A, we see that We now expand K A (−λ) in a Taylor series. Using standard tensor notation, We notice that K i = μ i = E[A i ] is the mean of the attribute distribution, and K ij = σ ij is the covariance between attributes. Hence, we have the following result.
Proposition 1 If A and λ are independent and relevant moments exist such that m A (s) exists, then the mixex certainty equivalent vector C(λ) whose ith element is C i is given by Proof This involves comparing (3) with (7).
The above argument is, subject to the existence of m A (s) and the validity of (7) in the general case, which we now specialise. We aim to determine C(λ) if we assume that A is multivariate normal denoted by A ∼ N(μ, ).

Corollary 1 The mixex certainty equivalent C(λ) for multivariate normally distributed values of A and pdf(
In particular, if λ is nonstochastic, mixex certainty equivalence is identical to conventional certainty equivalence. In the case that N = 1 and pdf(λ) is degenerate, i.e. pdf(λ) = λ with probability 1, we get a certainty equivalent This is a familiar result, and is the rationale for mean-variance optimisation used by investment practitioners, which corresponds to the usual concept of certainty equivalence. In effect, μ − 1 2 λ can be thought of as a risk-adjusted expected attribute vector where the scale of the risk adjustment (λ) is a variable depending on the state of the world. This is the only polynomial solution for MCE available as a consequence of a result by Marcinkiewicz (1939). He showed that the normal distribution is the only distribution with polynomial cumulants.
Considering the general case under normality, the distribution of state-dependent CE can be found from Corollary 1.
Thus denoting the pdf of λ as pdf λ ( ) and the pdf of C as pdf C ( ), we see that in this case. In the case of λ discrete, and A multivariate normal, different vectors of C will have different probabilities associated with them. Given we have a pdf for C, we can of course calculate unconditional moments. In the case of normal where E λ ( ) indicates expectation over the λ distribution. In the general case, expectations can be taken with respect to (8).
An alternative representation of C(λ) can be put in terms of generalised inverses, similar to Duncan (1977). Duncan derives the generalised inverse A − (following Moore 1920, see also Rao and Mitra 1972) of a matrix A (N × m) by the following equation In our case we need to 'invert' the vector We can define C(λ) = −λ −1 K A (−λ), where λ −1 is the generalised inverse of λ, and we have dropped 1/N. We now discuss issues of estimation. whereK Proof This follows immediately from the consistency properties of the empirical moment generating function (emgf).
We note that the above will generalise to stationary stochastic processes where the emgf consistently estimates the mgf of the stochastic process. Some insight into the properties of MCE can be gained from our numerical examples in Section 5.
Next, consider a discrete distribution for stochastic risk aversion of the following form, where the (N × 1) vector λ takes the following values, and where the ith state has probability P i of occurrence; 0 ≤ P i ≤ 1; i = 1, . . . , m; m i=1 P i = 1. Thus for N attributes, we have m distinct states of the world. We call this model the conditional certainty equivalence model.
The advantage of this specification is that we can condition on state i to compute MCE. Denote the conditional certainty equivalence of state i by MCE i . This leads to our next proposition: Proposition 3 The Unconditional Certainty Equivalence, where E λ ( ) involves taking expectations over pdf(λ) is: Proof Result follows immediately by the law of total probability.
We note that some of these probabilities (P i ) may be difficult to estimate in practice. However, if we do know the number of states (m) or may be prepared to propose such a number, we can define the 'equi-probable' MCE by the formula:

Scale gamma distributed risk aversion
We have seen in the above that the specific structure of the risk aversion data determines the properties of MCE. With that in mind, we look for a distribution that is analytically tractable and also flexible enough to admit multiple patterns. In this subsection we consider some examples for pdf(λ), the joint distribution of risk aversions. We shall use some of these models of utility in our empirical section. Of interest to us are the consequences of assuming dependent and independent risk aversions within the same framework. We shall focus on scale gamma distributions for risk aversion due to their tractability, known moment generating functions, and the availability of multivariate generalisations. Suppose that pdf(λ) consists of independent scale gamma distributions. Thus, where θ j is the inverse of the scale, α j is the exponent, and (α j ) is the gamma function evaluated at α j . We now have the following, from (19), and noting that A is an This tells us that U(A) is multiplicatively separable and that each attribute has HARA utility, −(1 + a j /θ j ) −α j , where the exponent α j and the scale (1/θ j ) both determine the risk characteristics of the attribute. More general univariate results of this kind have been provided by Brockett and Golden (1987). This does raise the question as to the form of the utility if we did not assume independence in (19), but assumed, say a multivariate gamma distribution such as the (McKay 1934) bivariate gamma distribution given by for λ 1 > λ 2 > 0, and c,d,p,q all positive. This is equivalent to assuming λ 1 = x 1 + x 2 and λ 2 = x 2 where x i are independent scale gammas (θ i , α i ).
Proposition 4 If λ j are independent scale gamma distributions with probability density functions given by (19) then the mixex utility function is given by (22) which is multiplicative HARA with For the case N = 2, and assuming that λ j have dependent scale gamma distributions, described by where the x i follow independent scale gamma distributions with parameters (θ i , α i ), then the mixex utility function, which we call dependent HARA utility, is given by (23), where Proof The first case, assuming independence is trivial. For the second case of dependence in risk aversion, consider the case with Comparing with (20) we see that we lose the multiplicative HARA structure. Adding further λ's in a hierarchical manner leads to obvious generalisations. The risk aversions λ 1 and λ 2 are now positively affiliated †, and so Theorem 6 of Tsetlin and Winkler (2009) states that absolute risk aversion (ARA) is decreasing in the other attribute.
Differentiating the first case (20) twice with respect to a j , we see that (a) we have positive and decreasing marginal utility, and (b) zero cross-derivatives. In particular, absolute risk aversion for a j is independent of all other attributes so that we have a constant absolute risk aversion with respect to changes in other attributes, and DARA with respect to itself. This is consistent with Theorem 6, Corollary 2 of Tsetlin and Winkler (2009). ‡ Brockett and Golden (1987) also provide a more general version of Theorem 2 of Tsetlin and Winkler (2009) that includes positive measures as a replacement for pdf(λ). This is based on results of Bernstein (1929), Widder (1931), and simplification by Tamarkin (1931). Using such a result and the assumption where m( ) indicates measure, § and in this case, Karlin and Rinott (1980), Ex. 3.5, p. 482. A detailed derivation of this point is not included for brevity, but would require α 2 > 1. If 0 < α 2 < 1, the attributes would be negatively affiliated. ‡ We delay discussion of (22) until Section 2.4. § The results rest on the characterisation of Laplace transforms on [0, ∞) of positive Borel measures (the Bernstein-Widders theorem) where a Borel measure on a topological space is a measure that is defined on all open sets; a measure being a function assigning a nonnegative real number or +∞ to certain subsets of a set. will give us a utility function of the form so that we can also recover products of power utility (generalised Cobb-Douglas utility) as well. Examples of this include making relative risk aversion dependent on state variables (e.g. Brandt and Wang 2003, Guo and Whitelaw 2006, Guo et al. 2013. In our context this would mean that the a i terms are themselves functions of other variables. Furthermore, following Brockett and Golden (1987), we are able to obtain expressions for Stone-Geary utility functions in terms of appropriate measures, as well as many of the multivariate utility functions appearing in the economics literature.

Risk aversion for multiattribute utility functions
In this subsection we will compare risk aversion under the multiplicatively separable form of the utility function (which we will denote V (a 1 , a 2 )), and the dependent attribute risk aversion form (denoted W (a 1 , a 2 )). Consider firstly the bivariate independent HARA utility function as per Proposition 4.
where a 1 is the wealth attribute, and a 2 is the other attribute.
Taking the first and second derivatives of the utility function with respect to wealth gives: We thus obtain the absolute risk aversion of V with respect to a 1 (ARA V ) for the independent risk-aversion case as: which is declining in a 1 as We notice that ARA V does not depend on a 2 , so that absolute risk aversion of a 1 will be constant with respect to changes in a 2 . This is because assumed independence between attributes.
Similarly, we can find the relative risk aversion (RRA V ) as which is increasing in x, as Now, consider the case where the attribute risk aversions are not independent. We have, following the previous notation Taking the first and second partial derivatives with respect to a 1 gives us simplifying, we are able to obtain the absolute risk aversion in the dependent risk aversion case (ARA a 1 W ) as The partial derivative of ARA a 1 W (a 1 , a 2 ) < 0, as the numerator is a quadratic expression in a 1 , and the denominator is a cubic expression in a 1 , and all terms are positive. Thus we have decreasing absolute risk aversion.
The second term w.r.t. a 2 in ARA 1 W (a 1 , a 2 ) is difficult to directly evaluate but again we can appeal to the same result in Tsetlin and Winkler (2009). Our risk aversions are affiliated by construction (see footnote 2), and so an increase in a 2 must decrease absolute risk aversion in a 1 .
Next, consider the special case of α 1 = α 2 and θ 1 = θ 2 ; this is the case used in our empirical application. We will drop the subscripts for clarity. Then, our expression for ARA 1 W (a 1 , a 2 ) becomes

Equity investment under general risk aversions
In this section, we address portfolio construction for mixex utility. We find that the optimal portfolio depends not just on the usual Markowitz portfolio, but also on a hedging portfolio, which offsets attribute risk. We assume that the attributes A of dimension (N × 1) and asset returns R of dimension (m × 1) are jointly distributed multivariate normal, but make no assumptions at this stage regarding pdf(λ). Suppose also that a 1 is an equity portfolio, hence a scalar, indeed A = (a 1 , A 2 ), where A 2 is an (N − 1) × 1 vector. Furthermore, we have a 1 = ω R, and μ 1 = ω μ R , where R is the rate of return to m assets, where μ R is an (m × 1) vector of expected rates of return of the individual assets and is an (m × m) positive definite matrix containing the variances and covariances of the m asset rates of return. We define an (N × N) matrix, where 12 is a (1 × (N − 1)) vector. In this setup, σ 11 , a scalar, is given by ω ω, and is the variance of a 1 . The matrix 12 = ω 12 , where 12 is (m × (N − 1)) and equals cov(R, A 2 ), that is, the covariance between the m investible assets and the N − 1 attributes other than wealth. We also define the matrix 22 , ((N − 1) × (N − 1)) as the covariance matrix of A 2 . Then we partition μ into (μ 1 , μ 2 ) and λ into (λ 1 , λ 2 ) where μ 1 is the mean value of returns and μ 2 is an (N − 1) × 1 vector of the means of the other attributes, and λ 1 is the risk aversion to wealth and λ 2 is an (N − 1) × 1 vector of risk aversions to the other N − 1 attributes. We then write, with V here defined as the expected utility; To simplify further analysis, we write the exponential expression inside the integral defined in (39), as g(λ), where Taking the first-order conditions of the expression V with respect to the portfolio weights, ω, we obtain which can be rewritten as Solving for ω leaves us with the expression Now, it is apparent that g(λ) depends on ω as well, so it is best to understand (43) as an updating equation. Let ω n be the n th iteration of ω, and thus we have g n (λ) = exp −λ 1 ω n μ R − λ 2 μ 2 + 1 2 λ 2 1 ω n ω n +λ 1 ω n 12 λ 2 + 1 2 λ 2 22 λ 2 .
In the limit,ω As is usual in many portfolio problems, we have not constrained the weights, and we assume there is an implicit cash position equal to 1 − i ω, where i is a vector of ones. We do not provide a proof that ω n converges to ω. Conditions for convergence in such systems are provided by Ljung (1977), although these are difficult to verify in practice (see also Stephens 1948 for a discussion of difference equations). It may be possible to substitute known distributions for pdf(λ) and simplify (45) to (47). Anticipating our empirical work in Section 5, one can compute the expressions −1 12 in (48) and, derivingω, we find a set of constraints that h ∞ and p ∞ must satisfy. However, in our example, N = 2 and m = 1, so this does not lead to a unique solution. † This structure consists of the market portfolio and N − 1 hedging portfolios weighted by the elements of p ∞ . Due to the multivariate normal distribution of attributes, the structure is similar to many models in financial economics, including the single period model of portfolio choice with background risk of Jiang et al. (2010) and the multiperiod ICAPM of Merton (1973). In the former case, the interpretation of the hedging portfolio is essentially similar to ours, where background risk corresponds to the various non-wealth attributes. In the multiperiod case of Merton (1973), however, the hedging is with respect to changes in the investment opportunity set, and thus the interpretation is slightly different. Merton shows (Theorem 1, p. 878) that in a world with a constant investment opportunity set, that there is no exposure to the hedging portfolio. ‡ Our (48) bears some resemblance to that of Li and Ziemba (1989, Lemma 5).
To make the result less abstract, we consider a worked example in Section 4. We implement a bivariate utility function, with attributes of wealth and sentiment. Sentiment is used as a proxy for aggregate 'optimism,' with the general understanding that sentiment refers to non-fundamental factors that drive asset prices away from equilibrium values. We stress our interest in sentiment as a attribute is simply as an example; arguments applied here could be used for other attributes.

A worked example
In this section we explore the impact of bivariate utility functions on portfolio choice. We obtain the optimal portfolio allocation in the classic asset allocation problem of one risky and one riskless good under two alternatives based on (i) independent attribute risk aversion and (ii) dependent attribute risk aversion, where our attributes are wealth and sentiment; this is based on our calculations in Section 2. Whilst our † An alternative approach to trace through the iterative calculations was not available to us as we employed grid search optimisation rather than successive updating.
‡ Practitioner versions of the hedging result might be holding the market portfolio but hedging exposures to value, size, and momentum.
model is a one-period model, we note that the joint distribution of sentiment and wealth may follow a non-i.i.d. stochastic process.

Data
Our empirical data utilise both returns and sentiment data at a monthly level. We use data on U.S. equity returns over the period July 1965 to September 2015 from Ken French's website. † We refer to the excess returns on equity for month t as R t henceforth, distributed with mean 0.48% and standard deviation 4.51% per month. We obtain data on sentiment from Jeffrey Wurgler's website, as used in Baker and Wurgler (2006). ‡ Specifically, we use the monthly version of the sentiment index, SENT ⊥ that has been orthogonalised to remove economic fundamentals, over the maximum period of availability, from July 1965 until September 2015, a period of 602 months. This sentiment index is specifically constructed using principal components of five factors, the value-weighted dividend premium, the number of IPOs and first-day returns on IPOs, the closed end funds discount, and the equity share in new issues. By construction, changes in the value of the sentiment index are distributed with mean 0 and standard deviation 1. We scale the sentiment index by multiplying by 0.0451 so as to equalise the standard deviation to that of the returns index, and denote sentiment in month t as S t . The scaling ensures that the risk aversions to both attributes (returns and sentiment) will be on a broadly similar scale; α 1 = α 2 and θ 1 = θ 2 in the terminology of Proposition 4, see also (37). Readers should be aware that high sentiment is associated with a prevailing view that stocks are over-valued.

Model estimation
Our calculations will therefore be based upon the conditional joint distribution of sentiment and wealth at time t + 1, conditional upon information known at time t and variations thereof. With that in mind, we initially assume that the stochastic process for A t , which is a (2 × 1) vector (N = 2) is a low-order vector autoregression of order p (VAR(p)).
where V t is assumed i.i.d. (0, ). We first estimated VARs of orders 1, 2, and 3, and used the Akaike Information Criterion (AIC) to determine the appropriate order p. We find that p = 2 generates the lowest AIC (value 5.292). We rechecked the equations for p = 2 and eliminated any individual variables with insignificant coefficients. † The data is available at http://mba.tuck.dartmouth.edu/pages/ faculty/ken.french/data_library.html. We thank Kenneth French for making this data available for research. ‡ The data is available at http://people.stern.nyu.edu/jwurgler/. We thank Jeffrey Wurgler for making this data available for research. The preferred models are that sentiment depends on two lags each of excess returns and sentiment, while excess returns depend on a single lag of sentiment and excess returns. These parsimonious VAR models are presented in table 1. The sentiment index varies less on a monthly basis than do returns. As a result, S t is predictable relative to R t based on our preferred models; the R 2 for the resulting VARs for sentiment is 0.9674, while for returns it is 0.0115. Using the results of table 1, we can compute both the conditional mean of our joint distribution and also the estimated residuals, which we can test for cross-and autocorrelation.
We test for independence by sorting the pairs into quintiles and calculating the appropriate chi-squared test (this is χ 2 (16), Critical value 26.296), which returns a test statistic of 26.00. Thus, we do not reject the null hypothesis of independence at the 5% level, but we would reject the null at the 10% level.
In testing for cross-and auto-correlation, we find little evidence of any significant coefficients noting that a type I error of 5% will throw up occasional rejections of the null. The residual covariance matrix as described in (50)  We observe from the residual covariance in (51) that the crosscorrelations for X t are statistically insignificant (p-value = 0.973).

Estimation of MCEs
We assume that the attributes are jointly normally distributed and use the mean-variance specification of the MCEs, assuming discrete states of the world as in (16) and Proposition 3. We also estimate the empirical moment generating function form of the attribute distribution to provide a comparison with the previous method, to assess how strong our normality assumptions are. If we assume that in Proposition 1, Corollary 1, C(λ) = μ − 1 2 λ for wealth and sentiment, A bivariate normal, and A and λ independent, we can describe the mixex (state-contingent) certainty equivalence. In what follows, we define our various terms; wealth is attribute 1 and sentiment is attribute 2.
The joint pdf(λ) is described in the contingency table (table 2), indicating that the state of low risk aversion to wealth and low risk aversion to sentiment λ = (α 1 , β 1 ) arises with probability P 11 . Likewise, the state corresponding to high risk aversion to wealth and low risk aversion to sentiment λ = (α 2 , β 1 ) arises with probability P 12 . The precise nature of the states are left arbitrary, but we assume that 2 i=1 2 j=1 P ij = 1.
We have four states of the world where risk aversion differs, as calculated in table 2. We have four vectors of conditional MCE. As an example, for λ = (α 1 , β 1 ) we have with probability P 11 . These certainty equivalents can be thought of as 'risk-adjusted' expected returns for wealth and sentiment and where the risk-adjustments depend not just on σ 11 α 1 or σ 22 β 1 , but also on the covariance σ 12 and the corresponding risk aversions β 1 and α 1 . Thus in a situation where σ 12 and β 1 are positive and large, the CE of wealth will be low since our large risk aversion to uncertain future sentiment and its positive dependence with uncertain future wealth reduces the merits of uncertain future wealth.

Empirical distribution for certainty equivalence.
An alternative to estimating certainty equivalents assuming normally distributed attributes is to take the empirical distribution and estimate the empirical moment generating function given by where A t = (a 1t , a 2t ) is (2 × 1) and and we do not specify P ij . This leads to expressions of the form for i,j = 1,2, e.g. for (α i , β j ).
We estimate the linear combinations of certainty equivalents using both the attribute normality specification and the empirical distribution, and calibrate them against each other. We note that this is the calculation where we fix (α i , β j ) so that it is conditioned on them.

Empirical estimation
We treat the residuals as strongly exogenous with respect to each other and use a bootstrap method to generate data based on our conditional means. We sample (with replacement) 300 pairs of residuals from the VAR model given by table 1 and combine these with the conditional means to reconstruct time series of future returns and sentiment. Thus, at a point in time, we have the same conditional mean, but 300 idiosyncratic residuals, which allows us to construct a conditional distribution at any time t. An alternative interpretation of this approach is that we have one history of 50 years of conditional means attached to each conditional mean in the history are 300 alternative residual returns. Whilst this approach is not fully dynamic, it allows us to myopically optimise our utility functions, period by period. We do not address a full multi-period dynamic optimisation. For simplicity, we fix interest rates r f at 2.5% p.a. Further, we implement the assumption that α = α 1 = α 2 and θ = θ 1 = θ 2 so that the risk aversions are directly comparable as per (37). The parameter θ is set to 100 (converting the percentage returns to decimal); the arguments in the utility function are positive for our choices of a. Our choice for α is 9, which is a high-risk aversion case discussed in Section 5.1, where we discuss this choice in more detail. Initial wealth is set to unity at each point in time, so our experiment is effectively a series of one-period investment problems, where historical considerations determine the conditional means.
For the first case, the problem can be written as: We define E t ( ) to mean expectation conditional on all information up to time t. The processes are not Markovian (see table 1); returns are Markovian but sentiment is not. We consider examples of the bivariate utility functions with scale gamma distributed risk aversions. Given the utility function, V (a 1t , a 2t ) described in (26), and W (a 1t , a 2t ), described in (33), for each month, we compute the optimal proportion (allocation) of equity ω by grid-searching ω between −0.5 and 1.5 over replications of the bootstrapped expected utility. Our output will be a series of 600 asset allocations. Table 3. Example values of state-contingent Mixex Certainty Equivalents corresponding to (resp.) High (α 1 = β 1 = 9) and Low (α 2 = β 2 = 1) Risk Aversions to Returns and Sentiment, where the mean of equity returns (μ 1 ) is 0.48% per month, and the mean of sentiment (μ 2 ) is 0.0.  (55). Cumulant Generating Function (CGF) Non-Parametric estimation is computed using (53). Percentage Discrepancy is the percentage error in the mean-variance approximation from the Non-parametric estimation of the cumulant generating function in (59).

MCEs and CGFs under mean variance (MV) and non-parametric specifications
In the second case, the problem can be written as in (56), except that we consider the conditional distribution of a 1t+1 given the historical value of a 2t . The conditional distribution can be calculated as in (50) and the actual values of estimated parameters are listed in table 1. Since we are only interested in the conditional distribution of returns given sentiment, based on (50), and using the strong exogeneity of our residuals, we add a simulated residual from the return distribution to reconstruct the future distribution of returns. We would expect, relative to case (i), that output from this model should be less variable.
In the third case, we maximise maxE t (W (a 1t+1 , a 2t+1 )) where W (a 1t , a 2t ) is the utility function described in (33), where risk aversions to wealth and sentiment are positively correlated (as in the assumptions of scale gamma distributions for risk aversions.) Here, as in case (i), we assume the joint distribution of returns and sentiment rather than the conditional distribution of returns given sentiment as in case (ii).
There is a fourth case, analogous to case (ii), but with correlated risk aversions, which we also report in the fourth column of table 4. Allowing for an interpretation of the complete impact of the contingency table (pseudoprobabilities of states of the world) and the level or of risk aversions to attribute, we estimate a conditional certainty equivlance model. We use this to explore the impact of the bivariate utility function in scaling equity positions in periods of high and low sentiment. The model estimated is max E t (U(a 1t+1 , a 2t+1 ) where α i is the risk aversion to wealth in state i and β j is the risk aversion to sentiment in state j.

Results
In this section we present the estimation of the conditional MCEs, assuming the mean-variance specification of attribute risk aversion. This is presented in table 3 as described in Section 4.3. We also present two models of bivariate utility as discussed in Sections 2.3 and 2.4. We use different conditioning, based on conditioning on all information up to time t and, alternatively, forecasting returns based on the conditional distribution of sentiment. Of particular interest is how the share of investment in the risky asset varies over time, measured by the standard deviation of the equity position, σ ω . We present four cases: (i) model V (a 1t+1 , a 2t+1 ), with sentiment and returns jointly bootstrapped, (ii) model V (a 1t+1 , a 2t+1 ), with returns estimated based on the conditional distribution of sentiment, (iii) model W (a 1t+1 , a 2t+1 ), with sentiment and returns jointly bootstrapped, and correlated risk aversions for sentiment and returns, and (iv) model W (a 1t+1 , a 2t+1 ), with returns estimated based on the conditional distribution of sentiment.

Mixex certainty equivalents
Returning to our discussion in (52), we now illustrate the point numerically. We use the mean and standard deviation of equity returns and sentiment from the worked example in Section 4, μ 1 = 0.0048, μ 2 = 0.0, σ 1 = σ 2 = 0.0451, ρ 12 = −0.079, where ρ 12 represents the correlation between excess returns and the scaled sentiment index, estimated from the data. Risk aversion parameters are set at high and low values Model Notes: We estimate the optimal equity position each month based on bootstrap forecasts of returns (R) and sentiment (S) in the coming month. The four utility models are Case 1: bivariate utility function V (a 1t+1 , a 2t+1 ) from (26) with returns and sentiment estimated jointly using bootstrapped forecasts; Case 2: bivariate utility function V (a 1t+1 , a 2t+1 ) with returns estimated based on the conditional distribution of sentiment, Case 3: bivariate utility function W (a 1t+1 , a 2t+1 ) from (33), which incorporates affiliation between attributes, with returns and sentiment estimated jointly using bootstrapped forecasts, and Case 4: bivariate utility function W (a 1t+1 , a 2t+1 ) with returns distributed based on the conditional distribution of sentiment. Parameter values for risk aversion distribution are θ = θ 1 = θ 2 = 100, and α = α 1 = α 2 = 9. The Average Equity Position is the time-series average of the step-ahead forecast utility maximising positions in equity. Std. Dev. Equity position reports the standard deviation over time of the optimal forecast position in equity. Average Utility reports the average utility at the optimal position in equity.
(respectively) of, α 1 = β 1 = 9, α 2 = β 2 = 1, and P ij arbitrary for all i,j. There is scant experimental evidence covering parameter estimation for multi-attribute utility functions. In the mixex setup, one might think of the choice of parameters of α 1 = 1 and α 2 = 9 as being similar to the average of the risk aversion parameter of 4 chosen by Bjork et al. (2012). This worked example corresponds to average equity returns of 0.48% per month, sentiment with mean zero, and standard deviation equal to 4.51% per month for both returns and sentiment. Here, high risk aversion coefficients to both excess returns and sentiment are set to 9, and low risk aversion coefficients are set to 1. The calculations that we will present will be conditional certainty equivalences, conditioning on four states of the world, described in table 3. Numerical values for the conditional certainty equivalents generated in this case are shown in table 3. To generate the levels of the certainty equivalents, we use (52), assuming that the attributes are normally distributed. The four cases of risk aversion to attributes we consider are presented in the columns, while the level of the certainty equivalent for the market risk premium (Mkt-Rf) is shown in the first row, and the level of the certainty equivalent for sentiment is shown in the second row. For cases of low risk-aversion to returns (the first two columns), the certainty equivalent level is slightly positive, while high risk aversion (the second two columns) would lead to slightly negative certainty equivalent. Sentiment, with a mean of zero, has four negative certainty equivalents, but the magnitude is greater for higher levels of risk aversion (columns 2 and 4). Interestingly, the crossterms (columns 2 and 3) are quite distinct. We calculate the percentage discrepancy as follows: The discrepancy reflects the non-normality of the data, which is exacerbated if attribute risk aversions are especially large.
The conditional MCE of sentiment on equity could take either sign, depending on the values of stochastic risk aversion. In particular, the conditional MCE becomes negative for excess returns when α 1 = 9. Here we are so risk averse that we prefer a certain loss of wealth (of − 0.426%), conditional on α 1 = 9. Given the mean of sentiment is zero, we would expect that the MCE would always be slightly negative, but more so when β 2 = 9.
In the case of univariate certainty equivalence for an equity index, we might expect certainty equivalence to be positive for plausible values of fixed ARA. We now see that, introducing states of the world and stochastic risk aversion, situations arise when MCE may be negative. What we have computed in table 3 is MCE conditional upon a particular state, without specifying P ij . If we treat the states as equally likely, P ij = 1/4, i = 1, 2; j = 1, 2; then we can average the conditional MCEs to get the equiprobable MCE. The equiprobable MCE of returns is 0.18 bps per month, or 15 bps per annum. The equiprobable MCE of Sentiment is − 0.46 units per month. The fact that the equiprobable MCE of returns is so low suggests that some of the states of the world are much more likely than others.

Equity positions for bivariate mixex utility with scale-gamma risk aversion
We present the results of asset allocations in table 4. The average equity position, reported in row 1, is the mean proportion held in equity for the 600 month history. The standard deviation of the 600 equity positions is reported in row 2 of table 4. Higher levels of standard deviation would represent higher transaction costs incurred to maintain the desired utility level. The average level of optimised utility is reported in row 3 of table 4. We remind the reader that α in table 4 refers to the degrees of freedom of the scale gamma used in the construction of the utility function, while the αs in table 3 as described in (54) refer the varying levels of absolute risk aversion to wealth. Whereas the average equity position changes very little between Case 1 and Case 2, the volatility of the equity proportion is substantially reduced in Case 2, due to the reduction in Table 5. Average and standard deviation of equity position conditional on sentiment for full mixex specification.
Case (i) Case (ii) Case (iii) P 11 = 0.40 P 11 = 0.25 P 11 = 0.10 P 12 = 0.10 P 12 = 0.25 P 12 = 0.40 Notes: Contingency table as specified P 11 = P 22 , P 12 = P 21 = 0.5 − P 11 , for states of risk aversion to equity α 1 = 1, α 2 = 9, and risk aversion to sentiment β 1 = 1, β 2 = 9. P 11 refers to the pseudoprobability of risk aversion to equity value of α 1 and risk aversion to sentiment of β 1 , and similar for other states. Equity returns are simulated as in volatility of the conditional distribution of R t+1 |S t . In Case 3, where risk aversion to sentiment and risk aversion to returns are positively correlated, the utility gained from sentiment partially augments the equity-based utility. In effect, as predicted in (37), the impact of positively correlated risk aversion is to increase absolute risk aversion, and hence reduce the share invested in equity (given a fixed dollar investment). Because of the positive affiliation between the attributes of the utility function, Cases 3 and 4 exhibit lower equity volatility than Cases 1 and 2 (respectively) even though returns are forecast using the same distributions. Case 4 is similar in substance to Case 3, but with higher average equity position and lower volatility in the equity position due to lower conditional volatility in the conditional equity distribution.

Equity positions for the conditional certainty equivalence model
In table 5 we report the average and standard deviation of equity positions conditional on the sentiment in the previous month, S t . Here, we report the results of the probabilistic specification of the model, using the contingency table in table 2 to specify risk aversions for wealth α 1 = 1, α 2 = 9 and risk aversions to sentiment β 1 = 1, β 2 = 9. For example, P 11 = 0.40 would correspond to a pseudo-probability of 0.40 of the state of the world with low risk aversion to equity (α 1 = 1) and low risk aversion to sentiment (β 1 = 1). We consider three cases of the contingency table (all of which are symmetric): P 11 = 0.4, P 11 = 0.25 and P 11 = 0.1, where the off-diagonal terms are determined by taking P 12 = P 21 = 0.5 − P 11 . These three cases indicate (i) positive affiliation between attributes, (ii) no affiliation between attributes, and (iii) negative affiliation between attributes. In each of the three cases we report the equity position for the two alternative return simulation processes as in table 4; either returns are simulated jointly with sentiment ({R t+1 , S t+1 }) or conditional on the level of sentiment ({R t+1 |S t }). The results in table 5 demonstrate a number of interesting features. Generally, the features of the mixex model are consistent across the three cases of affiliation between attributes; when returns are simulated jointly with sentiment, {R t+1 , S t+1 }, in periods of low sentiment the position in equity is scaled up, and in periods of high sentiment, the position in equity is scaled down. For example, the average equity position for the first case is 0.567 in periods of low sentiment, and − 0.107 in periods of high sentiment. This is consistent with the interpretation of high sentiment presented in Section 4.1. The level of sentiment thus provides a hedge against the equity position for an individual with the bivariate mixex utility function. When returns are simulated conditional upon sentiment ({R t+1 |S t }) there is little variation in the equity position for the different levels of sentiment, and there is a lower standard deviation in the optimal equity position. Thus, although the optimal equity position changes over time, the return expectation has already taken the current level of sentiment into account.
The impact of decreasing affiliation between attributes (moving across the three cases we change from positive to negative affiliation) is to make the 'high-low' (off-diagonal values on the contingency table in table 2) risk aversion cases more likely than either 'high-high' or 'low-low' (diagonal values on the contingency table). Overall, this leads to a decreased dispersion in positions in equity across periods of high and low sentiment. For example, in periods of low sentiment, case (iii) has a 0.078 lower average position in equity than in case (i), while in periods of high sentiment while it is 0.026 higher in periods of high sentiment. Negative affiliation between the attributes somewhat tempers the need for hedging. An examination of the average utility in each state demonstrates the difference between the cases. The average utility is much less variable across the states of sentiment for the conditionally estimated returns, for each of the three cases of attribute affiliation considered. For example, in case (ii), the average utility is essentially identical across the states of sentiment (−0.997) even though there is variation in the equity position. When the return distribution is jointly estimated with sentiment, {R t+1 , S t+1 }, it is generally observed that utility is higher in periods of high sentiment. For the case of negative affiliation (Case (iii)), although the average equity position is less variable across the sentiment states than when attributes are positively affiliated (Case (i)), the utility is actually more sensitive to the level of sentiment. Careful consideration must therefore be made to the degree of affiliation between attributes along both dimensions when using the bivariate mixex specification.

Conclusion
In this paper we have considered the modelling of investment decisions using mixex utility. We have analysed in detail the problem of investment choice between a risky and risk-free asset for an investor with a bivariate mixex utility function. The first attribute in the investor's utility function is wealth, while the second attribute (sentiment is our example) is nonfinancial, and nontradeable, and differs from background risk in that it is not measured in dollars. The key question we address is the extent to which an additional attribute affects the level of investment in the risky asset.
In addition, we define a state-dependent concept of certainty equivalence which varies with the stochastic risk aversion in our mixex model of utility. In particular, for the case of normally distributed attributes, we provide a formula for mixex certainty equivalence which is a generalisation of existing results in the mean-variance literature.
We consider the case where stochastic risk aversions follows a discrete distribution and present results for certainty equivalents in Proposition 4. We explore the modelling of risk aversion by the use of scale gamma distributions. These lead to different utility functions based on the dependence of risk aversions, as presented in Proposition 5. These different assumptions lead to different implications for risk aversion. We find that positively correlated risk aversions lead to increases in absolute risk aversion with respect to our first attribute. In the context of investment, where the first attribute is wealth, and where we have fixed initial wealth, we would expect a reduction in risky investment. This result is similar to results in the background risk literature.
We also provide a characterisation of an optimal portfolio for the mixex utility function when the attributes are multivariate normal. The solution for the optimal portfolio takes a familiar form, in that we hold cash, a Markowitz portfolio, as well as hedging portfolios against other attributes. This is similar to the fund separation theorems found elsewhere in portfolio theory.
Finally, we provide a worked example using U.S. equity and sentiment indices, from 1965-2015, where we discuss procedures involving a mix of estimation and simulation.
High risk aversion (to high sentiment) would make an investor highly risk averse to strongly sentimentally priced assets. In particular we find that when stochastic risk aversions to sentiment and wealth are positively correlated, the share optimally invested in equity is reduced relative to the case where the risk aversions are independent, as predicted by the results in Section 2.