#### Abstract

We extend the foundation of probability in samples with rare
events that are potentially catastrophic, called *black swans*, such as natural
hazards, market crashes, catastrophic climate change, and species extinction. Such events are generally treated as ‘‘outliers’’
and disregarded. We propose a new axiomatization of probability requiring equal treatment in the measurement of rare and frequent events—the Swan Axiom—and characterize the subjective probabilities that the axioms imply: these are neither finitely additive nor countably additive
but a combination of both. They exclude countably additive probabilities
as in De Groot (1970) and Arrow (1971) and are a strict subset of
Savage (1954) probabilities that are finitely additive measures. Our subjective
probabilities are standard distributions when the sample has no
black swans. The finitely additive part assigns however more weight to
rare events than do standard distributions and in that sense explains the
persistent observation of ‘‘power laws’’ and ‘‘heavy tails’’ that eludes classic theory. The axioms extend earlier work by Chichilnisky (1996, 2000, 2002, 2009) to
encompass the foundation of subjective probability and axiomatic treatments
of subjective probability by Villegas (1964), De Groot (1963), Dubins and Savage (1965), Dubins
(1975) Purves and Sudderth (1976) and of choice under uncertainty by Arrow (1971).

#### 1. Introduction

*Black swans* are rare events with important consequences, such as market crashes, natural hazards, global warming, and major episodes of extinction. This article is about the foundations of probability when catastrophic events are at stake. It provides a new axiomatic foundation for probability requiring sensitivity both to rare and frequent events. The study culminates in Theorem 6.1, that proves existence and representation of a probability satisfying three axioms. The last of these axioms requires sensitivity to rare events, a property that is desirable but not respected by standard probabilities. The article shows the connection between those axioms and the Axiom of Choice at the foundation of Mathematics. It defines a new type of probabilities that coincide with standard distributions when the sample is populated only by relatively frequent events. Generally, however, they are a mixture of countable and finitely additive measures, assigning more weight to black swans than do normal distributions, and predicting more realistically the incidence of “outliers,” “power laws,” and “heavy tails” [1, 2].

The article refines and extends the formulation of probability in an uncertain world. It provides an argument, and formalization, that probabilities must be additive functionals on (where is a -field of "events" represented by their indicator bounded and real valued functions), that are neither countably additive nor finitely additive. The contribution is to provide an axiomatization showing that subjective probabilities must lie in the full space rather than as the usual formalization (Arrow, [3]) forcing countable additivity implies. The new axioms refine both Savage's [4] axiomatization of finitely additive measures, and Villegas' [5] and Arrow's [3] that are based on countably additive measures, and extend both to deal more realistically with catastrophic events.

Savage [4] axiomatized subjective probabilities as finitely additive measures representing the decision makers' beliefs, an approach that can ignore frequent events as shown in the appendix. To overcome this, Villegas [5] and Arrow [3] introduced an additional continuity axiom (called “*Monotone Continuity*”) that yields countably additivity of the measures. However Monotone Continuity has unusual implications when the subject is confronted with rare events, for example, it predicts that in exchange for a couple of cents, one should be willing to accept a small risk of death (measured by a countably additive probability), a possibility that Arrow called “outrageous” [3, Pages 48–49]. This article defines a realistic solution: for some, very large, payoffs and in certain situations, one may be willing to accept a small risk of death—but not in others. This means that Monotone Continuity holds in some cases but not in others, a possibility that leads to the axiomatization proposed in this article and is consistent with the experimental observations reported by (Chanel and Chichilnisky [6, 7]). The results are as follows. We show that countably additive measures are insensitive to *black swans: *they assign negligible weight to rare events, no matter how important these may be, treating catastrophes as outliers. Finitely additive measures, on the other hand, may assign no weight to frequent events, which is equally troubling. Our new axiomatization balances the two approaches and extends both, requiring sensitivity in the measurement of rare as well as frequent events. We provide an existence theorem for probabilities that satisfy our axioms, and a characterization of all that do.

The results are based on an axiomatic approach to choice under uncertainty and sustainable development introduced by Chichilnisky [8–10] and illuminate the classic issue of continuity that has always been at the core of “subjective probability” axioms (Villegas, [5], Arrow [3]). To define continuity, we use a topology that tallies with the experimental evidence of how people react to rare events that cause fear (Le Doux [11], Chichilnisky [12]), previously used by Debreu [13] to formalize a market's Invisible Hand, and by Chichilnisky [9, 12, 14] to axiomatize choice under uncertainty with rare events that inspire fear. The new results provided here show that the standard axiom of decision theory, Monotone Continuity, is equivalent to De Groot's Axiom that lies at the foundation of classic likelihood theory (Proposition 2.1) and that both of these axioms underestimate rare events no matter how catastrophic they may be. We introduce here a new Swan Axiom (Section 3) that logically negates them both, show it is a combination of two axioms defined by Chichilnisky [9, 14] and prove that any subjective probability satisfying the Swan Axiom is neither countably additive nor finitely additive: it has elements of both (Theorem 4.1). Theorem 6.1 provides a complete characterization of all subjective probabilities that satisfy linearity and the Swan Axiom, thus extending earlier results of Chichilnisky [1, 2, 9, 12, 14].

There are other approaches to subjective probability such as Choquet Expected Utility Model (CEU, Schmeidler, [15]) and Prospect Theory (Kahneman and Tversky, [16, 17]). They use a nonlinear treatment of probabilities of likelihoods (see, e.g., Dreze, [18], or Bernstein, [19]), while we retain linear probabilities. Both have a tendency to give higher weight to small probabilities, and are theoretical answers to experimental paradoxes found by Allais in 1953 and Ellsberg in 1961, among others refuting the *Independence Axiom* of the Subjective Expected Utility (SEU) model. Our work focuses instead directly on the foundations of probability by taking the logical negation of the *Monotone Continuity Axiom*. It is striking that weakening or rejecting this axiom—respectively, in decision theory and in probability theory—ends up in probability models that are more in tune with observed attitudes when facing catastrophic events. Presumably each approach has advantages and shortcomings. It seems that the approach offered here may be superior on four counts: (i) it retains linearity of probabilities, (ii) it identifies Monotone Continuity as the reason for underestimating the measurement of catastrophic events, an axiom that depends on a technical definition of continuity and has no other compelling feature, (iii) it seems easier to explain and to grasp, and therefore (iv) it may be easier to use in applications.

#### 2. The Mathematics of Uncertainty

*Uncertainty*

Uncertainty is described by a set of distinctive and exhaustive possible *events* represented by a family of sets whose union describes a universe . An event is identified with its *characteristic function * where when and when The subjective probability of an event is a real number that measures how likely it is to occur according to the subject. Generally we assume that the probability of the universe is and that of the empty set is zero In this article we make no difference between subjective probabilities and likelihoods, using both terms intercheangeably. Classic axioms for subjective probability (resp. likelihoods) are provided by Savage [4] and De Groot [20]. The likelihood of two disjoint events is the sum of their likelihoods: when ; a property called* additivity*. These properties correspond to the definition of a probability or likelihood as a *finite additive measure* on a family (-algebra of measurable sets of , which is Savage's [4] definition of subjective probability. is countably additive when whenever if j. A *purely finitely additive probability* is one that is additive but not countably additive. Savage's subjective probabilities can be purely finitely additive or countably additive. In that sense they include all the probabilities in this article. However as seen below, this article excludes probabilities that are either purely finitely additive, or countably additive, and therefore our characterization of a subjective probability is strictly finer than that of Savage's [4], and different from the view of a measure as a countably additive set function (e.g. De Groot, [21]) The following Axioms were introduced by Villegas [5]; and others for the purpose of obtaining countable additivity.

*Monotone Continuity Axiom (MC) (Arrow [3])*

For every two events and with , and every *vanishing* sequence of events (defined as follows: for all and ) there exists such that altering arbitrarily the events and on the set where does not alter the subjective probability ranking of the events, namely, where and are the altered events.

This axiom is equivalent to requiring that the probability of the sets along a vanishing sequence goes to zero. Observe that the decreasing sequence could consist of infinite intervals of the form for Monotone continuity therefore implies that the likelihood of this sequence of events goes to zero, even though all its sets are unbounded. A similar example can be constructed with a decreasing sequence of bounded sets, for which is also a vanishing sequence as it is decreasing and their intersection is empty.

*De Groot's Axiom (De Groot, [20], Chapter 6, page 71)*

If is a decreasing sequence of events and is some fixed event that is less likely than for all then the probability of the intersection is larger than that of

The following proposition establishes that the two axioms presented above are one and the same; both imply countable additivity.

Proposition 2.1. *A relative likelihood (subjective probability) satisfies the Monotone Continuity Axiom if and only if it satisfies Axiom Each of the two axioms implies countable additivity.*

*Proof. *Assume that De Groot's axiom is satisfied. When the intersection of a decreasing sequence of events is empty and the set is less likely to occur than every set , then the subset must be as likely as the empty set; namely, its probability must be zero. In other words, if is more likely than the empty set, then regardless of how small is the set , it is impossible for every set to be as likely as . Equivalently, the probability of the sets that are far away in the vanishing sequence must go to zero. Therefore implies Monotone Continuity. Reciprocally, assume that MC is satisfied. Consider a decreasing sequence of events and define a new sequence by substracting from each set the intersection of the family, namely, Let be a set that is more likely than the empty set but less likely than every . Observe that the intersection of the new sequence is empty, and since the new sequence is, by definition, a vanishing sequence. Therefore by MC Since must be more likely than for some onwards. Furthermore, and , so that is equivalent to Observe that would contradict the inequality since as we saw above, by MC, and It follows that , which establishes De Groots's Axiom Therefore Monotone Continuity is equivalent to De Groot's Axiom . A proof that each of the axioms implies countable additivity is in Villegas [5], Arrow [3] and De Groot [20].

The next section shows that the two axioms, Monotone Continuity and are biased against rare events no matter how catastrophic these may be.

#### 3. The Value of Life

The best way to explain the role of *Monotone Continuity* is by means of an example provided by Arrow [3, Pages 48–49]. He explains that if is an action that involves receiving one cent, is another that involves receiving zero cents, and is a third action involving receiving one cent and facing a small probability of death, then *Monotone Continuity* requires that the third action involving death and one cent should be preferred to the action with zero cents if the probability of death is small enough. Even Arrow says of his requirement “this may sound outrageous at first blush…” (Arrow [3, Pages 48–49]). Outrageous or not, Monotone Continuity (MC) leads to neglect rare events with major consequences, like death. Death is a black swan.

To overcome the bias we introduce an axiom that is the logical negation of MC: this means that sometimes MC holds and others it does not. We call this the *Swan Axiom*, and it is stated formally below. To illustrate this, consider an experiment where subjects are offered a certain amount of money to choose a pill at random from a pile, which is known to contain one pill that causes death. It was shown experimentally (Chanel and Chichilnisky [7]) that in some cases people accept a sum of money and choose a pill provided that the pile is large enough—namely, when the probability of death is small enough—thus satisfying the Monotone Continuity axiom and determining the statistical value of their lives. But there are also cases where the subjects will not accept to choose any pill, no matter how large is the pile. Some people refuse the payment of one cent if it involves a small probability of death, no matter how small the probability may be (Chanel and Chichilnisky, [6, 7]). This conflicts with the Monotone Continuity axiom, as explicitly presented by Arrow [3].

Our Axiom provides a reasonable resolution to this dilemma that is realistic and consistent with the experimental evidence. It implies that there exist catastrophic outcomes such as the risk of death, so terrible that one is unwilling to face a small probability of death to obtain one cent versus nothing, no matter how small the probability may be. According to our Axiom, no probability of death may be acceptable when one cent is involved. Our Axiom also implies that in other cases there may be a small enough probability that the lottery involving death may be acceptable, for example if the payoff is large enough to justify the small risk. This is a possibility discussed by Arrow [3]. In other words: sometimes one is willing to take a risk with a small enough probability of a catastrophe, in other cases one is not. This is the content of our Axiom, which is formally stated as follows.

*The Swan Axiom*

This axiom is the logical negation of Monotone Continuity: There exist events and with , and for every vanishing sequence of events an such that altering arbitrarily the events and on the set where does not alter the probability ranking of the events, namely, where and are the altered events. For other events and with , there exist vanishing sequence of events where for every altering arbitrarily the events and on the set where does alter the probability ranking of the events, namely where and are the altered events.

*Definition 3.1. *A probability is said to be *biased against rare events* or *insensitive to rare events* when it neglects events that are small according to Villegas and Arrow; as stated in Arrow [3, page 48]: “An event that is far out on a *vanishing sequence* is “*small*” by any reasonable standards” (Arrow [3, page 48]). Formally, a probability is insensitive to rare events when given two events and and any vanishing sequence of events such that satisfying and a.e. on when and denotes the complement of the set .

Proposition 3.2. *A subjective probability satisfies Monotone Continuity if and only if it is biased against rare events.*

*Proof. *This is immediate from the definitions of both [3, 12].

Corollary 3.3. *Countably additive probabilities are biased against rare events.*

*Proof. *It follows from Propositions 2.1 and 3.2 [9, 12].

Proposition 3.4. *Purely finitely additive probabilities are biased against frequent events.*

*Proof. *See example in the appendix.

Proposition 3.5. *A subjective probability that satisfies the Swan Axiom is neither biased against rare events, nor biased against frequent events.*

*Proof. *This is immediate from the definition.

#### 4. An Axiomatic Approach to Probability with Black Swans

This section proposes an axiomatic foundation for subjective probability that is unbiased against rare and frequent events. The axioms are as follows:

*Axiom 1. *Subjective probabilities are continuous and additive.

*Axiom 2. *Subjective probabilities are unbiased against rare events.

*Axiom 3. *Subjective probabilities are unbiased against frequent events.

Additivity is a natural condition and *continuity* captures the notion that “nearby” events are thought as being similarly likely to occur; this property is important to ensure that “sufficient statistics” exist. “Nearby” has been defined by Villegas [5] and Arrow [3] as follows: two events are *close* or *nearby* when they differ on a *small set* as defined in Arrow [3], see previous section. We saw in Proposition 3.2 that the notion of continuity defined by Villegas and Arrow—namely, monotone continuity—conflicts with the Swan Axiom. Indeed Proposition 3.2 shows that countably additive measures are biased against rare events. On the other hand, Proposition 3.4 and the Example in the appendix show that purely finitely additive measures can be biased against frequent events. A natural question is whether there is anything left after one eliminates both biases. The following proposition addresses this issue.

Theorem 4.1. *A subjective probability that satisfies the Swan Axiom is neither finitely additive nor countably additive; it is a strict convex combination of both.*

*Proof. *This follows from Propositions 3.2, 3.4 and 3.5, Corollary 3.3 above, and the fact that convex combinations of measures are measures. It extends Theorem 6.1 of Section 6 below, which applies to the special case where the events are Borel sets in or in an interval

Theorem 4.1 establishes that neither Savage's approach nor Villegas' and Arrow's satisfy the three axioms stated above. These three axioms require more than the additive subjective probabilities of Savage, since purely finitely additive probabilities are finitely additive and yet they are excluded here. At the same time the axioms require less than the countably subjective additivity of Villegas and Arrow, since countably additive probabilities are biased against rare events. Theorem 4.1 above shows that a strict combination of both does the job.

Theorem 4.1 does not however prove the existence of likelihoods that satisfy all three axioms. What is missing is an appropriate definition of continuity that does not conflict with the Swan Axiom. The following section shows that this can be achieved by identifying an event with its characteristic function, so that events are contained in the space of bounded real-valued functions on the universe space , and endowing this space with the sup norm.

#### 5. Axioms for Probability with Black Swans, in or

From here on events are the Borel sets of the real line or the interval This is a widely used case that make the results concrete and allows to compare the results with the earlier axioms on choice under uncertainty of Chichilnisky [9, 12, 14]. We use a concept of “continuity” based on a topology that was used earlier by Debreu [13] and by Chichilnisky [1, 2, 9, 10, 12, 14]: observable events are in the space of measurable and essentially bounded functions with the sup norm . This is a sharper and more stringent definition of closeness than the one used by Villegas and Arrow, since two events can be close under the Villegas-Arrow definition but not under ours, see the appendix.

A subjective probabiliy satisfying the classic axioms by De Groot [20] is called a *standard probability*, and is countably additive. A classic result is that for any event a standard probability has the form where is an integrable function in

The next step is to introduce the new axioms, show existence and characterize all the distributions that satisfy the axioms. We need more definitions. A subjective probability is called *biased against rare events*, or *insensitive to rare events* when it neglects events that are small according to a probability measure on that is absolutely continuous with respect to the Lebesgue measure. Formally, a probability is insensitive to rare events when given two events and such that for all satisfying and a.e. on and . Here denotes the complement of the set . is said to be *insensitive to frequent events* when given any two events that for all satisfying and a.e. on and is called *sensitive* to rare (respectively frequent) events when it is *not insensitive* to rare (respectively frequent) events.

The following three axioms are identical to the axioms in last section, specialized to the case at hand.

*Axiom 1. * is linear and continuous.

*Axiom 2. * is sensitive to frequent events.

*Axiom 3. * is sensitive to rare events.

The first and the second axiom agree with classic theory and standard likelihoods satisfy them. The third axiom is new.

Lemma 5.1. *A standard probability satisfies Axioms 1 and 2, but it is biased against rare events and therefore does not satisfy Axiom 3. *

*Proof. *Consider Then
since and are characteristic functions and thus positive. Therefore is linear. is continuous with respect to the norm because implies
Since the sup norm is finer than the norm, continuity in implies continuity with respect to the sup norm (Dunford and Schwartz, [22]). Thus a standard subjective probability satisfies Axiom 1. It is obvious that for every two events , with the inequality is reversed namely when and are appropriate variations of and that differ from and on sets of sufficiently large Lebesgue measure. Therefore Axiom 2 is satisfied. A standard subjective probability is however not sensitive to rare events, as shown in Chichilnisky [1, 2, 9, 10, 12, 14, 23].

#### 6. Existence and Representation

Theorem 6.1. *There exists a subjective probability satisfying Axioms 1, 2, and 3. A probability satisfies Axioms 1, 2 and 3 if and only if there exist two continuous linear functions on , denoted and and a real number such that for any observable event **
where defines a countably additive measure on and is a purely finitely additive measure.*

*Proof. *This result follows from the representation theorem by Chichilnisky [9, 12].

*Example 6.2 (“Heavy” Tails). *The following illustrates the additional weight that the new axioms assign to rare events; in this example in a form suggesting “heavy tails.” The finitely additive measure appearing in the second term in (6.1) can be illustrated as follows. On the subspace of events with limiting values at infinity, define and extend this to a function on all of using Hahn Banach's theorem. The difference between a standard probability and the likelihood defined in (6.1) is the second term , which focuses all the weight at infinity. This can be interpreted as a “heavy tail,” a part of the distribution that is not part of the standard density function and gives more weight to the sets that contain *terminal* events, namely sets of the form .

Corollary 6.3. *In samples without rare events, a subjective probability that satisfies Axioms 1, 2, and 3 is consistent with classic axioms and yields a countably additive measure.*

*Proof. *Axiom 3 is an empty requirement when there are no rare events while, as shown above, Axioms 1 and 2 are consistent with standard relative likelihood.

#### 7. The Axiom of Choice

There is a connection between the new axioms presented here and the Axiom of Choice that is at the foundation of mathematics (Godel, [24]), which postulates that there exists a universal and consistent fashion to select an element from every set. The best way to describe the situation is by means of an example, see also Dunford and Schwartz [22], Yosida [25, 26], Chichilnisky and Heal [27], and Kadane and O'Hagan [28].

*Example 7.1 (illustration of a purely finitely additive measure). *Consider a possible measure satisfying the following: for every interval if for some , and otherwise . Such a measure would not be countably additive, because the family of countably many disjoint sets defined as , satisfies when and so that while which contradicts countable additivity. Since the contradiction arises from assuming that is countably additive, such a measure could only be purely finitely additive.

One can illustrate a function on that represents a purely finitely additive measure if we restrict our attention to the closed subspace of consisting of those functions in that have a limit when by the formula , as in Example 6.2 of the previous section. The function can be illustrated as a limit of a sequence of delta functions whose supports increase without bound. The problem however is to extend the function to another defined on the entire space This could be achieved in various ways but as we will see, each of them requires the Axiom of Choice.

One can use Hahn—Banach's theorem to extend the function from the closed subspace to the entire space preserving its norm. However, in its general form Hahn—Banach's theorem requires the Axiom of Choice (Dunford and Schwartz, [22]). Alternatively, one can extend the notion of a *limit* to encompass all functions in including those with no standard limit. This can be achieved by using the notion of convergence along a* free ultrafilter* arising from compactifying the real line as by Chichilnisky and Heal [27]. However the existence of a *free ultrafilter* also requires the Axiom of Choice.

This illustrates why any attempts to construct *purely finitely additive measures*, requires using the Axiom of Choice. Since our criteria include purely finitely additive measures, this provides a connection between the Axiom of Choice and our axioms for relative likelihood. It is somewhat surprising that the consideration of rare events that are neglected in standard statistical theory conjures up the Axiom of Choice, which is independent from the rest of mathematics (Godel, [24]).

#### Appendix

*Example A.1 (Illustration of a probability that is biased against frequent events). *Consider the function . This is insensitive to frequent events of arbitrarily large Lebesgue measure (Dunford and Schwartz, [22]) and therefore does not satisfy Axiom 2. In addition it is not linear, failing Axiom 1.

*Example A.2 (two approaches to “closeness”). *Consider the family where , This is a vanishing family because for all and Consider now the events when and otherwise, and when and otherwise. Then for all . In the sup norm topology this implies that and are *not* “close” to each other, as the difference does not converge to zero. No matter how far along the vanishing sequence the two events differ by . Yet since the events differ from and respectively only in the set and is a vanishing sequence, for large enough they are as “close” as desired according to Villegas-Arrow's definition of “nearby” events.

*The Dual Space : Countably Additive and Finitely Additive Measures*

The space of continuous linear functions on with the sup norm is the “dual” of and is denoted . It has been characterized, for example, in Yosida [25, 26]. consists of the sum of two subspaces functions that define countably additive measures on by the rule where so that is *absolutely continuous *with respect to the Lebesgue measure, and a subspace consisting of purely finitely additive measures. A countable measure can be identified with an function, called its “density,” but purely finitely additive measures cannot be identified by such functions.

*Example A.3. *Illustration of a Finitely Additive Measure that is not Countably Additive

See Example 7.1 in Section 7.

#### Acknowledgments

This research was conducted at Columbia University's Program on Information and Resources and its Columbia Consortium for Risk Management (CCRM). The author acknowledges support from Grant no 5222-72 of the US Air Force Office of Research directed by Professor Jun Zheng, Arlington VA. The initial results (Chichilnisky [8]) were presented as invited addresses at Stanford University's 1993 Seminar on Reconsideration of Values, organized by Professor Kenneth Arrow, at a 1996 Workshop on Catastrophic Risks organized at the Fields Institute for Mathematical Sciences of the University of Toronto, Canada, at the NBER Conference *Mathematical Economics: The Legacy of Gerard Debreu *at UC Berkeley, October 21, 2005, the Department of Economics of the University of Kansas National Bureau of Economic Research General Equilibrium Conference, September 2006, at the Departments of Statistics of the University of Oslo, Norway, Fall 2007, at a seminar organized by the late Professor Chris Heyde at the Department of Statistics of Columbia University, Fall 2007, at seminars organized by Drs. Charles Figuieres and Mabel Tidball at LAMETA Universite de Montpellier, France December 19 and 20, 2008, and by and at a seminar organized by Professor Alan Kirman at GREQAM Universite de Marseille, December 18 2008. We are grateful to the above institutions and individuals for supporting the research, and for helpful comments and suggestions. An anonymous referee provided insightful comments that improved this article.