Research Article | Open Access
A Duality Approach to the Genealogies of Discrete Nonneutral Wright-Fisher Models
Discrete ancestral problems arising in population genetics are investigated. In the neutral case, the duality concept has been proved of particular interest in the understanding of backward in time ancestral process from the forward in time branching population dynamics. We show that duality formulae still are of great use when considering discrete nonneutral Wright-Fisher models. This concerns a large class of nonneutral models with completely monotone (CM) bias probabilities. We show that most classical bias probabilities used in the genetics literature fall within this CM class or are amenable to it through some “reciprocal mechanism" which we define. Next, using elementary algebra on CM functions, some suggested novel evolutionary mechanisms of potential interest are introduced and discussed.
Forward evolution of large populations in genetics has a long history, starting in the s; it is closely attached to the names of R. A. Fisher and S. Wright; see Nagylaki  for historical commentaries and on the role played by the French geneticist G. Malécot, starting shortly before the second world war. The book of Ewens  is an excellent modern presentation of the current mathematical theory. Coalescent theory is the corresponding backward problem obtained while running the forward evolution processes backward in time. It was discovered independently by several researchers in the s, but definitive formalization is commonly attributed to Kingman . Major contributions to the development of coalescent theory were made (among others) by P. Donnelly, R. Griffiths, R. Hudson, F. Tajima, and S. Tavaré (see the course of Tavaré in Saint-Flour  for a review). It included incorporating variations in population size, mutation, recombination, selection, and so forth. In (), Pitman  and Sagitov , independently, introduced coalescent processes with multiple collisions of ancestral lineages. Shortly later, the full class of exchangeable coalescent processes with simultaneous multiple mergers of ancestral lineages was discovered by Möhle and Sagitov , and Schweinsberg . All these recent developments and improvements concern chiefly the discrete neutral case and its various scaling limits in continuous time and/or space. As was shown by Möhle [9, 10], neutral forward and backward theories learn much from one another by using a concept of duality introduced by Liggett . Backward theory in the presence of mutations in the forward process is well understood, as it requires the study of a marked Kingman's tree (see Tavaré , for a review). In the works of Neuhauser and Krone , there is also some use of the duality concept in an attempt to understand the genealogies of a Wright-Fisher diffusion (as a limit of a discrete Wright-Fisher model) presenting a selection mechanism; this led these authors to the idea of the ancestral selection graph extending Kingman's coalescent tree of the neutral theory; see also Huillet  for related objectives in the context of Wright-Fisher diffusions with and without drifts. There is, therefore, some evidence that the concept of duality could help one understand the backward theory even in nonneutral situations when various evolutionary forces are the causes of deviation to neutrality (see Crow and Kimura , Maruyama , Gillespie , and Ewens , for a discussion on various models of utmost interest in population genetics).
In this note, we focus on discrete nonneutral Wright-Fisher (WF) models and on the conditions on the bias probabilities under which forward branching population dynamics is directly amenable to a dual discrete ancestral coalescent. We emphasize that duality formulae still are of great use when considering discrete nonneutral Wright-Fisher models, at least for specific deviation forces to neutrality. It is shown that it concerns a large class of nonneutral models involving completely monotone bias probabilities. Several classical examples are supplied in the light of complete monotonicity. In the process leading us to focus on these peculiar bias models, some unsuspected evolutionary mechanisms of potential interest are introduced and discussed, as suggested by elementary algebra on completely monotone functions. We emphasize that the relevance of these novel bias mechanisms in Biology seems to deserve additional work and confrontation with real-world problems is urged for to pinpoint their biological significance.
We will finally briefly outline the content of this manuscript. Section 2 is designed to fix the background and ideas. We introduce some basic facts about the discrete-time forward (Section 2.2) and backward processes (Section 2.3) arising from exchangeable reproduction laws (Section 2.1). In Section 2.4, we introduce a concept of duality and briefly recall its relevance to the study of the neutral case problem. The basic question we address in subsequent sections is whether this notion of duality still makes sense in nonneutral situations. We start supplying important nonneutral examples in Section 3. In Section 4, we show that duality does indeed make sense in the framework of discrete nonneutral Wright-Fisher models, but only for the class of completely-monotone state-dependent transition frequencies. In Section 5, we show that most nonneutrality mechanisms used in the literature fall within this class, or are amenable to it via some “reciprocal transformation,” starting with elementary mechanisms and ending up with more complex ones. In Section 6, we show that duality can be used in nonneutral situations to compute the extinction probabilities (invariant measure) of the dual backward ancestral processif one knows the invariant measure (resp., extinction probabilities) of the forward branching process.
2. Discrete-Time Neutral Coalescent
In this section, to fix the background and notations, we review some well-known facts from the cited literature.
2.1. Exchangeable Neutral Population Models: Reproduction Laws (the Cannings Model )
Consider a population with nonoverlapping generations Assume the population size is constant, say ( individuals (or genes)) over generations. Assume the random reproduction law at generation is satisfying Here, is the number of offspring of gene We avoid the trivial case: One iterates the reproduction over generations, while imposing the following additional assumptions:
(i)exchangeability: for all permutations ;(ii)time-homogeneity: reproduction laws are independent and identically distributed (iid) at each generation This model, therefore, consists of a conservative-conditioned branching Galton-Watson process in , where (see Karlin-McGregor ).
Famous reproduction laws are as follows.
Example 2.1. The multinomial Dirichlet model: Multin-Dirichlet , where is a disorder parameter. With , admits the following joint exchangeable distribution on the simplex : where is the rising factorial of . This distribution can be obtained by conditioning independent mean Pòlya distributed random variables on summing to , that is to say, where When , this distribution reduces to the Wright-Fisher model for which Multin Indeed, admits the following joint exchangeable multinomial distribution on the simplex : This distribution can be obtained by conditioning independent mean Poisson distributed random variables on summing to . When is large, using Stirling formula, it follows that with joint finite-dimensional law: on Thanks to the product form of all finite-dimensional laws of , we get an asymptotic independence property of .
Example 2.2. In the Moran model, random permutation of In such a model, only one new gene per generation may come to life, at the expense of the simultaneous disappearance of some other genes.
2.2. Forward in Time Branching Process
Take a subsample of size from at generation Let This sibship process is a discrete-time homogeneous Markov chain, with transition probability It is a martingale, with state space , initial state , absorbing states and transient states The first hitting time of boundaries , which is is finite with probability and has finite mean. Omitting reference to any specific initial condition , the process has the transition matrix with entries given by (2.6). We have and , and is not irreducible. However, is aperiodic and (apart from absorbing states) cannot be broken down into noncommunicating subsets; as a result it is diagonalizable, with eigenvalues and .
Example 2.3 (Dirichlet binomial). With a -valued random variable with density beta which is a beta mixture of the binomial distribution Bin.
Example 2.4. The Wright-Fisher model has a Bin transition matrix
Remark 2.5 (statistical symmetry). Due to exchangeability of the reproduction law, neutral models are symmetric in the following sense. The transition probabilities of are equal to the transition probabilities of .
2.3. Backward in Time Process (neutral Coalescent)
can be defined as follows. Take a subsample of size from at generation Identify two individuals from at each step if they share a common ancestor one generation backward in time. This defines an equivalence relation between genes from the set . Define the induced ancestral backward process The ancestral process is a discrete-time Markov chain with transition probability: where, with number of equivalence classes of number of equivalence classes of clusters sizes of and a falling factorial is the probability of a -merger. This is the probability that randomly chosen individuals out of have distinct parents, merging classes and cluster sizes ,
If a unique multiple collision occurs of order .If a simple binary collision occurs involving only two clusters.If , simultaneous multiple collisions of orders occur. Thus, the jump's height of a transition is corresponding to a partition of integer into summands, each .
The chain's state-space is equivalence relations on (partitions of) ; it has dimension (a Bell number), where are the second-kind Stirling numbers.
The chain has initial state , and terminal absorbing state .
The Ancestral Count Process
Let The backward ancestral count process is a discrete-time Markov chain with transition probabilities [17, 19] This Markov chain has state-space , initial state , absorbing states The process has the transition matrix with entries given by (2.13). Note, by inclusion-exclusion principle, the alternative alternating expression of :
2.4. Duality (neutral Case)
We start with a definition of the duality concept which is relevant to our purposes.
Definition 2.6 (see ). Two Markov processes with state-spaces are said to be dual with respect to some real-valued function on the product space if :
We then recall basic examples of dual processes from the neutral and exchangeable population models (Möhle). The neutral forward and backward processes introduced in the two preceding subsections are dual with respect to the hypergeometric sampling without replacement kernels Namely, (i) reads Call type individuals the descendants of the first chosen individuals (allele ) in the study of the forward process; type individuals are the remaining ones (allele ). The left-hand-side of the above equality is an expression of the probability that -samples (without replacement) from population of size at time are all of type , given If these -samples are all descendants of ancestors at time , this probability must be equal to the probability that a -sample from population of size at time are all of type . This is the meaning of the right-hand-side.
Moreover, (ii) reads The left-hand-side is the probability that a -sample (without replacement) from population of size at time are all of type , given If these -samples are all descendants of ancestors at time , this probability must be equal to the probability that -samples from population of size at time are themselves all of type .
With the transpose of , a one-step () version of these formulae is where are matrices with entries and respectively, and the transition matrices of forward and backward processes. Note that the matrix is symmetric and left-upper triangular. The matrices and are both invertible, with respective entries The matrix is left-lower triangular, while is symmetric right-lower triangular. Thus, In any case, being similar matrices, and (or ) both share the same eigenvalues. If diagonalizing is known so that the diagonal matrix of the eigenvalues of , then, with or , diagonalizes and is obtained for free (and conversely). is the matrix whose columns are the right-eigenvectors of and is the matrix whose columns (rows) are the right eigenvectors (left-eigenvectors) of (of ). Similarly, if is the matrix whose rows are the left eigenvectors of , is the matrix whose rows (columns) are the left eigenvectors (right eigenvectors) of (of ). With the th row of and the th column of the spectral decomposition of is whereas, with the th column of and the th row of the one of reads In Möhle , a direct combinatorial proof of the duality result can be found (in the general exchangeable or neutral case); it was obtained by directly checking the consistency of (2.6), (2.13), and (2.16).
The duality formulae allow one to deduce the probabilistic structure of one process from the one of the other. The question we address now is does duality still make sense in nonneutral situations? We will see that it does in discrete nonneutral Wright-Fisher models, but only for some class of state-dependent transition frequencies.
3. beyond Neutrality (symmetry Breaking)
Discrete forward nonneutral models (with nonnull drifts) can be obtained by substituting where is the state-dependent Bernoulli bias probability different from identity (as in neutral case).
When particularized to the WF model, this leads to the biased transition probabilities In this binomial -sampling with replacement model, a type individual is drawn with probability which is different from the uniform distribution due to bias effects.
From this, we conclude (a symmetry breaking property) the following: the transition probabilities of , are and so, no longer coincide with the ones of The process , no longer is a martingale. Rather, if is concave (convex), , is a submartingale (supermartingale) because (resp., ).
In the binomial neutral Wright-Fisher transition probabilities, we replaced the success probability by a more general function . However, this replacement leaves open the question what model is in the background and what quantity the process really counts? A concrete model in terms of offspring variables must be provided instead. To address this question, we emphasize that the reproduction law corresponding to the biased binomial model is multinomial and asymmetric, namely, Multin, where and . We note that under our hypothesis, Due to its asymmetry, the law of the biased no longer is exchangeable.
We now recall some well-known bias examples arising in population genetics.
Example 3.1 (homographic model, selection). Assume where is a selection parameter. This model arises when gene (resp., ), with frequency (resp., ), has fitness (resp., ). The case arises when gene of type is selectively advantageous, whereas it is disadvantageous when .
Example 3.2 (selection with dominance). Assume In this model, genotype (resp., and ), with frequency (resp., and ) has fitness (resp., and ). is a measure of the degree of dominance of heterozygote . We impose and Note that the latter quantity can be put into the canonical form of deviation to neutrality: where the ratio appearing in the right-hand-side is the ratio of the difference of marginal fitnesses of and to their mean fitness. The case corresponds to balancing selection with .
Example 3.3 (quadratic model). With a curvature parameter, one may
choose If , ,
this bias appears in a discrete -sex population model [9, 10]. We will give below an interpretation of
this quadratic model when in terms of a joint one-way mutations and
neutrality effects model.
We can relax the assumption by assuming ,
Example 3.4 (affine model). Take, for
example, where are mutation probabilities, satisfying It corresponds to the mutation scheme .
To avoid discussions of intermediate cases, we will assume that and . In this case, the matrix is irreducible and even primitive and all
states of this Markov chain are now recurrent. We have and and the boundaries and no longer are strictly absorbing as there is a
positive reflection probability inside the domain
For reasons to appear now, we will be only interested in functions such that is a completely monotone function (CM) on that is, satisfying for all order- derivatives of . If is such that is CM, we will call it an admissible bias mechanism.
4. Nonneutral Wright-Fisher Models and Duality
Preliminaries. Let be a -vector of -valued numbers. Define the backward difference operator acting on by: We have and so forth, and, starting from the endpoint Let be some continuous function Consider the -vector , sampling at points The function is said to be -completely monotonic if , for all and all Let be two continuous functions on . Let . With the pointwise product of and , assuming both functions to be -completely monotonic, so will be by the Leibniz rule. In particular, if is -completely monotonic, so will be its integral powers . Our main result is as follows.
Theorem 4.2. Consider a nonneutral WF forward model on , with continuous, nondecreasing bias satisfying This process has forward transition matrix There exists a Markov chain on such that are dual with respect to if and only if is completely monotone on . In this case, the transition probability matrix of is is a stochastic matrix if and only if else, if , the matrix is substochastic.
Proof. Developing we easily obtain This entry is nonnegative if and
only if ,
for all But due to the above argument on -complete monotonicity of integral powers, this
will be the case if and only if ,
for all .
As this must be true for arbitrary value of population size
function has to be -completely monotonic. Adapting now the
arguments of Theorem 2 developed in [20, page 223], for absolutely monotone functions on ,
this will be the case if and only if is a completely monotone function on in the sense that Next, since is a simple back-shift, and if is CM, is a stochastic matrix if and only if ; else, if ,
the matrix is substochastic.
We note that the first column of the matrix is whereas its first line is , expressing, as required, that the state of is absorbing.
We show here that most of the simplest nonneutrality mechanisms used in the literature fall within the class which we would like to draw the attention on, or are amenable to it via some “reciprocal transformation” which we define. Elementary algebraic manipulations on CM functions allow to exhibit a vast class of unsuspected mechanisms. Note that in some cases, their biological relevance remains to be elucidated. The results presented in this section seem to be new. They serve as an illustration of our theorem.
5.1. Elementary Mechanisms
Assume first that corresponding to the simple neutral case. Then is completely monotone on . With the second kind Stirling numbers, we get a lower left triangular stochastic transition matrix The diagonal terms (eigenvalues) are all distinct with The matrix is stochastic. Due to triangularity, ancestral process is a pure death Markov process which may be viewed as a discrete coalescence tree.
From Example 3.4 (mutation). Assume that (3.10) holds where are mutation probabilities. Then, with , is completely monotone on if and only . In this case, is again lower left triangular (a pure death process). We have in terms of generalized Stirling numbers . We have and the spectrum of is real. When , this matrix is substochastic with .
A particular case deals with one-way mutations .
If , else. Further, .If , else. The corresponding matrix is stochastic.
From Example 3.3 (quadratic). Assume , as in (3.9). Then which is completely monotone if and only if . The case is the neutral case, whereas appears in a -sex model of Möhle. In this quadratic case, since if then if and so is a Hessenberg-like matrix. Note that .
From the selection Example 3.1, when (3.6) holds is CM whenever selection parameter . The induced matrix is stochastic. It is no longer lower left triangular so that the ancestral no longer is a pure death process, being rather a birth and death process. The induced coalescence pattern no longer is a discrete tree, but rather a graph (a discrete version of the ancestral selection graph of Neuhauser-Krone ).
From Example 3.2 (selection with dominance). The corresponding mechanism (3.7) with parameters () satisfying and is CM if and only if and . The case corresponds to directional selection, where genotype has highest fitness compared to aa's and the heterozygote class has intermediate fitness compared to both homozygote classes. In this situation, marginal fitness of exceeds the one of and selective sweep is expected. When , allele is dominant to , whereas when , allele is recessive to (a stabilizing effect slowing down the sweep). Critical value is a case of pure genic balancing selection.
Example 5.1. Consider the mechanism for some . The function is CM if and only if Although this model seems quite appealing, we could find no reference to it in the specialized mathematical genetics literature.
Example 5.2 (reciprocal mechanism). If is not admissible in that is not CM, it can be that is itself admissible. As observed before, if has transition probabilities given by Bin arises in the transition probabilities of Indeed, such transitions are Bin distributed.
If is the selection mechanism of Example 3.1, (3.6), with (not admissible), is itself an admissible selection mechanism because it has reciprocal selection parameter If is the mechanism of Example 3.2, namely, (3.7), with parameters , then is itself a selection with dominance mechanism with reciprocal parameters and Assuming , is not admissible whereas is because and . Similarly, when , the mechanism is not admissible but, from Example 5.1, is as follows.
5.2. Bias Mechanisms with Mutational Effects
Let be the mutational bias mechanism (with ). Let be a bias mechanism such that is CM with . Then, is such that is CM. It is, therefore, admissible and adds mutational effects to the primary mechanism For example, is a mechanism of selection combined with mutational effects. We have , The mechanisms obtained in this way all share the specificity .
Note that, except for the mutational affine mechanism, it is not true in general that whenever and are two admissible bias mechanisms, then is admissible.
5.3. Joint Bias Effects and Compound Bias
Let and be two admissible bias in that and are both completely monotone. Then Thus, with the probabilistic product in Whenever a WF model is considered with bias obtained from two distinct bias and , we call it a WF model with joint bias effect.
Example 5.3 (joint selection and mutational effects). Let and . We get with , This mechanism differs from the traditional mechanism of selection combined with mutational effects.
Example 5.4 (joint mutation and neutral effects). Let and .
We get with , When (one-way mutations), we recover the quadratic
mechanism with curvature parameter This finding justifies some interest into the
quadratic mechanisms with .
With , the reproduction law of each elementary effect is Multin where Then, Multin, where is easily obtained componentwise by We let Multin. It is the reproduction law of a WF model obtained jointly from the two bias and .
Let be an absolutely monotone function satisfying for all th derivatives of , for all . Such functions are well known to be probability generating functions (pgfs) of -valued random variables, say , that is to say, Clearly, if is CM on , then so is Thus, is an admissible bias mechanism in that is CM. We call it a compound bias.
Example 5.5. The general mechanism with mutational effects is in this class. Indeed, and so which is absolutely monotone as soon as .
Example 5.6. With , taking or , the pgf of a Poisson (or shifted-Poisson) random variable, is admissible if is. Note that if is of the form , where is the pgf of the Poisson random variable, then is admissible, for all , a property reminiscent of infinite divisibility for pgfs. Taking or , , the pgf of a geometric (or shifted-geometric) random variable, or is admissible if is (with ). In the external latter mechanism, one recognizes the one in (3.6) occurring in the model with selection of Example 3.1.
Example 5.7. Let with as in Example 5.1. Then, , where , is admissible. Note that . The reciprocal function also interprets as an absolutely monotone discrete-stable pgf (see Steutel and van Harn ). It is not admissible.
Proceeding in this way, one can produce a wealth of admissible bias probabilities , the signification of which in population genetics remaining though to be pinpointed, in each specific case study.
6. Limit Laws
Consider a WF model on with forward transition matrix with admissible bias Define as the dual Markov chain on with transition probability Then, are dual with respect to , to wit We will distinguish two cases.
Case 1. Assume first that Let and The line vector is the left eigenvector of associated to the eigenvalue It is the (unique) invariant probability
measure (stationary distribution) of
If this stationary distribution exists, then, using duality formula, necessarily, as with probability The numbers are the extinction probabilities of the dual process started at As is well known, is the unique solution to with .
Remark 6.1. Typical situations where has an invariant measure is when mutational
effects are present, and more generally when the bias mechanism satisfies and .
In this situation, the forward stochastic transition matrix has an algebraically simple dominant
By Perron-Frobenius theorem where The invariant probability measure can be
approximated by subsequent iterates of ,
the convergence being exponentially fast, with rate governed by the second
largest eigenvalue. Of course, detailed balance (stating that ) does not hold here and the forward chain in
equilibrium is not time-reversible.
In these recurrent cases, the dual ancestral process started at gets extinct with probability . The numbers are the probabilities that it gets killed before getting extinct; in other words, are the probabilities that first hits an extra coffin state, say before hitting .
In terms of moments, by the duality formula, we conclude that relating -factorial moments of to the extinction probabilities of given We also have and so the probability generating function of can be expressed as (): in terms of the Bernstein-Bézier polynomial of .
Let The vector is the right eigenvector of associated to the eigenvalue In this case, the matrix is sub-stochastic and the extinction probability of given is less than one. Thanks to duality, we have where the matrix is symmetric whereas the matrix is symmetric right lower triangular, with Thus, showing that and are related through The knowledge of the invariant measure of the forward process allows one to compute the extinction probabilities of the dual backward ancestral process (and conversely).
Consider the discrete WF model with mutations of Example 3.4. In this case, as regardless of and has an invariant measure which is difficult to compute. Looking at the backward process, the matrix is substochastic (if ) and lower left triangular. Due to triangularity, the right eigenvector of can easily be computed explicitly in terms of We, therefore, get the following alternating expression for the invariant measure: Concerning moments, for instance, we have so that ; from (5.2) we also have allowing to compute and then the variance of We get suggesting (when ) a central limit theorem for as grows large:
Case 2. Conversely, assume now that given so that boundaries are absorbing. Then, the ancestral process possesses an invariant distribution, in that In terms of moments, the duality formula means that relating -factorial moments of to the extinction probabilities of given Stated differently, the probability generating function of is () Let , with Then, using duality, is the right eigenvector of associated to the eigenvalue Thus, and are related through The knowledge of the extinction probabilities of the forward process allows one to compute the invariant measure of the dual backward ancestral process (and conversely).
Typical situations where boundaries are absorbing to occur when and . The simplest case is the neutral case, but the nonneutral selection and selection with dominance mechanisms or the quadratic mechanism (Examples 3.1, 3.2 and 3.3) are also in this class. For instance,
(i)in the neutral case, Thus, and as the degenerate state reached when the most recent common ancestor (MRCA) is attained,(ii)nondegenerate solutions of are obtained when considering bias mechanisms with and ,(iii)consider any biased WF model with and for which , Then, due to large sample asymptotic independence,where is an iid sequence with Poisson (as it can easily be checked by the Poisson limit to the binomial distribution). In this case, the limiting extinction probability of given is , where is the smallest solution to the fixed point equation is the singleton extinction probability of a supercritical Galton-Watson process with offspring distribution Poisson. More precisely, proceeding as in Möhle [9, Theorem 4.5], we have showing that the convergence of to is of order . As a result, we get the asymptotic normality Intuitively, , showing that and