We consider a population with two types of individuals, distinguished by the resources required for reproduction: type-\(0\) (small) individuals need a fractional resource unit of size \(\vartheta \in (0,1)\), while type-\(1\) (large) individuals require \(1\) unit. The total available resource per generation is \(R\). To form a new generation, individuals are sampled one by one, and if enough resources remain, they reproduce, adding their offspring to the next generation. The probability of sampling an individual whose offspring is
small is \(\rho_{R}(x)\), where \(x\) is the proportion of small individuals in the current generation. We call this discrete-time stochastic model a two-size Wright–Fisher model, where the
function \(\rho_{R}\) can represent mutation and/or frequency-dependent selection. We show that on the evolutionary time scale, i.e. accelerating time by a factor \(R\), the frequency
process of type-\(0\) individuals converges to the solution of a Wright–Fisher-type SDE. The drift term of that SDE accounts for the bias introduced by the function \(\rho_{R}\) and the
consumption strategy, the latter also inducing an additional multiplicative factor in the diffusion term. To prove this, the dynamics within each generation are viewed as a renewal process, with the population size corresponding to the first passage time
\(\tau(R)\) above level \(R\). The proof relies on methods from renewal theory, in particular a uniform version of Blackwell’s renewal theorem for binary, non-arithmetic random variables,
established via \(\varepsilon\)-coupling.
In population genetics, the Wright–Fisher model [1], [2] is one of the most
prominent and widely used models in discrete time when aiming to describe the evolution of the type composition of individuals under the influence of evolutionary forces such as mutation, selection, gene flow and environmental changes. In its original
form, the model considers the dynamics of a population of genes of two allelic types under neutral selection, but it has by now been generalized in many directions, see e.g. [3]–[6] for recent work of this kind.
In most of these generalizations, it is typically assumed that the population consists of a fixed number of haploid individuals that reproduce asexually. However, differences in consumption strategies within the population are often overlooked. This
assumption can be reasonable in cases where all individuals follow the same consumption strategy, or when the resources required for reproduction are negligible in comparison to the total resource pool. For instance, the standard Wright–Fisher model falls
into the first category if we interpret the fixed population size as the constant amount of available resources, assuming that each individual requires exactly one unit of resource to reproduce and is selected at random with replacement.
From a biological perspective, a classic framework addressing reproductive resource trade-offs is the \(r\)- and \(K\)-selection theory introduced in [7]. It postulates a trade-off between offspring quantity and quality: \(r\)-strategists (e.g., mice) produce many offspring at low cost, while \(K\)-strategists (e.g., elephants) produce fewer, costlier offspring. The \(r\)-strategy is favored in unstable environments; the \(K\)-strategy in stable ones.
In a different context, [8] examined the role of resource use in species coexistence. There, species capable of surviving with minimal resources are
favored over those with higher demands, a principle known as the \(R^*\)-rule.
Recently, González Casanova et al. (2020) [9] proposed an extension of the Wright–Fisher model with selection, incorporating reproductive costs to explore
their evolutionary consequences. In their framework, the population consists of two types of individuals, where reproductive success is influenced by both genic selection and a resource consumption strategy. Notably, these consumption strategies depend
solely on the type of the individual. This approach provides a stepping stone toward understanding how resource allocation in reproduction can influence evolutionary fitness, potentially in ways that deviate from traditional biological models.
More specifically, it is assumed in [9] that individuals either require a fractional resource unit \({\vartheta}\in (0,1)\)
or a full resource unit \(1\) to reproduce and that individuals requiring \(1\) resource unit per reproduction have a selective advantage. This differentiation leads to two distinct
reproductive strategies based on resource consumption, which can influence the evolutionary trajectory of the population. Mutation or more complex forms of selection are not considered. As a result of this framework, the population size is no longer fixed
but becomes a stochastic variable that fluctuates over time (see [9] for comparison to other models with variable population sizes). The large population limit
of the model treated in [9] is dual to a branching process with interaction. This duality relation is a special case of a more general result stated in [10]. We refer to [11] and [12] for further discussion on branching processes with interactions, where the implications of stochastic fluctuations in population size and their impact on evolutionary processes are explored in more
depth.
The objective of this paper is to study an extension of the model described above by incorporating general forms of (frequency-dependent) selection and mutation. More precisely, we consider a discrete-time, finite population model with a fixed amount
\({R}\in {\mathbb{R}}_{+}\) of resources available for reproduction in each generation. These are consumed each time a new individual is produced. There are two types of individuals, called type-\(0\) and type-\(1\), whose distinguishing feature is the amount of resources needed to produce them. Namely, it takes a fraction \({\vartheta}\in(0,1)\) of
resource units to produce a type-\(0\) individual and one unit of resources to produce a type-\(1\) individual. To form a new generation, the individuals are sequentially sampled (with
replacement) from the current generation, each time subtracting the required amount of resources from those still available. This continues until the quantity \({R}\) is completely used up or exceeded, where one can imagine
that in the latter case the missing quantity of resources for the production of the last individual is taken from an internal storage. The amount of resources needed to produce a new generation may therefore exceed \({R}\)
(but not \({R}+1\)). We usually interpret the two types as two sizes (see Remark 3 and Figure 2) and therefore refer to this extension as the two-size Wright–Fisher model. We examine the evolutionary impact of different consumption strategies on the frequency process of type-\(0\) individuals within this framework, developing a new method for a comprehensive treatment, based on renewal theory. The limiting process is characterized as the solution of a stochastic differential equation (SDE). It is
also worth noting that in the special case where our model coincides with that of [9], our main theorem shows that the drift term derived in [9] is not correct, as will be further explained after the statement of our result (see Remark 5). A similar family of SDEs was considered in [13], which studied the selective effects of within-generation
variance on the offspring number, see Remark 6.
A more formal introduction of the model is provided in Section 2.1, but already at this point the connection of the one-step dynamics to renewal theory is evident: the amount of consumed resources equals the sum of
iid positive random variables (taking values \({\vartheta}\) or \(1\)), and hence a renewal process with Bernoulli-type increments. The (varying) population size corresponds to the first
passage time \(\tau({R})\) above level \({R}\) of this renewal process, see Section 2.3 for details.
Our main goal is to establish the asymptotic behavior of the frequency process as the amount of resources \(R\) tends to infinity. Let \(X_{t}^{{R}}\), \(t\geq0\), denote the proportion of type-\(0\) individuals at time \(t\) in the model with available resources \({R}\). Our main
result, Theorem 1, states that, under suitable conditions, the time-scaled process \(\big(X_{\lfloor {R}t\rfloor}^{{R}}\big)_{t\geq 0}\) converges, as
\({R}\to\infty\), to the solution of the SDE \[{\rm d}X_{t}\;=\;\big(-(1-{\vartheta}) X_{t}(1-X_{t}) + \rho(X_{t})\big)\,{\rm d}t + \sqrt{X_{t}(1-X_{t})\big(1-(1-{\vartheta}) X_{t}\big)}\, {\rm
d}B_{t},\] where \(B\) denotes a standard Brownian motion and \(\rho:[0,1]\to{\mathbb{R}}\) is an appropriate Lipschitz function summarizing the effects of selection and mutation.
Note that the size parameter \({\vartheta}\) affects both the drift term and the diffusion term of the SDE. We will refer to the solution of this SDE as the two-size Wright–Fisher diffusion.
The result will be obtained by showing uniform convergence of the respective generators. Owing to the connection to renewal theory, this uniform convergence leads to certain uniform renewal-type convergence results for random walks with Bernoulli-type
increments, as indicated above. We establish uniform versions of the elementary renewal theorem and a uniform version of Blackwell’s renewal theorem. These versions cannot be deduced from existing uniform renewal theorems in the literature (see [14], [15]) and may therefore be of interest in their own right.
An obvious modification of the model is to complete a generation with the last individual whose reproduction costs are fully covered by the remaining resources. This variant is shortly described in Subsection 5
together with a statement of a counterpart of Theorem 1.
We have organized our work as follows. Section 2 introduces the model in detail, states our main result (Theorem 1) and also explains the connection to
renewal theory. In Section 3, the required uniform convergence results from renewal theory are established, which then allow us to give the proof of Theorem 1 in Section 4. We finish with two sections providing short discussions of the afore-mentioned model variant (Section 5) and of the two-size Wright–Fisher
diffusion (Section 6).
We consider the evolution in discrete time (generations) of a finite population of haploid individuals, each of which can be of type \(0\) or type \(1\). A constant amount \(R \in {\mathbb{R}}_{+}\) of resources is available in each generation, reserved for reproduction and consumed in the formation of the next generation. The cost (in resource units) of placing an offspring of type \(0\) into the next generation is \({\vartheta}\in (0,1)\), while the cost for an offspring of type \(1\) is \(1\). If the
relative proportion of type-\(0\) individuals in the current generation is \(x \in [0,1]\), the next generation is formed by sequentially sampling (with replacement) from the current
generation. Each sampled individual produces an offspring according to the following rules.
If \(k \in {\mathbb{N}}_{0}\) individuals of types \(r_1, \ldots, r_k\) have already been placed in the next generation at cost \[R_k \mathrel{\vcenter{:}}=
\sum_{i=1}^{k} \big((1 - r_i){\vartheta}+ r_i\big)\] (with \(R_k \mathrel{\vcenter{:}}= 0\) if \(k = 0\)), and if \(R_k < R\), the \((k+1)\)-th individual is created as follows. First, a parent is randomly selected according to fitness: the parent is of type \(\hat{r}_{k+1} = 0\) with probability \(s_{R}(x)\) and of type \(\hat{r}_{k+1} = 1\) with probability \(1 - s_{R}(x)\). It is natural to assume that \(s_{R}(0) = 0\)
and \(s_{R}(1) = 1\). The selected parent then produces the \((k+1)\)-th individual, which mutates to type \(i \in \{0,1\}\) with probability \(\beta_{i,R}\) (so \(r_{k+1} = i\)), or retains the parental type with probability \(1 - \beta_{0,R} - \beta_{1,R}\) (so \(r_{k+1} =
\hat{r}_{k+1}\)).
If \(R_{k+1} < R\), replace \(k\) by \(k+1\) and return to Step (1). Otherwise, if \(R_{k+1} \geq R\), the
reproduction process terminates, and the next generation consists of the first \(k+1\) individuals.
The function \(s_{R}\) models frequency-dependent selection, and the constants \(\beta_{0,R}\) and \(\beta_{1,R}\) represent mutation probabilities.
Mutations are parent-independent, so silent events are allowed, where an individual of type \(i\) mutates to type \(i\). By construction, the probability of selecting a parent that places an
offspring of type \(0\) in the next generation is \[\label{sel-mut}
\rho_{{R}}(x)\mathrel{\vcenter{:}}= s_{R}(x)\big(1-\beta_{1,R}\big)+\big(1-s_{R}(x)\big)\beta_{0,R}.\tag{1}\] The function \(\rho_{R}:[0,1]\to[0,1]\) will play a key role in our analysis, as will become
apparent in Section 2.3. In fact, except for Remarks 5 and 7, our results will be stated in terms of the function \(\rho_{R}\), without assuming the particular form 1 . Moreover, the next remark shows that any function
\(\rho_{R}:[0,1]\to[0,1]\) that attains its maximum and minimum at the boundary points \(0\) and \(1\) can always be interpreted as resulting from a
combination of frequency-dependent selection and mutation.
Figure 1: Simulation of the population size \(M_{n}^R\) (left) and the proportion of type-\(0\) individuals \(X_{n}^R\)
(right) for six populations of a two-size Wright–Fisher model with parameters \({R}=5\), \({\vartheta}=0.3\), \(\rho_{R}(x)=x\) and \(x_{0}=0.5\). The horizontal gray lines show the respective codomain..
Remark 1. Although the selection-mutation mechanism underlying Eq. 1 is biologically intuitive, our analysis is not restricted to that specific form of the function \(\rho_{R}:[0,1]\to[0,1]\). However, if we are given a function \(\rho_{R}\) such that \[\rho_{R}(0) \wedge \rho_{R}(1) \leq \rho_{R}(x) \leq \rho_{R}(0) \vee
\rho_{R}(1) \quad \text{for all } x \in [0,1],\] we can always express it in the form 1 by setting \[\beta_{0,R} \mathrel{\vcenter{:}}= \rho_{R}(0), \quad \beta_{1,R} \mathrel{\vcenter{:}}= 1 -
\rho_{R}(1), \quad \text{and} \quad s_{R}(x) \mathrel{\vcenter{:}}= \frac{\rho_{R}(x) - \rho_{R}(0)}{\rho_{R}(1) - \rho_{R}(0)},\] provided that \(\rho_{R}(0) \neq \rho_{R}(1)\). In the case where \(\rho_{R}(0) = \rho_{R}(1)\), the function \(s_{R}\) can be chosen arbitrarily. Note that the assumptions made for the function \(\rho_{R}\) imply that \(\beta_{0,R},\beta_{1,R}\in [0,1]\), \(s_{R}:[0,1]\to[0,1]\), \(s_{R}(0)=0\), and \(s_{R}(1)=1\).
Note that the population size is random (as \({\vartheta}<1\)) and varies from generation to generation. Let \(M_{n}^{{R}}\) denote the population size in generation \(n\in{\mathbb{N}}\) and \(X_{n}^{{R}}\) the relative proportion of type-\(0\) individuals in this generation. Initially, there are \(M_{0}^{{R}}=m_{0}\) individuals and a relative proportion \(X_{0}^{{R}}=x_{0} \in [0,1]\) of type-\(0\) individuals. We refer to this model as the two-size
Wright–Fisher model with frequency-dependent selection and mutation. Figure 1 shows a realization of the model.
Remark 2. Assume \(X^R_{n}=x\) and that \(R\) is large. From the construction of the model it is easy to see that the population size satisfies \[M_{n}^R= \frac{R}{1-(1-{\vartheta})x}+O(1).\]
Remark 3. We typically interpret types as sizes (lengths), where individuals can be of length \({\vartheta}\) or length \(1\), and refer to them as small (type \(0\)) or large (type \(1\)). During the reproduction step, new individuals are added to the population until the total size reaches \({R}\), which
corresponds to the space available per generation.
Figure 2 illustrates a reproduction step in this model with the above interpretation in mind.
Figure 2: Illustration of a sample construction of generation \(n+1\) in a two-size Wright–Fisher model. The two sizes of the rectangles correspond to the two types: small rectangles represent
small individuals. Each individual in generation \(n\) is assigned a color and a pattern for identification. The individuals from generation \(n\) are sampled, and their offspring are placed
one after another from left to right until the capacity \(R\) is reached. In this example three individuals from generation \(n\) place exactly one offspring in generation \(n+1\), one individual places two offspring and two individuals do not place any offspring. Note that the offspring of the small yellow individual mutates from small to large.
The main objective of this paper is to demonstrate that, as the resource parameter \({R}\) becomes large, the appropriately scaled version of our model converges to a Wright–Fisher-type diffusion. This limiting process
features a drift term that incorporates the effects of the stopping rule, frequency-dependent selection, and mutation, along with a non-standard diffusion coefficient.
Theorem 1. Let \(\rho:[0,1] \to {\mathbb{R}}\) be a Lipschitz continuous function. Suppose that \(X_{0}^{{R}}\to x_{0}\in[0,1]\) in probability and that \({R}(\rho_{{R}}(x)-x) \to \rho(x)\) uniformly in \(x\in[0,1]\), as \({R}\to\infty\). Then the process \(\big(X_{\lfloor
{R}t\rfloor}^{{R}}\big)_{t\geq 0}\) converges in distribution to the solution \((X_{t})_{t\geq 0}\) of the stochastic differential equation \[\label{eq46SDE}
{\rm d}X_{t}\;=\;\big( - (1-{\vartheta}) X_{t} (1-X_{t}) + \rho(X_{t}) \big)\,{\rm d}t\,+\,\sqrt{X_{t}(1-X_{t})\big(1-(1-{\vartheta}) X_{t}\big)}\,{\rm d}B_{t},\tag{2}\] with initial condition \(X_{0}=x_{0}\),
where \(B\) denotes a standard Brownian motion.
We emphasize that, unlike \(\rho_{{R}}\), the parameter \({\vartheta}\) does not scale with \({R}\). Figure 3 shows
simulations of the finite model alongside sample paths of the limiting SDE 2 .
Remark 4. The existence and uniqueness of the solution to the SDE 2 follow from [16]. Moreover,
since \(\rho_{R}(x) \in [0,1]\) for all \(x \in [0,1]\), the limiting function \(\rho\) in Theorem 1 must satisfy the boundary conditions \(\rho(0) \geq 0\) and \(\rho(1) \leq 0\). This ensures that the solution to the SDE 2 remains
within the interval \([0,1]\) for all \(t \geq 0\). Define \(\beta_{0} \mathrel{\vcenter{:}}= \rho(0) \geq 0\), \(\beta_1
\mathrel{\vcenter{:}}= -\rho(1) \geq 0\), and introduce the function \(\sigma(x) \mathrel{\vcenter{:}}= \rho(x) - \beta_{0}(1 - x) + \beta_1 x\). Then \(\rho\) can be decomposed as
\[\label{eq4695rho95decom}
\rho(x) = \sigma(x) + \beta_{0}(1 - x) - \beta_1 x.\tag{3}\] Since \(\rho\) is Lipschitz continuous by assumption, the same holds for \(\sigma\). Additionally, \(\sigma\) satisfies the boundary conditions \(\sigma(0) = \sigma(1) = 0\). Thus, this decomposition highlights that \(\rho\) can be interpreted as comprising a
frequency-dependent selection component (represented by \(\sigma\)) and a mutation component (captured by \(\beta_{0}\) and \(\beta_1\)). Furthermore, if
\(\rho_{R}\) admits the decomposition 1 with \(\beta_{0,R}, \beta_{1,R} \in [0,1]\) and \(s_{R} : [0,1] \to [0,1]\) satisfying
\(s_{R}(0) = 0\) and \(s_{R}(1) = 1\), then the uniform convergence of \(R(\rho_{R}(x) - x)\) to \(\rho(x)\) as \(R \to \infty\) implies that \[R\, \beta_{i,R}\xrightarrow[R\to\infty]{}\beta_i,\,i\in\{0,1\},\quad\text{and}\quad \sup_{x\in[0,1]}\lvert
R\big(s_{R}(x)-x\big)-\sigma(x)\rvert\xrightarrow[R\to\infty]{}0.\]
Figure 3: Simulations of the evolution of the proportion of small individuals in the finite model (left) with \(\rho_{{R}}(x) = x\), and sample trajectories of the limiting SDE (right) with
\(\rho \equiv 0\). In both cases, the parameters are \({\vartheta}= 0.6\) and \(x_{0} = 0.5\). The finite model was simulated for 3000 generations with \(R = 3000\). SDE trajectories were generated using the Euler method with step size \(h = 1/3000\)..
Remark 5. Theorem 1 extends Theorem 1 in [9], which considers the special case of
genic selection favoring type-1 individuals without mutation, i.e. \[s_{R}(x)\,=\,\frac{(1-sR^{-1})x}{1-sR^{-1}x}\quad \text{and} \quad \beta_{0,R}\,=\,\beta_{1,R}\,=\,0,\] so that \(\rho_{R}(x) =
s_{R}(x)\) and \(\rho(x) = -s x(1 - x)\). In [9], this setup is referred to as the Wright–Fisher model with
efficiency, where "efficient" denotes the small individuals. We avoid the term “efficiency” here, as it indicates an inherent advantage for small individuals – an interpretation not supported by our findings. Theorem 1 reveals that the drift term derived in [9] is in fact incorrect. That work claims that the consumption
strategy – represented by the parameter \(\vartheta\) – has no effect on the drift term in the diffusion limit, which is asserted to coincide with the drift in the classical Wright–Fisher diffusion with genic selection.
However, our results show that the drift consists of two components:
A stopping bias term, explicitly depending on the size parameter \(\vartheta\), which encodes a disadvantage for small individuals introduced by the stopping rule. This effect is analogous to the waiting-time
paradox, which in our context implies that the event that the last individual in a generation is large has at least probability \(1 - \rho_{R}(x)\); see Subsection 3.1. The
magnitude of this disadvantage scales with \(1 - \vartheta\), meaning that smaller values of \(\vartheta\) intensify the effect.
A selection term, accounting for the sampling probabilities \(\rho_{{R}}\), which reduces to the drift term obtained in [9] in their particular setting.
The incorrect conclusion in [9] stems from an assumption of exchangeability, a standard property in classical population genetics models. However, this
assumption fails for the two-size Wright–Fisher model, as the stopping rule introduces a structural bias favoring large individuals. This breakdown of exchangeability will become evident in our renewal-theoretic analysis.
To corroborate our theoretical results, we performed simulations whose outcomes are displayed in Figure 4. As predicted by Theorem 1, the quantity
\(R\,{\mathbb{E}}_{x}[X_{1}^{R} - x]\) from the finite model also approximates the drift term in the limiting SDE 2 ; see also 8 .
Figure 4: Approximation (blue) of \(R\,{\mathbb{E}}_{x}[X_{1}^{R} - x_{0}]\) in the two-size Wright–Fisher model without selection (left) and with genic selection favoring small individuals
(right). For each \(x_{0} = i/100\), \(i \in \{0,\dots,100\}\), we approximated \(R\, {\mathbb{E}}_{x}[X_{1}^{R} - x_{0}]\) by simulating \(R(X_{1}^{R} - x_{0})\)\(10^6\) times and computing the mean. The small individuals’ size parameter was set to \({\vartheta}= 0.3\), and the resource capacity
was \(R = 1000\). The theoretical drift term \(d(x_{0}) = (-(1 - \vartheta) + s)x_{0}(1 - x_{0})\) from the SDE 2 is plotted in green..
Remark 6. Although the diffusion coefficient in 2 differs from the classical Wright–Fisher form \(\sqrt{x(1 - x)}\), it is not new in the population genetics literature. For instance,
Gillespie [13] derived the stochastic differential equation \[{\rm d}\widetilde{X}_{t}\;=\;\big((\sigma_1^2 -
\sigma_{0}^2) + (\mu_{0} - \mu_1)\big)\widetilde{X}_{t}(1 - \widetilde{X}_{t})\,{\rm d}t + \sqrt{\widetilde{X}_{t}(1 - \widetilde{X}_{t})\big(\sigma_{0}^2 \widetilde{X}_{t} + \sigma_1^2(1 - \widetilde{X}_{t})\big)}\,{\rm d}B_{t}\] as an
approximation to the type composition in a discrete-time population model with two types of individuals. For comparison with our model, we refer to them as type-\(0\) and type-\(1\). The two
types differ in the mean and variance of their offspring numbers: the mean number of offspring for type-\(i\) individuals is \(1 + \mu_i\), and the variance is \(\sigma_i^2\). In this formulation, the limiting process \(\widetilde{X}_{t}\) tracks the proportion of type-\(0\) individuals. Specializing to the case \(\sigma_{0}^2 = {\vartheta}\), \(\sigma_1^2 = 1\), and \(\mu_{0} = \mu_1\), Gillespie’s diffusion reduces to \[{\rm
d}\widetilde{X}_{t}\;=\;(1 - {\vartheta})\widetilde{X}_{t}(1 - \widetilde{X}_{t})\,{\rm d}t + \sqrt{\widetilde{X}_{t}(1 - \widetilde{X}_{t})\big(1 - (1 - {\vartheta})\widetilde{X}_{t}\big)}\,{\rm d}B_{t},\] which differs from our SDE 2 with \(\rho(x) = 0\) only in the sign of the drift term. In our model, type-\(0\) individuals (small-sized) are at a disadvantage due to the stopping rule, whereas
in Gillespie’s model, type-\(0\) individuals (those with lower offspring variance) are favored. One of the central conclusions in [13] is that reduced variance in offspring number can confer a selective advantage.
Despite the distinct modeling assumptions, the agreement in the diffusion terms is not coincidental. It reflects the fact that differences in size (in our model) and differences in offspring variance (in Gillespie’s model) can lead to the same effective
population size. This shared feature explains the identical form of the diffusion coefficient in both settings; see Remark 2 and [13].
Remark 7. We now provide intuition for the sampling and reproduction mechanisms underlying our model. In general, selection governs how parents are sampled, while mutation determines how offspring are produced. The combined effect of these
evolutionary forces is encoded in the function \(\rho_{R}\), which specifies the probability that a sampled individual produces a small offspring.
Below, we present several common scenarios covered by our framework and state the corresponding functions \(\rho_{R}\) (from the finite model) and \(\rho\) (from the drift term in 2 ):
No selection, no mutation: \[s_{R}(x)\,=\,x \quad\text{and}\quad\beta_{0,R}\,=\,\beta_{1,R}\,=\,0,\] so that \(\rho_{R}(x)=x\) and \(\rho(x)=0\).
Genic selection (favoring small individuals): \[\begin{align}
s_{R}(x)\,=\, \frac{(1+s{R}^{-1})x}{1+s{R}^{-1}x} \quad\text{and}\quad\beta_{0,R}\,=\,\beta_{1,R}\,=\,0.
\end{align}\] Consequently \(\rho_{R}(x)=s_{R}(x)\) and \(\rho(x)=s x(1-x)\), with \(s\geq0\). The adaptation to genic selection favoring large
individuals is straightforward and yields \(\rho(x)=-sx(1-x)\), again with \(s\geq 0\), see Theorem 1 and compare with
[9].
Fittest-type-wins selection (favoring small individuals): \[s_{R}(x)=\,1-{\mathbb{E}}\big[ (1-x)^G \big] \quad\text{and}\quad\beta_{0,R}\,=\,\beta_{1,R}\,=\,0,\] where \(G\) is a
\({\mathbb{N}}\)-valued random variable with \[{\mathbb{P}}(G=1)\,=\,1-\frac{1}{{R}} \quad \text{and}\quad{\mathbb{P}}(G=k)\,=\,\frac{s_{k-1}}{{R}}\text{ for }k\geq 2,\] and weights \(s_k \geq 0\) satisfying \(\sum_{k = 1}^{\infty} s_k = 1\). In this case, \(\rho_{R}(x) = s_{R}(x)\) and the limiting function is \[\rho(x)\,=\,s(x)x(1-x) \quad \text{ with } s(x)\,=\,\sum_{k=1}^{\infty} s_{k} (1-x)^{k}.\] The variable \(G\) can be interpreted as the number of "potential parents" in the underlying ancestral
picture (see [17], [5]).
Diploid selection: \[s_{R}(x)\,=\,x+\frac{2s}{{R}}\,x(1-x)\big((1-2h)x+h\big) \quad\text{and}\quad\beta_{0,R}\,=\,\beta_{1,R}\,=\,0,\] with \(s \geq 0\) and \(h \in {\mathbb{R}}_+\). Then \(\rho_{R}(x) = s_{R}(x)\) and \[\rho(x)\,=\,2 s\, x(1-x)\big((1-2h)x+h\big).\] In the standard Wright–Fisher model the haploid
population can also be interpreted as a diploid setting where each genotype \(ij \in\{0,1\}^2\) has a fitness value \(w_{ij}\), see [18]. The homozygote \(00\) (resp. \(11\)) reproduces with rate \(w_{00} = 1 + 2s\) (resp. \(w_{11} = 1\)) and the heterozygots have rate \(w_{01} = w_{10} = 1 + 2hs\). The parameter \(s \geq 0\) controls the strength of selection, and \(h\) (the dominance parameter) measures the contribution of allele \(0\) to the fitness of a heterozygote. The case \(h = 1/2\) corresponds to additive selection
(no dominance), \(h < 1/2\) models the case where allele \(0\) is recessive, \(h > 1/2\) represents a setting where allele \(0\) is dominant, and \(h > 1\) corresponds to balancing selection. Even in the absence of a direct diploid interpretation, the functions \(\rho_{R}\) and
\(\rho\) serve to model various diploid selection regimes in the two-size Wright–Fisher framework.
We will revisit the cases of genic selection and parent-independent mutation in Section 6, where we derive asymptotic properties of the corresponding two-size Wright–Fisher diffusions using classical diffusion
theory.
A natural variant of our model arises by slightly altering the stopping rule: instead of completing a generation with the first individual that causes the total resource consumption to exceed \({R}\), one may instead
reject this individual and end the generation at the previous one (with the outcome unchanged only when the total exactly equals \({R}\)). Let \({\overline{X}}_{n}^R\) denote the
corresponding process in this variant. A similar analysis to that of the original model yields the following analogue of Theorem 1.
Theorem 2. Under the same assumptions as in Theorem 1, the process \(\big({\overline{X}}_{\lfloor {R}t \rfloor}^{R}\big)_{t \geq 0}\) converges
in distribution to the solution \(({\overline{X}}_{t})_{t \geq 0}\) of the stochastic differential equation \[{\rm d}{\overline{X}}_{t}\;=\;\rho({\overline{X}}_{t})\,{\rm d}t +
\sqrt{{\overline{X}}_{t}(1-{\overline{X}}_{t})\big(1-(1-{\vartheta}) {\overline{X}}_{t}\big)}\,{\rm d}B_{t} ,\] with initial condition \({\overline{X}}_{0} = x_{0}\), where \(B\)
denotes a standard Brownian motion.
The key difference from the SDE in 2 lies in the drift term: for \(\rho(x) = 0\), the original model has a drift of \(-(1 - {\vartheta})x(1 - x)\), whereas the
variant has zero drift and is thus neutral. We refer to Section 5 for further details and a proof sketch, which closely parallels the argument used for Theorem 1.
Remark 8. This variant (with \(\vartheta \in {\mathbb{Q}}\)) was also studied in [9], again in the special case
\(\rho_{R}(x) = \frac{(1 - s {R}^{-1}) x}{1 - s {R}^{-1} x}\). While we show that the consumption strategy has no impact on the drift in this variant, Theorem 2 of [9] incorrectly suggests a selective advantage for small individuals, and their drift term varies significantly with different values of \(\vartheta \in {\mathbb{Q}}\). This
error stems from the same incorrect assumption discussed for the original model. Our simulations, shown in Figure 5, support the theoretical findings presented here.
ab
Figure 5: Approximations (blue) of \(R, {\mathbb{E}}_{x}[{\overline{X}}_{1}^{R} - x_{0}]\) (left) and \(R, {\mathbb{E}}_{x}[({\overline{X}}_{1}^{R} - x_{0})^2]\)
(right) for the variant of the two-size Wright–Fisher model with \(\rho_{R}(x) = x\). For each \(x_{0} = i/100\), \(i \in \{0, \dots, 100\}\), we
estimated the expectations by simulating \(R({\overline{X}}_{1}^{R} - x_{0})\) and \(R({\overline{X}}_{1}^{R} - x_{0})^2\) over \(n_{\text{sim}}\) runs and
computing the sample means. The size of small individuals was set to \({\vartheta}= 0.3\), and the resource capacity to \(R = 1000\). The drift function \(d(x)\) (left) and the diffusion coefficient \(\sigma^2(x)\) (right) from the SDE in Theorem 2 are
shown in green.. a — \(R \, {\mathbb{E}}_{x}[{\overline{X}}_{1}^{R}-x_{0}]\), with \(n_{\text{sim}}=10^6\), b — \(R \,
{\mathbb{E}}_{x}[({\overline{X}}_{1}^{R}-x_{0})^2]\), with \(n_{\text{sim}}=10^3\)
As already indicated, a key ingredient in our analysis is the connection between the one-step transitions of the process \(\big(X_{n}^{{R}}, M_{n}^{{R}}\big)_{n \geq 0}\) and classical renewal theory. This connection is
formalized via the distributional identity in 5 below. We fix \({\vartheta}\in (0,1)\), and let \((\Omega, \mathcal{F}, {\mathbb{P}})\) be a probability space that
supports both the process \(\big(X_{n}^{{R}},M_{n}^{{R}}\big)_{n \geq 0}\) and a sequence \((\xi_i)_{i \geq 1}\) of \(\{{\vartheta}, 1\}\)-valued random
variables satisfying the following conditions:
The \(\xi_i\) are iid under each \({\mathbb{P}}_{x} := {\mathbb{P}}(\cdot \mid X_{0}^{{R}} = x)\), \(x\in [0,1]\), with distribution \(F_{\rho_{{R}}(x)}\) and mean \({\mathbb{E}}_{x}[\xi_i]=\mu(\rho_{{R}}(x))\), where \[F_{p}\,\mathrel{\vcenter{:}}=\,p\, \delta_{{\vartheta}}+(1-p)\,
\delta_{1}\quad\text{and}\quad\mu(p)\,\mathrel{\vcenter{:}}=\,\int u\,F_{p}({\rm d}u)\,=\,1-(1-{\vartheta})p\quad\text{for }p\in [0,1],\] and \(\delta_{x}\) denotes the Dirac measure at \(x\).
The sequences \((\xi_i)_{i \geq 1}\) and \(\big(X_{n}^{{R}}, M_{n}^{{R}}\big)_{n \geq 0}\) are independent under each \({\mathbb{P}}_{x}\).
To state uniform renewal results later, we also introduce a family of auxiliary probability measures \(({\mathbf{P}}_p)_{p \in [0,1]}\) on \((\Omega, \mathcal{F})\), under which the \((\xi_i)_{i \geq 1}\) are iid with law \(F_p\) and mean \(\mu(p)\). When \(p = \rho_{{R}}(x)\) for some \(x \in [0,1]\), we can take \({\mathbf{P}}_p = {\mathbb{P}}_{x}\). Expectations under \({\mathbb{P}}_{x}\) and \({\mathbf{P}}_p\)
are denoted by \({\mathbb{E}}_{x}\) and \({\mathbf{E}}_{p}\), respectively.
Now define the zero-delayed renewal process \(S = (S_{n})_{n \geq 0}\) by \[S_{0}\,:=\,0, \qquad S_{n}\,:=\,\sum_{j=1}^n \xi_j \quad \text{for } n \geq 1,\] and its first passage time
above level \(a \geq 0\) by \[\label{eq46def95stop}
\tau(a)\,:=\,\inf\{n \in {\mathbb{N}}: S_{n} \geq a\}.\tag{4}\]
We now relate this renewal process to the one-step transitions of our model. Let \(S_i^{{R}}\) denote the total resources consumed to produce the first \(i\) individuals in generation 1.
The total number of individuals in this generation is then given by \[M_{1}^{{R}}\;=\;\inf\{n\in{\mathbb{N}}:{S}_{n}^{{R}}\geq {R}\}.\] Since individuals are either of size \({\vartheta}\)
or 1, the total resources required for generation 1 satisfy \[{S}_{M_{1}^{{R}}}^{{R}}\;=\;{\vartheta}M_{1}^{{R}} X_{1}^{{R}}\,+\,M_{1}^{{R}} (1-X_{1}^{{R}})\;=\;-(1-{\vartheta}) M_1^{{R}}X_{1}^{{R}}\,+\,M_{1}^{{R}}
\quad{\mathbb{P}}_{x}\text{-a.s.}\] It follows directly from the model that the vectors \((S_1^{{R}}, \ldots, S_{M_1^{{R}}}^{{R}})\) and \((S_1, \ldots, S_{\tau({R})})\) have the same
law under \({\mathbb{P}}_{x}\). Therefore, \[\label{l1}
{\mathbb{P}}_{x}\big((X_1^{{R}}, M_1^{{R}}) \in \cdot\,\big)\;=\;{\mathbb{P}}_{x}\Bigg(\bigg(\frac{1}{1 - {\vartheta}} \Big(1 - \frac{S_{\tau({R})}}{\tau({R})}\Big), \tau({R})\bigg) \in \cdot\,\Bigg),\tag{5}\] i.e., the one-step dynamics of
the two-size Wright–Fisher model are fully determined by the renewal process \(S\) and its stopping time \(\tau({R})\). In particular, we have the identity
\[\label{l1a}
X_1^{{R}} - \rho_{{R}}(x)\;=\;-\frac{1}{1 - {\vartheta}} \Big( \frac{S_{\tau({R})}}{\tau({R})} - \mu\big(\rho_{{R}}(x)\big) \Big) \quad {\mathbb{P}}_{x}\text{-a.s.}\tag{6}\]
To prove Theorem 1, we will show that for any \(f \in C^4([0,1])\), the discrete generator \[\mathcal{A}^{{R}}
f(x)\,\mathrel{\vcenter{:}}=\,{R}\, {\mathbb{E}}_{x}\big[f(X_{1}^{{R}}) - f(x)\big]\] converges uniformly in \(x\) to the infinitesimal generator \(\mathcal{A}f(x)\) of the limiting
diffusion process defined by the SDE 2 , as \({R}\to \infty\). The conclusion then follows by a classical convergence result for Markov processes, see Ethier and Kurtz [19] or Kallenberg [20].
By Itô’s formula, the generator \(\mathcal{A}\) acts on functions \(f \in C^2([0,1])\) as \[\label{eq46gen95SDE}
\mathcal{A}f(x) = \big(-(1 - {\vartheta})x(1 - x) + \rho(x)\big) f'(x) + \frac{1}{2}x(1 - x)\big(1 - (1 - {\vartheta})x\big) f''(x).\tag{7}\] To relate this to the discrete generator, we perform a fourth-order Taylor expansion,
yielding \[\begin{align}
\mathcal{A}^{{R}} f(x)\;=&\;{R}\, {\mathbb{E}}_{x}\big[ X_{1}^{{R}} - x \big] f'(x) + \frac{1}{2} {R}\, {\mathbb{E}}_{x}\big[ (X_{1}^{{R}} - x)^2 \big] f''(x) + \frac{1}{6} {R}\, {\mathbb{E}}_{x}\big[ (X_{1}^{{R}} - x)^3 \big]
f'''(x) \\
&+ \frac{1}{12} {R}\, {\mathbb{E}}_{x}\big[ (X_{1}^{{R}} - x)^4 f^{(4)}(Z_{x}) \big],
\end{align}\] for some random point \(Z_{x}\) in \([0,1]\). Using the bound \[\big\vert {\mathbb{E}}_{x}\big[ (X_{1}^{{R}}-x)^4
f^{(4)}(Z_{x})\big]\big\vert\;\leq\; {\mathbb{E}}_{x}\big[ (X_{1}^{{R}}-x)^4\big] \, \Vert f^{(4)}\Vert_{\infty},\] where \(\Vert\cdot\Vert_{\infty}\) denotes the uniform norm, it suffices to prove that
\[\begin{align}
{R}\,{\mathbb{E}}_{x}[X_{1}^{{R}}-x]\,&\xrightarrow[{R}\to\infty]{}\, -(1-{\vartheta}) x (1-x)+\rho(x),\tag{8}\\
{R}\,{\mathbb{E}}_{x}\big[(X_{1}^{{R}}-x)^2\big]\,&\xrightarrow[{R}\to\infty]{}\, x(1-x)\big(1-(1-\vartheta) x\big),\tag{9}\\
{R}\,{\mathbb{E}}_{x}\big[ (X_{1}^{{R}}-x)^3\big]\,&\xrightarrow[{R}\to\infty]{}\,0,\tag{10}\\
{R}\,{\mathbb{E}}_{x}\big[ (X_{1}^{{R}}-x)^4\big]\,&\xrightarrow[{R}\to\infty]{}\,0,\tag{11}
\end{align}\] uniformly in \(x\in[0,1]\).
These convergence statements will be established in Section 4, after preparing the necessary tools from renewal theory. The connection to the latter becomes evident by observing that, via identity 6 , the centered moments \({\mathbb{E}}_{x}[(X_{1}^{{R}} - \rho_{{R}}(x))^n]\) for \(n = 1,2,3,4\) can be written in terms of \(S_{\tau({R})}\) and \(\tau({R})\)\[\label{eq:moments-62renewal32theory}
{\mathbb{E}}_{x}\big[\big(X_{1}^{{R}} - \rho_{{R}}(x)\big)^n\big]\;=\;\frac{(-1)^n}{(1 - {\vartheta})^n} {\mathbb{E}}_{x}\left[ \Big( \frac{S_{\tau({R})}}{\tau({R})} - \mu\big(\rho_{{R}}(x)\big) \Big)^n \right].\tag{12}\] Two key ingredients
in proving the uniform convergence of the generator are:
a uniform version of an \(L^p\)-type elementary renewal theorem (see 14 below), and
a uniform weak convergence result for the stopping summand \(\xi_{\tau({R})}\).
In our setting, the classical (pointwise) elementary renewal theorem states that, for each \(p \in [0,1]\), \[\lim_{{R}\to\infty}\frac{{R}}{\tau({R})}\;=\;\mu(p)
\quad{\mathbf{P}}_{p}\text{-almost surely},\] see e.g.[21]. Moreover, since \({\vartheta}\le \xi_j \le 1\) for all
\(j\), we obtain the uniform bounds \[\label{eq:stopping32bounds}
{R}\,\le\,\tau({R})\,\le\,\frac{{R}+1}{{\vartheta}}\quad\text{for all }{R}>0\tag{13}\] which, combined with dominated convergence, yield the following \(L^\beta\) version
\[\label{eq:ERT32L95p32version}
\lim_{{R}\to\infty}{\mathbf{E}}_{p}\Bigg[\bigg(\frac{{R}}{\tau({R})}\bigg)^{\beta}\Bigg]\,=\,\mu(p)^{\beta}
\quad\text{for all }\beta>0\text{ and }p\in [0,1].\tag{14}\] Additionally, it is well-known (see [22]) that the law \(Q_{p}^{{R}}\) of the stopping summand \(\xi_{\tau({R})}\) under \({\mathbf{P}}_p\) converges weakly to a limiting law \(Q_{p}\),
which can be identified by solving a renewal equation and is given by \[\label{eq:form32of32Qp}
Q_{p}\,=\,\frac{{\vartheta}\, p}{\mu(p)}\delta_{{\vartheta}}\;+\;\frac{1-p}{\mu(p)}\delta_{1}.\tag{15}\] This is immediate for \(p = 0\) and \(p = 1\), and follows for \(p \in (0,1)\) by a standard coupling argument. Since the support of the \(Q_{p}^{{R}}\) is a two-point set, the weak convergence may be equivalently written as
\[\label{eq:QpR-62Qp}
\lim_{{R}\to \infty} \big| Q_{p}^{{R}}({{\vartheta}}) - Q_{p}({{\vartheta}}) \big|\;=\;0.\tag{16}\] No lattice-type considerations are needed because the support of \(\xi_{\tau({R})}\) does not vary with \({R}\) – unlike the support of the excess over the boundary \(S_{\tau({R})}-{R}\) in the case when \({\vartheta}\in{\mathbb{Q}}\) and thus \((S_{n})_{n\ge 0}\) is arithmetic.
In the next section, we will prove that this convergence holds uniformly in \(p \in [0,1]\), for fixed \({\vartheta}\in (0,1)\) (see Proposition 3). This will allow us to establish a uniform extension of 14 involving the stopping summand (Proposition 4). These results together will imply uniform convergence of the centered moments of \(S_{\tau({R})}/\tau({R})\)
under \({\mathbf{P}}_p\) for \(p \in [0,1]\), which, through identity 12 , will yield 8 – 11 and thus complete the proof of Theorem 1 in Section 4.
Let \(Q_{p}^{{R}}\) and \(Q_{p}\) be as introduced previously, and denote by \(\xi_{\infty}\) a random variable with law \(Q_{p}\) under \({\mathbf{P}}_{p}\), independent of all other relevant random variables. Let \(\xi\) be a random variable with distribution \(F_p\) under \({\mathbf{P}}_{p}\) and independent of the \(\xi_i\). Note that for all \(p \in [0,1]\), \[{\mathbf{E}}_{p}[\xi_{\infty}]\;=\;\frac{1-p(1-{\vartheta})(1+{\vartheta})}{\mu(p)}\;=\;\frac{{\mathbf{E}}_{p}[\xi^{2}]}{\mu(p)}.\]
Proposition 3. Fix \({\vartheta}\in (0,1)\). Let \(S\) be a renewal process with stopping time \(\tau({R})\) as defined in 4 , and let \(\xi_{\tau({R})}\) denote the corresponding stopping summand. Then \(\xi_{\tau({R})}\) converges in distribution as \({R}\to \infty\), uniformly in \(p \in [0,1]\). Specifically, \[\label{eq1:uniform32stopping32summand}
\lim_{{R}\to \infty}\;\sup_{p \in [0,1]}\left|{\mathbf{P}}_{p}(\xi_{\tau({R})} = {\vartheta}) - \frac{p{\vartheta}}{\mu(p)}\right|\;=\;0.\qquad{(1)}\] As a direct consequence, for all \(\beta > 0\), \[\label{eq2:uniform32stopping32summand}
\lim_{{R}\to \infty}\;\sup_{p \in [0,1]} \left| {\mathbf{E}}_{p}\big[\xi_{\tau({R})}^\beta\big] - {\mathbf{E}}_{p}\big[\xi_{\infty}^\beta\big] \right|\;=\;0.\qquad{(2)}\]
Our proof of ?? is purely probabilistic and based on a coupling argument. Fix \(\nu \in (0, \tfrac{1}{2})\), we construct a coupling process whose distribution is the same under any \({\mathbf{P}}_{p}\) with \(p \in [\nu, 1 - \nu]\), implying that the coupling time has the same distribution across this range. This yields uniform convergence on \([\nu, 1 - \nu]\). Moreover, as shown in Lemma 1 and supported by a simple intuitive argument, if \(\nu\) is chosen
sufficiently small, then for \(p \in (0, \nu)\), the distribution \({\mathbf{P}}_{p}(\xi_{\tau({R})} \in \cdot)\) is nearly \(\delta_1\), i.e., the law of
\(\xi_{\tau({R})}\) under \({\mathbf{P}}_{0}\). Similarly, for \(p \in (1 - \nu, 1)\), it is close to \(\delta_{{\vartheta}}\), the law of \(\xi_{\tau({R})}\) under \({\mathbf{P}}_1\). These two ingredients combine to establish the uniform convergence asserted in
?? .
In the arithmetic case (i.e., \({\vartheta}\in {\mathbb{Q}}\)), ?? can also be derived from Lemma 1 and a result by Borovkov and Foss [14], after verifying their Fourier-analytic condition: for some continuous function \(\psi\) on \([0, 2\pi]\) (assuming lattice span one), \[\label{eq46inequ}
\Big|{\mathbf{E}}_{p}[e^{i u \xi}] -1\Big|\, \geq\, \psi(u)\quad\text{for all }u\in (0,2\pi).\tag{17}\] To the best of our knowledge, the result in the non-arithmetic case is new.
Let us now introduce the necessary notation and auxiliary results used in the proof of Proposition 3, which will follow at the end of this
subsection.
Let \({\mathbf{U}}_{p}\mathrel{\vcenter{:}}= \sum_{n=0}^{\infty}{\mathbf{P}}_{p}(S_{n} \in \cdot)\) denote the renewal measure of the process \(S\). The classical version of Blackwell’s
renewal theorem ([21]) states that \[\begin{gather}
\lim_{{R}\to \infty} {\mathbf{U}}_{p}\big([{R}-t,{R})\big)\;=\;\frac{t}{\mu(p)}\quad\text{if S is non-arithmetic}
\shortintertext{and}
\lim_{n \to \infty} {\mathbf{U}}_{p}\Big(\Big\{\frac{n}{b}\Big\}\Big)\;=\;\frac{1}{b\,\mu(p)}\quad \text{if S is arithmetic with lattice-span }\frac{1}{b}.
\end{gather}\] Moreover, a standard renewal argument gives \[\begin{gather}
{\mathbf{P}}_{p}(\xi_{\tau(R)}={\vartheta})\;=\;{\mathbf{P}}_{p}(\xi_{1}={\vartheta})\,{\mathbf{U}}_{p}\big([R-{\vartheta},R)\big)\;=\;p\,{\mathbf{U}}_{p}\big([R-{\vartheta},R)\big)
\shortintertext{and}
{\mathbf{P}}_{p}(\xi_{\tau(R)}=1)\;=\;{\mathbf{P}}_{p}(\xi_{1}=1)\,{\mathbf{U}}_{p}\big([R-1,R)\big)\;=\;(1-p)\,{\mathbf{U}}_{p}\big([R-1,R)\big)
\end{gather}\] for all \(p\in [0,1]\) and \(R\ge 0\), which implies the identity \[\label{eq:ren32measure32id} p\,{\mathbf{U}}_{p}\big([R-{\vartheta},R)\big)\,+\,(1-p)\,{\mathbf{U}}_{p}\big([R-1,R)\big)\,=\,1.\tag{18}\] From this, the limiting distribution \(Q_{p}\) in
15 follows both in the arithmetic case (\({\vartheta}\in {\mathbb{Q}}\)) and the non-arithmetic case (\({\vartheta}\notin {\mathbb{Q}}\)). In the arithmetic
setting, note that \(Q_{p}^{{R}}\), the law of \(\xi_{\tau({R})}\) under \({\mathbf{P}}_{p}\), remains constant between consecutive lattice points. The form
of \(Q_{p}\) remains valid also at the boundary values \(p = 0\) and \(p = 1\), where the limiting distribution coincides trivially with the increment laws
\(F_{0} = \delta_1\) and \(F_1 = \delta_{{\vartheta}}\), respectively, as already noted.
With the help of Lemma 1 below, we may restrict attention to \(p \in [\nu, 1 - \nu]\) for any sufficiently small \(\nu > 0\), reducing the uniformity claim to showing \[\label{eq:reduced32assertion}
\lim_{{R}\to \infty}\;\sup_{p \in [\nu, 1 - \nu]}\| Q_{p}^{{R}} - Q_{p}\| = 0,\tag{19}\] where \(\|\cdot\|\) denotes the total variation distance (normalized). Since both \(Q_{p}^{{R}}\) and \(Q_{p}\) are supported on the two-point set \(\{{\vartheta}, 1\}\), this distance simplifies to \[\big\|Q_{p}^{{R}}-Q_{p}\big\|\;=\;\big|Q_{p}^{{R}}(\{\vartheta\})-Q_{p}(\{\vartheta\})\big|\;=\;\big|Q_{p}^{{R}}(\{1\})-Q_{p}(\{1\})\big|.\]
Lemma 1. Let \(Q_{p}^{{R}},\) and \(Q_{p}\) be as above. Then \[\lim_{p(1-p)\to 0}\,\sup_{{R}\geq 0}\big\|Q_{p}^{{R}}-Q_{p}\big\|\;=\;0.\]
Proof. By 18 and the fact that \(\sup_{p\in [0,1]}\sup_{{R}\geq 0}{\mathbf{U}}_{p}([{R}-\vartheta,{R}))\leq1\), we have \[\begin{align}
\lim_{p\downarrow 0}{\mathbf{U}}_{p}\big([{R}-1,{R})\big)\;=\;\lim_{p\downarrow 0}\frac{1-p\,{\mathbf{U}}_{p}\big([{R}-\vartheta,{R})\big)}{1-p}\;=\;1
\shortintertext{and}
\lim_{p\uparrow 1}{\mathbf{U}}_{p}\big([{R}-\vartheta,{R})\big)\;=\;\lim_{p\uparrow 1}\frac{1-(1-p)\,{\mathbf{U}}_{p}\big([{R}-1,{R})\big)}{p}\;=\;1,
\end{align}\] both uniformly in \({R}\geq 0\). Using this and the explicit form of \(Q_{p}^{{R}}\) from the renewal representation, we obtain \[\begin{gather}
\sup_{{R}}\big\|Q_{p}^{{R}}-Q_{p}\big\|\;=\;\sup_{{R}}\bigg|{\mathbf{P}}_{p}(\xi_{\tau({R})}=\vartheta)-\frac{p\vartheta}{\mu(p)}\bigg|\;=\;\sup_{{R}}\bigg|p\,{\mathbf{U}}_{p}\big([{R}-\vartheta,{R})\big)-\frac{p\vartheta}{\mu(p)}\bigg|\;\xrightarrow{p\uparrow
1}\;0
\shortintertext{and similarly}
\sup_{{R}}\big\|Q_{p}^{{R}}-Q_{p}\big\|\;=\;\sup_{{R}}\bigg|(1-p)\,{\mathbf{U}}_{p}\big([{R}-1,{R})\big)-\frac{1-p}{\mu(p)}\bigg|\;\xrightarrow{p\downarrow 0}\;0.
\end{gather}\] This completes the proof. ◻
In the following, we restrict attention to the case \(\vartheta\notin{\mathbb{Q}}\) so that the renewal process \(S\) is non-arithmetic under each \({\mathbf{P}}_{p}\), \(p\in (0,1)\). The arguments in the arithmetic case are very similar and, in fact, simpler, since one can use exact coupling instead of an approximate \(\varepsilon\)-coupling. In the boundary cases \(p=0\) and \(p=1\), the process \(S\) becomes deterministic and is thus not
non-arithmetic. In these cases, the limiting distribution is trivial, with \(Q_{0}=F_{0}=\delta_{1}\) and \(Q_{1}=F_{1}=\delta_{\vartheta}\), respectively.
Let \(F_{p}^{*}\) denote the stationary renewal distribution of \(S\) under \({\mathbf{P}}_{p}\), given by
\[\label{eq:stationary32delay}
F_{p}^{*}({\rm d}x)\;=\;\mu(p)^{-1}{\mathbf{P}}_{p}(\xi_{1}>x){\mathbb{1}}_{(0,\infty)}(x)\,{\rm d}x.\tag{20}\] In the non-arithmetic case, \(F_{p}^{*}\) is characterized by \[F_{p}^{*}*{\mathbf{U}}_{p}=\mu(p)^{-1}\lambda\lambda^{+}\] where \(\lambda\lambda^{+}\) denotes Lebesgue measure on the positive halfline. It is also the limiting distribution of the overshoot
\(S_{\tau({R})}-{R}\) as \({R}\to\infty\) and hence the stationary law of the continuous-time Markov process \((S_{\tau({R})}-{R})_{{R}\geq 0}\) under \({\mathbf{P}}_{p}\).
We now fix, as indicated above, an arbitrarily small \(\nu\in (0,\frac{1}{2})\) and restrict attention to \(p\in [\nu,1-\nu]\). Let \((\xi_{1}',\xi_{1}''),(\xi_{2}',\xi_{2}''),\ldots\) be iid under every \({\mathbf{P}}_{p}\) with common joint law defined by \[\begin{gather}
{\mathbf{P}}_{p}(\xi_{1}'=\vartheta,\xi_{1}''=0)\,=\,{\mathbf{P}}_{p}(\xi_{1}'=0,\xi_{1}''=\vartheta)=\frac{\nu}{4},\\
{\mathbf{P}}_{p}(\xi_{1}'=1,\xi_{1}''=0)\,=\,{\mathbf{P}}_{p}(\xi_{1}'=0,\xi_{1}''=1)\,=\,\frac{\nu}{4},\\
{\mathbf{P}}_{p}(\xi_{1}'=\xi_{1}''=\vartheta)\,=\,\frac{p}{2}-\frac{\nu}{4},\quad {\mathbf{P}}_{p}(\xi_{1}'=\xi_{1}''=1)\,=\,\frac{1-p}{2}-\frac{\nu}{4}
\shortintertext{and}
{\mathbf{P}}_{p}(\xi_{1}'=\xi_{1}''=0)\,=\,\frac{1-\nu}{2}.
\end{gather}\] It follows that \(\xi_{n}'\) and \(\xi_{n}''\) have the same law under \({\mathbf{P}}_{p}\), namely \[\frac{1}{2}\delta_{0}+\frac{p}{2}\delta_{\vartheta}+\frac{1-p}{2}\delta_{1}\;=\;\frac{1}{2}\big(\delta_{0}+F_{p}\big).\] Moreover, the law of the difference \(\xi_{n}'-\xi_{n}''\)
is symmetric and independent of \(p\in [\nu,1-\nu]\), namely \[\label{eq:symmetrization32law}
{\mathbf{P}}_{p}(\xi_{n}'-\xi_{n}''\in\cdot)\;=\;(1-\nu)\delta_{0}\,+\,\frac{\nu}{4}\big(\delta_{\vartheta}+\delta_{-\vartheta}+\delta_{1}+\delta_{-1}\big)\tag{21}\] for each \(p\in [\nu,1-\nu]\).
Note that this law is non-arithmetic under our assumption that \(\vartheta\notin{\mathbb{Q}}\).
Now let \((\xi_{0}',\xi_{0}'')\) be independent of \((\xi_{n}',\xi_{n}'')_{n\geq 1}\) and distributed according to \(\delta_{0}\otimes
F_{p}^{*}\) under \({\mathbf{P}}_{p}\). Define the bivariate random walk \[\begin{gather}
(S_{n}',S_{n}'')\,\mathrel{\vcenter{:}}=\,\sum_{k=0}^{n}(\xi_{k}',\xi_{k}''),\quad n\geq 0
\shortintertext{and its symmetrization}
W_{n}\,\mathrel{\vcenter{:}}=\,S_{n}''-S_{n}'\,=\,\sum_{k=0}^{n}(\xi_{k}''-\xi_{k}'),\quad n\geq 0.
\end{gather}\] By 21 , the law of the sequence \((W_{n}-W_{0})_{n \geq 0}\) under \({\mathbf{P}}_{p}\) is the same for every \(p \in [\nu, 1-\nu]\).
Define the \(\varepsilon\)-coupling time \[T_{\varepsilon}\,=\,\inf\{n\geq 0:|W_{n}|\le\varepsilon\},\] and more generally, \[T_{\varepsilon, x}\;=\;\inf \{ n
\geq 0: \vert W_{n} -W_{0}+x \vert \leq \varepsilon\}\] for \(\varepsilon>0\) and \(x >0\). Using \({\mathbf{P}}_{\bullet}\) for probabilities
that are independent of \(p\), we then have \[{\mathbf{P}}_{p}(T_\varepsilon\in \cdot)\;=\;F_{p} (\{ {\vartheta}\})\, {\mathbf{P}}_{\bullet} (T_{\varepsilon, {\vartheta}}\in \cdot) +
\big(1-F_{p}(\{{\vartheta}\})\big)\, {\mathbf{P}}_{\bullet}(T_{\varepsilon, 1}\in \cdot).\] This shows that the law of \(T_\varepsilon\) under \({\mathbf{P}}_{p}\) depends on \(p\) only through \(F_{p}(\{ {\vartheta}\})\), and is bounded by the larger of the two laws on the right-hand side, in the sense that \[{\mathbf{P}}_{p}(T_\varepsilon\in \cdot)\;\leq\;{\mathbf{P}}_{\bullet}(T_{\varepsilon, {\vartheta}}\in \cdot) \vee {\mathbf{P}}_{\bullet}(T_{\varepsilon,1}\in \cdot).\] We also observe that, if \(\sigma_{0}'=0\) and \[\sigma_{n}'\,=\,\inf\{k>\sigma_{n-1}':\xi_{k}'>0\}\,=\,\inf\{k>\sigma_{n-1}':S_{k}'>S_{\sigma_{n-1}'}'\}\] for \(n\geq 1\) denote the jump epochs (strictly ascending ladder epochs) of \(S'\), then the process \((S_{\sigma_{n}'}')_{n\geq 0}\) has the same law as
the original process \(S\) under every \({\mathbf{P}}_{p}\). Furthermore, the increments of \((\sigma_{n}')_{n\geq 0}\) are iid and geometrically
distributed on \({\mathbb{N}}\) with parameter \(\frac{1}{2}\). Now define \[\tau'({R})\,=\,\inf\{n\geq 1:S_{n}'\geq{R}\},\quad {R}\geq 0.\] Since
level exceedance by \(S'\) can only occur at a jump time, it follows that \(\tau'({R})=\sigma_{g({R})}'\) for some suitable index function \(g({R})\).
For \(n\in{\mathbb{N}}_{0}\) and measurable \(A\subset [0,\infty)\), define the counting processes \[\begin{align}
N_{n}(A)\,\mathrel{\vcenter{:}}=\,\sum_{k=0}^{n}{\mathbb{1}}^{}_{A}(S_{k})\quad\text{and}\quad N(A)\,\mathrel{\vcenter{:}}=\,\sum_{k\geq 0}{\mathbb{1}}^{}_{A}(S_{k})
\end{align}\] and define \(N_{n}'(A),N_{n}''(A),N'(A),N''(A)\) accordingly for \(S'\) and \(S''\), respectively.
Then, by definition of the renewal measure, we have \({\mathbf{U}}_{p}(A)={\mathbf{E}}_{p}[N(A)]\). The next lemma shows that augmenting the increment law by an atom at zero (i.e., replacing \(F_{p}\) with \(\frac{1}{2}(\delta_{0}+F_{p})\)) changes the renewal measure only by a constant. We continue with some auxiliary lemmata used in the uniform coupling argument for the proof of
Proposition 3.
Lemma 2. Let \({\mathbf{U}}_{p}'\) and \({\mathbf{U}}_{p}''\) denote the renewal measures of \(S'\) and \(S''\) under \({\mathbf{P}}_{p}\), respectively. Then \[{\mathbf{U}}_{p}'\,=\,2{\mathbf{U}}_{p}\quad\text{and}\quad{\mathbf{U}}_{p}''\,=\,F_{p}^{*}*{\mathbf{U}}_{p}'\,=\,2\mu(p)^{-1}\lambda\lambda^{+}\] for each \(p\in (0,1)\).
Proof. Since \(S''\) has the same increment law as \(S'\) and delay distribution \(F_{p}^{*}\), only the first identity needs to be
verified. Let \(\varphi_{p}\) denote the Laplace transform of \(F_{p}\). Then the Laplace transform of \((\delta_{0}+F_{p})/2\) equals \((1+\varphi_{p})/2\). It follows that \({\mathbf{U}}_{p}'=\sum_{n\geq 0}2^{-n}(\delta_{0}+F_{p})^{*n}\) has Laplace transform \[\frac{1}{1-(1+\varphi_{p})/2}\;=\;\frac{2}{1-\varphi_{p}}\] which is also the Laplace transform of \(2{\mathbf{U}}_{p}\). ◻
Lemma 3. For all \({R}\geq 0\), \(n\in{\mathbb{N}}_{0}\) and \(p\in [\nu,1-\nu]\), \[\begin{gather}
N_{n}'\big([{R},{R}+\vartheta)\big)\;\leq\;(\sigma_{g({R})+1}'-\sigma_{g({R})}'){\mathbb{1}}_{\{\tau'({R})\leq n\}}\quad{\mathbf{P}}_{p}\text{-a.s.}\tag{22}
\shortintertext{and}
{\mathbf{E}}_{p}\big[N_{n}'\big([{R},{R}+\vartheta)\big)\big]\;\le\;2\,{\mathbf{P}}_{p}(\tau'({R})\leq n).\tag{23}
\end{gather}\]
Proof. By definition of \(S'\), the walk can only visit the interval \([{R},{R}+\vartheta)\) within \(n\) steps if \(\tau'({R})=\sigma_{g({R})}'\leq n\). In that case, \(S'\) will exit the interval at the next positive jump, which is of size at least \(\vartheta\).
Thus, 22 follows immediately. Since \(\big(\sigma_{g({R})+k}'-\sigma_{g({R})}'\big)_{k\geq 0}\) and \(\tau'({R})\) are independent and \({\mathbf{E}}_{p}\big[\sigma_{g({R})+1}'-\sigma_{g({R})}'\big]=2\), we obtain 23 by taking expectations in 22 . ◻
Remark 9. Since \(S'\) differs from the original walk \(S\) only by the inclusion of additional zero jumps, it is immediate that
\[\label{eq:tau40a4162tau3940a41}
{\mathbf{P}}_{p}\big(\tau'({R}) \leq n\big) \,\leq\, {\mathbf{P}}_{p}\big(\tau({R}) \leq n\big)\tag{24}\] for all \({R}\geq 0\), \(n \in {\mathbb{N}}_{0}\), and \(p \in [\nu, 1 - \nu]\). Moreover, we have \(\tau({R}) \geq {R}\) because the maximal jump size of \(S\) is \(1\) (see 13 ).
Remark 10. Since \[{\mathbf{E}}_{p}\big[N_{n}''\big([{R}, {R}+ \vartheta)\big)\big]
\;=\;\int_{(0,1]} {\mathbf{E}}_{p}\big[N_{n}'\big([{R}- x, {R}+ \vartheta - x)\big)\big] \, F_{p}^{*}({\rm d}x),\] the previous lemma, combined with 24 , implies that \[\begin{align}
{\mathbf{E}}_{p}\big[N_{n}''\big([{R}, {R}+ \vartheta)\big)\big]\;
&\leq\;\int_{(0,1]} 2\, {\mathbf{P}}_{p}\big(\tau'({R}- x) \leq n\big)\, F_{p}^{*}({\rm d}x)\;\leq\;2\, {\mathbf{P}}_{p}\big(\tau({R}- 1)\leq n\big)
\end{align}\] for all \({R}\geq 0\), \(n \in{\mathbb{N}}_{0}\), and \(p \in [\nu,1-\nu]\).
To formulate the next lemma, let \(({\mathcal{F}}_{n})_{n \geq 0}\) denote the canonical filtration of the bivariate random walk \((S_{n}', S_{n}'')_{n \geq 0}\). Note that
the \(\varepsilon\)-coupling time is a stopping time with respect to this filtration.
Lemma 4. For all \(\varepsilon> 0\), \({R}\geq 0\), and \(p \in [\nu, 1 - \nu]\), \[\label{eq:aux4461}
{\mathbf{E}}_{p}\big[N_{T_{\varepsilon}}'\big([{R}, {R}+ \vartheta)\big)\big]\;\leq\;2\,{\mathbf{P}}_{\bullet}(T_{\varepsilon} \geq {R}),\tag{25}\] where \({\mathbf{P}}_{\bullet}\) indicates that this
probability is independent of \(p\). As a consequence, \[\label{eq:aux4462}
\lim_{{R}\to \infty} \sup_{p \in [\nu, 1 - \nu]} {\mathbf{E}}_{p}\big[N_{T_{\varepsilon}}'\big([{R}, {R}+ \vartheta)\big)\big]\;=\;0,\tag{26}\] and the same uniform convergence holds for \({\mathbf{E}}_{p}\big[N_{T_{\varepsilon}}''([{R}, {R}+ \vartheta))\big]\).
Proof. From 22 , we know that \[N_{T_\varepsilon}'\big([{R}, {R}+ \vartheta)\big)\;\leq\;(\sigma_{g({R})+1}' - \sigma_{g({R})}')\, {\mathbb{1}}_{\{\tau'({R}) \leq
T_{\varepsilon}\}}\quad \text{{\mathbf{P}}_{p}-a.s. for all p \in [\nu, 1 - \nu]}.\] Since \(\{\tau'({R}) \leq T_{\varepsilon}\} \in {\mathcal{F}}_{\tau'({R})}\) and the increment \(\sigma_{g({R})+1}' - \sigma_{g({R})}'\) is independent of \({\mathcal{F}}_{\tau'({R})}\), it follows that \[{\mathbf{E}}_{p}\big[N_{T_{\varepsilon}}'\big([{R}, {R}+ \vartheta)\big)\big]
\,\leq\, 2\, {\mathbf{P}}_{\bullet}\big(T_{\varepsilon} \geq \tau'({R})\big).\] Using the fact that \(\tau'({R})\geq{R}\) and that the law of \(T_{\varepsilon}\) under \({\mathbf{P}}_{p}\) does not depend on \(p\), we obtain 25 . The remaining statements follow immediately. ◻
We now have all the ingredients to prove Proposition 3.
Proof of Proposition 3. Fix any irrational \(\vartheta\), and recall that \[Q_{p}^{{R}}(\{\vartheta\})\;=\;{\mathbf{P}}_{p}(\xi_{\tau({R})}\;=\;\vartheta)\;=\;p\,{\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big)\;=\;1 - {\mathbf{P}}_{p}(\xi_{\tau({R})}\;=\;1).\] Further recalling 15 , we obtain \[\big|Q_{p}^{{R}}(\{\vartheta\}) - Q_{p}(\{\vartheta\})\big|\;=\;p \left| {\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big) - \frac{\vartheta}{\mu(p)} \right|.\] Hence, to show
\[\sup_{p \in [0,1]} \left| Q_{p}^{{R}}(\{\vartheta\}) - Q_{p}(\{\vartheta\}) \right| \xrightarrow{{R}\to \infty} 0,\] it suffices to establish that \({\mathbf{U}}_{p}([{R}- \vartheta, {R})) \to
\vartheta / \mu(p)\) uniformly in \(p\). By Lemma 1, we may restrict to \(p \in [\nu, 1 - \nu]\) for
arbitrary \(\nu \in (0, \tfrac{1}{2})\). Thus, it remains to prove \[\label{eq:to32be32shown}
\lim_{{R}\to \infty} \sup_{p \in [\nu, 1 - \nu]} \left| {\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big) - \frac{\vartheta}{\mu(p)} \right|\;=\;0.\tag{27}\]
To this end, fix any \(\varepsilon\in (0, \tfrac{\vartheta}{2})\). Observe that \[{\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big) - \frac{\vartheta}{\mu(p)}\;=\;\frac{1}{2} \Big(
{\mathbf{U}}_{p}'\big([{R}- \vartheta, {R})\big) - {\mathbf{U}}_{p}''\big([{R}- \vartheta, {R})\big) \Big),\] and recall that \({\mathbf{U}}_{p}''\;=\;\frac{2}{\mu(p)} \lambda\lambda^{+}\). We
estimate \[\begin{align}
{\mathbf{U}}_{p}'\big([{R}- \vartheta, {R})\big)\;
&=\;{\mathbf{E}}_{p} \big[N_{T_{\varepsilon}}'\big([{R}- \vartheta, {R})\big)\big] +\;{\mathbf{E}}_{p} \Bigg[ \sum_{n > T_{\varepsilon}} {\mathbb{1}}_{[{R}- \vartheta, {R})}(S_{n}') \Bigg] \\
&\leq\;{\mathbf{E}}_{p} \big[N_{T_{\varepsilon}}'\big([{R}- \vartheta, {R})\big)\big] +\;{\mathbf{E}}_{p} \Bigg[ \sum_{n > T_{\varepsilon}} {\mathbb{1}}_{[{R}- \vartheta - \varepsilon, {R}+ \varepsilon)}(S_{n}'') \Bigg] \\
&=\;{\mathbf{E}}_{p} \big[N_{T_{\varepsilon}}'\big([{R}- \vartheta, {R})\big)\big] -\;{\mathbf{E}}_{p} \big[N_{T_{\varepsilon}}''\big([{R}- \vartheta - \varepsilon, {R}+ \varepsilon)\big)\big] +\;{\mathbf{U}}_{p}''\big([{R}-
\vartheta - \varepsilon, {R}+ \varepsilon)\big) \\
&\leq\;o(1)\;+\;\frac{2(\vartheta + 2\varepsilon)}{\mu(p)}\quad \text{as } {R}\to \infty,
\end{align}\] where the \(o(1)\) term is uniform in \(p \in [\nu, 1 - \nu]\) by Lemma 4. Therefore,
\[\label{eq:limsup32estimate}
\limsup_{{R}\to \infty} \sup_{p \in [\nu, 1 - \nu]}
\left( {\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big) - \frac{\vartheta}{\mu(p)} \right)\;\leq\;\frac{2\varepsilon}{\mu(1-\nu)}.\tag{28}\]
A similar argument yields the lower bound \[\begin{align}
{\mathbf{U}}_{p}'\big([{R}- \vartheta, {R})\big)
&\geq\;{\mathbf{E}}_{p} \Bigg[ \sum_{n > T_{\varepsilon}} {\mathbb{1}}_{[{R}- \vartheta, {R})}(S_{n}') \Bigg] \\
&\geq\;{\mathbf{U}}_{p}''\big([{R}- \vartheta + \varepsilon, {R}- \varepsilon)\big) \; -\;{\mathbf{E}}_{p} \big[N_{T_{\varepsilon}}''\big([{R}- \vartheta + \varepsilon, {R}- \varepsilon)\big)\big] \\
&=\;\frac{2(\vartheta - 2\varepsilon)}{\mu(p)}\;-\;o(1)\quad \text{as } {R}\to \infty,
\end{align}\] again with uniform remainder \(o(1)\) in \(p \in [\nu, 1 - \nu]\). Thus, \[\label{eq:liminf32estimate}
\liminf_{{R}\to \infty} \inf_{p \in [\nu, 1 - \nu]}
\left( {\mathbf{U}}_{p}\big([{R}- \vartheta, {R})\big) - \frac{\vartheta}{\mu(p)} \right)
\;\geq\;-\frac{2\varepsilon}{\mu(1-\nu)}.\tag{29}\] Combining 28 and 29 , and noting that \(\varepsilon> 0\) was arbitrary, we
obtain 27 . This completes the proof of Proposition 3. ◻
3.2 Uniform \(L^p\)-convergence of \({R}/\tau({R})\)↩︎
Before giving the proof of Theorem 1, we state the second announced result.
Proposition 4. For any \(m \in {\mathbb{N}}\) and \(\beta > 0\), \[\label{eq0:uniform32key32renewal}
\lim_{{R}\to \infty} \sup_{p \in [0,1]} \left| {R}^{m} \,{\mathbf{E}}_{p}\left[ \frac{\xi_{\tau({R})}^{\beta}}{\big(\mu(p)\tau({R})\big)^{m}} \right] - {\mathbf{E}}_{p}[\xi_{\infty}^{\beta}] \right|\;=\;0,\qquad{(3)}\] and additionally, for
\(\beta = 0\), \[\label{eq1:uniform32key32renewal}
\lim_{{R}\to \infty} \sup_{p \in [0,1]} {\mathbf{E}}_{p} \left[\left|\frac{R^{m}}{\big(\mu(p)\tau({R})\big)^{m}}\,-\,1 \right| \right]\;=\;0,\qquad{(4)}\] which is equivalent to \[\label{eq2:uniform32key32renewal}
\lim_{{R}\to\infty}\sup_{p \in [0,1]}{\mathbf{E}}_{p} \left[\left| \frac{R}{\mu(p)\tau({R})}\,-\,1\right|^m \right]\;=\;0.\qquad{(5)}\]
Proof. For the proof of ?? , we note that \[{R}^{m}\,{\mathbf{E}}_{p}\left[\frac{\xi_{\tau({R})}^{\beta}}{\big(\mu(p)\tau({R})\big)^{m}}\right]\,-\,{\mathbf{E}}_{p}[\xi_{\infty}^{\beta}]
\;=\;{\mathbf{E}}_{p}\left[\xi_{\tau({R})}^{\beta} \left( \frac{{R}^{m}}{\big(\mu(p)\tau({R})\big)^{m}}\,-\,1 \right) \right]\,+\,{\mathbf{E}}_{p}[\xi_{\tau({R})}^{\beta}]\,-\,{\mathbf{E}}_{p}[\xi_{\infty}^{\beta}].\] Applying ?? and Proposition 3, we obtain \[\sup_{p \in [0,1]} \left| {\mathbf{E}}_{p} \left[ \xi_{\tau({R})}^{\beta} \left(
\frac{{R}^{m}}{\big(\mu(p)\tau({R})\big)^{m}}\,-\,1 \right) \right] \right|
\;\leq\;\sup_{p \in [0,1]} {\mathbf{E}}_{p} \left[ \left| \frac{{R}^{m}}{\big(\mu(p)\tau({R})\big)^{m}}\,-\,1 \right| \right]\;\xrightarrow{{R}\to \infty}\;0,\] and \[\sup_{p \in [0,1]} \left|
{\mathbf{E}}_{p}[\xi_{\tau({R})}^{\beta}]\,-\,{\mathbf{E}}_{p}[\xi_{\infty}^{\beta}]\right|\;\xrightarrow{{R}\to \infty}\;0.\] so ?? follows.
Since \({R}/ \tau({R}) \to \mu(p)\)\({\mathbf{P}}_p\)-a.s. for each \(p\), the equivalence of ?? and ?? follows by a theorem of Riesz; see [23]. We now prove ?? . From 13 , we have for all \(m \in {\mathbb{N}}\)\[{\mathbf{E}}_{p} \left[ \left| \frac{R}{\mu(p)\tau({R})}\,-\,1 \right|^{m+1} \right]\;\leq\;C\,{\mathbf{E}}_{p} \left[ \left| \frac{R}{\mu(p)\tau({R})}\,-\,1 \right|^{m} \right].\] for some constant \(C > 0\). Hence, it suffices to consider even \(m\). Expanding the \(m\)-th power, \[\begin{align}
{\mathbf{E}}_{p} \left[ \left( \frac{R}{\mu(p)\tau({R})} - 1 \right)^m \right]
\;=\;1\,+\,\sum_{k=1}^m (-1)^{m-k} \binom{m}{k} {\mathbf{E}}_{p} \left[ \frac{R^k}{\big(\mu(p)\tau({R})\big)^k} \right].
\end{align}\] Using \(\sum_{k=1}^m (-1)^{m-k} \binom{m}{k} = -1\), we see that it suffices to prove that for all \(k \in {\mathbb{N}}\),
\[\label{eq:Gerold321a}
\lim_{{R}\to \infty} \sup_{p \in [0,1]} \left| {\mathbf{E}}_{p}\left[ \frac{R^k}{\big(\mu(p)\tau({R})\big)^k} \right]\,-\,1 \right|\;=\;0.\tag{30}\] To this end, we stipulate that all subsequent convergence statements (including big O
symbols) are meant to hold uniformly in \(p\). We expand \(\tau({R})^{-k}\) via Taylor’s theorem around \({\mathbf{E}}_{p}[\tau({R})]\)\[\begin{align}
\label{eq:Gerold322}
{\mathbf{E}}_{p} \left[ \frac{R^k}{\big(\mu(p)\tau({R})\big)^k} \right]
\;=\;\frac{R^k}{\big(\mu(p){\mathbf{E}}_{p}[\tau({R})]\big)^k}\,+\, \frac{k(k+1)R^k}{2 \mu(p)^k} \cdot {\mathbf{E}}_{p} \left[ \frac{\big(\tau({R}) - {\mathbf{E}}_{p}[\tau({R})]\big)^2}{\zeta^{k+2}} \right],
\end{align}\tag{31}\] where \(\zeta\) is between \(\tau({R})\) and \({\mathbf{E}}_{p}[\tau({R})]\). For suitable \(0
< c_1 \le 1 \le c_2\), we have that \[{\mathbf{P}}_{p}(c_1 R \le \zeta \le c_2 R)\;=\;1 \quad \text{for all } R \ge 1, \; p \in [0,1].\] Thus, \[\begin{align}
\frac{\boldsymbol{Var}_{p}[\tau({R})]}{c_{2}^{k+2}R^{k+2}}\;\le\;{\mathbf{E}}_{p}\bigg[\frac{\big(\tau({R})-{\mathbf{E}}_{p}[\tau({R})]\big)^{2}}{\zeta^{k+2}}\bigg]\;\le\;\frac{\boldsymbol{Var}_{p}[\tau({R})]}{c_{1}^{k+2}R^{k+2}}.
\end{align}\] From Wald’s first identity, \[\label{eq46wald}
\mu(p){\mathbf{E}}_{p}[\tau({R})]\;=\;{\mathbf{E}}_{p}[S_{\tau({R})}]\;\in\;[R,R+1]\tag{32}\] Using this, an application of Wald’s second identity yields \[\begin{align}
\mu(p)^{2}\,\boldsymbol{Var}_{p}[\tau({R})]\;&=\;{\mathbf{E}}_{p}\Big[\Big((\mu(p)\tau({R})-S_{\tau({R})})+(S_{\tau({R})}-{\mathbf{E}}_{p}S_{\tau({R})})\Big)^2\Big]\\
&=\;{\mathbf{E}}_{p}\big[\big(S_{\tau({R})}-\mu(p)\tau({R})\big)^{2}\big]\,+\,{\mathbf{E}}_{p}\big[\big(S_{\tau({R})}-R+O(1)\big)^2\big]\\
& -\,2\,{\mathbf{E}}_{p}\big[\big(S_{\tau({R})}-\mu(p)\tau({R})\big)\big(S_{\tau({R})}-R+O(1)\big)\big]\\
&=\;\boldsymbol{Var}_{p}[\xi]\,{\mathbf{E}}_{p}[\tau({R})]\,+\,O(1)\,+\,O\Big(\sqrt{\boldsymbol{Var}_{p}[\xi]\,{\mathbf{E}}_{p}[\tau({R})]}\, \Big)\\
&=\;\boldsymbol{Var}_{p}[\xi]\frac{R}{\mu(p)}\,+\,O(R^{1/2}),
\end{align}\] where the Cauchy-Schwarz inequality has been used for the last two equalities to deduce \[\begin{align}
\Big|{\mathbf{E}}_{p}\Big[\big(S_{\tau({R})}&-\mu(p)\tau({R})\big)\big(S_{\tau({R})}-R+O(1)\big)\Big]\Big|\\
&\le\;\sqrt{{\mathbf{E}}_{p}\big[\big(S_{\tau({R})}-\mu(p)\tau({R})\big)^{2}\big]\,{\mathbf{E}}_{p}\big[(S_{\tau({R})}-R+O(1)\big)^{2}\big]}\\
&=\;\sqrt{\boldsymbol{Var}_{p}[\xi]\,{\mathbf{E}}_{p}[\tau({R})]}\,O(1)\;=\;O(R^{1/2})\quad\text{as }R\to\infty.
\end{align}\] Returning to 31 and using 32 , we conclude \[{\mathbf{E}}_{p} \left[ \frac{R^k}{\big(\mu(p)\tau({R})\big)^k}
\right]\;=\;\frac{R^k}{\big(\mu(p){\mathbf{E}}_{p}[\tau({R})]\big)^k}\,+\,O(R^{-1}) \;=\;1\,+\,O(R^{-1}) \quad \text{as } R \to \infty,\] uniformly in \(p \in [0,1]\), proving 30 ,
hence also ?? . This completes the proof of Proposition 4. ◻
In view of the strategy outlined in Subsection 2.4, we must verify conditions (8 – 11 ).
The condition \[\label{eq46rho}
\rho_{{R}}(x)\;=\;x\,+\,\frac{\rho(x)}{{R}}\,+\,o\left(\frac{1}{{R}}\right) \quad \text{as } {R}\to \infty\tag{33}\] from Theorem 1 implies that, for any \(n \in {\mathbb{N}}\), \[\begin{align}
{\mathbb{E}}_{x}[(X_{1}^{{R}}\,-\,x)^n]\;
&=\;\sum_{k=0}^{n} \binom{n}{k} {\mathbb{E}}_{x}\big[\big(X_{1}^{{R}}\,-\,\rho_{{R}}(x)\big)^k\big] \cdot \big(\rho_{{R}}(x)\,-\,x\big)^{n-k} \\
&=\;\sum_{k=0}^{n} \binom{n}{k} {\mathbb{E}}_{x}\big[\big(X_{1}^{{R}}\,-\,\rho_{{R}}(x)\big)^k\big] \cdot \left( \frac{\rho(x)}{R} \right)^{n-k}\,+\,o(R^{-1}) \quad \text{as } {R}\to \infty,
\end{align}\] where, throughout this section, all convergence statements involving \({\mathbb{E}}_{x}\) are understood to hold uniformly in \(x \in [0,1]\), and those involving \({\mathbf{E}}_{p}\) uniformly in \(p \in [0,1]\).
As a consequence of the uniform convergence \(R(\rho_{R}(x) - x) \to \rho(x)\), we deduce the following expansion, valid as \({R}\to \infty\)\[\label{moment32formula}
{\mathbb{E}}_{x}[(X_{1}^{R}\,-\,x)^n]\;=\;n\,{\mathbb{E}}_{x}\big[\big(X_{1}^{R}\,-\,\rho_{R}(x)\big)^{n-1}\big] \cdot \frac{\rho(x)}{R}
\,+\,{\mathbb{E}}_{x}[(X_{1}^{R}\,-\,\rho_{R}(x))^n]\,+\,o(R^{-1}).\tag{34}\]
Combining 35 with 34 will pave the way for the proof of Theorem 1, which is presented at the end of this
section.
Proposition 5. For any \(m \in {\mathbb{N}}\), we have \[\label{eq:moment32m}
{R}\bigg( {\mathbf{E}}_{p} \bigg[ \bigg( \frac{S_{\tau({R})}}{\tau({R})} \bigg)^m \bigg]\,-\,\mu(p)^m \bigg)\;=\;\frac{m(m+1)}{2} \mu(p)^{m-1} \, \mathbf{Var}_{p}[\xi]\,+\,O\big(R^{-1}\big),\qquad{(6)}\] where the \(O(R^{-1})\) term is uniform in \(p \in [0,1]\).
For the proof of this result, we require the following auxiliary lemma, which provides a somewhat tedious but useful expansion for the integral moments of the ratio \(\tau({R})^{-1} S_{\tau({R})}\).
Lemma 5. In the given notation, for all \(p \in [0,1]\), \(m \in {\mathbb{N}}\), and \(\beta \ge 0\), we have
\[\begin{align}
{\mathbf{E}}_{p}\bigg[\bigg(\frac{S_{\tau({R})}}{\tau({R})}\bigg)^{m}\bigg]
\;=\;\sum_{k=0}^{m} \sum_{\substack{\alpha_{1},\ldots,\alpha_{k} \ge 1,\;\beta \ge 0 \\ \alpha_{1}\,+\,\cdots\,+\,\alpha_{k}\,+\,\beta = m}}
\frac{m!}{\alpha_{1}!\cdots\alpha_{k}! \, \beta! \, k!} \, J_{m,k}^{(p)}(\alpha_{1},\ldots,\alpha_{k} \mid \beta), \label{eq:cases95123n125ew}
\end{align}\tag{36}\] where, with \(s_k \mathrel{\vcenter{:}}= x_1+\cdots+x_k\), the term \(J_{m,k}^{(p)}\) is defined as \[\begin{align}
&J_{m,k}^{(p)}(\alpha_{1},\ldots,\alpha_{k} \mid \beta)\\
&\mathrel{\vcenter{:}}=\;\int \cdots \int {\mathbf{E}}_{p} \Bigg[\xi_{\tau({R}- s_k)}^{\beta} \cdot \frac{ \prod_{j=0}^{k-1} \big(\tau({R}- s_k)\,+\,j\big)}{ \big(\tau({R}- s_k)\,+\,k\big)^{m} }\Bigg]\Bigg(\prod_{j=1}^{k} x_j^{\alpha_j}
\Bigg)\;F_{p}(\mathrm{d}x_k) \cdots F_{p}(\mathrm{d}x_1),
\end{align}\] with the convention that when \(k = 0\), the term reduces to \[J_{m,0}^{(p)}(m)\;=\;{\mathbf{E}}_{p}\bigg[\bigg( \frac{\xi_{\tau({R})}}{\tau({R})} \bigg)^m
\bigg].\]
Proof. Let \(n\in{\mathbb{N}}\). By the multinomial theorem we obtain \[\begin{align}
{\mathbf{E}}_{p}&\bigg[{\mathbb{1}}_{\{\tau({R})=n\}}\bigg(\frac{S_{\tau({R})}}{\tau({R})}\bigg)^{m}\bigg]\\
&=\;\sum_{k=0}^{m}\sum_{\alpha_{1},\ldots,\alpha_{k}\ge 1|\beta\ge 0\atop \alpha_{1}+\ldots+\alpha_{k}+\beta=m}\frac{m!}{\alpha_{1}!\cdots\alpha_{k}!\beta!}\sum_{1\le i_{1}<\ldots<i_{k}<n}{\mathbf{E}}_{p}\Bigg[{\mathbb{1}}_{\{S_{n-1}<{R}\le
S_{n}\}}\Bigg(\prod_{j=1}^{k}\xi_{i_{j}}^{\alpha_{j}}\Bigg)\cdot\frac{\xi_{n}^{\beta}}{n^{m}}\Bigg]\\
&=\;\sum_{k=0}^{m}\sum_{\alpha_{1},\ldots,\alpha_{k}\ge 1|\beta\ge 0\atop \alpha_{1}+\ldots+\alpha_{k}+\beta=m}\frac{m!}{\alpha_{1}!\cdots\alpha_{k}!\beta!}\,\binom{n-1}{k}J_{m,k,n}^{(p)}(\alpha_{1},\ldots,\alpha_{k}|\beta),
\end{align}\] where \[\begin{gather}
J_{m,k,n}^{(p)}(\alpha_{1},\ldots,\alpha_{k}|\beta)\;\mathrel{\vcenter{:}}=\;\int\!\!\cdots\!\!\int{\mathbf{E}}_{p}\Bigg[{\mathbb{1}}_{\{S_{n-1-k}<{R}-s_{k}\le S_{n-k}\}}\frac{\xi_{n-k}^{\beta}}{n^{m}}\Bigg]\Bigg(\prod_{j=1}^{k}
x_{j}^{\alpha_{j}}\Bigg)\,F_{p}(\mathrm{d}x_{k})\ldots F_{p}(\mathrm{d}x_{1})\\
=\;\int\!\!\cdots\!\!\int{\mathbf{E}}_{p}\Bigg[{\mathbb{1}}_{\{\tau({R}-s_{k})=n-k\}}\frac{\xi_{\tau({R}-s_{k})}^{\beta}}{(\tau({R}-s_{k})+k)^{m}}\Bigg]\Bigg(\prod_{j=1}^{k} x_{j}^{\alpha_{j}}\Bigg)\,F_{p}(\mathrm{d}x_{k})\ldots F_{p}(\mathrm{d}x_{1}).
\end{gather}\] Summing over all \(n \in {\mathbb{N}}\) yields 36 . ◻
The next auxiliary lemma provides the final ingredient in the proof of Proposition 5, namely suitable asymptotic expansions for the
functions \(J_{m,k}^{(p)}\).
Lemma 6. For any \(m,k \in {\mathbb{N}}\) with \(k \le m-2\), and for \(\alpha_1, \ldots, \alpha_k \in {\mathbb{N}}\), \(\beta \ge 0\) satisfying \(\alpha_1 + \cdots + \alpha_k + \beta = m\), we have \[\label{Jsmall}
J_{m,k}^{(p)}(\alpha_{1},\ldots,\alpha_{k}|\beta)\;=\;O(R^{-2}).\tag{37}\] Moreover, for each \(m \in {\mathbb{N}}\), the following asymptotics hold \[\begin{gather}
J_{m,m}^{(p)}(1,1,\ldots,1|0) \;=\;\mu(p)^m \left( 1 - \frac{m(m+1)}{2R} \mu(p) \right)\,+\,O(R^{-2}), \tag{38} \\
J_{m,m-1}^{(p)}(1,1,\ldots,1|1)\;=\;\frac{\mu(p)^{m-1}\,{\mathbf{E}}_{p}[\xi^2]}{R}\,+\, O(R^{-2})\;=\;J_{m,m-1}^{(p)}(2,1,\ldots,1|0) \tag{39}.
\end{gather}\]
Proof. Note first that \[\begin{align}
J_{m,k}^{(p)}(\alpha_{1},\ldots,\alpha_{k}|\beta)\;\le\;{\mathbf{E}}_{p}\Bigg[
\prod_{j=0}^{k-1} \left(1 - \frac{k-j}{\tau({R}- k) + k} \right)\cdot \frac{1}{\big(\tau({R}- k) + k\big)^{m - k}} \Bigg]\;\le\;(R - m)^{-(m - k)},
\end{align}\] which is of order \(O(R^{-2})\) for \(k \le m - 2\), uniformly in \(p\), hence proving 37 . To establish 38 and 39 , set \(\tau_k \mathrel{\vcenter{:}}= \tau(R-s_k)\). By definition of the functions \(J_{m,k}^{(p)}\), we obtain \[J_{m,m}^{(p)}(1,1,\ldots, 1\vert 0) = \int \cdots \int \mathbf{E}_p\bigg[ \frac{\prod_{j=0}^{m-1} (\tau_m+j)}{(\tau_m+m)^m} \bigg] \Big(\prod_{j=1}^m x_j\Big) F_p({\rm d}x_1) \ldots F_p({\rm d}x_m).\] By combining the
asymptotic expansion for large \(x\)\[\frac{\prod_{j=0}^{m-1} (x+j)}{(x+m)^m}=1-\frac{m(m+1)}{2 (x+m)}+ O\big((x+m)^{-2}\big),\] with Proposition 4, we get \[\mathbf{E}_p\bigg[ \frac{\prod_{j=0}^{m-1} (\tau_m+j)}{(\tau_m+m)^m} \bigg] = 1-\frac{m(m+1)\mu(p)}{2R}+O(R^{-2}),\] which proves 38 . Similarly, \[\mathbf{E}_p\bigg[ \xi_{\tau_{m-1}}\frac{\prod_{j=0}^{m-2}(\tau_{m-1}+j)}{(\tau_{m-1}+m-1)^m}\bigg] = \mathbf{E}_p \bigg[ \frac{\xi_{\tau_{m-1}}}{\tau_{m-1}+m-1} \bigg] + O\big( (
\tau_{m-1}+m-1)^{-2}\big),\] and, by applying Proposition 3, we conclude that \[\begin{align}
J_{m,m-1}^{(p)}(1,1,\ldots,1 \vert 1) =& \int \cdots \int \mathbf{E}_p\bigg[ \xi_{\tau_{m-1}} \frac{\prod_{j=0}^{m-2}(\tau_{m-1}+j)}{(\tau_{m-1}+m-1)^m} \bigg] \Big( \prod_{j=1}^{m-1} x_j \Big) F_p({\rm d}x_1) \cdots F_p({\rm d}x_{m-1}) \\
=&\frac{\mu(p)^{m-1} \mathbf{E}_p[\xi_\infty]}{R}+O(R^{-2})=\frac{\mu(p)^{m-1}\,{\mathbf{E}}_{p}[\xi^2]}{R}\,+\, O(R^{-2}).
\end{align}\] The asymptotic expansion for \(J_{m,m-1}^{(p)}(2,1,\ldots,1 \vert 0)\) follows analogously, which completes the proof. ◻
Proof of Proposition 5. Recall that all asymptotic expansions stated below are understood to hold uniformly in \(p\). Note also that for any \(m \in {\mathbb{N}}\), the symmetry of the integrals implies \[J_{m,m-1}^{(p)}(2,1,\ldots,1|0)\;=\;J_{m,m-1}^{(p)}(1,2,1,\ldots,1| 0)\;=
\cdots
=\;J_{m,m-1}^{(p)}(1,\ldots,1,2| 0).\] Using Lemmata 5 and 6, we expand \[\begin{align}
{\mathbf{E}}_{p}\Bigg[\bigg(\frac{S_{\tau({R})}}{\tau({R})}\bigg)^{m}\Bigg]
&=\;m \cdot J_{m,m-1}^{(p)}(1,\ldots,1|1)
\,+\,\frac{m(m-1)}{2} \cdot J_{m,m-1}^{(p)}(2,1,\ldots,1 \mid 0) \\
&\quad +\,J_{m,m}^{(p)}(1,\ldots,1|0) \,+\,O(R^{-2}) \\
&=\;\frac{m \mu(p)^{m-1} \, {\mathbf{E}}_{p}[\xi^2]}{R}
\,+\,\frac{m(m-1) \mu(p)^{m-1} \, {\mathbf{E}}_{p}[\xi^2]}{2R} \\
&\quad +\, \mu(p)^m \left(1 - \frac{m(m+1)}{2R} \mu(p) \right)
\,+\,O(R^{-2}) \\
&=\;\mu(p)^m
\,+\,\frac{m(m+1)\mu(p)^{m-1}}{2R}
\left( {\mathbf{E}}_{p}[\xi^2]\,-\,\mu(p)^2 \right)
\,+\,O(R^{-2}),
\end{align}\] which completes the proof. ◻
We are now ready to prove the main result.
Proof of Theorem 1. As outlined in Section 2.4, the first step is to verify conditions (8
– 11 ). All convergence statements below are understood to hold uniformly in \(x \in [0,1]\).
We begin with the first moment. Setting \(n = 1\) in 34 , and using 35 together with Proposition 5, we find \[\begin{align} \label{first95moment}
R\,{\mathbb{E}}_{x}[X_{1}^{R} - x]\;&=\;- \frac{\boldsymbol{Var}_{\rho_{R}(x)}[\xi]}{1 - {\vartheta}}\,+\,\rho(x)\,+\,o(1)
\;=\; - \frac{\boldsymbol{Var}_{x}[\xi]}{1\,-\, {\vartheta}}\,+\,\rho(x)\,+\,o(1),
\end{align}\tag{40}\] where the second identity follows from the uniform convergence in 33 . This establishes 8 .
Using the Taylor expansion argument from Section 2.4 and combining with (8 – 11 ), we conclude that for any \(f
\in C^4([0,1])\), \[\lim_{R \to \infty} \sup_{x \in [0,1]}
\lvert \mathcal{A}^R f(x) - \mathcal{A} f(x) \rvert\;=\;0.\]
Since the diffusion coefficient \(x \mapsto a(x) = x(1 - x)(1 - (1 - \vartheta)x)\) is non-negative, twice continuously differentiable, and vanishes at \(x\in\{0,1\}\), and since the
drift term \(x \mapsto d(x) = -(1 - \vartheta)x(1 - x) + \rho(x)\) is Lipschitz continuous with \(d(0) = \rho(0) = \beta_{0} \geq 0\) and \(d(1) = \rho(1) =
-\beta_1 \leq 0\), we may invoke [19] to conclude that \(X\) is Feller and that \(C^\infty([0,1])\) is a core for its generator \(\mathcal{A}\). The result then follows from [19]. ◻
We now briefly return to the model variant described in Subsection 2.2, where the stopping rule for each generation is to reject the first individual that would cause an overshoot of the available resources.
We have already noted that if \(\rho(x) \equiv 0\), the limiting diffusion model given by Theorem 2 is neutral. The intuitive
reason for this is the absence of the effect of the last individual. Specifically, the only reason small individuals experience a disadvantage in the original two-size Wright–Fisher model is the size-biased law of the stopping summand \(\xi_{\tau({R})}\), which does not apply in the variant, as the individual associated with \(\xi_{\tau({R})}\) is rejected.
The proof of Theorem 2 is very similar to the proof of Theorem 1, but instead of \(\tau({R})\), it requires considering the modification \[\overline{\tau}({R}) \mathrel{\vcenter{:}}= \inf \{ n \in \mathbb{N} : S_n > {R}\}.\] As the counterpart to 5 ,
we then have \[\label{l2}
{\mathbb{P}}_{x}\big(({\overline{X}}_{1}^{{R}},{\overline{M}}_{1}^{{R}})\in\cdot\big)\;=\;{\mathbb{P}}_{x}\Bigg(\bigg(\frac{1}{1-{\vartheta}}\Big(1-\frac{S_{{\overline{\tau}}({R})-1}}{{\overline{\tau}}({R})-1}\Big),{\overline{\tau}}({R})-1\bigg)\in\cdot\Bigg).\tag{42}\]
With the help of this relation, the expression \({R}\, \mathbb{E}_x [ ({\overline{X}}_1^{{R}} - x)^k]\) as \({R}\to \infty\) can, for \(k \in \{1, 2, 3,
4\}\), be analyzed in the same way as in the previous section, without the need for new arguments. For \(k \in \{2, 3, 4\}\), the same results as in the original model are obtained, and for \(k = 1\), we even have an explicit result, as the following lemma shows.
Lemma 7. For any fixed \({\vartheta}\in (0,1)\) and \(\rho(x) = 0\), \[\mathbb{E}_{x}\big[{\overline{X}}_{1}^{R} - x\big] = 0.\]
Proof. Since, by 42 , \[\mathbb{E}_{x}\big[ {\overline{X}}_{1}^{R} - x \big]
\;=\;\frac{1}{1 - {\vartheta}} \left(\mu\big(\rho_{R}(x)\big)\,-\,\mathbb{E}_{x}\left[ \frac{S_{\overline{\tau}(R) - 1}}{\overline{\tau}(R) - 1}\right] \right),\] it suffices to show that, for any \(p \in [0,1]\),
\[\label{eq:Stau-1}
\mathbb{E}_{p} \left[ \frac{S_{\overline{\tau}(R) - 1}}{\overline{\tau}(R) - 1} \right]\;=\;\mu(p).\tag{43}\] To this end, let \(\xi\) be a generic copy of the \(\xi_{i}\),
independent of all other random variables under each \(\mathbb{P}_{p}\). Then, \[\begin{align}
\mathbb{E}_{p} \left[ \frac{S_{\overline{\tau}(R) - 1}}{\overline{\tau}(R) - 1} \right]\;
&=\;\sum_{n\ge 1} \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}_{p} \big[ \xi_{i}\,\mathbf{1}_{\{\overline{\tau}(R) - 1 = n\}}\big] \\
&=\;\sum_{n\ge 1} \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}_{p} \big[\xi\,\mathbf{1}_{\{{\overline{T}}(R - \xi) = n\}}\big] \;=\;\sum_{n \ge 1}\mathbb{E}_{p} \big[\xi\,\mathbf{1}_{\{ {\overline{T}}(R - \xi) = n \}}\big]\;=\; \mu(p),
\end{align}\] as required. ◻
Let us note in passing that Lemma 7 can also be derived by observing that the sequence \((n^{-1}
S_{n})_{n \ge 1}\) forms a reverse martingale, and that \(\overline{\tau}(R)-1=\sup \{n\ge 0:S_{n}\le R\}\) is an associated reverse stopping time. This implies, see e.g.[24] for more details, that \[\mathbb{E}_{p} \left[ \frac{S_{\overline{\tau}(R) - 1}}{\overline{\tau}(R) - 1} \right]\;=\;
\mathbb{E}_{p}[S_1]\;=\;\mu(p),\] and thus we obtain 43 once again.
We conclude with a brief discussion of the long-term behavior of the solution to SDE 2 and its interpretation in the context of the underlying two-size Wright–Fisher model. This analysis does not require new theoretical
developments, but instead relies on standard methods, as described in [25] and [26].
A key object in characterizing the long-term behavior is the scale function\(S(x)\), defined by \[\label{eq46scale}
S(x) \mathrel{\vcenter{:}}= \int_{x_{0}}^{x} \exp\left( -\int_{\eta}^{y} \frac{d(z)}{\sigma^{2}(z)} \,{\rm d}z \right) {\rm d}y,\tag{44}\] where \(d\) and \(\sigma\) denote the
drift and diffusion coefficients, respectively, of the SDE under study, and \(x_{0}, \eta\) are arbitrary points in the interval \((0,1)\). In what follows, we use the scale function to
describe certain aspects of the long-term behavior of the SDE 2 .
We begin with the extinction probability of the small individuals as a function of their initial proportion, which is meaningful only in the absence of mutation, i.e., when \(\rho(x) = s(x)\,x(1 - x)\) for some Lipschitz
function \(s : [0,1] \to \mathbb{R}\). With this in mind, define \(T_a\), for \(a \in \{0,1\}\), as the first hitting time of \(a\) by the process \(X\). The case \(a = 0\) corresponds to extinction, and \(a = 1\) to fixation. By standard results for
one-dimensional SDEs (see [26]), we have \[\mathbb{P}_{x}(T_{0} < T_1) = \frac{S(1) - S(x)}{S(1) - S(0)},\]
where \(S(x)\) denotes the scale function. Substituting the drift and diffusion coefficients from the SDE 2 into the definition 44 of the scale function yields \[S(x)\;=\;\int_{x_{0}}^{x} \exp\left( 2 (1 - {\vartheta}) \int_{\eta}^{y} \frac{1}{1 - (1 - {\vartheta}) z} \, \mathrm{d}z
- 2 \int_{\eta}^{y} \frac{s(z)}{1 - (1 - {\vartheta}) z} \, \mathrm{d}z
\right) \, \mathrm{d}y.\] The first integral with respect to \(z\) can always be computed explicitly; the second depends on the specific form of the selection function \(s\). In the
case of genic selection, i.e., \(\rho(x) = s\,x(1 - x)\) with constant \(s\), the extinction probability is given by \[\label{eq46extinction}
\mathbb{P}_{x}(T_{0} < T_1)\;=\;
\begin{cases}
\displaystyle\frac{{\vartheta}^{-1 + 2s(1 - {\vartheta})^{-1}} - \big(1 - (1 - {\vartheta})x\big)^{-1 + 2s(1 - {\vartheta})^{-1}}}{{\vartheta}^{-1 + 2s(1 - {\vartheta})^{-1}} - 1} & \text{if } 1 - {\vartheta}\neq 2s, \\[10pt]
\displaystyle\frac{\ln({\vartheta}) - \ln\big(1 - (1 - {\vartheta})x\big)}{\ln({\vartheta})} & \text{if } 1 - {\vartheta}= 2s.
\end{cases}\tag{45}\]
Figure 6 illustrates the extinction probability 45 as a function of \(x\), the initial proportion of small individuals, for various values of \(s\) and \({\vartheta}\). In the absence of selection (\(s = 0\)), the extinction probability exceeds \(1 - x\), and this
disadvantage increases as the size parameter \({\vartheta}\) decreases. When \(s = 1 - {\vartheta}\), the model becomes neutral, and the extinction probability equals \(1 - x\) (see Figure 6 (b), where \(s = {\vartheta}= 0.5\)).
Interestingly, for \(s \in \{1.5, 2\}\) and sufficiently large \(x\), the extinction probability no longer decreases with increasing \({\vartheta}\). To
explain this, assume first that \({\vartheta}\in [0,1]\) with \(s < 1\). In this regime, the drift term \[d(x)\;=\;\big(-(1 - {\vartheta}) + s\big) \, x(1 -
x)\] decreases as \({\vartheta}\) decreases. At the same time, the diffusion coefficient decreases as well, further limiting the process’s deviation from its drift. The combined effect – a stronger push toward \(0\) for \({\vartheta}< 1 - s\) and a weaker push toward \(1\) for \({\vartheta}> 1 - s\), along with reduced stochastic
noise – leads to an increased likelihood of extinction.
In contrast, when \(s > 1\) and \({\vartheta}\in [0,1]\), the drift is always positive, pushing the process toward \(1\). Decreasing \({\vartheta}\) weakens this drift, which might suggest, as before, that extinction becomes more likely. However, in this case the diffusion coefficient contains the additional factor \(1 - (1 -
{\vartheta})x\), which vanishes as \((1 - {\vartheta})x \to 1\). Consequently, if the process starts close to \(1\), the reduced noise – despite the weaker drift – makes it harder to
escape the vicinity of \(1\). This results in a lower probability of reaching \(0\).
abcd
Figure 6: The extinction probability \({\mathbb{P}}_{x}(T_{0} < T_1)\) from 45 for solutions of the SDE 2 with \(\rho(x)
= s x(1 - x)\), for different values of \(s\). For each \(s\), a fixed set of values for \({\vartheta}\) is considered. The function \(f(x) = 1 - x\) (black, solid line) is plotted for reference.. a — \(s = 0\), b — \(s = 0.5\), c — \(s = 1.5\), d — \(s = 2\)
Another quantity of interest in biological applications is the mean time to absorption, \({\mathbb{E}}_{x}[T_{0,1}]\), as a function of \(x\), where \(T_{0,1}
\mathrel{\vcenter{:}}= T_{0} \wedge T_1\). This is given by \[\label{eq46fixation95def}
{\mathbb{E}}_{x}[T_{0,1}] = \int_{0}^{1} G(x,\nu)\, \mathrm{d}\nu,\tag{46}\] where \(G(x, \nu)\) is the Green’s function, defined as \[\begin{align}
G(x, \nu) =
\begin{cases}
\displaystyle 2\,\frac{S(1) - S(x)}{S(1) - S(0)} \cdot \frac{S(\nu) - S(0)}{\sigma^2(\nu)\, S'(\nu)} &\text{if } 0 < \nu < x, \\[6pt]
\displaystyle 2\,\frac{S(x) - S(0)}{S(1) - S(0)} \cdot \frac{S(1) - S(\nu)}{\sigma^2(\nu)\, S'(\nu)} &\text{if } x < \nu < 1,
\end{cases}
\end{align}\] with \(S(x)\) the scale function from 44 .
Even in the case of genic selection, i.e., when \(d(x) = (-(1 - {\vartheta}) + s)x(1 - x)\), the integral in 46 generally cannot be evaluated analytically. However, in the absence of
selection (\(s = 0\)), a straightforward computation yields the explicit formula: \[\begin{align}
\label{eq46fix46analytic}
{\mathbb{E}}_{x}[T_{0,1}]\;=\;2 \ln(1 - x) \cdot \frac{{\vartheta}^{-1} - \big(1 - (1 - {\vartheta})x\big)^{-1}}{1 - {\vartheta}^{-1}} + 2 \ln(x) \cdot \frac{1 - \big(1 - (1 - {\vartheta})x\big)^{-1}}{1 - {\vartheta}}.
\end{align}\tag{47}\] A plot of this expression is shown in Figure 7. Although the integral in 46 cannot be solved in closed form for \(s \neq
0\), it can be evaluated numerically. We provide such a plot for the case \(s = 2\)—where the extinction probabilities exhibit non-monotonic dependence on \({\vartheta}\)—in the same
figure. In both cases, one observes that the expected time to absorption increases as \({\vartheta}\) decreases.
ab
Figure 7: Mean time to absorption \({\mathbb{E}}_{x}[T_{0,1}]\) for the solution of the SDE 2 with drift term \(d(x) = (-(1 - {\vartheta}) + s)x(1 -
x)\).. a — Analytical result from 47 for the case \(s = 0\)., b — Result for \(s = 2\), obtained via numerical integration (using the
scipy package).
Finally, we briefly comment on the stationary distribution of \(X\) in the case where \(\beta_{0}, \beta_1 > 0\). According to [26], the density of the stationary distribution is given by the ratio \(m(x) / \int_{0}^1 m(x)\, \mathrm{d}x\), where \[m(x)\;\mathrel{\vcenter{:}}=\;\frac{1}{\sigma^2(x)\, S'(x)},\] and \(S(x)\) is again the scale function. In the setting \[\rho(x)\;=\;\beta_{0}(1 - x) - \beta_1 x
+ s\, x(1 - x)\] (which corresponds to genic selection favoring small individuals and bi-directional mutation), the density of the stationary distribution simplifies to \[C(\beta_{0}, \beta_1, \vartheta, s)\,
x^{2\beta_{0} - 1}\,
(1 - x)^{2\beta_1 {\vartheta}^{-1} - 1}\,
\big(1 - (1 - {\vartheta})x\big)^{-2\beta_{0} - 2\beta_1 {\vartheta}^{-1} - 2s(1 - {\vartheta})^{-1} + 1},\] where \(C(\beta_{0}, \beta_1, \vartheta, s) > 0\) is a normalizing constant. This constant can be
expressed in terms of hypergeometric functions and evaluated numerically. Acknowledgments. The authors are very grateful to two anonymous reviewers for their careful reading and insightful comments, which significantly improved the manuscript. We also thank Ellen Baake for numerous helpful discussions and
valuable insights.
Gerold Alsmeyer acknowledges financial support by the German Research Foundation (DFG) under Germany’s Excellence Strategy EXC 2044–390685587, Mathematics Münster: Dynamics–Geometry–Structure. Fernando Cordero and Hannah Dopmeyer gratefully acknowledge
financial support by the German Research Foundation (DFG) – Project-ID 317210226 – SFB1283.
R. A. Fisher. The Genetical Theory of Natural Selection. Oxford University Press, 1st edition, 1930.
[2]
S. Wright. Evolution in mendelian populations. Genetics, 16(2):97–159, 1931.
[3]
F. Cordero, A. González Casanova, and J. Schweinsberg. Two waves of adaptation: Speciation induced by dormancy in a model with changing environment. arXiv e-prints, 2024.
[4]
A. González Casanova and C. Smadi. On \(\Lambda\)-Fleming–Viot processes with general frequency-dependent
selection. J. Appl. Probab., 57(4):1162–1197, 2020.
[5]
A. González Casanova and D. Spanò. Duality and fixation in \({\Xi}\)-Wright–Fisher processes with frequency-dependent selection.
Ann. Appl. Probab., 28(1):250–284, 2018.
[6]
A. González Casanova, D. Spanò, and M. Wilke-Berenguer. The effective strength of selection in random environment. Ann. Appl. Probab., 35(1):701–748, 2025.
[7]
R. H. MacArthur and E. O. Wilson. The Theory of Island Biogeography, volume 1. Princeton University Press, 2001.
[8]
D. Tilman. Resource Competition and Community Structure. Monographs in Population Biology. Princeton Univ. Press, Princeton, NJ, 1st edition, 1982.
[9]
A. González Casanova, V. Miró Pina, and J. C. Pardo. The Wright–Fisher model with efficiency. Theor. Popul. Biol., 132:33–46, 2020.
[10]
A. González Casanova, J. C. Pardo, and J. L. Perez. Branching processes with interactions: subcritical cooperative regime. Adv. Appl. Probab., 53(1):251–278, 2021.
[11]
G. Berzunza Ojeda and J. C. Pardo. Branching processes with pairwise interactions. arXiv e-prints, 2024.
[12]
M. E. Caballero, A. González Casanova, and J. L. Perez. The relative frequency between two continuous-state branching processes with immigration and their genealogy. Ann. Appl.
Probab., 34(1B):1271–1318, 2024.
[13]
J. H. Gillespie. Natural selection for within-generation variance in offspring number. Genetics, 76(4):601–606, 1974.
[14]
A. A. Borovkov and S. G. Foss. Estimates for the excess of a random walk over an arbitrary boundary and their applications. Theory Probab. Appl., 44(2):249–277, 1999.
[15]
T. L. Lai. niform Tauberian Theorems and their Applications to Renewal Theory and First Passage
Problems. Ann. Probab., 4(4):628–643, 1976.
[16]
T. Yamada and S. Watanabe. On the uniqueness of solutions of stochastic differential equations. J. Math. Kyoto Univ., 11:155–167, 1971.
[17]
E. Baake, L. Esercito, and S. Hummel. Lines of descent in a Moran model with frequency-dependent selection and mutation. Stoch. Process. Appl., 160:409–457,
2023.
[18]
W. J. Ewens. Mathematical Population Genetics I: Theoretical Introduction. Springer, New York, 2004.
[19]
S. N. Ethier and T. G. Kurtz. Markov Processes: Characterization and Convergence. Wiley, New York, 1986.
[20]
O. Kallenberg. Foundations of Modern Probability. Probability Theory and Stochastic Modelling. Springer International Publishing, Cham, 3rd edition, 2021.
[21]
A. Gut. Stopped Random Walks: Limit Theorems and Applications. Springer Series in Operations Research and Financial Engineering. Springer, New York, 2nd edition, 2009.
[22]
H. Thorisson. Coupling, Stationarity, and Regeneration. Springer, New York, 1st edition, 2000.
[23]
H. Bauer. Measure and Integration Theory, volume 26 of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, 2001. Translated from the German by
Robert B. Burckel.
[24]
G. Alsmeyer. On first and last exit times for curved differentiable boundaries. Sequential Anal., 7(4):345–362, 1988.
[25]
R. Durrett. Probability Models for DNA Sequence Evolution. Probability and its applications. Springer, New York, 2nd edition, 2008.
[26]
A. Etheridge. Some Mathematical Models from Population Genetics. Lecture Notes in Mathematics. Springer, Heidelberg, 1st edition, 2011.