Abstract

We prove a few simple cases of a random graph statement that would imply the "second" Kahn–Kalai Conjecture. Even these cases turn out to be reasonably challenging, and it is hoped that the ideas introduced here may lead to further interest in, and further progress on, this natural problem.

1 Introduction↩︎

For graphs \(G\) and \(J\), a copy of \(J\) in \(G\) is an (unlabeled) subgraph of \(G\) isomorphic to \(J\). (We will, a little abusively, use “\(G\supseteq H\)" to mean \(G\) contains a copy of \(H\).) We use \(N(G,J)\) for the number of such copies, and \(\mathbb{E}_pX_J\) for \(\mathbb{E}N(G_{n,p},J)\) (where \(G_{n,p}\) is the usual "Erdős-Rényi" random graph). See the end of this section for other definitions and notation.

For \(q \in [0,1]\), say a graph \(J\) is \(q\)-sparse if \[\mathbb{E}_qX_{I} \ge 1 \,\,\,\forall I\subseteq J.\]

We are interested here in the following conjecture from [1].

Conjecture 1. [1] There is a fixed K such that if \(H\) is \(q\)-sparse and \(p=Kq\), then \[N(H,F) < \mathbb{E}_pX_F \quad \forall F \subseteq H.\]

(Note “\(\subseteq H\)" is unnecessary.)

This simple statement is our preferred form of [1], which would imply the "second" Kahn-Kalai Conjecture [2]. We will not go into background here, just referring to the discussion in [1], but for minimal context recall the original conjecture of [2], though it will not be needed below.

Define the threshold for \(H\)-containment, \(p_c(H)= p_c(n,H)\), to be the unique \(p\) for which \(\mathbb{P}(G_{n,p}\supseteq H)=1/2\), and set \[p_{\mathbb{E}}(H) =p_{\mathbb{E}}(n,H)=\min\{p:\mathbb{E}_pX_I \ge 1/2 ~ \forall I \subseteq H\}.\] This is essentially what [2] calls the expectation threshold, though the name was repurposed in [3]. It is, trivially, a lower bound on \(p_c(H)\) since, for any \(I\subseteq H\), \(\mathbb{P}(G_{n,p}\supseteq H) \leq \mathbb{P}(G_{n,p}\supseteq I) \leq \mathbb{E}_pX_I\). The “second Kahn–Kalai Conjecture" (so called in [4]), which was in fact the starting point for [2], is then

Conjecture 2. [2]There is a fixed \(K\) such that for any graph \(H\), \[p_c(H) < K p_{\mathbb{E}}(H) \log v_H.\]

(That this is implied by 1 follows from the main result of [3]; again, see [1].) In the limited setting to which it applies, 2 is considerably stronger than the main conjecture of [2] (called the “Kahn–Kalai Conjecture" in [5]), which is now a result of Pham and the third author [6].

At this writing the best we know in the direction of 1 is the main result of [1], viz.

Theorem 3. [1] There is a fixed \(K\) such that if \(H\) is \(q\)-sparse and \(p=Kq\log^2 n\), then \[\label{eq.MT} N(H,F) < \mathbb{E}_pX_F \quad \forall F \subseteq H.\qquad{(1)}\] Furthermore, there is a fixed \(\alpha >0\) such that ?? holds if \(H\) is \(q\)-sparse with \(q=\alpha p\le 1/(3n)\).

In this paper we show that 1 is correct for a few simple families of \(F\)’s, as follows. (Note that the sizes of these \(F\)’s can depend on \(n\).)

Theorem 4. There is a fixed \(L\) such that the following holds. Suppose \(H\) is \(q\)-sparse and \(p=Lq\). If \(F\) is a clique or a cycle, then \[N(H,F) < \mathbb{E}_pX_F.\]

Theorem 5. For any \(\Delta\), there exists an \(L=L(\Delta)\) such that if \(H\) is \(q\)-sparse, \(p=Lq\), and \(F\) is a tree with maximum degree \(\Delta\), then \[N(H,F) < \mathbb{E}_pX_F.\]

Remarks. (a) It is easy to see (see [1]) that if 1 is true for each component of \(F\) then it is true for \(F\); in particular 5 implies the conjecture for forests as well as trees.

(b) Even the above elementary cases are, to date, not so easy, and the present work is meant partly to highlight this, and partly to give some first ideas on how to proceed. One may of course wonder whether this (seeming) difficulty is telling us the conjecture is simply wrong, but (and somewhat contrary to our initial opinion) we now tend to think it is true.

Outline and preview. 2 includes definitions and a few initial observations, following which the clique portion of 4, 5, and the cycle portion of 4 are proved in Sections 3, 4 and 5 respectively. Of these:

Cliques are our easiest case and may serve as a warm-up for what follows. 4 for cycles is postponed to 5 since it depends on the result for paths, a first case of 5. While the proof of 27 seems to us quite interesting (as does the fact that getting from paths to cycles seems not at all immediate), we regard the proof of 5 as the heart of the paper. Here it may be helpful to think of the (prototypical) case of paths. A simpler argument for even this very simple case would be welcome, as (of course) would be a proof of 5 without the degree restriction.

Usage↩︎

For a graph \(J\) we use \(v_J\) and \(e_J\) for \(|V(J)|\) and \(|E(J)|\), and \(\Delta _J\) for the maximum degree in \(J\). The identity of \(H\) (in Theorems 4 and 5) is fixed throughout, and we often use copy of \(J\) for copy of \(J\) in H.

As usual, \(J[U]\) is the subgraph of \(J\) induced by \(U\subseteq V(J)\), and \(v\sim w\) denotes adjacency of \(v,w\in V(J)\). For \(A, B \subseteq V(J)\) (here always disjoint), \(\nabla_J(A, B):=\{\{v,w\} \in E(J):v \in A, w \in B\}\) and \(\nabla_J(A):=\nabla_J(A, V(J) \setminus A)\). We also use \(\nabla_J(v)\) for \(\nabla_J(\{v\})\) (and similarly for \(\nabla_J(v,\cdot)\)) and \(d_J(v)=|\nabla_J(v)|\).

Recall (see e.g. [7]) that the density of a graph \(J\) with \(v_J \neq 0\) is \(d(J)=e_J/v_J\), and the maximum density of \(J\) is \(m(J) =\max\{d(I):I \subseteq J\}\).

Throughout the paper, \(\log\) means \(\log_2\). For positive integers \(a\) and \(b\), we use \([a]=\{1,2, \ldots a\}\), \([a,b]=\{a, a+1, \ldots, b\}\), and \((a)_b=a(a-1)\cdots(a-b+1)\). We make no effort to keep our constant factors small, and, in line with common practice, often pretend large numbers are integers.

2 Preliminaries↩︎

Note that, in proving 1, we may assume \(n\) is somewhat large, since otherwise the conjecture is vacuous for large enough \(L\). We may also assume that \(L\) is somewhat large, so \[\label{qL} q =p/L \le 1/L\tag{1}\] is somewhat small.

We will make occasional, usually tacit, use of the familiar fact that for positive integers \(a,b\), \[\label{elementary} (a)_b > (a/e)^b.\tag{2}\]

Proposition 6. If \(H\) is \(q\)-sparse, then \(\Delta _H\le \max\{\log n, 2enq\}.\) In particular, if \(q\ge \log n/n\), then \(\Delta _H\le 2enq\).

Proof. If \(R\) is a \(k\)-star with \(k> \max\{\log n, 2enq\}\), then (using 2 for the second inequality) \[\mathbb{E}_qX_R < n \binom{{n}}{{k}}q^k < n\left(\frac{enq}{k}\right)^k< n 2^{-k}<1,\] so \(R\not\subseteq H\). ◻

Proposition 7. If \(H\) is \(q\)-sparse, then \(m(H)<\log n\). If in addition \(q \le n^{-c}\), then \(m(H)< 1/c\).

Proof. If \(d(R)\ge \log n\) (that is, \(e_R \ge v_R\log n\)), then \[\mathbb{E}_qX_R < n^{v_R}q^{e_R}\; \le \left(nq^{\log n}\right)^{v_R} \le (nL^{-\log n})^{v_R}<1\] (see 1 ); so \(H\not\supseteq R\).

Similarly, if \(q\le n^{-c}\) and \(d(R) \geq 1/c\), then \[\mathbb{E}_qX_R < n^{v_R}q^{e_R}\le \left(nq^{1/c}\right)^{v_R}\le 1\] (so \(H\not\supseteq R\)). ◻

Corollary 8. If \(H\) is \(q\)-sparse, then \(e_H< n\log n\), and \(e_H < n/c\,\) if \(\,q \le n^{-c}\).

We denote by \(\nu(H,J)\) the maximum size of an edge-disjoint collection of copies of \(J\) in \(H\). The following simple observation will be important.

Proposition 9. If \(H\) is \(q\)-sparse, then for any \(J\), \(\nu(H,J) \le e\mathbb{E}_qX_J.\)

This is helpful because (roughly): trivially, \[\label{N.nu.B} N(H,J)\le \nu(H,J)\cdot B\tag{3}\] for any bound \(B\) on the number of copies of \(J\) (in \(H\)) sharing an edge with a given copy; and possible bounds \(B\) should be better than bounds on \(N(H,J)\) itself, since the number of starting points for a copy of \(J\) meeting a given copy is at most \(v_J\), rather than the usually much larger \(n\). This idea plays a main role below, and again in [1], which was inspired in large part by the ideas introduced here.

Proof. Let \(R\) be the edge-disjoint union of \(\nu\) copies of \(J\), with \(\nu>e\mathbb{E}_qX_J\). Then \[\mathbb{E}_qX_R =N(K_n,R)q^{e_R}\le {N(K_n,J) \choose \nu} q^{\nu\cdot e_J}< \left(\frac{eN(K_n,J)q^{e_J}}{\nu}\right)^{\nu}=\left(\frac{e\mathbb{E}_qX_J}{\nu}\right)^\nu < 1,\] so \(H\not\supseteq R\). ◻

3 Cliques↩︎

Here we prove the clique portion of 4. Recall that \(p=Lq\), with \(L\) fixed and somewhat large (large enough to support the assertions below), and let \(F=K_{r+1}\) (\(r \in [2,n-1]\)) (noting that \(r =1\), which could easily be included here, is immediate from 9). We divide possibilities for \(q\) into two ranges, for which we use different arguments.

Small \(q\). Suppose \[\label{eq:p.ub} q< n^{-2/(r+1)}.\tag{4}\]

This is our first use of the strategy sketched following 9; the desired bound on \(B\) (in 3 ) is provided by the next observation.

Lemma 10. If \(H\) is \(q\)-sparse (with \(q\) as in 4 ) and \(K\) is a copy of \(F\) (in \(H\)), then number of copies of \(F\) that share edges with \(K\) is less than \((er)^{r+1}\).

Proof. Let \(R\) be the union of the copies of \(F\) that share edges with \(K\). We show that \(v_R\) can’t be too large, and, given this, use the crudest possible bound on \(N(R,F)\).

Set \(K=R_0\) and choose copies \(R_1, R_2,\ldots ,R_m\) of \(F\) that share edges with \(K\) and satisfy \[E(R_i) \not\subseteq\cup_{j<i} E(R_j) \,\,\forall i \in [m]\,\,\,\, \text{and} \,\,\,\,\cup_{i=0}^m R_i=R.\] We claim that \[\label{mler} m\le r.\tag{5}\]

Proof. Set \(v_i=|V(R_i)\setminus\cup_{j <i} V(R_j)|\) and \(e_i=|E(R_i) \setminus \cup_{j <i} E(R_j)|\). Then (since \(H\) is \(q\)-sparse, and using 4 for the third inequality) \[\label{eq:expR} 1\le \mathbb{E}_qX_R< n^{v_R}q^{e_R}= n^{r+1}q^{\binom{{r+1}}{{2}}}\prod_{i=1}^m \left(n^{v_i}q^{e_i}\right) < n\prod_{i=1}^m \left(n^{v_i}q^{e_i}\right).\tag{6}\] Again by 4 , we have \(n^{v_i}q^{e_i} < n^{-2/(r+1)}\) if \(v_i = 0\), while \(v_i \in [r-1]\) gives \(e_i \ge {{r+1} \choose 2}-{r+1 -v_i \choose 2}\) and \[n^{v_i}q^{e_i} < n^{v_i -2\left({{r+1} \choose 2} - {r+1 -v_i \choose 2}\right)/(r+1)}.\] Here the exponent on the r.h.s. is maximized (over \(v_i \in [r-1]\)) at \(v_i = 1\) and \(v_i = r-1\), yielding \[n^{v_i}q^{e_i} < n^{-1 + 2/(r+1)}.\] So in any case, \(n^{v_i}q^{e_i} < n^{-1/(r+1)}\) (since \(r \geq 2\)), and the r.h.s.of 6 is less than \(n \cdot n^{-m/(r+1)},\) yielding 5 . ◻

Thus \(v_R \leq r^2+1\) (say) and \(N(R,F) \leq \binom{{r^2+1}}{{r+1}}< (er)^{r+1}.\) ◻

Finally, the combination of 9 and 10 gives (for slightly large \(L\)) \[N(H,F)\le \nu(H,F) \cdot (er)^{r+1} \le e\cdot \mathbb{E}_qX_F(er)^{r+1}<L^{{r+1} \choose 2}\mathbb{E}_qX_F= \mathbb{E}_pX_F,\] so we have 4 in this case.0◻

Large q. Now suppose \[q \ge n^{-2/(r+1)}.\] Then \[\label{eq:lb1} \mathbb{E}_pX_F= \binom{{n}}{{r+1}}p^{\binom{{r+1}}{{2}}} \ge \left(\frac{np^{r/2}}{r+1}\right)^{r+1} \ge n\left(\frac{L^{r/2}}{r+1}\right)^{r+1}>n\cdot (L/4)^{r(r+1)/2}.\tag{7}\] We will argue by contradiction, showing that if \(N(H,F) \ge \mathbb{E}_pX_F\), then there is an \(R \subseteq H\) with \(\mathbb{E}_qX_R<1\).

Let \[\label{eq:a.def} a=\mathbb{E}_pX_F/n~\left(>(L/4)^{r(r+1)/2}\right);\tag{8}\] so we are assuming \[N(H,F)\ge an.\]

Recall that a hypergraph, \(\mathcal{H}\), on (vertex set) \(V\) is a collection of subsets (edges) of \(V\); degree for hypergraphs is defined as for graphs. We recall a standard fact:

Observation 11. Let \(\mathcal{H}\) be the hypergraph on \(V(H)\) whose edges are the vertex sets of copies of \(F\) in \(H\). With \(a\) as in 8 , there is a \(W \subseteq V(H)\) such that \[\label{eq:min.deg} \text{\mathcal{H} [W] has minimum degree at least a.}\tag{9}\]

Proof. Set \(\mathcal{H} _0=\mathcal{H}\) and for \(i\geq 1\) until no longer possible, let \(\mathcal{H} _i\) be gotten from \(\mathcal{H} _{i-1}\) by removing a vertex of degree less than \(a\) (and the edges containing it). The final hypergraph is nonempty (since we delete fewer than \(an\le N(H,F)=|\mathcal{H} |\) edges) and has minimum degree at least \(a\). ◻

Fix \(W\) as in 11 and set \(R=H[W]\); so each vertex of \(R\) is contained in at least \(a\) copies of \(F\) in \(R\). Write \(\delta\) for the minimum degree in \(R\) and \(w\) for \(|W|\) (\(=v_R\)). Then \[\mathbb{E}_qX_R< n^wq^{e_R}\le \left(nq^{\delta /2}\right)^w,\] so we will have the desired contradiction \(\mathbb{E}_qX_R <1\) if we show \[q^{\delta /2}<1/n.\] To this end, we find a suitable lower bound on \(\delta\) and upper bound on \(q\).

For the first of these, our choice of \(W\) and definition of \(\delta\) give \(a \le {\delta \choose r}< \left(\frac{e\delta }{r}\right)^r\), so \[\label{gd.lb}\delta > ra^{1/r}/e.\tag{10}\]

For an upper bound on \(q\), in view of 8 and 7 , we have \(an=\mathbb{E}_pX_F > \left(\frac{np^{r/2}}{r+1}\right)^{r+1}> \left(nq^{r/2}\right)^{r+1}\) (provided \(L^{r/2}> r+1\)), whence \[\label{q.ub} q<(a^{1/r}/n)^{2/(r+1)}.\tag{11}\]

Since \(R\subseteq H\), 7 promises \(\delta < 2\log n\), which with 10 gives \[\label{eq:a.ub} a^{1/r}< 2e\log n/r;\tag{12}\] and inserting this in 11 (and again using 10 ) we have (with room) \[q^{\delta /2}<\left(\frac{2e\log n}{rn}\right)^{\delta /(r+1)}< \left(\frac{2e\log n}{rn}\right)^{a^{1/r}/(2e)} < 1/n,\] where the last inequality uses \(a >(L/4)^{r(r+1)/2}\) (see 8 ).

This completes the proof of 4 for cliques.

4 Trees↩︎

Here we prove 5. We now use \(\varepsilon\) for what will be \(1/L\); so \(\varepsilon\) is a function of \(\Delta =\Delta _F\) and \(q= \varepsilon p\). We assume \(\varepsilon\) is small enough to support what we do, and, as usual, don’t try to give it a good value.

Let \(F\) be a tree, say with \(e_F=j\,\) (\(\in [n-1]\)), and set \[d=np.\]

It will be convenient to work with labeled copies (a labeled copy of \(J\) in \(G\) being an injection from \(V(J)\) to \(V(G)\) that takes edges to edges). We use \(\tilde{N}(G,J)\) for the number of of labeled copies of \(J\) in \(G\) and \(\mathbb{E}_p\tilde{X}_J\) for \(\mathbb{E}\tilde{N}(G_{n,p},J)\). Then \(N(H,F)=\tilde{N}(H,F) / aut(F)\) and \(\mathbb{E}_p X_F = \mathbb{E}_p\tilde{X}_F/aut(F)\) (where \(aut(\cdot):=|Aut(\cdot)|\)), and the inequality of 5 is the same as \(\tilde{N}(H,F) < \mathbb{E}_p \tilde{X}_F\); so, since \[\label{EEptd} \mathbb{E}_p\tilde{X}_F=(n)_{j+1}p^j > e^{-(j+1)}nd^j\tag{13}\] (see 2 ), the theorem will follow from \[\label{goal:tree} \tilde{N}(H,F) \le \varepsilon^{0.1j}nd^j\tag{14}\] (provided \(\varepsilon< e^{-20}\)), which is what we will prove.

4.1 Set-up and definitions.↩︎

Let \(V(F)=\{v_0, \ldots, v_j\}\), where we think of \(F\) rooted at \(v_0\) and \((v_1,\ldots ,v_j)\) is some breadth-first order. Let \(f_i\) be the number of children of \(v_i\) (so \(f_i\) is \(d_F(v_i)\) if \(i=0\) and \(d_F(v_i) - 1\) otherwise).

Before turning to the main line of argument, we dispose of two easy cases.

Proposition 12. The inequality in 14 holds if \(d \le 1/(3\varepsilon)\) or \(d \ge\varepsilon^{-1/3}\log n\).

Proof. The assertion for \(d \le 1/(3\varepsilon)\) follows from (the second part of) 3. (In more detail: assume \(\varepsilon< \alpha ^{10/9}\) with \(\alpha\) as in the theorem, and let \(p'=q/\alpha\); so \(\alpha p'=q \le 1/(3n)\) and \(p' < \varepsilon^{0.1}p\), implying \(N(H,F)< \mathbb{E}_{p'}X_F < \varepsilon^{0.1j}\mathbb{E}_pX_F\).)

If, on the other hand, \(d \ge \varepsilon^{-1/3}\log n\), then 6 gives \[\Delta _H \le \max\{\log n, 2enq\} \le \max\{\varepsilon^{1/3}d,2e\varepsilon d\}=\varepsilon^{1/3}d ,\] whence \[\tilde{N}(H,F) \le 2e_H \Delta _H^{j-1} \le 2n\log n \cdot (\varepsilon^{1/3}d)^{j-1} \le 2\varepsilon^{j/3}nd^j\] (where \(2e_H\) bounds the number of embeddings of \(v_0v_1\), \(\Delta _H^{j-1}\) bounds the number of ways to extend to the rest of \(F\), and the second inequality uses 8); so we have 14 . ◻

So we assume from now on that \[\label{WMAd} 1/(3\varepsilon) < d <\varepsilon^{-1/3}\log n.\tag{15}\]

Remark. With small modifications, the following argument goes through without the lower bound in 15 , and, in cases where \(q < 1/n\), without the bounded degree assumption in 5. In particular, since for \(q<1/n\) a \(q\)-robust \(H\) is acyclic, this gives an alternate proof of a slight strengthening of the second part of 3 (namely, replacing \(q<1/(3n)\) by \(q<1/n\)), which, strangely, we don’t see how to squeeze out of the argument in [1].

Definition 13 (Legal degree sequence). Say \(\underline d=(d_0, \ldots, d_{j})\) is legal if for all \(i \in [0, j]\), \[either d_i \ge \sqrt \varepsilon d (i is \textit{big}) or d_i=f_i (i is \textit{small}).\]

Note that \[\label{figD} f_i \leq \Delta < \sqrt \varepsilon d \,\,\,\forall i,\tag{16}\] since 15 gives \(\sqrt \varepsilon d > 1/(3 \sqrt \varepsilon)\), which we may assume is greater than \(\Delta\); so “big" and”small" do not overlap. From now on \({\underline{d}}\) is always a legal degree sequence.

Definition 14 (Partially labeled \(R\)). For a legal \(\underline d\), define \(\mathcal{R} _{\underline{d}}\) to be the set of partially labeled graphs \(R\) that consist of

(i) \(F\) (with its labels) plus

(ii) for each \(i \in [0, j]\), \(d_i\) edges joining \(v_i\) to vertices not in \(\{v_0, \ldots, v_{i-1}\}\) (which may still be in \(F\) but should be thought of as mostly new); vertices of \(R\setminus F\) are unlabeled.

A copy (in \(H\)) of such an \(R\) is then partially labeled in the same way.

Set \(\mathcal{R} =\cup \mathcal{R} _{\underline{d}}\). We use \(\hat{R}\) for a copy of \(R\), \(\hat{F}\) for a (labeled) copy of \(F\), \(\hat{\mathcal{R}} _{\underline{d}}\) for the set of copies of \(R\)’s in \(\mathcal{R} _{{\underline{d}}}\), and \(\hat{\mathcal{R} }=\cup \hat{\mathcal{R} }_{{\underline{d}}}\). We write \(\hat{R} \sim \hat{F}\) if \(\hat{F}\) is the "\(F\)-part" of \(\hat{R}\).

Definition 15 (Fit). For \(\hat{R} \subseteq H\) a copy of \(R \in \mathcal{R} _{\underline{d}}\), with \(w_i \in V(\hat{R})\) the copy of \(v_i\), say \(\hat{R}\) fits \(H\) if, for all \(i \in [0,j]\), \[\label{fit} |N_H(w_i) \setminus \{w_0, \ldots, w_{i-1}\}|\,\, \left\{\begin{array}{cl} =d_i & \text{if i is big,}\\ < \sqrt \varepsilon d & \text{if i is small.} \end{array}\right.\tag{17}\]

Observation 16. For each labeled \(\hat{F} \subseteq H\), there is a unique \(\hat{R} \in \mathcal{R}\) such that \(\hat{R} \sim \hat{F}\) and \(\hat{R}\) fits \(H\).

(With \(w_i\) the copy of \(v_i\) in \(\hat{F}\), the desired \(\hat{R}\) consists of \(\hat{F}\) plus all edges \(w_iu\) with \(u\in N_H(w_i) \setminus \{w_0, \ldots, w_{i-1}\}\) and \(|N_H(w_i) \setminus \{w_0, \ldots, w_{i-1}\}|\ge \sqrt \varepsilon d\) (and the vertices in these edges).)

For \(R \in \mathcal{R}\), let \(N^*(H, R)\) be the number of copies of \(R\) that fit \(H\). By 16, \[\label{N*bd} \tilde{N}(H,F) = \sum_{R \in \mathcal{R} } N^*(H, R).\tag{18}\]

Plan. We will give two upper bounds on \(\sum_{R \in \mathcal{R} _{\underline{d}}} N^*(H,R)\) and show that, for each \({\underline{d}}\), one of these is small. Which bound we use will depend on how \[\label{D(d)} D({\underline{d}}):=\sum_{\text{i big}} d_i\tag{19}\] compares to \(j\log d\), but in either case will be small enough relative to the bound of 14 that even summing over \({\underline{d}}\) causes no trouble.

We conclude this section by showing that the cost of "decomposing" \(D\) is small.

Proposition 17. For any D the number of \({\underline{d}}\)’s with \(D({\underline{d}})=D\) is \[\label{count.bd} \begin{array}{ll} \exp\left[O\left(j\log^2 d/(\sqrt{\varepsilon} d) \right)\right] & \text{ if } D \leq j \log d,\\ \exp\left[O\left(D\log d/(\sqrt{\varepsilon} d) \right)\right] & \text{ if } D > j \log d. \end{array}\qquad{(2)}\]

Proof. The number of big \(i\)’s for a \({\underline{d}}\) with \(D({\underline{d}})=D\) is at most \[s_0 := \min\{j, D/(\sqrt \varepsilon d)\} < \sqrt{jD/2}\]

(see 15 ), and the number of such \({\underline{d}}\)’s with exactly \(s\) big \(i\)’s is less than \[\label{si's} \binom{{j}}{{s}}\binom{{D-1}}{{s-1}} < \exp_2[s\log (e^2jD/s^2)];\qquad{(3)}\] so (since the r.h.s.of ?? increases rapidly with \(s\)), the number of \({\underline{d}}\)’s in the proposition is less than \[\label{0th.bd} \sum_{s \le s_0} \exp_2[s\log (e^2jD/s^2)] < 2 \exp_2\left[s_0\log (e^2jD/s_0^2)\right],\tag{20}\] which is \[\begin{align} 2 \exp_2\left[D/(\sqrt{\varepsilon} d)\log (e^2j\varepsilon d^2/D)\right] & \text{if j > D/(\sqrt{\varepsilon}d),} \tag{21}\\ 2\exp_2\left[j\log (e^2D/j)\right] & \text{if j \leq D/(\sqrt{\varepsilon} d)}. \tag{22} \end{align}\]

Now for ?? : If \(D \leq j \log d\), then 21 applies (since \(d\) is somewhat large; see 15 ), so the bound in 20 is at most \[2 \exp_2\left[j \log d/(\sqrt{\varepsilon} d)\log (e^2\varepsilon d^2/\log d)\right] = \exp\left[O\left(j \log^2 d/(\sqrt{\varepsilon} d)\right)\right]\] (using the fact that \(x\log(\alpha /x)\) is increasing in \(x\) up to \(\alpha /e\)). And if \(D > j \log d\), then: if \(j \leq D/(\sqrt \varepsilon d)\) then the version 22 of the bound in 20 is at most \[\exp\left[O\left(D/(\sqrt \varepsilon d)\log (e^2 \sqrt \varepsilon d)\right)\right] = \exp\left[O\left(D\log d/(\sqrt{\varepsilon} d)\right)\right];\] and otherwise we use \(j < D/\log d\) to say the bound in 21 is less than \[2 \exp_2\left[D/(\sqrt{\varepsilon} d)\log (e^2\varepsilon d^2/\log d)\right] = \exp\left[O\left(D\log d/(\sqrt{\varepsilon} d)\right)\right].\qedhere\] ◻

4.2 First bound.↩︎

The goal of this section is to show \[\label{1st.goal} \sum_{D({\underline{d}}) \le j\log d}\,\sum_{R \in \mathcal{R} _{\underline{d}}}N^*(H,R) < n (\varepsilon^{1/3} d)^j.\tag{23}\] We first bound the inner sums and then invoke 17.

Proposition 18. For any \({\underline{d}}\), \[\label{first.bd} \sum_{R \in \mathcal{R} _{\underline{d}}}N^*(H,R) \le n \prod_{\text{i small}}(\sqrt \varepsilon d)^{f_i} \prod_{\text{i big}} d_i^{f_i}.\qquad{(4)}\]

Proof. This is just the naive bound on the number of \(\hat{F}\)’s for which the unique \(\hat{R}\sim\hat{F}\) that fits \(H\) (see 16) is in \(\mathcal{R} _{\underline{d}}\). With \(w_i\) again the copy of \(v_i\) in \(\hat{F}\), we choose \(w_0,\ldots ,w_j\) in turn. The number of choices for \(w_0\) is at most \(n\), and, since \(\hat{R}\) is in \(\mathcal{R} _{\underline{d}}\) and fits \(H\), the number of choices for the children of \(w_i\) (which are all chosen with \((w_0,\ldots ,w_i)\) known) is at most \[(|N_H(w_i) \setminus\{w_0, \ldots, w_{i-1}\}|)_{f_i},\] which with 17 gives ?? . ◻

Proposition 19. If \[\label{D.ub} D:= D({\underline{d}}) \le j\log d,\qquad{(5)}\] then \[\label{prodprod} \prod_{\text{i small}}(\sqrt \varepsilon d)^{f_i}\prod_{\text{i big}}d_i^{f_i} \le (\sqrt \varepsilon d)^j e^{(\Delta j\log^2 d)/(\sqrt \varepsilon d)}.\qquad{(6)}\]

Proof. Since \(\sum f_i=j\), the first product is less than \((\sqrt \varepsilon d)^j\). For the second, with \(s\) the number of big \(i\)’s, we have \(s \le D/(\sqrt \varepsilon d) ~(<D/e),\) whence (using ?? for the last inequality) \[\prod_{\text{i big}}d_i^{f_i}\le \prod_{\text{i big}} d_i^{\Delta} \le (D/s)^{s\Delta} \le (\sqrt \varepsilon d)^{\Delta D/(\sqrt \varepsilon d)}\le e^{(\Delta j\log^2 d)/(\sqrt \varepsilon d)}.\qedhere\] ◻

Finally, inserting ?? in ?? and using 17, we find that the l.h.s.of 23 is at most \[\left\{(j\log d)\cdot \exp\left[(\Delta +O(1))j\log^2 d/(\sqrt \varepsilon d)\right] \right\}\cdot n\cdot ( \sqrt{\varepsilon} d)^j < n ( \varepsilon^{1/3} d)^j\] Here the inequality holds because, since \(d > 1/(3\varepsilon)\) (see 15 ), the expression in \(\{\,\}\)’s is much smaller than \(\varepsilon^{-j/6}\) for a small enough \(\varepsilon\) (\(=\varepsilon(\Delta )\)).

4.3 Second bound.↩︎

Here we have more room and will show \[\label{2nd.goal} \sum_{D({\underline{d}}) > j\log d}\,\sum_{R \in \mathcal{R} _{\underline{d}}}N^*(H,R) \le n \varepsilon^{(j/3)\log d}.\tag{24}\] (What we say here applies to any \({\underline{d}}\) until we get to the end of the section, where we finally use \(D({\underline{d}}) > j\log d\).)

For this discussion \(R\) is always in some \(\mathcal{R} _{\underline{d}}\), so, as in 14, copies of \(R\) are partially labeled; with this understanding, we again use \(N(G,R)\) for the number of copies of \(R\) in \(G\), \(\mathbb{E}_pX_R\) for \(\mathbb{E}N(G_{n,p},R)\), and \(\nu(H,R)\) for the maximum size of an edge-disjoint collection of copies of \(R\) in \(H\).

Like the treatment of small \(q\) in 3, the proof of 24 uses the approach previewed following 9; thus we hope for a bound on the inner sum in 24 of the form \[\label{form} \sum_{R \in \mathcal{R} _{\underline{d}}} \nu(H,R) \cdot B(R),\tag{25}\] where \(B(R)\) is some bound on the number of copies of \(R\) that fit \(H\) and share edges with a given copy. (Here: (i) we would be entitled to insist that in the definition of \(\nu(H,R)\) we restrict to copies of \(R\) that fit \(H\), but we don’t need—and anyway don’t know how to use—this; (ii) rather than \(B(R)\), we will use a single bound, \(\beta ({\underline{d}})\), on the number of all copies of \(R\)’s in \(\mathcal{R} _{\underline{d}}\) that fit \(H\) and share edges with a given copy—though a bound on the number of such copies of a single \(R\) could in principle be much smaller.)

We observe that 9 trivially (and with some sacrifice) extends to copies of \(R\):

Corollary 20. If \(H\) is \(q\)-sparse, then for any \(R\in \mathcal{R}\), \(\nu(H,R) \le e\mathbb{E}_qX_R.\)

Proof. With \(S\) the unlabeled graph underlying \(R\), we have (using 9) \[\nu(H,R)=\nu(H,S) \le e\mathbb{E}_qX_S\leq e\mathbb{E}_qX_R.\qedhere\] ◻

Lemma 21. For any \({\underline{d}}\), \[\label{bd1.1'} \sum_{R \in \mathcal{R} _{\underline{d}}} \nu(H,R) \,<\, e n \prod_{\text{i small}} (\varepsilon d)^{f_i} \prod_{i \text{ big}} \left[\left(\frac{e\varepsilon d}{d_i}\right)^{d_i} d_i^{f_i}\right]\,=:\,\alpha ({\underline{d}}).\qquad{(7)}\]

Proof. By 20 it is enough to show \[\label{bd1.1} \sum_{R \in \mathcal{R} _{\underline{d}}} \mathbb{E}_qX_R <\alpha ({\underline{d}})/e.\tag{26}\] Here the main point is to show \[\label{sumN} \sum_{R \in \mathcal{R} _{\underline{d}}} N(K_n,R) < n \prod_{\text{i small}} n^{f_i} \prod_{i \text{ big}} \binom{{n}}{{d_i}}d_i^{f_i},\tag{27}\] which, since \(\sum d_i =e_R\) for any \(R\in \mathcal{R} _{\underline{d}}\), implies that the l.h.s.of 26 is less than \[n \prod_{\text{i small}} (nq)^{f_i} \prod_{i \text{ big}} \binom{{n}}{{d_i}}d_i^{f_i}q^{d_i} \le n \prod_{\text{i small}} (\varepsilon d)^{f_i} \prod_{i \text{ big}} \left[\left(\frac{e\varepsilon d}{d_i}\right)^{d_i} d_i^{f_i}\right].\]

For the proof of 27 we continue to use \(w_i\) for the copy of \(v_i\) in \(\hat{F}\), and now write \(p(w_i)\) for the parent of \(w_i\). We think of choosing \(w_0\) (in at most \(n\) ways) and then "processing" (in order) \(w_0,\ldots ,w_j\).

If \(i\) is small, then "processing" \(w_i\) means choosing its \(f_i\) (labeled) children, the number of possibilities for which is less than \((n)_{f_i}\leq n^{f_i}\).

If \(i\) is big, then "processing" \(w_i\) means choosing the set of its \(d_i\) neighbors not in \(\{w_0,\ldots ,w_{i-1}\}\), and its children (with labels) from this set. The number of ways to do this (not all of which will lead to legitimate \(\hat{R}\)’s) is at most \[\label{naive} \binom{{n}}{{d_i}}d_i^{f_i}.\tag{28}\] ◻

We next bound the number of members of \(\hat{\mathcal{R}} _{\underline{d}}\) that fit \(H\) and share edges with a given copy \(\hat{R}_0\). We first slightly refine \(\hat{\mathcal{R} }_{\underline{d}}\). For \(\hat{R} \in \hat{\mathcal{R} }\), set \({\underline{b}}={\underline{b}}(\hat{R})=(b_1, b_2, \ldots, b_j)\), with \[b_i =b_i(\hat{R})= |N_H(w_i) \cap \left(\{w_0, \ldots, w_{i-1}\} \setminus \{p(w_i)\}\right)|\] (recall \(p(w_i)\) is the parent of \(w_i\)) and define \(\hat{\mathcal{R} }_{{\underline{d}},{\underline{b}}}\) in the natural way. The next observation will allow us to more or less ignore edges of \(\hat{R}\setminus\hat{F}\) with both ends in \(\hat{F}\).

Proposition 22. If \(\hat{R}\subseteq H\), with \({\underline{b}}(\hat{R}) = {\underline{b}}\), then \[\label{bi.ub} \sum_{i \in [j]} b_i \le \max\{1, 3j\log\log n/\log n\} =\max\{1,o(j)\}.\qquad{(8)}\]

Proof. Set \(\sum b_i = \delta j\) and \(S=H[V(\hat{F})]\). Then \(v_S=j+1\), \(e_S= (1+\delta )j,\) and \[\label{EqXS} \mathbb{E}_qX_S\le n^{j+1}q^{(1+\delta )j}= n^{j+1}(\varepsilon d/n)^{(1+\delta )j}=n^{1-\delta j}(\varepsilon d)^{(1+\delta )j}\le n^{1-\delta j} (\log n)^{(1+\delta )j},\tag{29}\] where the last inequality uses 15 (weakly) to say \(\varepsilon d <\log n\).

If \(\delta j \ge 2\) (i.e.\(\sum b_i>1\)), then the r.h.s.of 29 is at most \((\log^{1+\delta } n/n^{\delta /2})^j\), which is less than 1 (in fact \(o(1)\)) if \(\delta >3 \log\log n/\log n\); and ?? follows since \(H\) is \(q\)-sparse. ◻

Proposition 23. The number of possibilities for \({\underline{b}}\) is at most \(j+ e^{o(j)}\).

Proof. Let \(B=\sum_{i \in [j]} b_i\); so 22 says \(B \le \max\{1,o(j)\}\). Given \(B\), the number of possibilities for \({\underline{b}}\) is at most \(\binom{{B +j-1}}{{B}}\), which is \(j\) if \(B=1\) and \(e^{o(j)}\) if \(B=o(j)\); so, crudely, the number of possible \({\underline{b}}\)’s is at most \(j+o(j)e^{o(j)}=j+e^{o(j)}\). ◻

If \(\hat{R}\in \hat{\mathcal{R}} _{{\underline{d}},{\underline{b}}}\) fits \(H\), then (for any \(i\)) \[\label{H.degree} d_H(w_i)\le \max\{d_i,\sqrt{\varepsilon}d\}+b_i+1_{\{i\neq 0\}} \,=:\, B_i,\tag{30}\] where we note that the max is \(d_i\) if \(i\) is big (in which case 30 holds with equality), and \(\sqrt{\varepsilon}d\) if \(i\) is small (in which case 30 is strict).

For \(\ell \in [j]\), let \[Q(\ell)=\{i \in [j]: \text{v_i is an internal vertex of the path in F connecting v_0 and v_\ell}\}\] (that is, \(Q(\ell)\) is the set of indices of non-root ancestors of \(v_\ell\)).

Lemma 24. For any \(e \in H\), \({\underline{d}}\) and \({\underline{b}}\), the number of \(\hat{R} \in \hat{\mathcal{R}} _{{\underline{d}}, {\underline{b}}}\) that contain \(e\) and fit \(H\) is at most \[\label{cases.intersection} 2\sum_{\ell=0}^{j}\prod_{\text{i small}} (\sqrt \varepsilon d)^{f_i} \prod_{\text{i big}} d_i^{f_i} \cdot K(\ell),\qquad{(9)}\] where \[\label{Kell} K(\ell)\, = \, \prod_{\substack{\text{i big} \\ i \in Q(\ell)}} \left(\frac{d_i+b_i+1}{d_i}\right){\prod_{\substack{\text{i small}\\ i \in Q(\ell) }}}\left(\frac{\sqrt \varepsilon d+b_i+1}{\sqrt \varepsilon d}\right) \cdot \frac{B_\ell}{\max\{d_0,\sqrt{\varepsilon}d\}}.\qquad{(10)}\]

Proof. We first choose an end, \(w\), of \(e\) in \(V(\hat{F})\) (where \(\hat{R}\sim \hat{F}\); this gives the 2 in ?? ), and the role, \(w_\ell\), of \(w\) in \(\hat{F}.\) It is then enough to bound the number of possibilities for the rest of \(\hat{R}\) by the \(\ell^{\text{th}}\) summand in ?? .

Note that for \(\ell=0\) (where \(K(\ell)=1\)), the summand is just the bound of 18, except that we no longer need the factor \(n\) since we already know \(w_0\).

For a general \(\ell\) we first specify \((w_i:i\in Q(\ell)\cup \{0\})\), the number of possibilities for which is, by 30 , at most \[\label{prodB} \prod_{i\in Q(\ell)\cup \{\ell\}}B_i.\tag{31}\] Then, for the number of ways to choose the rest of \(\hat{R}\), we again argue as in 18, now skipping terms in the bound corresponding to choosing the already known \(w_i\)’s with \(i\in Q(\ell)\cup \{0, \ell\}\); this bounds the number of possibilities by the double product in ?? divided by \[\prod_{i\in Q(\ell)\cup \{0\}} \max\{d_i,\sqrt{\varepsilon}d\},\] and multiplying by 31 gives the promised \(\ell^{\text{th}}\) summand. ◻

From now until the last paragraph of this section, we fix \({\underline{d}}\) and let \(D=D({\underline{d}})\) (\(:=\sum_{\text{i big}} d_i\); see 19 ). We have included the \(K(\ell)\)’s in 24 to help keep track of what the proof is doing, but will use only the simplifying \[\label{d.b.bd} K(\ell)\le K:= (D+j)\cdot\prod_{i \in [j]}\left(1+(b_i+1)/\sqrt \varepsilon d\right) < (D+j)\exp\left[(j+\sum_{i\in [j]}b_i)/\sqrt \varepsilon d\right] <(D+j)\exp[O(j)/\sqrt \varepsilon d],\tag{32}\] where \(D+j\) corresponds to the trivial \(B_\ell \leq D+j\), and the last inequality uses 22 (and the \(O(j)\) in the final exponent is actually \((1+o(1))j\)).

With the substitution of \(K\) for \(K(\ell)\), the summands in ?? no longer depend on \({\underline{b}}\) or \(\ell\), and, using 23, we have a simpler version of 24:

Corollary 25. For any \(e \in H\) and \({\underline{d}}\), the number of \(\hat{R} \in \hat{\mathcal{R}} _{{\underline{d}}}\) that contain \(e\) and fit \(H\) is at most \[\label{cases.intersection'} \beta ({\underline{d}}):= 2(j+e^{o(j)})(j+1)K\cdot\prod_{\text{i small}} (\sqrt \varepsilon d)^{f_i} \prod_{\text{i big}} d_i^{f_i}.\qquad{(11)}\]

So, since for any \(R \in \mathcal{R} _{\underline{d}}\), \[\label{deg.sum} e_R=\sum\{ d_i :i \in [0, j]\} \le D+j,\tag{33}\] we may overestimate the number of \(\hat{R}\)’s in \(\hat{\mathcal{R}} _{\underline{d}}\) that fit \(H\) and share edges with a given \(\hat{R}_0\) by \(e_R \beta ({\underline{d}})\), which with 21 and 33 gives \[\label{eq:upshot} \sum_{R \in \mathcal{R} _{{\underline{d}}}} N^*(H,R) ~<~ \sum_{R \in \mathcal{R} _{{\underline{d}}}} \nu(H,R) \cdot e_R \cdot \beta({\underline{d}}) ~<~ \alpha ({\underline{d}}) (D+j) \beta({\underline{d}}).\tag{34}\] Now inserting the values of \(\alpha ({\underline{d}})\) from ?? and \(\beta ({\underline{d}})\) from ?? (with the bound on \(K\) in 32 ), and (slightly) simplifying, we find that the l.h.s.of 34 is at most \[\label{total.sum} n \cdot\left\{e(D+j)^2 2(j+ e^{o(j)})(j+1)e^{O(j)/\sqrt \varepsilon d} \right\}\cdot \prod_{\text{i small}} (\varepsilon^{3/2}d^2)^{f_i} \prod_{i \text{ big}} \left[\left(\frac{e\varepsilon d}{d_i}\right)^{d_i} d_i^{2f_i}\right].\tag{35}\] This looks unpleasant but is actually simple, since the terms \((\frac{e\varepsilon d}{d_i})^{d_i}\) dominate the rest (apart from \(n\)): since \(d_i\geq \sqrt{\varepsilon} d\) when \(i\) is big, the product of these terms is less than \[\label{main.term} (e\sqrt{\varepsilon})^{D},\tag{36}\] whereas: the expression in \(\{\,\}\)’s is \(O(D^2)e^{O(j)}\); since \(\sum f_i = j\), the first product (even sacrificing the terms with \(\varepsilon^{3/2}\)) is at most \(d^{2j}\); and what’s left of the second product is \[\prod_{i \text{ big}} d_i^{2f_i} \leq \prod_{i \text{ big}} d_i^{2\Delta } < 2^{2D\Delta }\] (using \(d_i < 2^{d_i}\) for the second inequality). So the bound in 35 is no more than \[n D^2 e^{O(j)} d^{2j} 2^{2D\Delta } (e\sqrt{\varepsilon})^{D} < n 2^{O(D)}(e\sqrt{\varepsilon})^{D},\] where the inequality (finally) uses \(D > j \log d\) (and the implied constant in \(2^{O(D)}\) depends on \(\Delta\)).

Finally, now fixing \(D > j \log d\) and letting \({\underline{d}}\) vary, and recalling from ?? that the number of \({\underline{d}}\)’s with \(D({\underline{d}}) = D\) is \[\exp\left[O\left(\frac{D}{\sqrt \varepsilon d} \log (d)\right)\right] = 2^{O(D)},\] we have \[\sum_{D({\underline{d}}) = D}\sum_{R \in \mathcal{R} _{{\underline{d}}}} N^*(H,R) < n 2^{O(D)}(e\sqrt{\varepsilon})^{D},\] which (with a small enough \(\varepsilon\)) gives 24 . 0◻

5 Cycles↩︎

Here we prove the cycle portion of 4; to repeat: We assume that \(p=Lq\) with \(L\) a large constant, \(H\) is \(q\)-sparse, and \(F=C_k\) for some \(k\in [3,n]\), and want to show \[\label{cycle.show} N(H,F) < \mathbb{E}_pX_F.\tag{37}\] Since \[\label{eq:EpXF.lb'} \mathbb{E}_pX_{F}=N(K_n, F)p^{e(F)}=\frac{(n)_k}{2k}p^k > \frac{1}{2k}\left(\frac{np}{e}\right)^k =\frac{1}{2k}\left(\frac{L}{e}\right)^k (nq)^k \ge L^{.9k}(nq)^k,\qquad{(12)}\] it is enough to show that \(N(H,F)\) is at most the r.h.s.of ?? . To begin we eliminate easy ranges for \(q\):

Proposition 26. If \(q\not\in (1/n , 1/\sqrt n)\) then 37 holds.

Proof. If \(q \le 1/n\), then \(\mathbb{E}_qX_F < n^kq^k \le1\); so \(q\)-sparsity of \(H\) forces \(N(H,F)=0\). (Note that when \(q<1/(3n)\), the second part of 3 gives 1 for a general \(F\).)

For \(q\ge 1/\sqrt{n}\) we use the naive bound \[\label{eq:cycle.dense.ub} N(H,F)\le e_H \cdot \Delta_H^{k-2},\tag{38}\] in which the r.h.s.(over)counts ways to choose \(xy \in E(H)\) and a \((k-1)\)-edge path (in \(H\)) joining \(x\) and \(y\). We then recall that 6 promises \(\Delta _H\le 2enq\), while 8 bounds \(e_H\) by \(n\log n\) in general, and by \(3n\) if \(q < (\log n/n)^{1/2}\); so \[N(H,F) \le \left\{\begin{array}{ll} n\log n(2enq)^{k-2} &in general,\\ 3n(2enq)^{k-2}&if q< (\log n/n)^{1/2}, \end{array}\right.\] and the r.h.s.of ?? exceeds the first bound if \(q \ge (\log n/n)^{1/2}\) and the second if \(q\ge 1/\sqrt n\). ◻

So for the rest of this discussion we assume \[\label{q.bound.cycle} 1/n<q<1/\sqrt n.\tag{39}\]

Our approach here is simple (the trivial 38 is a first example), but turns out to be rather delicate: for some carefully chosen \(m\) we use 5 to bound the number of \(P_m\)’s (\(m\)-edge paths) in \(H\), and then bound the number of extensions to copies of \(F\) using 27, which is the main new point in this section. What we need from 5 is \[\label{eq:paths.ub} \text{there is a fixed L_1 such that if H is q-sparse then, for any m, N(H, P_m)< n^{m+1}(L_1 q)^m.}\tag{40}\]

For the rest of this discussion we work with the following definitions and assumptions. We assume \[nq=n^{c}\] (so, by 39 , \(c\in (0,1)\)). For distinct \(x,y \in V(H)\), we use "\((x,y)\)-\(P_\ell\)" for a \(P_\ell\) in \(H\) with endpoints \(x\) and \(y\), and set \[\label{gam.def} \gamma(\ell)=\max_{\substack{x,y \in V(H)\\ x \ne y}} |\{(x,y)-P_\ell's \}|.\tag{41}\] For \(\delta \in (0,1)\), we define \(\hat{\ell}(\delta )\) to be the largest integer \(\ell\) for which \[\label{eq:ell.def} (nq)^\ell<n^{1-\delta c}.\tag{42}\]

Lemma 27. Suppose \(H\) is \(q\)-sparse. Let \(\delta \in (0,1)\) be given and \(\hat{\ell}=\hat{\ell}(\delta )\). If \(\ell\) satisfies 42 (i.e.\(\ell\le \hat{\ell}\)), then \(\gamma(\ell)=O(\hat{\ell}/\delta )\), and if \[\label{eq:ell.def'} (nq)^\ell<n^{1-\delta },\qquad{(13)}\] then \(\gamma(\ell)=O(1/\delta )\).

Proof. For this discussion we fix distinct \(x,y \in V(H)\). We usually use \(K\), often subscripted, for \((x,y)\)-\(P_\ell\)’s, and \(\gamma =\gamma (x,y)\) for the number of these; so we should show \(\gamma=O(\hat{\ell}/\delta )\) if 42 holds and \(\gamma=O(1/\delta )\) if we assume ?? . We will often treat \(K\)’s as sets of edges.

Choose \(K_0, K_1, \ldots, K_m\) so that \[\label{eq:K_i.minimal} e_i:= |K_i \setminus \left(\cup_{j<i} K_j\right)| =\min\{|K \setminus \cup_{j <i} K_j|: K \not\subseteq\cup_{j <i} K_j\} \quad \forall i \ge 1\tag{43}\] and \[\label{eq:covering} R ~:= ~\bigcup_{i=0}^m K_i\,\, contains all (x,y)-P_\ell's.\tag{44}\] We use \(R_i\) for the subgraph of \(H\) consisting of (the edges of) \(K_i \setminus (\cup_{j < i} K_j)\) and their vertices (e.g.\(R_0=K_0\)), and set \[v_i=|V(R_i)\setminus V(\cup_{j<i}R_j)|.\] Thus \[\label{viei} v_0=e_0 +1=\ell +1\,\, and \,\,v_i\leq e_i-1 \leq \ell -1\,\, for i\in [m]\tag{45}\] (since for \(i\in [m]\), \(E(R_i)\) consists of edge-disjoint paths with ends in \(V(\cup_{j<i}R_j)\)), which, with the \(q\)-sparsity of \(H\), gives \[\label{eq:E.hat.R} 1\le \mathbb{E}_qX_{R}< n^{v_R}q^{e_R}=n^{\ell+1}q^{\ell} \prod_{i=1}^m n^{v_i}q^{e_i} \le n^{\ell+1}q^{\ell}\prod_{i=1}^m [n^{-1}(nq)^{e_i}].\tag{46}\]

The lemma will follow from Claims 28-30; the first of these bounds \(m\), and others bound the number of \((x,y)\)-\(P_\ell\)’s not in \(\{K_0,\ldots ,K_m\}\).

Claim 28. If 42 holds then \(m=O(\hat{\ell}/\delta )\), and if ?? holds then \(m<2/\delta\).

Proof. For the second part just notice that ?? bounds the r.h.s.of 46 by \[n^{\ell+1}q^{\ell}(n^{-1}(nq)^{\ell})^m <n^2\cdot n^{-\delta m}.\]

The first part will follow from density considerations. We may rewrite 42 as \(q<n^{-(\ell-1+\delta c)/\ell}\), which with the the second bound in 7 (that is, \(m(H)<1/c\) if \(q\le n^{-c}\)) gives \[\label{eq:e/v.ub} \frac{v_R}{e_R} \ge \frac{\ell-1+\delta c}{\ell}.\tag{47}\] But \(e_R=\sum e_i\) and, in view of 45 , \[v_R =\sum v_i \,\, \le \,\, e_0+1 +\sum_{i=1}^m(e_i-1) \,\, = \,\, e_R-m+1;\] so with 47 we have \[\frac{\ell-1+\delta c}{\ell} \le \frac{e_R-m+1}{e_R},\] which, with \(e_R \le \ell(m+1)\) (and a little rearranging), gives \[m\le 2/(\delta c)-1.\] The claim follows since \(n^{c(\hat{\ell}+1)} \ge n^{1-\delta c}\) (by the maximality of \(\hat{\ell}\)) and \(c<1/2\) (by 39 ) give \(\hat{\ell} =\Omega(1/c)\). ◻

Claim 29. If \(K\not\in \{K_0,\ldots ,K_m\}\), then there is \(i \in [0,m]\) such that \(|K \cap K_i|\ge \ell/8\).

Proof. Let \(j_0\) and \(j_1\) (possibly equal) be minimum with \(|K \setminus (\cup_{j \le j_0} K_j)| \le \ell/2\) and \(K \subseteq(\cup_{j \le j_1} K_j)\) (with existence given by 44 ). Then \(|K \cap (\cup_{j \in [j_0, j_1]} K_j)| >\ell/2\), so \[\label{eq:exist} \text{there is i \in [j_0, j_1] such that |K \cap K_i| \ge \ell/(2(j_1-j_0+1)).}\tag{48}\] This immediately gives the claim if \(j_0=j_1\). For \(j_0<j_1\), we return to 46 , observing that (justification to follow) \[\label{eq:R_i} n^{-1}(nq)^{e_i} \le \left\{\begin{array}{ll} n^{-1}(nq)^{\ell} <n^{-\delta c}<1 & for any i\in [m],\\ n^{-1}(nq)^{\ell/2} <n^{-1}n^{(1-\delta c)/2}<n^{-1/2} &if i\in [j_0+1,j_1]. \end{array}\right.\tag{49}\] Here both lines use 42 , and—the main point—the second uses \[e_i\leq \ell/2,\] which holds since otherwise 43 would have forbidden choosing \(K_i\) when we could have chosen \(K\).

The \(q\)-sparsity of \(H\), with 46 , 49 and, again, 42 , now gives \[\label{1EqXR} 1 \le \mathbb{E}_qX_R < n^{\ell+1}q^\ell \cdot n^{-(j_1-j_0)/2} <n^{2-(j_1-j_0)/2};\tag{50}\] so \(j_1-j_0\leq 3\) and 48 completes the proof. ◻

For the next claim, to avoid confusion, we use \(Q\) in place of \(K\) for \((x,y)-P_\ell\)’s.

Claim 30. For any \(Q_0\), the number of \(Q\)’s sharing at least \(\ell/8\) edges with \(Q_0\) is \(O(1)\).

Proof. Choose \(Q_1, \ldots, Q_m\) so that \[|Q_i\cap Q_0|\ge \ell/8\,\,\, and \,\,\, Q_i \not\subseteq\cup_{j <i} Q_j \quad \forall i \in [m]\] and \[R ~:= ~\bigcup_{i=0}^m Q_i\,\, contains all Q's with Q\cap Q_0\geq \ell/8.\] We again use \(R_i\) for the subgraph of \(H\) consisting of the edges of \(Q_i \setminus (\cup_{j < i} Q_j)\) and their vertices, and for \(i\in [m]\) set \[e_i= |Q_i \setminus \left(\cup_{j<i} Q_j\right)|,\,\, \,v_i=|V(R_i)\setminus V(\cup_{j<i}R_j)|,\] and \[\label{f(i)} f(i) = |\{(v,e):e\in Q_i\setminus(\cup_{j<i}Q_j), v\in e\cap V(\cup_{j<i}R_j)\}|.\tag{51}\] The main point here is \[\label{f(i)main} \sum f(i)=O(1).\tag{52}\] (Arguing as for 29 gives \(m=O(1)\), but we now need a little more.)

Here we observe that, for each \(i\), \(R_i\) is an edge-disjoint union of (say) \(a_i\) paths, each of which shares precisely its endpoints with \(\cup_{j<i} R_j\). Each of these paths contributes (exactly) two pairs \((v,e)\) to \(f(i)\), and each \((v,e)\) counted by \(f(i)\) arises in this way; so \[\label{fiai} f(i)=2a_i.\tag{53}\]

Proof of 52 .. Noting that \[v_i=e_i-a_i\,\,and \,\, e_i\le 7\ell/8,\] we have (more or less as in 46 50 ) \[1 \le \mathbb{E}_qX_{R} < n^{v_R}q^{e_R}=n^{\ell+1}q^{\ell} \prod_{i=1}^m n^{v_i}q^{e_i} < n^2\prod_{i=1}^m n^{v_i}q^{e_i}\] and \[n^{v_i}q^{e_i} = n^{-a_i}(nq)^{e_i} \le n^{-a_i}n^{7(1-\delta c)/8},\] which with 53 easily give 52 . ◻

Now, with \(v\) running over \(V(R)\setminus\{x,y\}\) (so \(d_R(v)\ge 2\)), we have \[\label{dRdR} (d_R(x)+d_R(y)-2) +\sum_v(d_R(v)-2) = 2\sum a_i\tag{54}\] (since if \(\Sigma_i\) is the l.h.s.of 54 with \(R\) replaced by \(\cup_{j\le i}Q_j\), then \(\Sigma_0=0\) and \(\Sigma_i-\Sigma_{i-1}= 2a_i\) for \(i\geq 1\)). Combined with 52 (and 53 ) this gives 30, since any \((x,y)-P_\ell\) in \(R\) is determined by what it does at \(x\), \(y\) and the \(v\)’s of degree greater than 2. ◻

Finally, the combination of 29 and 30 (used with \(Q_0=K_i\) for \(i\in [0,m]\)) bounds the number of \((x,y)\)-\(P_\ell\)’s by \(O(m)\); and adding the bounds on \(m\) from 28 then gives 27. ◻

With 40 and 27 in hand, we return to 4, setting (for the rest of our discussion) \[\label{ell0.1} \tilde{\ell}= \hat{\ell}(0.1).\tag{55}\]

To begin, we observe that the theorem is easy when \(k\) is fairly small (here we don’t need 40 ):

Lemma 31. If \(k \le \tilde{\ell}+1\), then \(N(H,F)< \mathbb{E}_pX_F.\)

Proof. We again use the approach sketched following 9, beginning by observing that \[N(H,F)\leq \nu(H,F)\cdot k \cdot \gamma(k-1),\] since each of the \(k\) edges of a given copy of \(F\) is contained in fewer than \(\gamma(k-1)\) other copies. (Recall \(\gamma\) and \(\nu\) were defined in 41 and following 8.)

Since \(\nu(H,F) \le e\mathbb{E}_qX_F\) (see 9), the lemma will follow if we show \[\label{k-1,k} \gamma(k-1)=O(k),\tag{56}\] since then \[N(H,F)\le e\mathbb{E}_qX_F\cdot O(k^2)< L^k\mathbb{E}_qX_F=\mathbb{E}_pX_F.\]

Proof of 56 . This is two applications of 27: if \(k \le \tilde{\ell}/2\), then \[(nq)^{k-1}\le (nq)^{\tilde{\ell}/2} \le n^{(1-0.1 c)/2} <n^{1/2},\] so the second part of the lemma gives \(\gamma(k-1)=O(1)\); and if \(k\in [\tilde{\ell}/2, \tilde{\ell}+1]\), then \[(nq)^{k-1}\le (nq)^{\tilde{\ell}}<n^{1-0.1 c}\] and the first part of the lemma gives \(\gamma(k-1)=O(\tilde{\ell})=O(k)\). ◻

◻

For the rest of this section, we assume \[\label{eq:kbig} k\ge \tilde{\ell}+2,\tag{57}\] and divide the argument according to the value of \(q\,\) (\(\in (1/n , 1/\sqrt n)\); see 39 ).

Small \(q\). We first assume \[1/n < q \le \log n/n.\] (The upper bound could be considerably relaxed.)

Setting \(m=k-\tilde{\ell}~( \ge 2)\), we have \[N(H,F)\le N(H, P_{m})\cdot\gamma(\tilde{\ell})\] (since \(\gamma(\tilde{\ell})\) bounds the number of completions of any given \(P_m\) in \(H\) to a \(C_k\)); and inserting the bounds from 40 and 27 (namely, \(N(H,P_{m})< n^{m+1}(L_1q)^{m}\) and \(\gamma(\tilde{\ell}) = O(\tilde{\ell})\)), and letting \(L= L_1^2\) (and \(p=Lq\)), gives \[N(H,F) < n^{m+1}(L_1q)^{m}\cdot O(\tilde{\ell}) \leq L^{k/2} n(nq)^{m}\cdot O(\tilde{\ell}).\] To bound the r.h.s.of this, we first observe that maximality of \(\tilde{\ell}\,\) (\(=\hat{\ell}(0.1)\)) gives \[\label{eq:hat.ell.plus2} (nq)^{\tilde{\ell}+2}=n^c(nq)^{\tilde{\ell}+1}\ge n^c n^{1-0.1 c}>n.\tag{58}\] So \(n(nq)^m<(nq)^{\tilde{\ell}+2}(nq)^m=(nq)^{k+2}\), and \(N(H,F)\) is less than \[L^{k/2}(nq)^{k}O((nq)^2\tilde{\ell})<L^{.9k}(nq)^k<\mathbb{E}_pX_F,\] where the first inequality uses the easy \[\label{kdhlgO} k> \tilde{\ell}=\Omega(\log n/\log\log n)\tag{59}\] (see 55 and 42 ; here \(\tilde{\ell}>\log\log n\) would suffice), and the second is ?? . 0◻

Large \(q\). Here we are in the complementary range \[\log n/n < q < 1/ \sqrt n.\] We again use \[\label{eq:factor} N(H,F) \leq N(H,P_m)\cdot\gamma (\ell'),\tag{60}\] with a suitable \(\ell'\) and \(m = k - \ell'\). In this case we bound the first factor by the trivial \[N(H,P_m) \leq 2e_H\Delta _H^{m-1}\] (cf.@eq:eq:cycle.dense.ub ; curiously this now does better than 40 ), which with 8 and 6 gives \[\label{NHPm} N(H,P_m) < O(n (2enq)^{k - \ell' - 1}).\tag{61}\] So we will mainly be interested in \(\gamma (\ell')\).

Let \(\ell'\) be the largest integer \(\ell\) satisfying \(n^{\ell - 1} q^\ell \leq L^{-1/4}\), noting that \[\label{ell'lb} \ell' + 1 > (1 - o(1))\log n/\log(nq)\qquad{(14)}\] (since \(n^{\ell'}q^{\ell' + 1} > L^{-1/4}\)), and set \[f = f(n) = 1/(n^{\ell'-1}q^{\ell'})\] and \[\delta=\log f/\log (L^{1/4}nq);\] noting that \(L^{1/4} \leq f < L^{1/4}nq\) implies \(\delta \in (0,1)\) and \[\label{gd.bds} \log f/\log(nq)> \delta > (1-o(1))\log f/\log(nq)\tag{62}\] (the latter since here \(nq=\omega(1)\)).

We first check that 27, used with \(\ell=\ell'\) (and \(\delta = \delta\)), gives \[\label{eq:gambound} \gamma (\ell') = O(\log n/ \log f).\tag{63}\]

Proof. The upper bound in 62 implies 42 in the form \((nq)^{\delta} < (n^{\ell' - 1}q^{\ell'} )^{-1} = f\) (recall \(nq=n^c\)); so (the first part of) 27 gives \(\gamma (\ell') = O(\hat{\ell}(\delta)/\delta)\), and 63 then follows from the lower bound in 62 and the trivial \(\hat{\ell} (\delta)= O(\log n/\log (nq)).\) ◻

We should also note that (\(m :=\)) \(k - \ell'\geq 1\), which is given by 57 since \(n^{\tilde{\ell}+ 1}q^{\tilde{\ell}+ 2} \ge n^{-0.1c}nq > 1 > L^{-1/4}\) implies \(\ell' \leq \tilde{\ell}+1\). We may thus insert 61 and 63 in 60 , yielding \[N(H,F)= O(n (2enq)^{k - \ell' - 1}\log n/ \log f),\] which, for large enough \(L\), is less than \[(n^kq^k L^{.9k}) \cdot \left(f\log n/(L^{k/2}nq\log f)\right).\] In view of ?? , it is thus enough to show \((f/\log f)\log n < L^{k/2}nq\), which, rewritten as \[(f/\log f)(\log n/\log (nq)) < L^{k/2} nq/\log(nq),\] is true because \(f/\log f < L^{1/4}nq /\log(nq)\) (since \(f < L^{1/4}nq\) and \(x/\log x\) is increasing for \(x\ge e\)) and, with plenty of room, \(\log n/\log(nq) < L^{k/4}\) follows from ?? .

Acknowledgments↩︎

QD was supported by NSF Grants DMS-1954035 and DMS-1928930. JK was supported by NSF Grants DMS-1954035 and DMS-2452069. JP was supported by NSF Grant DMS-2324978, NSF CAREER Grant DMS-2443706 and a Sloan Fellowship.

References↩︎

[1]

Quentin Dubroff, Jeff Kahn, and Jinyoung Park. On the "second" . arXiv preprint arXiv:2508.14269, 2025.

[2]

Jeff Kahn and Gil Kalai. Thresholds and expectation thresholds. Combinatorics, Probability and Computing, 16(3):495–502, 2007.

[3]

Keith Frankston, Jeff Kahn, Bhargav Narayanan, and Jinyoung Park. Thresholds versus fractional expectation-thresholds. Annals of Mathematics, 194(2):475–495, 2021.

[4]

Elchanan Mossel, Jonathan Niles-Weed, Nike Sun, and Ilias Zadik. On the second conjecture. arXiv preprint arXiv:2209.03326, 2022.

[5]

Michel Talagrand. Are many small sets explicitly small? In Proceedings of the forty-second ACM symposium on Theory of computing, pages 13–36, 2010.

[6]

Jinyoung Park and Huy Tuan Pham. A proof of the . Journal of the American Mathematical Society, 37(1):235–243, 2024.

[7]

Svante Janson, Tomasz Łuczak, and Andrzej Ruciński. Random graphs. John Wiley & Sons, 2011.

On the "Second" Kahn–Kalai Conjecture: cliques, cycles, and trees