The Computational Complexity of Almost Stable Clustering with Penalties


Abstract

We investigate the complexity of stable (or perturbation-resilient) instances of \(k\)-Means and \(k\)-Median clustering problems in metrics with small doubling dimension. While these problems have been extensively studied under multiplicative perturbation resilience in low-dimensional Euclidean spaces (e.g., [1], [2]), we adopt a more general notion of stability, termed “almost stable”, which is closer to the notion of \((\alpha, \varepsilon)\)-perturbation resilience introduced by [3]. Additionally, we extend our results to \(k\)-Means/\(k\)-Median with penalties, where each data point is either assigned to a cluster centre or incurs a penalty.

We show that certain special cases of almost stable \(k\)-Means/\(k\)-Median(with penalties) are solvable in polynomial time. To complement this, we also examine the hardness of almost stable instances and \((1 + \frac{1}{poly(n)})\)-stable instances of \(k\)-Means/\(k\)-Median(with penalties), proving super-polynomial lower bounds on the runtime of any exact algorithm under the widely believed Exponential Time Hypothesis (ETH).

Clustering, \(k\)-Means, \(k\)-Median, Stability, Perturbation Resilience, Hardness Results

1 Introduction↩︎

A fundamental challenge in statistical data analysis is to organize an unlabelled set of data points into groups of similar objects. Clustering, a prominent technique of unsupervised learning, is widely used across scientific disciplines to address this challenge. The computational complexity of various clustering objectives has been extensively studied in the literature. While many well-known objectives, such as \(k\)-Means, \(k\)-Median, and \(k\)-Center, are \(\mathsf{NP}\)-hard to optimize [4][6], simple heuristics like local search have long been known to perform effectively on real-world data. This disparity has sparked significant interest in beyond worst-case analysis, aiming to identify structural properties of the input data points that enable heuristic algorithms to produce nearly optimal solutions.

[7] first introduced the perturbation resilience condition in the context of max-cut clustering, where the maximum cut would remain the unique optimum after multiplying the edge weights by a factor between \(1\) and \(1 + \gamma\). [8] studied stables instances of centre-based clustering objectives (including \(k\)-Means and \(k\)-Median) and proved that \(3\)-stable instances of these problems are solvable in polynomial time. Since then, the challenge has been to determine the smallest stability value for which the problem is in \(\mathsf{P}\). [3] improved the bound to \(1 + \sqrt{2}\), and [9] showed that with a slightly weaker promise of metric perturbation resilience, \(2\)-stable instances of a class of centre-based clustering objectives can be optimized in polynomial time. [1] focused on metrics of bounded doubling dimension (which include \(\mathbb{R}^d\) for constant \(d\)) and proved that for any constant \(\varepsilon\), \((1 + \varepsilon)\)-stable instances of \(k\)-Means and \(k\)-Median are solvable in polynomial time. They complemented this result by showing that when the dimension \(d\) is part if the input, a universal constant \(\varepsilon_\circ\) exists such that even a PTAS for \((1 + \varepsilon_\circ)\)-stable instances of \(k\)-Means in \(\mathbb{R}^d\) would imply \(\mathsf{NP}=\mathsf{RP}\).

The definition of stable clustering problems is motivated by the intuition that most real-world instances of such problems behave in a “stable” manner for the right objective [7], i.e., the structure of the optimum solution is resilient against small perturbations in input values. Here, the objective value of a solution is used as a proxy for its resilience to perturbation, i.e., for stable instances the solution with the optimum objective value must maintain its optimality under a set of valid perturbations. To address cases when this model of stability is too restrictive, the notion of \(\alpha\)-stability can be relaxed to ensure that perturbing the metric keeps the optimum solution “\(\beta\)-close” (for some notion of closeness) to the optimum solution of the original metric, for some \(\beta \geq 0\). [10] viewed stability as a property of the clustering algorithm itself, and allowed for various optimum solutions that are structurally similar under Hamming distance. [3] introduced \((\alpha, \beta)\)-stable instances of centre-based clustering (see Section 2 for details) and provided an approximation scheme for \((\alpha, \beta)\)-stable instances of \(k\)-Median when \(\alpha > 2 + \sqrt{3}\) and \(\beta\) is a function of the minimum size of a cluster in the optimal solution.

1.1 Our Contribution↩︎

We study the perturbation resilience of \(k\)-Means and \(k\)-Median in metrics of bounded doubling dimension. Our ultimate goal is to give tight bounds on the \((\alpha, \beta)\)-stability parameter of efficiently solvable problem instances. While we do not fully close the gap, we make significant progress towards this goal by establishing both upper and lower bounds, thereby providing a more refined picture of the complexity of the problem. Our results also extend to the penalty version of \(k\)-Means and \(k\)-Median, in which points may not be assigned to a cluster, thus incurring a penalty.

We first consider a generalization of \(\alpha\)-perturbation resilience, where the given instance is not required to have a unique optimum (see Section 2 for details). We show that such instances of the \(k\)-Means and \(k\)-Median(with penalties) are solvable over doubling metrics:

theoremwarmup Fix any \(\varepsilon' > 0\). \((1 + \varepsilon')\)-stable instances of \(k\)-Means in doubling metrics can be solved in polynomial time.

theoremalmoststablepen Fix any \(\varepsilon' > 0\). \((1 + \varepsilon')\)-stable instances of \(k\)-Means with penalties in doubling metrics can be solved in polynomial time.

Next, we focus on the hardness of \((\alpha, \beta)\)-instances of the problem (which we call “almost stable” instances). Assuming ETH, we prove that an optimal solution cannot be obtained, even if we allow the algorithm a super-polynomial runtime, and even if we restrict the instance to a bounded-dimensional Euclidean space.

theoremkmedpenaltyhardness For any \(\alpha\in(1,1.2)\), \(\beta>0\), there is no \(f(k) n^{o(\sqrt{k})}\)-time algorithm for \((\alpha, \beta)\)-stable Euclidean \(k\)-Median with penalties in \(\mathbb{R}^2\), unless ETH fails.

theoremkmedhardness For any \(\alpha\in(1,1.2)\), \(\beta>0\), there is no \(f(k) n^{o(\sqrt{k})}\)-time algorithm for \((\alpha, \beta)\)-stable \(k\)-Median in metrics of doubling dimension \(3\), unless ETH fails.

Finally, to complement our algorithmic upper bounds for \(\alpha\)-stable instances of \(k\)-Means/\(k\)-Median(with penalties), we ask the question: how close to one can \(\alpha\) get to ensure the instance is still efficiently solvable? While we do not obtain APX-hardness, we manage to show that for inverse polynomial values of \(\varepsilon\), the problem is not likely to have an FPT solution.

theoremkmedpeninversehardness There is no \(f(k) n^{o(k)}\)-time algorithm for \((1 + \mathcal{O}\left(\frac{1}{n^{12}}\right))\)-stable Euclidean \(k\)-Median with penalties in \(\mathbb{R}^4\), unless ETH fails.

theoremkmedinversehardness There is no \(f(k) n^{o(k)}\)-time algorithm for \((1 + \mathcal{O}\left(\frac{1}{n^{16}}\right))\)-stable Euclidean \(k\)-Median in \(\mathbb{R}^6\), unless ETH fails.

2 Preliminaries↩︎

Consider a doubling metric space \((V, \delta)\), where \(V\) is a set of points and \(\delta\) a distance function over \(V\) with doubling dimension \(d\). We focus on stable instances of discrete clustering problems \(\mathrm{\small k-Means}\) and \(\mathrm{\small k-Median}\), with and without penalties, over \(V\) with metric \(\delta\). Let \(X \subseteq V\) be a finite set of data points, \(C \subseteq V\) a finite set of candidate centres, \(p: X \to \mathbb{R}\) a penalty function, \(\delta: V \times V \to \mathbb{R}\) a metric, and \(k\) an integer. We denote a \(k\)-Means/\(k\)-Median instance as a triple \((X, C, \delta)\), and a \(k\)-Means/\(k\)-Median with penalties instance as a four-tuple \((X, C, p, \delta)\). A feasible solution to the instance is a set \(S \subseteq C\) with \(|S|=k\). For \(k\)-Means(\(k\)-Median) the cost of solution is defined as \(\operatorname{cost}(S):=\sum_{j \in X} \delta(j, S)^2\) (\(\operatorname{cost}(S):=\sum_{j \in X} \delta(j, S)\)), and for \(k\)-Means(\(k\)-Median) with penalties the cost of solution is defined as \(\operatorname{cost}_{\mathrm{pen}}(S):=\sum_{j \in X} \min\left(\delta(j, S)^2, p(j)\right)\) (\(\operatorname{cost}_{\mathrm{pen}}(S):=\sum_{j \in X} \min\left(\delta(j, S), p(j)\right)\)), where \(\delta(j, S)\) is defined as \(\min_{i \in S} \delta(j, i)\).

Next, we define \(\alpha\)-stable clustering. Intuitively, \(\alpha\)-stability for \(\mathrm{\small k-Means}/ \mathrm{\small k-Median}\) ensures that the optimal solution for a perturbed metric remains optimal for the original metric. We follow the concept of multiplicative stability, first introduced by [8] and [7], but we present a more general version here. Let \(\alpha \geq 1\). We call an instance \(\mathcal{I}=(X, C, \delta)\) of \(k\)-Means/\(k\)-Median\(\alpha\)-stable if for any \(\delta \leq \delta^\prime \leq \alpha \delta\) every optimum solution of the \(k\)-Means/\(k\)-Median(with penalties) instance \(\mathcal{I}^{\prime}=\left(X, C, \delta^{\prime} \right)\) is also an optimum solution for \(\mathcal{I}\). Similarly, an instance \(\mathcal{I}=(X, C, \delta, p)\) of \(k\)-Means/\(k\)-Median with penalties is called \(\alpha\)-stable if for any \(\delta \leq \delta^\prime \leq \alpha \delta\) and any \(p'\) with \(p \leq p' \leq \alpha^2 \cdot p\) for \(k\)-Means and \(p \leq p^\prime \leq \alpha \cdot p\) for \(k\)-Median, every optimum solution of the \(k\)-Means/\(k\)-Median with penalties instance \(\mathcal{I}^\prime=(X, C, \delta^\prime, p^\prime)\) is also an optimum solution for \(\mathcal{I}\). \(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: removed definition of strongly stable.

Let \(\operatorname{cost}(S)\) and \(\operatorname{cost}'(S)\) denote the cost of \(S\) before and after perturbation. We also study \((\alpha, \beta)\)-stable instances of \(k\)-Means and \(k\)-Median, as defined by [3]. Recall that in this generalization, \(\beta\) is a measure of proximity between feasible solutions. Various distance metrics can be considered to define \(\beta\)-closeness. We adopt the notion \(\delta^{bij}_{\mathcal{I}}: \binom{C}{k} \times \binom{C}{k} \to \mathbb{R}^{\geq 0}\), defined as follows. However, all our results easily extend to other natural notions of distance. For any \(S_1, S_2 \subseteq C\) with \(|S_1| = |S_2| = k\), define: \[\delta^{bij}_{\mathcal{I}}(S_1, S_2) := \min \left\{ \sum\nolimits_{s_1 \in S_1 } \delta(s_1, f(s_1)) \mid f \text{ is a bijection between S_1 and S_2} \right\}.\] We also generalize this notion to clustering with penalties. Let \(\alpha \geq 1\) and \(\beta \geq 0\). We call an instance \(\mathcal{I}=(X, C, \delta)\) (\(\mathcal{I}=(X, C, \delta, p))\) of \(k\)-Means/\(k\)-Median(with penalties) \((\alpha, \beta)\)-stable if for any optimum solution \(O \subseteq C\) of \(\mathcal{I}\), and any optimum solution \(O^\prime \subseteq C\) of the \(k\)-Means/\(k\)-Median instance (with penalties) \(\mathcal{I}^{\prime}=\left(X, C, \delta^{\prime}\right)\) (\(\mathcal{I}^\prime=(X, C, \delta^\prime, p^\prime))\), where \(\delta^\prime\) is any symmetric function such that \(\delta \leq \delta^\prime \leq \alpha \delta\) (and \(p'\) is any function such that \(p \leq p' \leq \alpha^2 \cdot p\) for \(k\)-Means, and \(p \leq p^\prime \leq \alpha \cdot p\) for \(k\)-Median), we have \(\delta^{bij}_{\mathcal{I}}(O, O^\prime) \leq \beta\).

Observe that setting \(\beta = 0\) yields the definition of \(\alpha\)-stable instances. If \(\beta > 0\), we call the instance of \(k\)-Means/\(k\)-Median an almost stable instance. Below, we study the performance of a local search algorithm for cases where \(\alpha\) is close to 1, that is \(\alpha = 1 + \varepsilon'\). To avoid confusion, we use \(\varepsilon'\) to refer to the stability parameter, and \(\varepsilon\) to refer to a parameter of our local search analysis. It should be mentioned that \(\varepsilon\) only affects the runtime of the local search and is a function of \(\varepsilon'\).

3 Polynomial Algorithms for Stable Clusterings↩︎

In this section, we build upon the exact polynomial-time algorithm of [1] \(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: made some minor changes here and design a version that incorporating penalties. Despite high-level similarities, our analysis departs where needed to handle penalty terms. We first show how to obtain optima for \(\mathrm{\small k-Means}\) without penalties in doubling metrics (see Theorem [thm:warmup]). Next, we provide a polynomial-time algorithm for \(\mathrm{\small k-Means}\) with penalties (see Theorem [thm:almost95stable95pen]). Let \((X, C, p, \delta)\) be the given instance of \(k\)-Means with penalties. Define \(\mathcal{F}_k = \{S \subseteq C:|S|=k\}\) to be the set of feasible solutions. Throughout this section, we use Algorithm 1, a generalization of the local search method introduced by [1] to \(\mathrm{\small k-Means}\) with penalties. Their original algorithm is simply the standard \(\rho\)-swap local search algorithm, modified to perform the swap that provides the best improvement at each step. We set \(\rho=d^{O(d)} \cdot \varepsilon^{-O(d / \varepsilon)}\).

Figure 1: \rho-Swap Local Search

To expand Friggstad et al.’s terminology to the case with penalties, for any \(S, O \in \mathcal{F}_k\), define:

  • For \(j \in X\), let \(\sigma(j, S)\) be the centre in \(S\) nearest to \(j\), breaking ties by a fixed ordering of \(C\).

  • \(\overline{X}_{S, O}=\{j \in X: \sigma(j, S) \in S-O\) and \(\sigma(j, O) \in O-S\}\).

  • \(\overline{X}^{pen}_{S, O}=\{j \in X: \sigma(j, S) \in S-O\) and \(\sigma(j, O) \in O-S\) and \(\delta(j, S)^2, \delta(j, O)^2 < p(j)\) \(\}\).

  • \(\Psi(S, O)=\sum_{j \in \overline{X}_{S, O}} \delta(j, \sigma(j, S))^2+\delta(j, \sigma(j, O))^2\), \(\Psi_{pen}(S, O)=\sum_{j \in \overline{X}^{pen}_{S, O}} \delta(j, \sigma(j, S))^2+\delta(j, \sigma(j, O))^2\).

[1] first showed that Algorithm 1 visits a “nearly-good” solution in a polynomial number of rounds and returns such a solution upon termination. Next, they used the properties of stable instances to show that any nearly-good solution must be the unique optimal solution, hence Algorithm 1 is an exact polynomial-time algorithm to solve the problem. We slightly modify the notion of nearly-good solutions to handle technical aspects of our generalized definition of stability.

\(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: We only need one nearly-good definition. I changed the definition.

Definition 1. We call \(S \in \mathcal{F}_k\) a nearly-good solution if for every optimum solution \(F \in \mathcal{F}_k\) we have \(\operatorname{cost}(S) \leq \operatorname{cost}(F)+2 \varepsilon \cdot \Psi(S, F)\) for \(\mathrm{\small k-Means}\) and \(\operatorname{cost}_{\mathrm{pen}}(S) \leq \operatorname{cost}_{\mathrm{pen}}(O)+ 2 \varepsilon \cdot \Psi_{pen}(S, F)\) for \(\mathrm{\small k-Means}\) with penalties.

Using a by now standard technique introduced by [11], we can show the existence of a desirable partitioning of the set of centres via a probabilistic argument. Fix any \(S', O', T \subseteq C\) and \(X' \subseteq X\) such that \(S'\) and \(O'\) are disjoint, \(T \subseteq O' \cup S'\), and \(|T| = |O'| = |S'|\). Define \(\Delta_j^T\) for each \(j \in X'\) to be \(\delta\left(j, \sigma\left(j, S^{\prime} \triangle T\right)\right)^2-\delta\left(j, \sigma\left(j, S^{\prime}\right)\right)^2\). \(\Delta_j^T\) denotes the change in \(j\)’s assignment cost when solution \(S'\) is replaced by \(S' \Delta T\). One obtains the following.

Theorem 1 (Theorem 5 in [1]). There is a randomized algorithm that samples a partition \(\pi\) of \(S^{\prime} \cup O^{\prime}\) such that \(\left|T \cap S^{\prime}\right|=\left|T \cap O^{\prime}\right| \leq \rho\) for each \(T \in \pi\) and \[\mathbf{E}_\pi\left[\sum\nolimits_{T \in \pi} \sum\nolimits_{j \in X'} \Delta_j^T\right] \leq \sum\nolimits_{j \in X'}(1+\varepsilon) \cdot \delta\left(j, O^{\prime}\right)^2-(1-\varepsilon) \cdot \delta\left(j, S^{\prime}\right)^2\,.\]

Next, we observe that if a feasible solution \(S\) is not nearly-good according to our more strict definition, it still holds that the local search algorithm must take a large step on \(S\):

Theorem 2 (Theorem 6 in [1])). For any \(S, O \in \mathcal{F}_k\), if \(\operatorname{cost}(S)>\operatorname{cost}(O)+\varepsilon \cdot \Psi(S, O)\), then there exists an \(S^{\prime} \in \mathcal{F}_k\) with \(\left|S-S^{\prime}\right| \leq \rho\) and \(\operatorname{cost}\left(S^{\prime}\right) \leq \operatorname{cost}(S)+\big(\operatorname{cost}(O)-\operatorname{cost}(S)\) \(+ \varepsilon \cdot \Psi(S, O)\big)/k\).

Rather than recreating Friggstad et al.’s proof of the theorem, we simply observe that at no point does their proof require \(O\) to be globally optimal or \(S\) to be locally optimal. Theorem 2 implies that Algorithm 1 converges to a nearly-good solution in polynomially many iterations. Due to space constraints, we defer the proof to the appendix.

Lemma 1. Given any instance of the \(k\)-Means problem, Algorithm 1 with \(\rho=d^{O(d)} \cdot \varepsilon^{-O(d / \varepsilon)}\) terminates at a nearly-good solution. Also, within \(2 k \cdot \ln (n \Delta)\) iterations, the algorithm produces a nearly-good solution, where \(\Delta=\max _{j \in X, i \in C} \delta^2(i, j)\).

3.1 Stable \(k\)-Means without Penalties↩︎

Lemma 1 alone does not guarantee that Algorithm 1 finds an optimal solution for the \((1 + \varepsilon')\)-stable instances of \(k\)-Means. We use the stability condition to show that a nearly-good solution must be one of the optimal ones for the instance. Combined with the fact that the returned solution of Algorithm 1 is nearly-good we get the desired result:

Proof. \(\Box\) F: I changed the proof here to not require strong stability. We refrain from recreating the proof in entirety. Instead we refer interested readers to the proof of Lemma 2 in [1].\(\Box\) K: I would say we should omit this since we are invoking the needed lemmas from [1] anyway.. We first perturb the distances as follows:

\[\label{eq:perturbed-dist} \delta^{\prime}(i, j)=\left\{\begin{tabular}{l l} \left(1+\varepsilon^{\prime}\right) \cdot \delta(i, j) & ; if i \neq \sigma(j, S) \\ \delta(i, j) & ; otherwise \end{tabular}\right.\tag{1}\]

We present a slightly modified version of Lemma 2 in [1]:

Lemma 2 (Lemma 2 of [1], Restatement). Let \(S, O \in \mathcal{F}_k\) be feasible solutions. If \(cost(S) \leq cost(O) + 2\varepsilon \Psi(S, O)\), then \(cost'(S) \leq cost'(O)\).\(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) Sa: The sentence still isn’t grammatical. There are still two “if”s. How about: “Let \(S, O \in \mathcal{F}_k\) be feasible solutions. If \(cost(S) \leq cost(O) + 2\varepsilon \Psi(S, O)\), then \(cost'(S) \leq cost'(O)\).”?

In this lemma, we let \(S\) be the nearly-good solution of Algorithm 1. Let \(O\) be any optimum solution of \(\mathcal{I}^\prime\). Using stability condition \(O\) should also be an optimum solution for \(\mathcal{I}\). We observe that \[\operatorname{cost}'(S) = \operatorname{cost}(S) \leq \operatorname{cost}'(O),\] where \(\operatorname{cost}'\) indicates the cost of a solution under \(\delta'\). Therefore \(S\) is a optimum solution for \(\mathcal{I}^\prime\). Using the stability condition one more time, we conclude that \(S\) must also be an optimum solution of \(\mathcal{I}\) and this \(S \in \mathcal{O}= \{O_1, \ldots, O_\ell\}\). Assume that Algorithm 1 encounters \(S\) in iteration \(t \leq \left\lfloor 2k\cdot \ln (\Delta n)\right\rfloor\). Since \(S\) has the lowest possible cost for the instance \(\mathcal{I}\), the local search must terminate at iteration \(t\) and return \(S\). ◻

3.2 Stable \(k\)-Means with Penalties↩︎

In this section, we provide polynomial-time exact (\((1 + \mathcal{O}\left(\varepsilon'\right))\)-approximate) solutions for stable instances of \(k\)-Means with penalties in doubling metrics. All the results in this section are extendable to \(k\)-Median with minor modification. We first show how to solve \((1 + \varepsilon')\)-stable instances of \(k\)-Means exactly in polynomial time when the metric has a bounded doubling dimension. We first focus on \((1 + \varepsilon')\)-stable instances for which any optimal solution will remain the optimum for the perturbed instance.

The local search in [1] is oblivious to penalties and cannot guarantee nearly-good solutions under this new setting. Our core idea for addressing this issue is to capture the penalties for the points of \(X\) by adding a dummy centre \(z^*\) to \(V\) (specifically, to \(C\)) and defining \(C'=C \cup \{z^*\}\). We extend \(\delta\) to \(V \cup \{z^*\}\) by setting \(\delta(z^*, j)^2 = \delta(j, z^*)^2 = p(j)\) for every \(j \in X\), and \(\delta(z^*, c) = 0\) for all \(c \in C'\). Note that \(\delta\) no longer satisfies the triangle inequality, making the analysis more challenging. For any solution \(S\) of the \(k+1\)-Means instance \((X, C^{\prime}, \delta)\) that contains \(z^*\), the cost of \(S\) equals the cost of \(S \backslash \{z^*\}\) for the original \(k\)-Means with penalties instance \((X, C, p, \delta)\). Thus, the task of finding the optimum solution of the \((X, C, p, \delta)\) instance is reduced to finding the optimum solution that contains \(z^*\) for the \(k+1\)-Means instance \((X, C^{\prime}, \delta)\). Our key insight here is to modify the \(\rho\)-swap algorithm to ensure it always makes the cheapest swap that does not evict \(z^*\). This eliminates the need to invoke the triangle inequality on \(z^*\).

We first show that our modified local search converges to a nearly-good solution in polynomial time. Then, we prove that the stability condition ensures that any such nearly-good solution is, in fact, the optimal one. While the structure of our proof is similar to Friggstad et al.’s, we need to tackle challenges in each step. We will highlight how we address these challenges in what follows.

Before rigorously proving Theorem [thm:almost95stable95pen], we prove that a nearly-good solution can be found in polynomial time for \(\mathrm{\small k-Means}\) with penalties as well. To do so, it suffices to replicate Theorem 2 for the penalty setting. Consider the original instance \((X, C, p, \delta)\) and fix any \(S, O \in \mathcal{F}_k\). Define \(S^{\prime} = S \backslash O\) and \(O^{\prime} = O \backslash S\), and fix some \(T \subseteq O^{\prime} \cup S'\) such that \(|T| = |O'| = |S'|\). Moreover, for any \(j \in \overline{X}^{pen}_{S, O}\) define \(\tilde{\Delta}_j^T\) to be \(\min(\delta\left(j, \sigma\left(j, S^{\prime} \triangle T\right)\right)^2, p(j))-\delta\left(j, \sigma\left(S^{\prime}\right)\right)^2\). Using Theorem 1 with \(X' = \overline{X}^{pen}_{S, O}\), there exists a randomized algorithm that samples a partition \(\pi\) of \(S^{\prime} \cup O^{\prime}\) with \(\left|T \cap S^{\prime}\right|=\left|T \cap O^{\prime}\right| \leq \rho\) for each \(T \in \pi\) such that \[\label{eq:rand-partition-penalty} \begin{align} \mathbf{E}_\pi\left[\sum\nolimits_{T \in \pi} \sum\nolimits_{j \in \overline{X}^{pen}_{S, O}} \tilde{\Delta}_j^T\right] &\leq \mathbf{E}_\pi\left[\sum\nolimits_{T \in \pi} \sum\nolimits_{j \in \overline{X}^{pen}_{S, O}} \delta\left(j, \sigma\left(j, S^{\prime} \triangle T\right)\right)^2 -\delta\left(j, \sigma\left(j, S^{\prime}\right)\right)^2\right] \\ &\leq \sum\nolimits_{j \in \overline{X}^{pen}_{S, O}}(1+\varepsilon) \cdot \delta\left(j, O^{\prime}\right)^2-(1-\varepsilon) \cdot \delta\left(j, S^{\prime}\right)^2 \end{align}\tag{2}\]

We defer the proof of the following theorem to the appendix.

Theorem 3. For any \(S, O \in \mathcal{F}_k\), if \(\operatorname{cost}_{\mathrm{pen}}(S)>\operatorname{cost}_{\mathrm{pen}}(O)+\varepsilon \cdot \Psi_{pen}(S, O)\), then there exists an \(S' \in \mathcal{F}_k\) with \(\left|S-S^{\prime}\right| \leq \rho\) such that \[\operatorname{cost}_{\mathrm{pen}}\left(S^{\prime}\right) \leq \operatorname{cost}_{\mathrm{pen}}(S)+\frac{\operatorname{cost}_{\mathrm{pen}}(O)-\operatorname{cost}_{\mathrm{pen}}(S)+\varepsilon \cdot \Psi_{pen}(S, O)}{k}\]

Now similarly to how one obtains Lemma 1, we can derive the following lemma.

Lemma 3. For \(k\)-Means with penalties when Algorithm 1 terminates, the returned solution is a nearly-good solution. Also, within \(2 k \cdot \ln (n \Delta)\) iterations Algorithm 1 will have had some iteration with \(S\) being a nearly-good solution, where \(\Delta=\max _{j \in X, i \in C} \delta^2(i, j)\).

\(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: I have edited the following proof to not require strong stability.

3.2.0.1 Proof of Theorem [thm:almost95stable95pen]

Let \(\varepsilon\) be such that \(1+6 \varepsilon=\left(1+\varepsilon^{\prime}\right)^2\), roughly speaking we have \(\varepsilon \approx \varepsilon^{\prime} / 3\) for small \(\varepsilon^{\prime}\). Moreover, let \(S\) be the nearly-good solution that Algorithm 1 encounters within \(2 k \cdot \ln (n \Delta)\) iterations (by Lemma 3). Define a perturbed distance function \(\delta^{\prime}\) as in 1 and a perturbed penalty function via \[\label{eq:perturbed-pen} p^{\prime}(j) = \left\{\begin{tabular}{l l} \left(1+\varepsilon^{\prime}\right)^2 p(j)\; & ; if \delta(j, S) < p(j)\,, \\ p(j) & ; otherwise \end{tabular}\right.\tag{3}\] Moreover, for any \(S^{\prime} \in \mathcal{F}_k\), let \(\operatorname{cost}_{\mathrm{pen}}^{\prime}\left(S^{\prime}\right)=\sum_{j \in X} \min(\min _{i \in S^{\prime}} \delta^{\prime}(i, j)^2, p^{\prime}(j))\) be the cost of \(S^{\prime}\) under \(\delta^{\prime}\). Clearly, \(\delta \leq \delta^\prime \leq (1 + \varepsilon^\prime) \cdot \delta\) and \(p \leq p^\prime \leq (1 + \varepsilon^\prime)^2 \cdot p\). Let \(O_q\) be any optimum solution w.r.t. \(\operatorname{cost}_{\mathrm{pen}}^{\prime}\), due to stability condition \(O\) should be an optimum solution w.r.t. \(\operatorname{cost}_{\mathrm{pen}}\). Therefore, \(S\) is nearly good for \(O_q\). Partition \(X\) as follows:

  • \(X^1=\{j \in X: \sigma(j, S) \in S-O_q\) and \(\sigma(j, O_q) \in S \cap O_q\) and \(\delta(j, S)^2 < p(j)\}\)

  • \(X^2=\{j \in X: \sigma(j, S) \in S \cap O_q\) and \(\sigma(j, O_q) \in O_q-S\) and \(\delta(j, O_q)^2 < p(j)\}\)

  • \(X^3=\{j \in X: \sigma(j, S), \sigma(j, O_q) \in S \cap O_q\) or \(\delta(j, S)^2, \delta(j, O_q)^2 \geq p(j)\}\)

  • \(X^4=\{j \in X: \sigma(j, S) \in S-O_q\) and \(\sigma(j, O_q) \in O_q-S\) and \(\delta(j, S)^2, \delta(j, O_q)^2 < p(j)\}\).

Observe that\(X^4=\overline{X}^{pen}_{S, O_q}\). As in the proof of Theorem 3, let \(c_j^*=\min\{\delta(j, \sigma(j, O_q))^2, p(j)\}\) be the cost incurred by point \(j\) in the optimum solution, and analogously \(c_j=\min\{\delta(j, \sigma(j, S))^2, p(j)\}\) for \(S\). To calculate \(\operatorname{cost}_{\mathrm{pen}}^{\prime}(O_q)\), we consider each \(j \in X\) case by case.

i) For \(j \in X^1\) or \(j \in X^4\) we have \(p'(j) = (1 + \varepsilon^{\prime})^2 p(j)\) and \(\delta^{\prime}(i, j) = (1 + \varepsilon^{\prime}) \delta(i, j)\) for all \(i \in O_q\). Thus, \(\min\{\min _{i \in O_q} \{\delta^{\prime}(i, j)^2\}, p^{\prime}(j)\} = (1 + \varepsilon^{\prime})^2 c^*_j\). ii) For \(j \in X^2\), only \(i = \sigma(j, S) \in O_q\) does not have its \(\delta^{\prime}(i, j)^2\) multiplied by \((1 + \varepsilon^{\prime})\). Thus, if \(\delta^2(j, S) \leq p(j)\) we have \(\min\{\min _{i \in O_q} \{\delta^{\prime}(i, j)^2\}, p^{\prime}(j)\} = \min\{ (1 + \varepsilon^{\prime})^2 c^*_j, \delta(j, S)\}\). Otherwise, we have that \(\min\{\min _{i \in O_q} \{\delta^{\prime}(i, j)^2\}, p^{\prime}(j)\} = \min\{ (1 + \varepsilon^{\prime})^2 c^*_j, p(j)\}\). Therefore, \(\min\{\min _{i \in O_q} \delta^{\prime}(i, j)^2, p^{\prime}(j)\} = \min\{ (1 + \varepsilon^{\prime})^2 c^*_j, c_j\}\). iii) For \(j \in X^3\) clearly \(\min\{\min _{i \in O_q} \delta^{\prime}(i, j)^2, p^{\prime}(j)\} = c_j = c^*_j\). Thus we derive

\[\label{eq:k-means-alpha-solution-penalty-polynomial-1} \begin{align} \operatorname{cost}_{\mathrm{pen}}^{\prime}(O_q)=\sum\nolimits_{j \in X^1} \left(1+\varepsilon^{\prime}\right)^2 \cdot c_j^* +\sum\nolimits_{j \in X^2} \min \left\{\left(1+\varepsilon^{\prime}\right)^2 \cdot c_j^*, c_j \right\} \\ +\sum\nolimits_{j \in X^3} c_j^*+\sum\nolimits_{j \in X^4} \left(1+\varepsilon^{\prime}\right)^2 \cdot c_j^*. \end{align}\tag{4}\]

Finally, we make one last observation, the proof of which is deferred to the appendix.

Lemma 4. If \(S\) is nearly-good for \(O_q\), then \(\sum_{j \in X^4} c_j \leq(1+6 \varepsilon) \cdot\left(\sum_{j \in X^1} c_j^*+\sum_{j \in X^4} c_j^*\right)\).

By Lemma 4, we bound \(\operatorname{cost}_{\mathrm{pen}}(S)\) in the following way: \[\label{eq:k-means-alpha-solution-penalty-polynomial-3} \begin{align} \operatorname{cost}_{\mathrm{pen}}(S) & \leq \operatorname{cost}_{\mathrm{pen}}(O_q)+2 \varepsilon \cdot \Psi_{pen}(S, O_q)=\operatorname{cost}_{\mathrm{pen}}(O_q)+2 \varepsilon \sum_{j \in X^4} c_j^*+2 \varepsilon \sum_{j \in X^4} c_j \\ & \leq \operatorname{cost}_{\mathrm{pen}}(O_q)+2 \varepsilon \sum_{j \in X^4} c_j^*+2 \varepsilon \cdot(1+6 \varepsilon) \cdot \left(\sum_{j \in X^1} c_j^*+\sum_{j \in X^4} c_j^* \right) \\ & \leq \sum_{j \in X^1}(1+6 \varepsilon) \cdot c_j^*+\sum_{j \in X^2} c_j^*+\sum_{j \in X^3} c_j^*+\sum_{j \in X^4}(1+6 \varepsilon) \cdot c_j^* \\ & \leq \sum_{j \in X^1}(1+6 \varepsilon) \cdot c_j^*+\sum_{j \in X^2} \min \left\{(1+6 \varepsilon) \cdot c_j^*, c_j\right\}+\sum_{j \in X^3} c_j^*+\sum_{j \in X^4}(1+6 \varepsilon) \cdot c_j^*. \end{align}\tag{5}\] The last bound again uses \(c_j^* \leq c_j\) for \(j \in X^2\). Recall we chose \(\varepsilon\) so that \((1+6 \varepsilon)=\left(1+\varepsilon^{\prime}\right)^2\). Thus, combining Lemma 4, 5 and the simple observation that \(\operatorname{cost}_{\mathrm{pen}}^{\prime}(S)=\operatorname{cost}_{\mathrm{pen}}(S)\), we get that \(\operatorname{cost}_{\mathrm{pen}}^{\prime}(S)=\operatorname{cost}_{\mathrm{pen}}(S) \leq \operatorname{cost}_{\mathrm{pen}}^{\prime}(O_q)\). Which means that \(S\) is an optimum solution for \((X, C, p^{\prime}, \delta^{\prime})\). This completes the proof.

4 Hardness of \((\alpha, \beta)\)-stable \(k\)-Median↩︎

Section 3 discussed steps towards finding optimum and near-optimum solutions of \((\alpha, \beta)\)-stable \(k\)-Means/ \(k\)-Median with or without penalties. To complement that, in this section, we establish hardness results for \((\alpha, \beta)\)-stable \(k\)-Median, as the question of whether we can efficiently obtain the exact optimum solution of \((\alpha, \beta)\)-stable \(k\)-Means/ \(k\)-Median with penalties remains open. In Section 4.1 and Section 4.2, we obtain hardness results for \((\alpha, \beta)\)-stable \(k\)-Median, respectively, with and without penalties for any \(\alpha\in(1,1.2)\), \(\beta>0\). While, for ease of presentation, our hardness results are shown for \(k\)-Median, every result in this section readily extends to \(k\)-Means as well.

The results of this section are inspired by a reduction of the Grid Tiling Inequality problem to \(k^2\)-Median with penalties, which was introduced by [12]. We here present our adapted version of the reduction from Grid Tiling Inequality to \(k^2\)-Median.

Consider any Grid Tiling Inequality problem with integer \(n\) and collection \(S\) of \(k^2\) non-empty sets. We define a Euclidean \(k^2\)-Median instance with penalties similar to the one introduced in Theorem 6.2 of [12]. Let \(\mathcal{I}(S, n) = (X^{grid}, C^{grid}, p^{grid}, \delta)\) denote this instance. Fix \(\varepsilon= O(\beta / n^3)\) for any \(\beta > 0\). Define a set of data points \(X^{grid}\) in the region consisting of a square of side length \(2 k+\varepsilon(n-1)\), where the lower left corner of the square is on the origin (i.e., \(A=\{(x, y) \mid\) \(0 \leq x, y \leq 2 k+\varepsilon(n-1)\})\). They are spaced evenly in a grid \(G\), with two consecutive (horizontal or vertical) data points at distance \(\varepsilon\) from each other; thus there are \(\varSigma=(2 k / \varepsilon+n)^2\) data points. Each data point \(\vec{j}\) has a penalty of \(p^{grid}(\vec{j}) = 1\). Thinking of this grid as a discrete approximation of the uniform measure on the square \(A\), we work with the discrete measure \(\mu\) carried by the data points, where each data point is weighted 1, so that \(\iint_A d \mu=\varSigma\).

For each set \(S_{i, j}\), we introduce \(\left|S_{i, j}\right| \leq n^2\) candidate centres, and let \(C_{i, j}\) denote the set of such candidate centres (note that there are \(k^2\) such sets), where \(C_{i, j}=\{(2 i-1,2 j-1)+\varepsilon(u-1, v-1) \mid\) \(\left.(u, v) \in S_{i, j}\right\}\). Note that the candidate centres are also placed on vertices of \(G\), and that if \(S_{i, j}\) has all possible pairs so that \(S_{i, j}=[n] \times[n]\), then \(C_{i, j}\) precisely forms a subgrid of \(n^2\) evenly spaced points in which consecutive points are at distance \(\varepsilon\) from each other and the lower left point of the subgrid lies at \((2 i-1,2 j-1)\). The final set of candidate centres is given by \(C^{grid}=\cup_{1 \leq i, j \leq k} C_{i, j}\).

Theorem 4 (Theorem 6.2 of [12]). There exist a \(\nu \geq 0\), such that the Grid Tiling Inequality problem with integer \(n\) and collection \(S\) has a solution if and only if there exists a solution to \(\mathcal{I}(S, n)\) with cost at most \(\nu\).

Previously, we mentioned that \(\mu\) approximates the uniform measure. Let us elaborate on what we mean by this. Define, \(R =[-0.5\varepsilon, 2k + \varepsilon (n - 0.5)] \times [-0.5\varepsilon, 2k + \varepsilon (n - 0.5)]\) to be a square with bottom left corner \((-0.5\varepsilon, -0.5\varepsilon)\) and top right corner \((2k + \varepsilon (n - 0.5), 2k + \varepsilon (n - 0.5))\). Denote by \(u_R\) the uniform probability measure on \(R\) and define \(\mu^\star := \varSigma \cdot u_R\). In other words, \(\mu^\star\) is a constant measure on \(R\) with the property \(\mu^\star(R) = \varSigma\) and \(\mu^\star(M) = 1\) for any unit square \(M\). Notice that for any measurable set \(D\) and measurable function \(f\), \(\varepsilon^2 \iint f d \mu\) can be viewed as the two-dimensional Riemann sum for \(\iint_{D} f d \mu^\star\). Therefore, \(\iint_D f d \mu\) approximates \(\frac{1}{\varepsilon^2} \iint_D f d \mu^\star\). The following Lemma will formalize this approximation for two specific \(f\) and \(D\) that will be used later. We defer the proof to the appendix.

Lemma 5. Fix any \(\vec{a} \in C\). Define \(D_r\) to be the circle with centre \(\vec{a}\) and radius \(r \leq 1\). Then
(i) \(\frac{1}{\varepsilon^2 }\iint_{D_{r - \varepsilon}} d \mu^\star \leq \iint_{D_{r}} 1 d \mu \leq \frac{1}{\varepsilon^2 }\iint_{D_{r + \varepsilon}} d \mu^\star\),
(ii) \(\iint_{D_r} \delta(\vec{x}, \vec{a}) d \mu \leq \frac{1}{\varepsilon^2} \left(\iint_{D_{r + \varepsilon}} \delta(\vec{x}, \vec{a}) d \mu^\star + \varepsilon \iint_{D_{r + \varepsilon}} d \mu^\star \right).\)
Here \(\iint_{D_{r}} 1 d \mu^\star = \pi r^2\) and \(\iint_{D_r} \delta(\vec{x}, \vec{a}) d \mu^\star = \frac{2}{3} \pi r^3\).

4.1 Stable Euclidean \(k\)-Median with Penalties↩︎

In this section, we will prove that for any \(1 < \alpha < 1.2\), \(0 < \beta\) \((\alpha, \beta)\)-stable Euclidean \(k\)-Median with penalties in \(\mathbb{R}^2\) is hard. The proof idea is to show that \(\mathcal{I}(S, n)\) is also stable. Since any two centres in any \(C_{i, j}\) for \(1 \leq i, j \leq k\) are close, for showing \((\alpha, \beta)\)-stability it is enough to show that, even if we perturb distances by a constant factor, the optimum solution would be \(k^2\) candidate centres each from a \(C_{i, j}\) for \(i, j \leq k\). Intuitively, this is not hard to see, as the cost of switching from \(\vec{a} \in \mathcal{C}_{i, j}\) to another \(\vec{b} \neq \vec{a} \in \mathcal{C}_{i, j}\) is relatively small. Conversely, by adding a centre \(\vec{c}\) in a cluster \(\mathcal{C}_{i', j'}\) that currently lacks a centre, we can significantly reduce the costs for points located near \(\vec{c}\).

Proof. By Theorem 4, it suffices to show that \(\mathcal{I}(S, n)\) is also \((\alpha, \beta)\)-stable. Consider any \(\delta \leq \delta^\prime \leq \alpha \cdot \delta\) and \(p \leq p^\prime \leq \alpha \cdot p\). Define \(\operatorname{cost}_{\mathrm{pen}}^\prime(S) = \sum_{j \in X^{grid}} \min_{i \in S} \left\{\min \left\{\delta^\prime(i, j), p^\prime(j)\right\}\right\}\) for every \(S \subseteq C^{grid}\). First we prove that every optimal solution \(O\) of the \(k^2\)-Median, \((X^{grid}, C^{grid}, \delta^{\prime}, p')\) cannot have two centres in a \(C_{i, j}\) for \(1 \leq i, j, \leq k\). For the sake of contradiction assume that there exist \(1 \leq i, j, \leq k\) such that there exist \(\vec{a}, \vec{b} \in C_{i, j} \cap O\). Due to the pigeonhole principle, we have \(i'\) and \(j'\), \(1 \leq i^{\prime}, j^{\prime}, \leq k\) such that \(C_{i^{\prime}, j^{\prime}} \cap O = \varnothing\). Let \(O^{\prime} = (O \backslash \{\vec{a}\}) \cup \{\vec{c}\}\) for any \(\vec{c} \in C_{i^{\prime}, j^{\prime}}\). We then argue that \(\operatorname{cost}_{\mathrm{pen}}^\prime(O^{\prime})\) is smaller than \(\operatorname{cost}_{\mathrm{pen}}^\prime(O)\), which is a contradiction. Let \(D\) be the set of all points in a circle with centre \(\vec{c}\) and radius \(\frac{1}{\alpha}\). Every \(\vec{x} \in D\) satisfies \(\delta^{\prime}(\vec{x}, \vec{c}) \leq \alpha \delta(\vec{x}, \vec{c}) \leq 1\). Also, since every other point \(\vec{d} \neq \vec{a}\), \(\vec{d} \in O^\prime\), has the property \(\delta(\vec{c}, \vec{d}) \geq 2 - (n - 1) \varepsilon\), we have \(\delta^\prime(\vec{d}, \vec{x}) \geq \delta(\vec{d}, \vec{x}) \geq 2 - \alpha - (n - 1) \varepsilon\) which is greater than 1 for large enough \(n\) (for \(n\) polynomial in \(\frac{1}{\alpha - 1}\)). Also, \(p'(\vec{x}) \geq p^{grid}(\vec{x}) = 1\) which is smaller than \(\delta^{\prime}(\vec{x}, \vec{c})\). Therefore, every point in \(D\) will be assigned to \(\vec{c}\). Thus, adding \(\vec{c}\) to \(O\) will decrease \(\operatorname{cost}_{\mathrm{pen}}^\prime(O)\) at least by \[\label{eq:k-median-penalty-alpha-beta-stable-hardness-1} \begin{align} \iint_{D} (1 - \delta^{\prime}(\vec{x}, \vec{c})) d\mu &\geq \iint_{D} \left(1 - \alpha \delta(\vec{x}, \vec{c})\right) d\mu \\ & \stackrel{(i)}{\geq} \frac{1}{\varepsilon^2} \left( \pi \left(\frac{1}{\alpha} - \varepsilon\right)^2 - \pi \alpha \left( \frac{2}{3} \cdot \left(\frac{1}{\alpha} + \varepsilon\right)^3 + \varepsilon \left(\frac{1}{\alpha} + \varepsilon\right)^2 \right) \right),\\ \end{align}\tag{6}\] where (i) is due to Lemma 5. Next, note that in order for any \(\vec{x} \in A\) to be assigned to \(\vec{a}\), \(\delta^{\prime}(\vec{x}, \vec{a})\) should be less than \(p'(\vec{x}) \leq \alpha\). Subsequently, all such \(\vec{x}\) should have \(\delta(\vec{x}, \vec{a}) \leq \alpha\), i.e., they should belong to a circle with radius \(\alpha\) and centre \(\vec{a}\), which we denote by \(D^{\prime}\). Next, notice that \[\label{eq:k-median-penalty-alpha-beta-stable-hardness-2} \begin{align} \delta^{\prime}(\vec{x}, \vec{b}) & \leq \alpha \delta(\vec{x}, \vec{b}) \leq \alpha (\delta(\vec{x}, \vec{a}) + \varepsilon (n - 1)) = \delta(\vec{x}, \vec{a}) + \left(\alpha - 1\right) \delta(\vec{x}, \vec{a}) + \alpha \varepsilon (n - 1) \\ & \leq \delta^{\prime}(\vec{x}, \vec{a}) + \left(\alpha - 1\right) \delta(\vec{x}, \vec{a}) + \alpha \varepsilon (n - 1). \end{align}\tag{7}\] Thus, removing \(\vec{a}\) from \(O\) will decrease \(\operatorname{cost}_{\mathrm{pen}}^\prime(O)\) by at most \[\label{eq:k-median-penalty-alpha-beta-stable-hardness-3} \begin{align} \iint_{D^{\prime}} &\max \left[ \delta^{\prime}(\vec{x}, b) - \delta^{\prime}(\vec{x}, \vec{a}), 0\right] \stackrel{(i)}{\leq} \iint_{D^{\prime}} \left(\alpha - 1\right) \delta(\vec{x}, a) + \alpha \varepsilon (n-1) d\mu \\ &\leq \frac{1}{\varepsilon^2} \left( \left(\alpha - 1\right) \frac{2\pi}{3} (\alpha + \varepsilon)^3 + (\alpha - 1)\varepsilon \pi (\alpha + \varepsilon)^2 + \alpha \varepsilon (n-1) \pi ( \alpha + \varepsilon)^2 \right). \\ \end{align}\tag{8}\] Next we combine 6 and 8 to derive \(\operatorname{cost}_{\mathrm{pen}}^\prime(O^{\prime}) - \operatorname{cost}_{\mathrm{pen}}^\prime(O) \leq \frac{1}{\varepsilon^2} \left((\alpha^4 - \alpha^3) \frac{2 \pi}{3} - \frac{ \pi}{3 \alpha^2} + O(n \varepsilon) \right)\), which is less than zero for \(1 < \alpha < 1.2\) for large enough \(n\). This contradicts \(O\) minimizing \(\operatorname{cost}_{\mathrm{pen}}^\prime\).

Consider \(O^\prime \in \mathcal{F}_{k^2}\) to be any optimum solution of \(\mathcal{I}(S, n)\). So far, we have established that \(O\) contains \(k^2\) centres, one in each \(C_{i, j}\) for \(1 \leq i, j, \leq k\). This indicates that \(O^\prime\) has the same property. Let \(f:O^\prime \rightarrow O\) be the unique function such that for any \(o \in O^\prime\), if \(o \in C_{i, j}\) for some \(1 \leq i, j, \leq k\), then also \(f(o) \in C_{i, j}\). Therefore, \(\sum\nolimits_{o \in O_p} \delta(o, f_p(o)) \leq k^2 \varepsilon (n-1) \leq \beta\). This implies that \(\delta^{bij}_{\mathcal{I}}(O, O^\prime) \leq \beta\). Thus, \(\mathcal{I}(S, n)\) is \((\alpha, \beta)\)-stable. ◻

4.2 Stable \(k\)-Median in Doubling Metrics↩︎

Focusing on the setting without penalties, we prove that \((\alpha, \beta)\)-stable \(k\)-Median with doubling dimension \(3\) is hard, for any \(1 < \alpha < 1.2\), \(0 < \beta\). The idea for the following theorem is that while the point set is located in \(\mathbb{R}^3\), introducing a metric \(\delta^{pen}\) with doubling dimension \(3\) (different from standard norm two metric) allows us to replicate the effect of penalties. The proof of the following theorem is deferred to the appendix.

5 Hardness of \(\alpha\)-stable \(k\)-Median in Bounded-Dimensional Metrics↩︎

[1] set out to resolve the complexity of \((1 + \varepsilon)\)-stable instances of \(k\)-Median and \(k\)-Means for constant values of \(\varepsilon\), which is highly desirable since this is the range where the \(\rho\)-swap local search can produce solutions in polynomial time. However, the computational complexity of the problem remains unknown when we go beyond constant error values. This question is of particular interest when discussing Fixed Parameter Tractable (FPT) algorithms that allow us slightly super-polynomial running times. Therefore, we examine the computational complexity of \((1 + \varepsilon)\)-stable instances of \(k\)-Median(and \(k\)-Means) when \(\varepsilon\) is an inverse polynomial in \(n\). As the main result of this section, we show that for \(\varepsilon = \frac{1}{poly(n)}\), the problem is still hard.

Inspired by [12], our general approach is to reduce PVC to \(\mathrm{\small k-Median}\) using moment curves. We first establish some useful properties regarding moment curves and spheres. These properties resemble those shown in Section 2.1 of [12], but are modified to our purpose. The proofs, in particular, utilize Descartes’ rule of signs (see [13]), and are deferred to the appendix.

Figure 2: In \mathbb{R}^4 (left), the 3-sphere that goes through the points p_1, p_2, p_3 on the moment curve and is tangent to the moment curve on p_2 and p_3, has no other intersections with the moment curveafter the origin. In \mathbb{R}^3 (right), the sphere that is tangent to the moment curve on p_1 and p_2 has no other intersections with the moment curve.

Lemma 6. Fix any 3 positive values \(0<t_1<t_2<t_3\). Consider the 4-dimensional moment curve \(\left(t, t^2, t^3, t^4\right)\) and the 3-sphere in \(\mathbb{R}^4\) that goes through the moment curve at \(t = t_1\) and is tangent to the curve at \(t = t_2, t_3\). The segments on the moment curve corresponding to \(t \in\left(t_1, t_2\right) \cup\left(t_2, t_3\right) \cup (t_3, \infty)\) all lie outside of the sphere (i.e., the distance of all such points from the centre of the 3-sphere is strictly more than its radius).

Lemma 7. Fix any 2 positive values \(0<t_1<t_2\). Consider the 3-dimensional moment curve \(\left(t, t^2, t^3\right)\) and the unique 2-sphere in \(\mathbb{R}^3\) that is tangent to the moment curve at \(t = t_1, t_2\). The sphere only contacts the curve at \(t_1\) and \(t_2\).

5.1 \((1 + \frac{1}{poly(n)})\)-Stable Euclidean \(k\)-Median with Penalties↩︎

In this subsection, we focus on \(\mathrm{\small k-Median}\) with penalties and show that solving \((1 + \frac{1}{poly(n)})\)-stable Euclidean \(k\)-Median with penalties in \(\mathbb{R}^4\) is hard. The structure of the proof is as follows. Just like [12], we reduce from Partial Vertex Cover(PVC) by creating a candidate centre on a moment curve for each vertex of the input graph, along with a data point for each edge of the input graph \(G = (V, E)\). For each edge, we ensure that the point corresponding to that edge is closer to the two centres representing each vertex of the edge than to any other candidate centre. This allows us to reduce a PVC instance to a \(\mathrm{\small k-Median}\) instance.

However, to ensure the stability of \(\mathrm{\small k-Median}\) with penalties, we need to guarantee that the difference between the distances of the representation of an edge to its corresponding vertex and to the representation of other vertices is significant. Moreover, with the reduction introduced by [12], perturbing the metric function risks having the optimum solution switch from one maximal partial cover of the graph to another. To avoid this, we ensure that the distance between the representation of edges and the representation of their respective vertices remains constant. These are the key properties in our proof, which is deferred to the appendix.

5.2 Hardness of \((1 + \frac{1}{poly(n)})\)-Stable Euclidean \(k\)-Median↩︎

Next, we focus on \(\mathrm{\small k-Median}\) without penalties and show that \((1 + \frac{1}{poly(n)})\)-stable Euclidean \(k\)-Median in \(\mathbb{R}^6\) is hard. The proof is similar to that of Theorem [thm:k-median-alpha-hardness-panlty], and is deferred to the appendix.

6 Conclusion↩︎

This paper addresses the complexity of stable instances of \(k\)-Means and \(k\)-Median under generalized notions of stability, and in the generalized setting with penalties. We show that, under our most general definition of stability, i.e., \((\alpha, \beta)\)-stability, the problem appears to be highly intractable, even in small dimensional Euclidean spaces. We fell short of answering the following question.

Open Problem: Does a \((1 + \varepsilon)\)-approximation for any constant \(\varepsilon > 0\) exist for almost stable (i.e., \(\varepsilon_\circ, \beta_\circ)\)-stable) instances of \(k\)-Means/\(k\)-Median?

Note that in the almost-stable setting, a \((1 + \varepsilon)\)-approximation would mean to find a solution that is both \((1 + \varepsilon)\)-close in cost and \(\beta\)-close in solution structure to an optimum solution of the given instance. This second requirement implies that, for example, the known PTASs for the \(k\)-Median[11], [14] do not necessarily produce the desired solution in this setting.

7 Omitted Proofs of Section 3↩︎

7.1 Proof of Lemma 1↩︎

\(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: changed this proof

Proof. To establish this result, our analysis requires extra care compared to [1]. Friggstad et al.introduce the notion of being nearly-good with respect to a fixed reference point, namely the unique optimal solution. In our setting, we generalize stability to accommodate instances with multiple optimal solutions. As a result, the absence of a single fixed reference point introduces new complications, which we discuss below.

We first argue that the algorithm terminates at a nearly-good solution. According to the definition of nearly-good solutions, any optimum solution could cause the solution \(S\) to violate the nearly-good solution condition. We say that \(S\) is nearly-good for \(F\) if \(cost(S) - cost(F) \leq 2\varepsilon \Psi(S, F)\). Algorithm 1 clearly terminates as in each round it selects a new \(S\) with strictly smaller cost. Assume that the set \(S\) in the last round is not nearly-good, meaning that there exists an optimum \(F \in \mathcal{F}_k\) for which \(S\) is not nearly-good. Observe that by Theorem 2, there should exist \(S' \in \mathcal{F}_k\) such that \(\left|S-S^{\prime}\right| \leq \rho\) and \(\operatorname{cost}\left(S^{\prime}\right) \leq \operatorname{cost}(S)+\frac{\operatorname{cost}(F)-\operatorname{cost}(S)+\varepsilon \cdot \Psi(S, F)}{k} < \operatorname{cost}(S)\), since the numerator of the fraction is negative. This is a contradiction because the \(\rho\)-swap local search could discover \(S'\) and move to it in the next iteration, contradicting the assumption that \(S\) was the returned solution.

It only remains to prove a nearly-good solution is found within \(2 k \cdot \ln (n \Delta)\) iterations. Let \(\mathcal{O}= \{O_1, O_2, \ldots, O_\ell\}\) be the set of optimal solutions. For the sake of contradiction, suppose that after \(K=[2 k \cdot \ln (n \Delta)]\) iterations Algorithm 1 has still not encountered a nearly-good solution. Say \(S_0, S_1, \ldots, S_K \in \mathcal{F}_k\) is the sequence of sets held by the algorithm after the first \(K\) iterations, where \(S_0\) is the initial set. Then for all \(i = 0, ..., K\) we should have \(S_i\) is not nearly-good at least with some \(O_{i} \in \mathcal{O}\). Therefore, by Theorem 2, for each \(i \in [K]\) we have

\[\begin{align} \operatorname{cost}\left(S_i\right)-\operatorname{cost}(O_i) & \leq \operatorname{cost}(S_{i - 1})-\operatorname{cost}(O_i)+\frac{\operatorname{cost}(O_i)-\operatorname{cost}(S_{i - 1})+\varepsilon \Psi(S_{i - 1}, O_i)}{k} \\ & < \operatorname{cost}(S_{i - 1})-\operatorname{cost}(O_i)+\frac{\operatorname{cost}(O_i)-\operatorname{cost}(S_{i - 1})}{2 k} \\ & = \left(1-\frac{1}{2 k}\right) \cdot(\operatorname{cost}(S_{i - 1})-\operatorname{cost}(O_i)) \\ & = \left(1-\frac{1}{2 k}\right) \cdot(\operatorname{cost}(S_{i - 1})-\textsf{OPT}) \\ \end{align}\] The second inequality follows from the fact that \(\varepsilon \Psi(S_{i - 1}, O_i) < \frac{1}{2} \Psi(S_{i - 1}, O_i) \leq \frac{1}{2}(\operatorname{cost}(S_{i - 1}) + \operatorname{cost}(O_i))\) by our choice of \(\varepsilon\). Because costs are integral, \(\operatorname{cost}(S_K) - \textsf{OPT}= 0\) which contradicts that \(S_K\) is not a nearly-good solution ◻

7.2 Proof of Theorem 3↩︎

Proof. Sample a random partition \(\pi\) of \(O^{\prime} \cup S^{\prime}\) with mentioned properties and consider the effect of the swap \(S \rightarrow S \triangle T\) for each part \(T \in \pi\). We bound \(\mathbf{E}_\pi\left[\sum_{T \in \pi} \operatorname{cost}_{\mathrm{pen}}(S \triangle T)-\operatorname{cost}_{\mathrm{pen}}(S)\right]\) from above by describing a valid way to redirect each \(j \in X\) in each swap. We use the shorthand \(c_j^*=\min(\delta(j, \sigma(j, O))^2, p(j))\) for the cost of connecting \(j\) in \(O\) and \(c_j=\min(\delta(j, \sigma(j, S))^2, p(j))\) similarly for \(S\). Based on possible reassignments of \(j \in X\), we break the analysis into four cases:

Case 1: Either \(\{\sigma(j, S), \sigma(j, O)\} \subseteq S \cap O\) or both \(\delta(j, O)^2, \delta(j, S)^2 \geq p(j)\). The former means that \(\sigma(j, S)\) remains open after each swap, and not reassigning \(j\) is a valid option; the latter means paying the penalty for \(j\) is the better option. Thus, the reassignment cost is bounded by \(c_j^* - c_j = 0\).

Case 2: \(\sigma(j, S) \in S^{\prime}\) and \(\sigma(j, O) \in S \cap O\) and \(\delta(j, S)^2 < p(j)\) (note that if the last condition does not hold, we are in the previous case). Move \(j\) to \(\sigma(j, O)\) when swapping the part \(T\) with \(\sigma(j, S) \in T\). As \(\sigma(j, S)\) remains open when swapping all other \(T^{\prime} \neq T\), we can leave \(j\) assigned to \(\sigma(j, S)\) to upper-bound its cost change for swaps \(T^{\prime} \neq T\) by 0. The total cost assignment for \(j\) is then bounded by \(c_j^*-c_j\).

Case 3: \(j\) with \(\sigma(j, S) \in S \cap O\) and \(\sigma(j, O) \in O^{\prime}\) and \(\delta(j, O)^2 < p(j)\). Move \(j\) to \(\sigma(j, O)\) when swapping the part \(T\) with \(\sigma(j, O) \in T\) and do not move \(j\) when swapping any other part \(T^{\prime} \neq T\). This places an upper bound of \(c_j^*-c_j\) on the total assignment cost change for \(j\).

Case 4: Finally, consider \(j\) with \(\sigma(j, S) \in S^{\prime}\) and \(\sigma(j, O) \in O^{\prime}\) and \(\delta(j, O)^2, \delta(j, S)^2 < p(j)\) (note that if the last condition doesn’t hold, we are in one of the previous cases). Note these are precisely the points \(j \in \overline{X}^{pen}_{S, O}\). From 2 , \[\mathbf{E}_\pi\left[\sum\nolimits_{T \in \pi} \tilde{\Delta}_j^T\right] \leq(1+\varepsilon) \cdot c_j^*-(1-\varepsilon) \cdot c_j=c_j^*-c_j+\varepsilon \cdot\left(c_j+c_j^*\right)\] Aggregating this cost bound for all clients, we see \[\mathbf{E}_\pi\left[\sum\nolimits_{T \in \pi} \operatorname{cost}_{\mathrm{pen}}(S \triangle T)-\operatorname{cost}_{\mathrm{pen}}(S)\right] \leq \operatorname{cost}_{\mathrm{pen}}(O)-\operatorname{cost}_{\mathrm{pen}}(S) +\varepsilon \cdot \Psi_{pen}(S, O)\] Therefore there is some \(\pi\) and some \(T \in \pi\) with \[\begin{align} \operatorname{cost}_{\mathrm{pen}}(S \triangle T)-\operatorname{cost}_{\mathrm{pen}}(S) &\leq \frac{\operatorname{cost}_{\mathrm{pen}}(O)-\operatorname{cost}_{\mathrm{pen}}(S)+\varepsilon \cdot \Psi_{pen}(S, O)}{|\pi|} \\ &\leq \frac{\operatorname{cost}_{\mathrm{pen}}(O)-\operatorname{cost}_{\mathrm{pen}}(S)+\varepsilon \cdot \Psi_{pen}(S, O)}{k} \end{align}\] The last inequality relies on \(|\pi| \leq k\) and \(\operatorname{cost}_{\mathrm{pen}}(O)-\operatorname{cost}_{\mathrm{pen}}(S)+\varepsilon \cdot \Psi_{pen}(S, O) < 0\). ◻

7.3 Proof of Lemma 4↩︎

Proof. The proof sis similar to that of [1]. As \(S\) is a nearly-good solution, \(c_j^* \leq c_j\) for \(j \in X^2\), and \(c_j^*=c_j\) for \(j \in X^3\), we have: \[\begin{align} \sum\nolimits_{j \in X^4} c_j & \leq \sum\nolimits_{j \in X^1} c_j+\sum\nolimits_{j \in X^4} c_j=\operatorname{cost}_{\mathrm{pen}}(S)-\sum\nolimits_{j \in X^2} c_j-\sum\nolimits_{j \in X^3} c_j \\ & \leq \operatorname{cost}_{\mathrm{pen}}(S)-\sum\nolimits_{j \in X^2} c_j^*-\sum_{j \in X^3} c_j^* \\ & \leq \operatorname{cost}_{\mathrm{pen}}(O_q)+2 \varepsilon \cdot \Psi_{pen}(S, O_q)-\sum\nolimits_{j \in X^2} c_j^*-\sum\nolimits_{j \in X^3} c_j^* \\ & =\sum\nolimits_{j \in X^1} c_j^*+\sum_{j \in X^4} c_j^*+2 \varepsilon\left(\sum\nolimits_{j \in X^4} c_j^*+c_j\right) \end{align}\] Rearranging, \[\label{eq:k-means-alpha-solution-penalty-polynomial-2} \sum_{j \in X^4} c_j \leq \frac{1}{1-2 \varepsilon} \left(\sum_{j \in X^1} c_j^*+(1+2 \varepsilon) \cdot \sum_{j \in X^4} c_j^*\right) \leq(1+6 \varepsilon) \cdot\left(\sum_{j \in X^1} c_j^*+\sum_{j \in X^4} c_j^*\right)\tag{9}\]  ◻

8 Tools for Reduction Proofs↩︎

We frequently use the moment curve throughout our reductions, which we define as follows.

Definition 2. The curve \(\mathbb{R}^{+} \rightarrow \mathbb{R}^d\) defined by \(t \mapsto\left(t, t^2, \ldots, t^d\right)\) is called the moment curve.

All of our lower bounds are conditioned on the Exponential Time Hypothesis (ETH), which was formulated in [15].

Definition 3 (Exponential Time Hypothesis (ETH) ). There exists a positive real value \(s>0\) such that 3-CNF-SAT, parameterized by \(n\), has no \(2^{sn}(n+m)^{O(1)}\)-time algorithm (where \(n\) denotes the number of variables and \(m\) denotes the number of clauses).

The following problem, Partial Vertex Cover, plays a critical role in our reductions. In particular, for showing hardness of inverse poly of \(n\) stable \(\mathrm{\small k-Median}\) (Section 5).

Definition 4 (Partial Vertex Cover(PVC)). Input: A graph \(G=(V, E)\), an integer \(s \in \mathbb{N}\).
Parameter: Integer \(k\).
Output: YES if and only if there exists a set of \(k\) vertices that covers at least \(s\) edges.

\(\rlap{\square}{\raisebox{2pt}{\large\checkmark}}\) F: PVC used in the previous section

The following lower bound is already shown for PVC[16].

Theorem 5 (PVC Hardness [16]). . There is no \(f(k) n^{o(k)}\)-time algorithm for the Partial Vertex Cover problem unless ETH fails (for any computable function \(f\) ), where \(n\) is the size of the input.

Grid Tiling Inequality problem also plays a critical role in our reductions. In particular, for showing hardness of \((\alpha, \beta)\)-stability of \(\mathrm{\small k-Median}\) (Section 4).

Definition 5 (Grid Tiling Inequality). Input: Integer \(n\), collection \(S\) of \(k^2\) nonempty sets \(S_{i, j} \subseteq[n] \times[n]\) (where \(1 \leq i, j \leq k\) ).
Parameter: Integer \(k\).
Output: YES if and only if there exists a set of \(k^2\) pairs \(s_{i, j} \in S_{i, j}\) such that

  • If \(s_{i, j}=(a, b)\) and \(s_{i+1, j}=\left(a^{\prime}, b^{\prime}\right)\), then \(a \leq a^{\prime}\).

  • If \(s_{i, j}=(a, b)\) and \(s_{i, j+1}=\left(a^{\prime}, b^{\prime}\right)\), then \(b \leq b^{\prime}\).

We call this problem Grid Tiling Inequality, and it is also known that this problem has no \(f(k)n ^{o(k)}\)-time algorithm unless ETH fails [17].

9 Omitted Proofs of Section 4↩︎

9.1 Proof of Lemma 5↩︎

Proof. Claim 1. For every \(\vec{y}\) in \(A\) define \(M_{\vec{y}}\) to be square centred at \(y\) with side length \(\varepsilon\). Notice that every point in \(R\) is in an \(M_{\vec{y}}\) for some \(\vec{y} \in A\), and since every point in each square is at most \(\varepsilon\) away from its centre, we immediately derive \(D_{r - \varepsilon} \subseteq \bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}} \subseteq D_{r + \varepsilon}\). Therefore, \[\iint_{D_{r - \varepsilon}} d \mu^\star \leq \iint_{\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}}} d \mu^\star = \varepsilon^2 |A \cap D_r| = \varepsilon^2 \iint_{D_r} d \mu\] This yields the first inequality, and the second inequality can be attained in a similar manner.

Claim 2. Again, using the fact that \(\bigcup_{\vec{y} \in A \cap D_{r}} M_{\vec{y}} \subseteq D_{r + \varepsilon}\) we derive \[\begin{align} \iint_{D_{r + \varepsilon}} \delta(\vec{a}, \vec{x}) d \mu^\star &\geq \iint_{\bigcup_{\vec{y} \in A \cap D_{r }} M_{\vec{y}}} \delta(\vec{a}, \vec{x}) d \mu^\star \\ & \geq \iint_{\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}}} \delta(\vec{a}, \vec{y}) - \delta(\vec{x}, \vec{y}) d \mu^\star \\ & \geq \iint_{\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}}} \delta(\vec{a}, \vec{y}) - \varepsilon \iint_{\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}}} d \mu^\star \\ & = \sum_{\vec{y} \in A \cap D_r}\iint_{M_{\vec{y}}} \delta(\vec{a}, \vec{y}) - \varepsilon \iint_{\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}}} d \mu^\star \\ & \stackrel{(i)}{\geq} \varepsilon^2 \sum_{\vec{y} \in A \cap D_r} \delta(\vec{a}, \vec{y}) - \varepsilon \iint_{D_{r + \varepsilon}} d \mu^\star \\ & = \varepsilon^2 \iint_{D_r} \delta(\vec{a}, \vec{x}) d \mu - \varepsilon \iint_{D_{r + \varepsilon}} d \mu^\star \\ \end{align}\] Where (i) is due the fact that \(\bigcup_{\vec{y} \in A \cap D_r} M_{\vec{y}} \subseteq D_{r + \varepsilon}\).

Finally, note that \(\iint_{D_{r}} 1 d \mu^\star\) is simply the area of a circle with radius \(r\), which is \(\pi r^2\). Moreover, \(\iint_{D_r} \delta(\vec{x}, \vec{a}) d \mu^\star\) is the volume of a cylinder with radius and height \(r\), which has a cone with radius and height \(r\), carved out from it. Therefore, \(\iint_{D_r} \delta(\vec{x}, \vec{a}) d \mu^\star = \pi r^3 - \frac{1}{3} \pi r^3 = \frac{2}{3} \pi r^3\). This completes the proof. ◻

9.2 Proof of Theorem [thm:k-median-alpha-beta-stable-hardness-mult-mult]↩︎

Proof. Define \(\delta^{pen} (\vec{x}, \vec{y}) := \max \left(\|(x_1, x_2) - (y_1, y_2)\|_2, |x_3 - y_3|\right)\) for \(\vec{x} = (x_1, x_2, x_3)\) and \(\vec{y} = (y_1, y_2, y_3)\). Note that this is a metric, because \[\begin{align} \delta^{pen}(\vec{x}, \vec{z}) &= \max \left(\|(x_1, x_1) - (z_1, z_2)\|_2, |x_3 - z_3|\right) \\ & \leq \max \left(\|(x_1, x_1) - (y_1, y_2)\|_2 + \|(x_1, x_1) - (z_1, z_2)\|_2, |x_3 - y_3| + |y_3 - z_3|\right) \\ & \leq \max\left(\|(x_1, x_2) - (y_1, y_2)\|_2, |x_3 - y_3|\right) + \max \left(\|(y_1, y_1) - (z_1, z_2)\|_2, |y_3 - z_3|\right) \\ & \leq \delta^{pen}(\vec{x}, \vec{y}) + \delta^{pen}(\vec{y}, \vec{z}) \end{align}\] Moreover, \(\delta^{pen}\) has doubling dimension \(3 = \log_2 8\), because any ball of radius \(r\) w.r.t. \(\delta^{pen}\) is a cylinder with radius and height \(r\), which can be covered by \(8\) balls of radius \(r/2\).

Define \(X', C'\) to extend \(X^{grid}, C^{grid}\) into 3 dimensions by setting their third coordinate to zero: \[X' := \{ (u, v, 0) \mid (u, v) \in X^{grid} \}, \quad C' := \{ (u, v, 0) \mid (u, v) \in C^{grid} \}\] Then define \(C'' = \bigcup_{1 \leq i, j \leq k} \{ (2i - 1, 2j, 1), (2i, 2j - 1, 1)\)} and \(C := C' \cup C''\). Also, define \(X\) to be \(X'\) together with \(\varSigma\) points in each of \((2i - 1, 2j, 1)\) and \((2i, 2j - 1, 1)\) for \(1 \leq i, j \leq k\).

We prove that solving the \(3k^2\)-Median instance \((X, C, \delta^{pen})\) is equivalent to solving the \(k^2\)-Median instance \((X^{grid}, C^{grid}, \delta, p^{grid})\). First, we show all \(2k^2\) candidate centres in \(C''\) should be selected in all optimum solutions of \((X, C, \delta^{pen})\). Note that every third coordinate of every candidate centre in \(C''\) is 1. Thus, every data point in \(X'\) has distance at least 1 to them. Moreover, note that for every \((u, w) \in X^{grid}\) such that \(\|(u, w) , (2i, 2j - 1)\|_2 \leq 1\), we have \(\delta^{pen}((u, w, 0), (2i, 2j - 1, 1)) = 1\). Similarly, for every \((u, w) \in X^{grid}\) such that \(\|(u, w) , (2i - 1, 2j)\|_2 \leq 1\), we have \(\delta^{pen}((u, w, 0), (2i, 2j - 1, 1)) = 1\). In other words, the circle with radius 1 around every \((2i, 2j - 1, 0)\) and \((2i, 2j - 1, 0)\) is distance 1 away from some centre in \(C''\). These circles cover all of \(X'\). Thus, the cost of any solution containing \(C''\) will be at most \(\varSigma\). On the other hand, for any solution that doesn’t contain any \(\vec{a} \in C''\), the cost of data points located in \(\vec{a}\) alone is \(\varSigma\).

Next, since aforementioned circles cover \(X'\), any data point with a distance of more than 1 from some selected candidate centre in \(C'\) will be assigned to some centre in \(C''\) and will have a cost of 1. This means that solving \(k^2\)-Median for \((X, C, \delta^{pen})\) is equivalent to solving \(3k^2\)-Median for \((X^{grid}, C^{grid}, \delta, p^{grid})\). Finally, with an argument identical to our argument in Theorem [thm:k-median-penalty-alpha-beta-stable-hardness-mult-mult] we can also prove that \((X, C, \delta^{pen})\) is \((\alpha, \beta)\)-stable for any \(0 < \beta, 1 < \alpha < 1.2\). ◻

10 Omitted Proofs of Section 5↩︎

10.1 Proof of Lemma 6↩︎

Proof. Let the sphere be centred at \((a, b, c, d)\) and radius \(r\). Consider the function \(f(t)=(t-a)^2+\left(t^2-b\right)^2+\left(t^3-c\right)^2+\left(t^4-d\right)^2-r^2\). Note that \(t_1\), \(t_2\), and \(t_3\) are roots of this polynomial. Moreover, \(t_2, t_3\) are also roots of \(f'(t)\). We will apply Descartes’ rule of signs to upper-bound the number of strictly positive roots of \(f'(t)\). The rule says that the number of strictly positive roots of a polynomial is upper-bounded by the number of sign changes between non-zero coefficients (assuming the coefficients are arranged in decreasing order of the degree of their corresponding term). To this end, we expand the polynomial \(f(t)\) : \[\begin{align} f(t) & =t^2-2 a t+a^2+t^4-2 b t^2+b^2+t^6-2 c t^3+c^2+t^8-2 d t^4+d^2-r^2 \\ & =t^8+t^6+(1-2 d) t^4-2 c t^3+(1-2 b) t^2-2 a t+\left(a^2+b^2+c^2+d^2-r^2\right) \end{align}\] Thus, \(f'(t) = 8 t^7+ 6t^5 + 4(1-2 d) t^3 - 6 c t^2 + 2 (1-2 b) t -2 a\).

Hence, the coefficient sequence is given by \(\left(8,6,4(1-2 d),-6 c,2(1-2 b), -2a\right)\). Clearly, there are (at most) 4 changes in sign in this sequence, which implies that the number of strictly positive roots is upper-bounded by 4. However, we already know of 4 roots to this polynomial. Two of them are corresponding to \(t_2, t_3\). Using the median value theorem and observing that \(f(t_1) = f(t_2) = f(t_3)\), there should exist a \(t'_1 \in (t_1, t_2)\) and \(t'_2 \in (t_2, t_3)\) such that \(f'(t'_1) = f'(t'_2) = 0\). Therefore \(f'\) cannot be zero anywhere else. Note that for \(t \rightarrow \infty\), the moment curve must be outside the sphere. Thus, the only way \(t_2, t_3, t'_1, t'_2\) could be the only roots of \(f'\) is for all \(t \in\left(t_1, t_2\right) \cup\left(t_2, t_3\right) \cup (t_3, \infty)\) to lie outside of the sphere. ◻

10.2 Proof of Lemma 7↩︎

Proof. Similarly to the proof of Lemma 6 , let the sphere have centre \((a, b, c)\) and radius \(r\). We then analyze the derivative of the function

\[\begin{align} f(t) & =(t-a)^2+\left(t^2-b\right)^2+\left(t^3-c\right)^2-r^2 \\ & =t^2-2 a t+a^2+t^4-2 b t^2+b^2+t^6-2 c t^3+c^2-r^2 \\ & =t^6+t^4-2 c t^3+(1-2 b) t^2-2 a t+\left(a^2+b^2+c^2-r^2\right) \end{align}\] Thus, \(f'(t) = 6 t^5+4 t^3- 6 c t^2+2(1-2 b) t-2 a\).

The coefficients are \(\left(6,4,-6 c, 2(1-2 b),-2 a\right)\), which has (at most) 3 changes of sign. Hence Descartes’ rule implies that there are at most 3 roots. We already know of 3 roots to this polynomial, two of which correspond to \(t_1, t_2\). From the median value theorem, since \(f(t_1) = f(t_2)\), there must exist a \(t'_1 \in (t_1, t_2)\) such that \(f'(t'_1) = 0\). Similar to Lemma 6, the only way \(t_1, t_2, t'_1\) could be the only roots of \(f'\) is for \(t \in\left(O, t_1\right) \cup(t_1, t_2) \cup \left(t_2, \infty\right)\) to all lie outside of the sphere. ◻

10.3 Proof of Theorem [thm:k-median-alpha-hardness-panlty]↩︎

Proof. For a fixed parameter \(k\), consider a graph \(G=(V, E)\) on \(n=|V|\) vertices and \(m=|E|\) edges, along with an integer \(s\). Arbitrarily index the vertices \(v_1, \ldots, v_n\). Construct a Euclidean \(k\)-Median instance with penalties in \(\mathbb{R}^4\) denoted by \(\mathcal{I}(G, k)\) as follows. Define \(A_3 := \{ (x_1, x_2, x_3, x_4) \in \mathbb{R}^4: x_4 = 0\}\), i.e., the affine subspace of \(\mathbb{R}^4\) with the fourth coordinate of all points being zero. Every point we define initially lies on \(A_3\).

Consider the 3-dimensional moment curve \(\left(t, t^2, t^3\right)\). For each vertex \(v_i\), we define \(\tilde{v}_i = \left(i ,i^2,i^3, 0 \right) \in A_3\). For each edge \(e_{i, j}=\left(v_i, v_j\right)\) in \(G\), consider the unique 2-sphere in \(A_3\), which we denote by \(\mathbb{S}_{i, j}\), that is perpendicular to the moment curve at points \(\tilde{v}_{i}, \tilde{v}_j\). Let \(c_{i, j}\) and \(r_{i, j}\) denote the centre and radius of the 2-sphere \(\mathbb{S}_{i, j}\), respectively. Let \(c_{i, j} = (a, b, c, 0)\). Then the equation system that uniquely solves \(a, b, c\) is as follows. \[\begin{align} & (i) \quad (i - a)^2+(i^2 - b)^2+(i^3 - c)^2 = (j - a)^2+(j^2 - b)^2+(j^3 - c)^2\\ & (ii) \quad 6 i^5+4 i^3- 6 c i^2+2(1-2 b) i-2 a = 0\\ & (iii) \quad 6 j^5+4 j^3- 6 c j^2+2(1-2 b) j-2 a = 0 \end{align}\] Solving the equation system, we get \[\label{eq:hardness-alpha-k-median-penalty-abc} \begin{align} a &= i\cdot j\cdot (i + j)\cdot (3\cdot i^2 + 3\cdot i\cdot j + 3\cdot j^2 + 1) \\ b &= -(3\cdot i^4 + 12\cdot i^3\cdot j + 15\cdot i^2\cdot j^2 + i^2 + 12\cdot i\cdot j^3 + 4\cdot i\cdot j + 3\cdot j^4 + j^2 - 1)/2 \\ c &= (i + j)\cdot (2\cdot i^2 + i\cdot j + 2\cdot j^2 + 1) \end{align}\tag{10}\]

Let \(q\) be an index pair that gives rise to the maximum \(r_{i, j}\), namely \(q=\operatorname{argmax}_{i, j} \{r_{i, j}\}\) (i.e., \(q\) is of the form " \(i, j\) "). Next, define \(c^\prime_{i, j} := (a, b, c, \sqrt{r_q^2 - r^2_{i, j}})\) so that \(\delta(\tilde{v}_i, c^\prime_{i, j}) = \delta(\tilde{v}_j, c^\prime_{i, j}) = r_q\).

As we discussed in the proof of Lemma 7, for all \(t \in [n]\) we have \(\delta( \tilde{v}_t, c_{i, j})^2 = f(t)\) where \(f(t) := (t-a)^2+\left(t^2-b\right)^2+\left(t^3-c\right)^2\). Using Lemma 7 for any \(t \neq i, j\), we have \(\delta(\tilde{v}_t, c_{i, j})^2 = f(t) > r_{i, j}^2\). Examining equation (10 ) more closely reveals that \(f(t)\) is an integer multiple of \(1/4\) for all \(t \in [n]\). Thus, we derive \(\delta(\tilde{v}_t, c_{i, j})^2 \geq r_{i, j}^2 + \frac{1}{4}\). Therefore, \(\delta(\tilde{v}_t, c^\prime_{i, j})^2 \geq r_q^2 - r_{i, j}^2 + r_{i, j}^2 + \frac{1}{4} = r_q^2 + \frac{1}{4}\).

Next, we notice that in equation 10 , every coordinate of every \(c_{i, j}\) is polynomial with degree at most \(5\) with respect to \(i\) and \(j\), while every coefficient is an integer divided by two. This immediately yields that every \(r^2_{i, j} = O(n^{10})\). In particular, \(r^2_{q} = O(n^{10})\). Denoting \(\varepsilon = \sqrt{\frac{r_{q}^2 + \frac{1}{4}}{r_{q}^2} } - 1\), we would have \(\varepsilon = \Omega(\frac{1}{n^{10}})\). As a result, \(\delta\left(c^\prime_{i, j}, \tilde{z}^* \right) \geq (1+ \varepsilon) r_q\).

We are finally ready to introduce the data points, candidate centres, and penalty function of \(\mathcal{I}(G, k)\). Define data point to be \(X := \{c^\prime_{i, j} \mid i, j \in [n], (v_i, v_j) \in E\}\), candidate centres to be \(C := \{\tilde{v}_i \mid i \in [n]\}\), and penalty function to be \(p(c^\prime_{i, j}) := r_q (1 + \varepsilon)\).

Lemma 8. Let \(S = \{\tilde{v}_{i_1}, \tilde{v}_{i_2}, ..., \tilde{v}_{i_k}\} \subseteq X\). Assume that \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) covers \(s^*\) edges in \(G\). Then \(\operatorname{cost}_{\mathrm{pen}}(S) = r^q (m + (m - s^*) \varepsilon)\).

Proof. Fix any \(i, j \in [n]\) where \((v_i, v_j) \in E\). We first calculate the cost of \(c^\prime_{i,j}\), which is given by \(\min_{\vec{t} \in S} \left\{ \min\left\{ \delta(c^\prime_{i, j}, \vec{t}), p(c^\prime_{i, j})\right\} \right\}\). Then, \(\operatorname{cost}_{\mathrm{pen}}(S)\) is the sum of the costs of all such \(c^\prime_{i, j}\). In case \(\tilde{v}_i\) or \(\tilde{v}_j\) are in \(S\) we derive \(\min_{\vec{t} \in S} \left\{ \min\left\{ \delta(c^\prime_{i, j}, \vec{t}), p(c^\prime_{i, j})\right\} \right\} = \delta(c^\prime_{i, j}, \tilde{v}_i) = r_{q}\). Otherwise, \(\min_{\vec{t} \in S} \left\{ \min\left\{ \delta(c^\prime_{i, j}, \vec{t}), p(c^\prime_{i, j})\right\} \right\} = p(c^\prime_{i, j}) = r_{q} ( 1 + \varepsilon)\). Therefore, the total cost of \(S\) is \(r_q\) multiplied by the number of edges \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) covers plus \(r_q ( 1 + \varepsilon)\) multiplied by the number of edges \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) does not cover. Hence, \(\operatorname{cost}_{\mathrm{pen}}(S) = r_q s^* + r_q (m - s^*) (1 + \varepsilon)\). This completes the proof. ◻

Firstly, Lemma 8 immediately implies that the graph \(G\) has a partial vertex cover of size \(k\), covering at least \(s\) edges, if and only if \(\mathcal{I}(G, k)\) has a solution of cost at most \(r^q (m + (m - s) \varepsilon)\). Therefore, the only thing left to prove is that \(\mathcal{I}(G, k)\) is stable. Secondly, from Lemma 8, we also derive that, assuming that a maximal partial vertex cover of size \(k\) of \(G\) covers \(s^*\) edges, the set of optimum solutions of \(\mathcal{I}(G, k)\) is the set of all \(\{\tilde{v}_{i_1}, \tilde{v}_{i_2}, ..., \tilde{v}_{i_k}\}\) such that \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) covers \(s^*\) edges in \(G\).

Fix \(\varepsilon^{\prime} = \varepsilon / 2m\). We prove that \(\mathcal{I}(G, k)\) is \((1+\varepsilon^{\prime})\)-stable to complete the proof. Consider any \(\delta \leq \delta^\prime \leq (1 + \varepsilon^\prime) \cdot \delta\) and \(p \leq p^\prime \leq (1 + \varepsilon^\prime) \cdot p\). Assume the maximal partial vertex cover of size \(k\) of \(G\) covers \(s^*\) edges. We only need to show that any optimum solution of the \(k\)-Median instance \((X, C, \delta^\prime, p')\) corresponds to a set of vertices that cover \(s^*\) edges. For every \(S \subseteq C\) define \(\operatorname{cost}_{\mathrm{pen}}^\prime(S) := \sum_{j \in X} \min_{i \in S} \left\{\min \left\{\delta^\prime(i, j), p'(j)\right\}\right\}\). Note that by definition, \(\operatorname{cost}_{\mathrm{pen}}(S) \leq \operatorname{cost}_{\mathrm{pen}}^\prime(S) \leq \operatorname{cost}_{\mathrm{pen}}(S) ( 1 + \varepsilon^\prime)\).

Denote by \(O^\prime\) an optimum solution of \((X, C, \delta^\prime, p')\). For the sake of contradiction assume that \(O^\prime\) corresponds to a set of vertices that cover \(s < s^*\) edges of \(G\), and let \(O \in \mathcal{F}_k\) be corresponding to any set of \(k\) vertices that cover \(s^*\) edges in \(G\). By Lemma 8 we have \[\begin{align} \operatorname{cost}_{\mathrm{pen}}^\prime(O) &\leq (1 + \varepsilon^\prime) \operatorname{cost}_{\mathrm{pen}}(O) = r^q ( m (1 + \varepsilon^\prime) + (m - s^*) \varepsilon (1 + \varepsilon^\prime) ) \\ & < r^q ( m + \frac{\varepsilon}{2} + (m - s^*) \varepsilon + \frac{\varepsilon^2}{2} ) < r^q ( m + (m - s^* + 1) \varepsilon )\\ & \leq r^q ( m + (m - s) \varepsilon) = \operatorname{cost}_{\mathrm{pen}}(O^\prime) \leq \operatorname{cost}_{\mathrm{pen}}^\prime(O^\prime) \end{align}\] This is a contradiction. Therefore, every optimum solution of \((X, C, \delta^\prime, p')\) is also an optimum solution of \((X, C, \delta, P)\), which completes the proof. ◻

10.4 Proof of Theorem [thm:k-median-alpha-hardness]↩︎

Proof. For a fixed parameter \(k\), we are given a graph \(G=(V, E)\) on \(n=|V|\) vertices and \(m=|E|\) edges, along with an integer \(s\). Arbitrarily index the vertices \(v_1, \ldots, v_n\). Inspired by Theorem 5.1 of [12], we construct a Euclidean \(k^\prime\)-Median instance in \(\mathbb{R}^6\) with \(k^\prime = k + 1\) denoted by \(\mathcal{I}(G, k)\). We emphasize that the proof has been significantly modified to fit our stability setting while also being simplified. Define \(A_4 := \{ (x_1, x_2, x_3, x_4, x_5, x_6) \in \mathbb{R}^4: x_5 = 0, x_6 = 0\}\), i.e., the affine subspace of \(\mathbb{R}^4\) with the fifth and sixth coordinates of all points being zero. Every point we define initially lies on \(A_4\).

Consider the 4-dimensional moment curve \(\left(t, t^2, t^3, t^4\right)\). For each vertex \(v_i\), we define \(\tilde{v}_i = \left(i + 1,(i+1)^2,(i+1)^3,(i+1)^4, 0, 0 \right) \in A_4\), and \(z^{*} = (1, 1, 1, 1, 0, 0) \in A\). For each edge \(e_{i, j}=\left(v_i, v_j\right)\) in \(G\), consider the unique 3-sphere in A, which we denote by \(\mathbb{S}_{i, j}\), that is perpendicular to the moment curve at point \(\tilde{v}_{i}, \tilde{v}_j\) and also has \(z^*\) on its surface. Let \(c_{i, j}\) and \(r_{i, j}\) denote the centre and radius of the 3-sphere \(\mathbb{S}_{i, j}\), respectively. Let \(c_{i, j} = (a, b, c, d, 0, 0)\) and \(\tilde{i} = (i + 1), \tilde{j} = (j + 1)\). Then the equation system that uniquely solves \(a, b, c, d\) is as follows. \[\begin{align} & (i) \quad (\tilde{i} - a)^2+(\tilde{i}^2 - b)^2+(\tilde{i}^3 - c)^2+(\tilde{i}^4 - d)^2 \\ & \quad \quad \quad= (\tilde{j} - a)^2+(\tilde{j}^2 - b)^2+(\tilde{j}^3 - c)^2+(\tilde{j}^4 - d)^2 \\ & \quad \quad \quad = (1 - a)^2+(1 - b)^2+(1 - c)^2+(1 - d)^2 \\ & (ii) \quad a + 2\tilde{i} b + 3\tilde{i}^2 c + 4\tilde{i}^3 d = \tilde{i} + 2\tilde{i}^3 + 3\tilde{i}^5 + 4\tilde{i}^7, \\ & (iii) \quad a + 2\tilde{j} b + 3\tilde{j}^2 c + 4\tilde{j}^3 d = \tilde{j} + 2\tilde{j}^3 + 3\tilde{j}^5 + 4\tilde{j}^7. \end{align}\] One can verify, using any solver system, that the solution to the system of equations is \[\label{eq:hardness-alpha-k-median-abcd} \begin{align} a &= -\tilde{i}\cdot \tilde{j}\cdot(\tilde{i} + \tilde{j})\cdot(2\cdot \tilde{i}^3\cdot \tilde{j} + 4\cdot \tilde{i}^3 + \tilde{i}^2\cdot \tilde{j}^2 + 6\cdot \tilde{i}^2\cdot \tilde{j} + 3\cdot \tilde{i}^2 + 2\cdot \tilde{i}\cdot \tilde{j}^3 + 6\cdot \tilde{i}\cdot \tilde{j}^2 + 5\cdot \tilde{i}\cdot \tilde{j} \\ & + 4\cdot \tilde{i} + 4\cdot \tilde{j}^3 + 3\cdot \tilde{j}^2 + 4\cdot \tilde{j} + 2) \\ b &= (8\cdot \tilde{i}^5\cdot \tilde{j} + 4\cdot \tilde{i}^5 + 17\cdot \tilde{i}^4\cdot \tilde{j}^2 + 22\cdot \tilde{i}^4\cdot \tilde{j} + 3\cdot \tilde{i}^4 + 20\cdot \tilde{i}^3\cdot \tilde{j}^3 + 34\cdot \tilde{i}^3\cdot \tilde{j}^2 + 20\cdot \tilde{i}^3\cdot \tilde{j} \\ & + 4\cdot \tilde{i}^3 +17\cdot \tilde{i}^2\cdot \tilde{j}^4 + 34\cdot \tilde{i}^2\cdot \tilde{j}^3 + 29\cdot \tilde{i}^2\cdot \tilde{j}^2 + 20\cdot \tilde{i}^2\cdot \tilde{j} + 2\cdot \tilde{i}^2 + 8\cdot \tilde{i}\cdot \tilde{j}^5 + 22\cdot \tilde{i}\cdot \tilde{j}^4 \\ & + 20\cdot \tilde{i}\cdot \tilde{j}^3 + 20\cdot \tilde{i}\cdot \tilde{j}^2+ 8\cdot \tilde{i}\cdot \tilde{j} + 4\cdot \tilde{j}^5 + 3\cdot \tilde{j}^4 + 4\cdot \tilde{j}^3 + 2\cdot \tilde{j}^2 + 1)/2 \\ c &= -(\tilde{i} + \tilde{j})\cdot (2\cdot \tilde{i}^4 + 6\cdot \tilde{i}^3\cdot \tilde{j} + 4\cdot \tilde{i}^3 + 5\cdot \tilde{i}^2\cdot \tilde{j}^2 + 6\cdot \tilde{i}^2\cdot \tilde{j} + 4\cdot \tilde{i}^2 + 6\cdot \tilde{i}\cdot \tilde{j}^3 + 6\cdot \tilde{i}\cdot \tilde{j}^2 \\ &+ 7\cdot \tilde{i}\cdot \tilde{j} + 4\cdot \tilde{i} + 2\cdot \tilde{j}^4 + 4\cdot \tilde{j}^3 + 4\cdot \tilde{j}^2 + 4\cdot \tilde{j} + 2)\\ d &= (5\cdot \tilde{i}^4 + 8\cdot \tilde{i}^3\cdot \tilde{j} + 4\cdot \tilde{i}^3 + 9\cdot \tilde{i}^2\cdot \tilde{j}^2 + 6\cdot \tilde{i}^2\cdot \tilde{j} + 6\cdot \tilde{i}^2 + 8\cdot \tilde{i}\cdot \tilde{j}^3 + 6\cdot \tilde{i}\cdot \tilde{j}^2 + 8\cdot \tilde{i}\cdot \tilde{j} \\ & + 4\cdot \tilde{i} + 5\cdot \tilde{j}^4 + 4\cdot \tilde{j}^3 + 6\cdot \tilde{j}^2 + 4\cdot \tilde{j} + 3)/2 \\ \end{align}\tag{11}\]

Similar to Theorem [thm:k-median-alpha-hardness-panlty], let \(q:=\operatorname{argmax}_{i, j} \{r_{i, j}\}\) and define \(c^\prime_{i, j} := (a, b, c, d, \sqrt{r_q^2 - r^2_{i, j}}, 0)\) so that \(\delta(z^*, c^\prime_{i, j}) = \delta(\tilde{v}_i, c^\prime_{i, j}) = \delta(\tilde{v}_i, c^\prime_{i, j}) = r_q\). Again for any \(t \neq i, j \in [n]\), we have \(\delta(\tilde{v}_t, c^\prime_{i, j})^2 \geq r_q^2 + \frac{1}{4}\). We also define \(\tilde{z}^{*} = \left(1, 1, 1, 1, 0, \frac{1}{2} \right)\), and subsequently \(\delta^2(\tilde{z}^*, c_{i, j}) = r^2_q + \frac{1}{4}\).

Moreover, in 10 every coordinate of every \(c_{i, j}\) is a polynomial with degree at most \(7\) with respect to \(\tilde{i}\) and \(\tilde{j}\), while every coefficient is an integer divided by two. This immediately yields that \(r^2_{i, j} = O(n^{14})\). In particular, \(r^2_{q} = O(n^{14})\). Setting \(\varepsilon = \sqrt{\frac{r_q^2 + \frac{1}{4}}{r_{q}^2} } - 1,\) we would have \(\delta(z^*, c^\prime_{i, j}) = (1 + \varepsilon) r_q\) and \(\varepsilon = \Omega(\frac{1}{n^{14}})\).

We are finally ready to introduce the data points and candidate centres of \(\mathcal{I}(G, k)\). Define the set of candidate centres to be \(C := \{\tilde{v}_1, \tilde{v}_2, ..., \tilde{v}_n, \tilde{z}^*\}\), and the set of data points \(X\) to be \(\{c^\prime_{i, j} \mid i, j \in [n], (v_i, v_j) \in E\}\) in addition to \(\lceil m r_q \rceil\) points in \(\tilde{z}^*\).

Lemma 9. Let \(S = \{\tilde{v}_{i_1}, \tilde{v}_{i_2}, ..., \tilde{v}_{i_k}\} \cup \{ \tilde{z}^* \} \subseteq C\). Assume that \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) covers \(s^*\) edges in \(G\). Then \(\operatorname{cost}(S) = r^q (m + (m - s^*) \varepsilon)\).

Proof. Note that all data points in \(\tilde{z}^*\) will be assigned to \(\tilde{z}^*\) and their cost would be zero. Consider any \((v_i, v_j) \in E\). Then, if \(\tilde{v}_i\) or \(\tilde{v}_j\) is in \(S\), we have \(\min_{\vec{t} \in S} \delta(c^\prime_{i, j}, \vec{t}) = r_{q}\). Otherwise, \(\min_{\vec{t} \in S} \delta(c^\prime_{i, j}, \vec{t}) = \delta(c^\prime_{i, j}, \tilde{z}^*) = r_{q} ( 1 + \varepsilon)\). This completes the proof. ◻

Lemma 10. Assuming a maximal partial vertex cover of size \(k\) of \(G\) covers \(s^*\) edges, the set of optimum solutions of \(\mathcal{I}(G, k)\) is the set of all \(\{\tilde{v}_{i_1}, \tilde{v}_{i_2}, ..., \tilde{v}_{i_k}, \tilde{z}^*\}\) such that \(\{v_{i_1}, v_{i_2}, ..., v_{i_k}\}\) covers \(s^*\) edges in \(G\).

Proof. First consider any \(O \in \mathcal{F}_{k^\prime}\) that does not contain \(\tilde{z}^*\). We will prove that such \(O\) cannot be an optimum solution. The closest centre to \(\tilde{z}^*\) is \(\tilde{v}_1\) and \(\delta(\tilde{z}^*, \tilde{v}_1) \geq 2\). If all data points in \(z^*\) were assigned to \(\tilde{v}_1\), the cost would be \(2 \lceil m r_q \rceil\). Therefore, due to Lemma 9, \(\operatorname{cost}(O)\) would be bigger than the cost of any \(S \in \mathcal{F}_{k^\prime}\) that contains \(\tilde{z}^*\), which is a contradiction. The rest of the proof immediately follows from Lemma 9. ◻

Lemma 9 and Lemma 10 imply that the graph \(G\) has a partial vertex cover of size \(k\) covering at least \(s\) edges if and only if \(\mathcal{I}(G, k)\) has a solution of cost at most \(r^q (m + (m - s) \varepsilon)\). Therefore, the only thing left to prove is that \(\mathcal{I}(G, k)\) is stable.

Fix \(\varepsilon^{\prime} = \varepsilon / 2m\). We prove that \(\mathcal{I}(G, k)\) is \((1 + \varepsilon^{\prime})\)-stable, which completes the proof. Consider any \(\delta \leq \delta^\prime \leq (1 + \varepsilon^\prime) \cdot \delta\) and \(p \leq p' \leq (1 + \varepsilon^\prime) \cdot p\). Assume that a maximal partial vertex cover of size \(k\) of \(G\) covers \(s^*\) edges. It suffices to show that any optimum solution of \((X, C, \delta^\prime)\) corresponds to a set of vertices that covers \(s^*\) edges. For every \(S \in \mathcal{F}_{k^\prime}\), define \(\operatorname{cost}^\prime(S) = \sum_{j \in X} \min_{i \in S} \delta^\prime(i, j)\). Note that by definition \(\operatorname{cost}(S) \leq \operatorname{cost}^\prime(S) \leq \operatorname{cost}(S) ( 1 + \varepsilon^\prime)\).

Let \(O^\prime\) be an optimum solution of the \(k^\prime\)-Median instance \((X, C, \delta^\prime)\). For the sake of contradiction assume that \(O^\prime\) corresponds to a set of vertices that cover \(s < s^*\) edges of \(G\), and let \(O\) correspond to any set of vertices that cover \(s^*\) edges in \(G\). Similarly to the proof of Theorem [thm:k-median-alpha-hardness-panlty], and using Lemma 8, we derive \[\begin{align} \operatorname{cost}^\prime(O) &\leq (1 + \varepsilon^\prime) \operatorname{cost}(O) < r^q \left( m + (m - s^* + 1) \varepsilon \right) \leq r^q ( m + (m - s) \varepsilon) \\ & = \operatorname{cost}(O^\prime) \leq \operatorname{cost}^\prime(O^\prime) \end{align}\] This is a contradiction. Therefore, every optimum solution of \((X, C, \delta^\prime)\) is also an optimum solution of \((X, C, \delta)\), which completes the proof. ◻

References↩︎

[1]
Zachary Friggstad, Kamyar Khodamoradi, and Mohammad R Salavatipour. Exact algorithms and lower bounds for stable instances of Euclidean k-Means. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2958–2972. SIAM, 2019.
[2]
Vincent Cohen-Addad and Chris Schwiegelshohn. On the local structure of stable clustering instances. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 49–60. IEEE Computer Society, 2017. . URL https://doi.org/10.1109/FOCS.2017.14.
[3]
Maria-Florina Balcan and Yingyu Liang. Clustering under perturbation resilience. SIAM J. Comput., 45 (1): 102–155, 2016. . URL https://doi.org/10.1137/140981575.
[4]
Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. J. Algorithms, 31 (1): 228–248, 1999. . URL https://doi.org/10.1006/jagm.1998.0993.
[5]
Vijay V. Vazirani. Approximation algorithms. Springer, 2001. ISBN 978-3-540-65367-7. URL http://www.springer.com/computer/theoretical+computer+science/book/978-3-540-65367-7.
[6]
Petros Drineas, Alan M. Frieze, Ravi Kannan, Santosh S. Vempala, and V. Vinay. Clustering large graphs via the singular value decomposition. Mach. Learn., 56 (1-3): 9–33, 2004. . URL https://doi.org/10.1023/B:MACH.0000033113.59016.96.
[7]
Yonatan Bilu and Nathan Linial. Are stable instances easy? Comb. Probab. Comput., 21 (5): 643–660, 2012. . URL https://doi.org/10.1017/S0963548312000193.
[8]
Pranjal Awasthi, Avrim Blum, and Or Sheffet. Stability yields a PTAS for k-Median and k-Means clustering. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, 2010, Las Vegas, Nevada, USA, pages 309–318. IEEE Computer Society, 2010. . URL https://doi.org/10.1109/FOCS.2010.36.
[9]
Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. Approximation algorithms for semi-random partitioning problems. In Howard J. Karloff and Toniann Pitassi, editors, Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19 - 22, 2012, pages 367–384. ACM, 2012. . URL https://doi.org/10.1145/2213977.2214013.
[10]
Shai Ben-David, Ulrike von Luxburg, and Dávid Pál. A sober look at clustering stability. In Gábor Lugosi and Hans Ulrich Simon, editors, Learning Theory, 19th Annual Conference on Learning Theory, COLT 2006, Pittsburgh, PA, USA, June 22-25, 2006, Proceedings, volume 4005 of Lecture Notes in Computer Science, pages 5–19. Springer, 2006. . URL https://doi.org/10.1007/11776420_4.
[11]
Zachary Friggstad, Mohsen Rezapour, and Mohammad R Salavatipour. Local search yields a PTAS for k-Means in doubling metrics. SIAM Journal on Computing, 48 (2): 452–480, 2019.
[12]
Vincent Cohen-Addad, Arnaud De Mesmay, Eva Rotenberg, and Alan Roytman. The bane of low-dimensionality clustering. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 441–456. SIAM, 2018.
[13]
DR Curtiss. Recent extentions of descartes’ rule of signs. Annals of Mathematics, 19 (4): 251–278, 1918.
[14]
Vincent Cohen-Addad, Philip N. Klein, and Claire Mathieu. Local search yields approximation schemes for k-Means and k-Median in Euclidean and Minor-Free metrics. SIAM J. Comput., 48 (2): 644–667, 2019. . URL https://doi.org/10.1137/17M112717X.
[15]
Russell Impagliazzo and Ramamohan Paturi. Complexity of k-sat. In Proceedings of the 14th Annual IEEE Conference on Computational Complexity, Atlanta, Georgia, USA, May 4-6, 1999, pages 237–240. IEEE Computer Society, 1999. . URL https://doi.org/10.1109/CCC.1999.766282.
[16]
Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? Journal of Computer and System Sciences, 63 (4): 512–530, 2001.
[17]
Marek Cygan, Fedor V Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized Algorithms, volume 5. Springer, 2015.