Identification and Inference for Synthetic Control Methods with Spillover Effects: Estimating the Economic Cost of the Sudan Split


Abstract

The synthetic control method (SCM) is widely used for causal inference with panel data, particularly when there are few treated units. SCM assumes the stable unit treatment value assumption (SUTVA), which posits that potential outcomes are unaffected by the treatment status of other units. However, interventions often impact not only treated units but also untreated units, known as spillover effects. This study introduces a novel panel data method that extends SCM to allow for spillover effects and estimate both treatment and spillover effects. The method leverages a spatial autoregressive panel data model to account for spillover effects. The study also proposes a Bayesian inference method using Bayesian horseshoe priors for regularization. The proposed method is applied to two empirical studies: one evaluating the effect of the California tobacco tax on consumption, and the other estimating the economic impact of the 2011 division of Sudan on GDP per capita.
Keywords: Synthetic control; Spillover effect; Spatial data; Bayesian inference; Sudan split.

0.0.1 Introduction↩︎

The synthetic control method (SCM) [1], [2] is a causal inference approach used to estimate treatment effects when the number of units undergoing intervention is very small, such as a single unit. This approach identifies treatment effects from panel data by substituting the counterfactual outcomes of the treated unit in the absence of treatment with weighted averages of outcomes of untreated units. The method has been widely used in various fields, including political economy and marketing. [3] describe SCM as “arguably the most important innovation in the policy evaluation literature in the last 15 years.”

The SCM is typically applied to country- and district-level data. For instance, [4] apply SCM to country-level data to estimate the economic impact of the 1990 German reunification, while [5] apply this method to school-district-level data to evaluate the effect of an educational program, among other examples. In such contexts, interventions may impact untreated units due to geographic or socioeconomic connections among countries or districts, which are known as spillovers. However, SCM relies on the stable unit treatment value assumption (SUTVA) [6], which posits that one’s outcome is not dependent on the treatment status of others, excluding spillovers. If spillovers occur to untreated units, SUTVA is violated, which can lead to biased estimates of treatment effects in the SCM.

This paper tackles this problem by extending the standard SCM to allow for spillover effects. We leverage the spatial autoregressive (SAR) model, a workhorse model in spatial data analysis, to address this issue with its capability to capture the correlation of outcomes among units. We propose a new approach to identify and estimate treatment effects by incorporating a SAR model into SCM, where we characterize the outcomes of untreated units with the SAR panel data model. This approach allows for spillover effects arising from the dependence of outcomes among treated and untreated units, thereby relaxing SUTVA in SCM. Furthermore, the extended SCM can identify and estimate spillover effects on untreated units, which are also parameters of interest in many empirical studies adopting SCM.

Building on the identification results, we propose a Bayesian inference method for the proposed SCM. Data targeted by SCM often involves panel data, but the number of units or the length of pretreatment periods is not always sufficiently large. In such cases, frequentist approaches may not provide accurate inference. Bayesian inference offers several advantages. First, it can yield more accurate estimates than frequentist methods when the sample size is small. Second, it facilitates statistical inference via the Markov Chain Monte Carlo (MCMC) procedure. Third, Bayesian modeling is flexible, allowing models to be less dependent on the assumptions about prior distributions.

We apply Bayesian regularization in the construction of synthetic controls. Following [7], we use Bayesian horseshoe priors [8] to model the synthetic weights. The Bayesian horseshoe prior has strong regularization effects, and hence is effective in avoiding overfitting bias. This is particularly important in SCM, as it is often applied to data with a relatively large number of control units and short pretreatment periods, which can easily cause overfitting bias in the estimation of synthetic control outcomes. The simulation study in the paper examines the finite sample performance of the proposed Bayesian inference.

This paper presents two substantive empirical studies with the application of the proposed SCM. One study evaluates the impact of California’s tobacco tax on its consumption [2], and the other estimates the impact of Sudan’s north-south split in 2011 on GDP per capita. Both studies involve spillover effects among US states and African countries. Regarding California’s tobacco tax, our empirical results show a negative impact on tobacco consumption in California, supporting the findings of [2]. We also estimate the spillover effects on other states, showing that California’s tobacco tax reduced tobacco consumption in other states.

To the best of our knowledge, the second empirical study is the first to examine the economic impact of Sudan’s north-south split on the Sudans (the region of the former united Sudan). In contrast, [9] estimate the economic impact of the 2012 oil production halt in South Sudan on South Sudan itself using the standard SCM. [9] intentionally exclude countries neighboring South Sudan from the control group to address spillover concerns. This is a common practice in dealing with spillover in SCM [10]; however, excluding some untreated units, particularly neighboring countries, can lead to poor fitting of synthetic controls and cause additional bias. The SCM proposed in this paper does not need to exclude any untreated units. The results of our empirical study show that Sudan’s north-south split decreased GDP per capita in the Sudans by about 9.5%, with a cumulative reduction of 34% from 2011 to 2015. Our empirical results also show the negative economic impacts of Sudan’s split on other African countries with strong economic connections to Sudan, such as Egypt and Kenya.

Related Literature

Since [1] and [2] pioneered SCM, along with its broad application in the social sciences, SCM has been theoretically and methodologically advanced by many works. [11] present a penalized synthetic control method to reduce interpolation biases. [12] combine SCM with difference-in-differences (DID) and propose a new estimator with robustness properties. [13] propose an augmented synthetic control method to correct bias resulting from imperfect pre-treatment fit. [14] and [15] propose inference methods for SCM. [16] point out a potential bias in the SCM when the perfect fit assumption is not satisfied.

Several works have proposed Bayesian estimation in SCM to facilitate statistical inference through MCMC sampling and/or to apply flexible modeling. [7] introduce a Bayesian synthetic control method that uses the Bayesian horseshoe prior proposed by [8] for the synthetic weights. Their simulations demonstrate more accurate predictions of counterfactual outcomes than the standard SCM, particularly when the number of units is relatively large compared to the length of the pretreatment periods. [17] propose a method based on Bayesian structural time-series models. [18] and [19] propose approaches to correct bias in SCM induced by the dynamic characteristics of untreated units. [18] propose a Bayesian posterior predictive approach with a latent factor term that incorporates unit-specific time trends. [19] incorporates time-varying parameters by using a state space framework.

While the related literature mentioned so far assumes SUTVA in SCM, few studies have considered its relaxation. [20] allow for spillover effects in SCM by assuming a specified structure of spillover effects. Our approach differs from theirs in two aspects: (i) our approach supposes a spatial correlation structure for the outcomes of units rather than a direct structure for spillover effects, and (ii) our approach supposes the existence of a synthetic control only for the treated unit, whereas their approach supposes synthetic controls for both the treated and untreated units. [21] assume that a unit’s potential outcome depends on its own and neighboring treatment assignments. They propose a Bayesian structural time series method to identify and estimate treatment and spillover effects by estimating the predictive distribution of counterfactual outcomes. [22] focus on the average spillover effects within specified clusters and provide a method for constructing confidence intervals for treatment and average spillover effects. In the context of DID, [23] propose an approach for estimating spillover effects using RCT data, which utilizes non-compliance in treatment groups to identify spillover effects.

Structure of the Paper

The remainder of the paper proceeds as follows. Section 0.0.2 describes the SCM setup with spillover effects. Section 0.0.3 introduces the SAR panel data model, and presents the identification results. Section 0.0.4 presents our proposal for Bayesian SCM. Section 0.0.5 shows simulation results to demonstrate the finite-sample performance of the proposed SCM. Section 0.0.6 presents the results of two empirical applications. Section 0.0.7 concludes the paper.

0.0.2 Setup↩︎

Consider units \(i=0,1,\cdots,N\) and time points \(t=1,2,\cdots,T\). We suppose that \(i=0\) is the unit receiving the treatment, and \(i=1,\cdots,N\) are the units in the control group. Let \(T_{0}\) be the number of pretreatment periods with \(1 \leq T_0 < T\). The treatment status of a unit \(i\) at time \(t\) is denoted as \(d_{it} \in \{0,1\}\), where \(d_{it}=1\) means that unit \(i\) receives treatment at time \(t\), and \(d_{it}=0\) means otherwise. In our context, \(d_{it}=1\) when \(i=0\) and \(t > T_0\) and \(d_{it}=0\) otherwise. We define \(\boldsymbol{d}_{t} \equiv (d_{0t}, d_{1t},\cdots,d_{Nt})\in\{0,1\}^{N+1}\), the treatment status vector for time \(t\).

We consider the potential outcome framework. While many studies assume that one’s potential outcome is not influenced by the treatment status of others (SUTVA) [6], this study allows for the potential outcome to be affected by the treatment status of others. Thus, for treatment status \(\boldsymbol{d}_{t}\) at time \(t\), the potential outcome of unit \(i\) is denoted as \(Y_{it}(\boldsymbol{d}_{t})\).3 The observed outcomes are \[\begin{align} Y_{it} \equiv \begin{cases} Y_{it}(0,0,\cdots,0) & \text{if } t\leq T_{0}, \\ Y_{it}(1,0,\cdots,0) & \text{if } t>T_{0} \end{cases} \end{align}\] for all \(i=0,1,\cdots,N\) and \(t=1,2,\ldots,T\).

We define the treatment effect \(\xi_{0t}\) for the treated unit \(0\) at time \(t\) \((> T_{0})\) as follows: \[\begin{align} \xi_{0t} \equiv \underbrace{Y_{0t}(1,0,\cdots,0)}_{\text{observed outcome}} - \underbrace{Y_{0t}(0,0,\cdots,0)}_{\text{counterfactual outcome}},\nonumber \end{align}\] where \(Y_{0t}(1,0,\cdots,0)\) is the outcome when unit \(0\) is treated, and \(Y_{0t}(0,0,\cdots,0)\) is the outcome when the unit is not treated, a counterfactual outcome not observed in the data.

The framework of this study can capture the spillover effects on units in the control group. Although no unit in the control group receives treatment, the influence of unit \(0\) receiving treatment may affect the outcomes of the control group units, which is referred to as spillover effects. We define the spillover effect \(\xi_{it}\) for any untreated unit \(i\;(\geq 1)\) and time \(t\;(>T_{0})\) as follows: \[\begin{align} \xi_{it} \equiv \underbrace{Y_{it}(1,0,\cdots,0)}_{\text{observed outcome}} - \underbrace{Y_{it}(0,0,\cdots,0)}_{\text{counterfactual outcome}}.\nonumber \end{align}\] The spillover effect is the counterfactual difference in the outcomes of an untreated unit when unit \(i=0\) receives treatment versus when unit \(i=0\) does not. For the outcomes of the treated unit, similar to [2], we suppose the following assumption holds.

Assumption 1 (Perfect Fit). There exists a vector of weights \(\boldsymbol{\alpha} = (\alpha_{1},\alpha_{2},\cdots,\alpha_{N})^{\top}\in\mathbb{R}^{N}\) that satisfies the following: For each \(t=1,2,\cdots,T\), \[\begin{align} Y_{0t}(0,0,\cdots,0) = \sum_{i=1}^{N}\alpha_{i}Y_{it}(0,0,\cdots,0) \;\;a.s. \label{eq:perfect95fit} \end{align}\tag{1}\]

This assumption implies that, in the absence of treatment, the control outcome of the treated unit can be replicated by the weighted average of the control outcomes of the untreated units. As pointed out by [16], this is a fundamental assumption for the SCM and is explicitly or implicitly assumed in SCM studies. Because we allow outcomes to depend on the treatment status of others, Assumption 1 pertains to a state where all units are untreated.

Under Assumption 1, we can estimate the synthetic weights \(\boldsymbol{\alpha}\) by solving the following least-squares problem: \[\begin{align} \widehat{\boldsymbol{\alpha}} = \mathop{\rm arg~min}\limits_{\boldsymbol{\alpha}\in\mathbb{R}^{N}} \sum_{t=1}^{T_{0}} \bigg(Y_{0t}-\sum_{i=1}^{N}\alpha_{i}Y_{it}\bigg)^{2}. \label{estimate32alpha} \end{align}\tag{2}\] Using \(\widehat{\boldsymbol{\alpha}}\), the standard SCM [2] estimates the treatment effect \(\xi_{0t}\) (\(t > T_0\)) by \(Y_{0t} - \sum_{i=1}^{N}\hat{\alpha}_{i}Y_{it}\). Under SUTVA and Assumption 1 (perfect fit), [2] show the identification and consistent estimation of treatment effects by SCM. However, when SUTVA is violated, the standard SCM can be biased in its estimation of treatment effects. The subsequent section clarifies the sources of this bias and proposes a new approach to identify treatment and spillover effects while allowing for spillover.

0.0.3 Identification↩︎

This section first clarifies the bias in standard SCM induced by spillover effects. Subsequently, we introduce the SAR panel data model and discuss its capability to capture spillover effects. We then discuss the identification of the treatment and spillover effects.

0.0.3.1 Bias for the Standard SCM

We first show that when the SUTVA is not satisfied, the standard SCM can be biased due to spillover. Assumption 1 implies that the treatment effects \(\xi_{0t}\) can be expressed as \(Y_{0t}(1,0,\cdots,0)-\sum_{i=1}^{N}\alpha_{i}Y_{it}(0,0,\cdots,0)\). However, when the outcomes depend on the treatment statuses of other units, we cannot observe counterfactual control outcomes \(Y_{it}(0,0,\cdots,0)\) after the pre-treatment periods (\(t > T_0\)). This causes bias in the standard SCM as follows: \[\begin{align} Y_{0t} - \sum_{i=1}^{N}\alpha_{i}Y_{it} &= Y_{0t}(1,0,\cdots,0) - \sum_{i=1}^{N}\alpha_{i}Y_{it}(1,0,\cdots,0) \\ & = \xi_{0t}+\underbrace{\sum_{i=1}^{N}\alpha_{i}\big(Y_{it}(0,0,\cdots,0) - Y_{it}(1,0,\cdots,0)\big)}_{\text{bias}}, \end{align}\] where \(Y_{0t} - \sum_{i=1}^{N}\alpha_{i}Y_{it}\) is the standard SCM estimator of \(\xi_{0t}\) given knowledge of \(\boldsymbol{\alpha}\). This result shows the existence of bias in the standard SCM when SUTVA is not satisfied.

0.0.3.2 Spatial Autoregressive Model

To address the issue of bias caused by spillover, we introduce a model that captures spillover effects between treated and untreated units. Specifically, for the outcomes of units in the control group, we assume the following SAR panel data model for each \(t\geq 1\): \[\begin{align} \boldsymbol{Y}_{t}^{c}(\boldsymbol{d}_{t}) = \rho \big(\boldsymbol{w} Y_{0t}(\boldsymbol{d}_{t}) + \boldsymbol{W} \boldsymbol{Y}_{t}^{c}(\boldsymbol{d}_{t})\big) + \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}^{\boldsymbol{d}_{t}}, \label{network32model} \end{align}\tag{3}\] where \(\boldsymbol{Y}_{t}^{c}(\boldsymbol{d}_{t}) \equiv (Y_{1t}(\boldsymbol{d}_{t}), Y_{2t}(\boldsymbol{d}_{t}),\cdots, Y_{Nt}(\boldsymbol{d}_{t}))^{\top}\in\mathbb{R}^{N}\). The vector \(\boldsymbol{w}\in\mathbb{R}^{N}\) and matrix \(\boldsymbol{W}\in\mathbb{R}^{N\times N}\) comprise spatial weights, which are to be specified a priori. For \(i=1,2,\ldots,N\) and \(j=0,1,\ldots,N\), we denote by \(w_{ij}\) the spatial weight between units \(i\) and \(j\); that is, the \((i,j+1)\)-the element of \((\boldsymbol{w},\boldsymbol{W}) \in \mathbb{R}^{N\times (1+N)}\). Typical examples of spatial weights include adjacent weights, where \(w_{ij}\) is 1 if units \(i\) and \(j\) are adjacent and 0 otherwise. Geographic distance (e.g., distance between the capitals of two countries) and economic distance (e.g., trade amount between two countries) are also often employed as spatial weights in the literature on spatial data analysis [24]. \(\boldsymbol{X}_{t}\equiv (\boldsymbol{X}_{1t},\ldots,\boldsymbol{X}_{Nt})^{\top}\in\mathbb{R}^{N\times k}\) denotes the covariate matrix for control units, where \(\boldsymbol{X}_{it}\) denotes a vector of covariates for unit \(i\) at time \(t\). \(\boldsymbol{u}_{t}^{\boldsymbol{d}_{t}} \equiv (u_{1t}^{\boldsymbol{d}_{t}},\cdots,u_{Nt}^{\boldsymbol{d}_{t}})^{\top}\in\mathbb{R}^{N}\) is a vector of error terms.

The SAR panel data model (3 ) captures the spillovers arising from the spatial dependence of the outcomes among treated and untreated units. The magnitude of the spillover effects depends on the value of \(\rho\) as well as those of the spatial weights \(\boldsymbol{w}\) and \(\boldsymbol{W}\), which represent the strength of spatial correlation. Several estimation methods are proposed to estimate \(\rho\) in SAR panel data models [25][29].

0.0.3.3 Identification

We discuss the identification of the treatment and spillover effects under Assumption 1 and the SAR panel data model (3 ). For simplicity, since the treatment state \(d_{it}\) for all \(i \geq 1\) and \(t\) is always \(0\), we write \[\begin{align} Y_{it}(1) & = Y_{it}(1,0,\cdots,0) \;\;\;\;(\forall i,t), \nonumber \\ Y_{it}(0) & = Y_{it}(0,0,\cdots,0) \;\;\;\;(\forall i,t). \nonumber \end{align}\] We assume the following condition holds.

Assumption 2. For \(t=1,2,\cdots,T\) and \(i=1,2,\cdots,N\), the error term \(u_{it}^{\boldsymbol{d}_{t}}\) depends only on its own treatment state \(d_{it}\); that is, \(u_{it}^{\boldsymbol{d}_{t}} = u_{it}^{\boldsymbol{d}_{t}^{\prime}}\) a.s. for any \(\boldsymbol{d}_{t}=(d_{0t},d_{1t},\ldots,d_{1T})^{\top}\) and \(\boldsymbol{d}_{t}^{\prime}=(d_{0t}^{\prime},d_{1t}^{\prime},\ldots,d_{1T}^{\prime})^{\top}\) such that \(d_{it} = d_{it}^{\prime}\).

This assumption implies that, since \(d_{it}=0\) for any control unit \(i\) at any time \(t\), the error terms for the control units always satisfy \(\boldsymbol{u}_{t}^{\boldsymbol{d}_{t}}=\boldsymbol{u}_{t}^{\boldsymbol{0}_{N+1}}\), where \(\boldsymbol{0}_{N+1}\) represents an \((N+1)\)-dimensional vector of zeros. For simplicity, we denote \(\boldsymbol{u}_{t}^{\boldsymbol{d}_{t}}\)(\(=\boldsymbol{u}_{t}^{\boldsymbol{0}_{N+1}}\)) as \(\boldsymbol{u}_{t}\).

We further assume that the model (3 ) and the synthetic weights \(\boldsymbol{\alpha}\) in Assumption 1 satisfy the following assumption.

Assumption 3. \(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\) is full rank.

This assumption ensures that \(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\) has an inverse matrix. Assumption 3 is testable given estimators of \(\boldsymbol{\alpha}\) and \(\rho\). Given the parameters \(\boldsymbol{\alpha}\) and \(\rho\), the treatment effects and spillover effects can be identified as follows. For each \(t>T_{0}\), under Assumption 1 and the SAR panel data model 3 , we obtain \[\begin{align} \boldsymbol{Y}_{t}^{c}(0) & = \rho \boldsymbol{w} Y_{0t}(0) + \rho \boldsymbol{W} \boldsymbol{Y}_{t}^{c}(0) + \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t} \nonumber \\ & =\rho \boldsymbol{w} \boldsymbol{\alpha}^{\top} \boldsymbol{Y}_{t}^{c}(0) + \rho \boldsymbol{W} \boldsymbol{Y}_{t}^{c}(0) + \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}.\nonumber \end{align}\] Thus, because the inverse matrix of \((\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W})\) exists under Assumption 3, arranging the above equation yields: \[\begin{align} \boldsymbol{Y}_{t}^{c}(0) & = \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}(\boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}),\nonumber \end{align}\] which leads to \[\begin{align} \xi_{0t} & = Y_{0t}(1) - \boldsymbol{\alpha}^{\top} \boldsymbol{Y}_{t}^{c}(0) \nonumber \\ & =Y_{0t}(1) - \boldsymbol{\alpha}^{\top} \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}\big).\nonumber \end{align}\] Furthermore, under the model 3 and Assumption, 2, for each \(t > T_0\) \[\begin{align} \big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c}(1) - \rho \boldsymbol{w}Y_{0t}(1)= \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}.\nonumber \end{align}\] Therefore, for \(t > T_0\), we obtain \[\begin{align} \xi_{0t} & =Y_{0t}(1) - \boldsymbol{\alpha}^{\top} \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c}(1) - \rho \boldsymbol{w}Y_{0t}(1)\big) \nonumber \\ &= Y_{0t} - \boldsymbol{\alpha}^{\top} \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w}Y_{0t}\big). \label{identification} \end{align}\tag{4}\] If the parameters \(\boldsymbol{\alpha}\) and \(\rho\) are estimated from the sample of the pretreatment periods (\(t\leq T_{0}\)), \(\xi_{0t}\) (\(t> T_{0}\)) can be estimated as equation (4 ). Note that \(\boldsymbol{\alpha}\) can be consistently estimated by the least-squares method (2 ) under Assumption 1, and several methods are applicable to estimate \(\rho\) in the literature of the SAR panel data model [25][29].

The following theorem summarizes the identification result of \(\xi_{0t}\).

Theorem 1 (Identification of the Treatment Effect). Suppose that Assumptions 1, 2, and 3 hold. Then, given \(\rho\) and \(\boldsymbol{\alpha}\), the treatment effect \(\xi_{0t}\) for \(t>T_{0}\) can be identified as follows: \[\begin{align} \xi_{0t} = Y_{0t} - \boldsymbol{\alpha}^{\top} \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w}Y_{0t}\big). \label{eq:identification95treatment95effect} \end{align}\tag{5}\]

Proof. The result follows from the discussion above. ◻

The discussion above shows that the counterfactual outcome \(\boldsymbol{Y}_{t}^{c}(0)\) for each untreated unit for \(t>T_{0}\) can be identified as \(\big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w}Y_{0t}\big)\). Therefore, given the parameters \(\boldsymbol{\alpha}\) and \(\rho\), the spillover effects on the untreated units are also identifiable.

Theorem 2 (Identification of the Spillover Effects). Suppose that Assumptions 1, 2, and 3 hold. Then, given \(\rho\) and \(\boldsymbol{\alpha}\), the spillover effects \(\boldsymbol{\xi}_{t}^{c}=(\xi_{1t},\xi_{2t},\cdots,\xi_{Nt})^{\top}\in\mathbb{R}^{N}\) on the \(N\) control units for \(t>T_{0}\) can be identified as follows: \[\begin{align} \boldsymbol{\xi}_{t}^{c} & =\boldsymbol{Y}_{t}^{c} - \big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w}Y_{0t}\big).\label{eq:identification95spillover95effect} \end{align}\tag{6}\]

Proof. Note that \(\boldsymbol{Y}_{t}^{c} = \boldsymbol{Y}_{t}^{c}(1)\) for \(t>T_0\) and that \(\big(\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} -\rho \boldsymbol{W}\big)^{-1}\big(\big(\boldsymbol{I}_{N} - \rho \boldsymbol{W}\big)\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w}Y_{0t}\big) = \boldsymbol{Y}_{t}^{c}(0)\). This leads to the result (6 ). ◻

Theorems 1 and 2 show that by incorporating the SAR model into SCM, we can identify treatment and spillover effects without relying on SUTVA. The results in Theorems 1 and 2 also suggest that although the SAR model (3 ) includes the covariates \(\boldsymbol{X}_t\) and error term \(\boldsymbol{u}_{t}\), it is unnecessary to estimate \(\boldsymbol{\beta}\) or the distribution of \(\boldsymbol{u}_{t}\) for the estimation of treatment and spillover effects. Only \(\rho\) and the spatial weights \((\boldsymbol{w},\boldsymbol{W})\) in the SAR model (3 ) are relevant for the estimation of treatment and spillover effects.

Remarkably, when there is no spillover (i.e., \(\rho=0\)), our identification result (5 ) for \(\xi_{0t}\) corresponds to the identification result of the standard SCM [2]. That is, when \(\rho=0\), the identification result (5 ) simplifies to \[\begin{align} \xi_{0t} & = Y_{0t} - \boldsymbol{\alpha}^{\top} (\boldsymbol{I}_{N}-\rho \boldsymbol{w}\boldsymbol{\alpha}^{\top} - \rho \boldsymbol{W})^{-1}\big((\boldsymbol{I}_{N}-\rho \boldsymbol{W})\boldsymbol{Y}_{t}^{c} - \rho \boldsymbol{w} Y_{0t}\big) \\ & = Y_{0t} - \boldsymbol{\alpha}^{\top} \boldsymbol{I}_{N}(\boldsymbol{Y}_{t}^{c}-0) \\ & = Y_{0t} - \boldsymbol{\alpha}^{\top} \boldsymbol{Y}_{t}^{c}, \end{align}\] where the final line is identical to the identification equation for the standard SCM [2]. This result indicates that the identification result obtained from the standard SCM is the special case of our identification result when \(\rho=0\) (no spillover).

0.0.4 Inference↩︎

In this section, we propose a Bayesian inference method for treatment effects \(\xi_{0t}\) and spillover effects \(\xi_{it}\) (\(i =1,2,\ldots,N\)), building on the identification results presented in Section 0.0.3. SCM is typically applied to data with small or no large pretreatment periods, where frequentist methods may not provide accurate estimations. On the other hand, Bayesian methods can provide precise statistical inference (e.g., credible intervals) through MCMC procedures even with short pretreatment periods. This section presents a Bayesian inference method for the model described in Sections 0.0.2 and 0.0.3.

0.0.4.1 Synthetic Weights

Regarding the estimation of the synthetic weights \(\boldsymbol{\alpha}\), we adopt [7]’s ([7]) method, which utilizes the Bayesian horseshoe prior as the prior distribution for parameters. The Bayesian horseshoe prior places a hierarchical prior distribution on parameters, characterized by a strong shrinkage effect around zero. [30] show that LASSO regression in linear models corresponds to estimation with a Laplace prior distribution on coefficients, while the Bayesian horseshoe prior proposed by [8] suggests an even more selective coefficient choice. Thus, by placing this prior distribution on the synthetic weights, it is anticipated that the estimated weights will select the units in the control group that are more related to the treated unit.

The following hierarchical prior distributions are placed on \(\alpha_{1},\alpha_{2},\cdots,\alpha_{N}\): \[\begin{align} \alpha_{i} \mid \lambda_{i} & \sim N(0,\lambda_{i}^{2}),\;\;\text{for}\;i=1,2,\cdots,N, \\ \lambda_{i} \mid \tau & \sim \text{Half-Cauchy}(0,\tau),\;\;\text{for}\;i=1,2,\cdots,N, \\ \tau \mid \sigma_{1} & \sim \text{Half-Cauchy}(0,\sigma_{1}), \\ \sigma_{1} & \sim \text{Half-Cauchy}(0,10). \end{align}\] When these prior distributions are set, as shown by [31], the full conditionals of each parameter can be derived analytically, enabling parameter sampling using the Gibbs sampler. The Appendix shows the full conditionals for each parameter.

0.0.4.2 Spatial Autoregressive Panel Data Model

Regarding the model (3 ), we set the Bayesian horseshoe prior for \(\boldsymbol{\beta}\) as in estimating the synthetic weights, that is,\[\begin{align} \beta_{k}\mid\kappa_{k} & \sim N(0,\kappa_{i}^{2}), \;\;\text{for}\;k=1,2,\cdots,K, \\ \kappa_{k}\mid\psi & \sim \text{Half-Cauchy}(0,\psi),\;\;\text{for}\;k=1,2,\cdots,K, \\ \psi\mid\sigma_{2} & \sim \text{Half-Cauchy}(0,\sigma_{2}), \\ \sigma_{2} & \sim \text{Half-Cauchy}(0,10). \end{align}\]

We assume that \(\boldsymbol{u}_{t}\) follows a multilevel latent factor model \(\boldsymbol{u}_{t}=\boldsymbol{\eta}\boldsymbol{\gamma}_{t}+\boldsymbol{e}_{t}\), where \(\boldsymbol{\gamma}_{t}\in\mathbb{R}^{p}\) denote factors at time \(t\), \(\boldsymbol{\eta}\in\mathbb{R}^{N \times p}\) are factor loadings, and \(\boldsymbol{e}_{t} \in\mathbb{R}^{N}\) is an error vector. We assume \(\boldsymbol{\gamma}_{t}\) follow the AR(\(1\)) model \(\boldsymbol{\gamma}_{t}=\phi_{\gamma}\boldsymbol{\gamma}_{t-1} + \boldsymbol{v}_{t}\), where \(\boldsymbol{v}_{t}\sim N(\boldsymbol{0}_{p},\sigma^{2}_{\gamma}\boldsymbol{I}_{p})\). For each control unit \(i\) (\(=1,\cdots,N\)), \(\eta_{i}\) follows \(p\)-dimensional normal distribution \(N_{p}(\boldsymbol{0}_{p},\sigma^{2}_{\eta}\boldsymbol{\Sigma}_{\eta})\), where \(\boldsymbol{\Sigma}_{\eta}=\mathrm{diag}(\omega_{1}^{2},\cdots,\omega_{p}^{2})\). The priors of \(\sigma_{\eta}\) and \(\omega_{1}, \cdots,\omega_{p}\) are \(\text{half-Cauchy}(0,10)\). These settings are the same as those used by [18].

We adopt the method proposed by [32] and [24] for modeling \(\rho\). Given all parameters except \(\rho\), the conditional distribution of \(\rho\) is \[\begin{align} p(\rho\mid\text{rest}) \propto \prod_{t=1}^{T_{0}}|\boldsymbol{I}_{N} - \rho\boldsymbol{W}|\exp(-\dfrac{1}{2\sigma^{2}}\tilde{\boldsymbol{u}}_{t}^{\top}\tilde{\boldsymbol{u}}_{t}), \end{align}\] where \(\tilde{\boldsymbol{u}}_{t} = \boldsymbol{A}\boldsymbol{Y}_{t}^{c} - \rho\boldsymbol{w}Y_{0t}-\boldsymbol{X}_{t}\boldsymbol{\beta} - \boldsymbol{\gamma}_{t}^{\top}\boldsymbol{\eta}\) and \(\boldsymbol{A}=\boldsymbol{I}_{N}-\rho\boldsymbol{W}\). However, since this is not a standard form, it is impossible to analytically derive the full conditional distribution. Therefore, sampling is conducted using the Metropolis algorithm. If the value of \(\rho\) in the \(m\)-th sampling is \(\rho^{(m)}\), then the proposal value \(\rho^{*}\) for the (\(m+1\))-th sampling is determined by \(\rho^{*}=\rho^{(m)}+ k \cdot N(0,1)\), where \(k\) is a tuning parameter adjusted to achieve an acceptance rate of approximately \(40\%\) to \(60\%\). The MCMC procedure for all parameters is described in the Appendix.

Using the sampled parameters from the posterior distributions, we can estimate the treatment and spillover effects through the identification results (5 ) and (6 ) in Theorems 1 and 2. Specifically, if \(M\) samplings are performed, the estimated values of the treatment and spillover effects for the \(m\)-th sample are as follows: \[\begin{align} \xi_{0t}^{(m)} & = Y_{0t} - \boldsymbol{\alpha}^{\top(m)}\big(\boldsymbol{I}_{N}-\rho^{(m)}\boldsymbol{w}\boldsymbol{\alpha}^{\top(m)}-\rho^{(m)}\boldsymbol{W}\big)^{-1}\big((\boldsymbol{I}_{N}-\rho^{(m)}\boldsymbol{W})\boldsymbol{Y}_{t}^{c}-\rho^{(m)}\boldsymbol{w}Y_{0t}\big), \\ \boldsymbol{\xi}_{t}^{(m)} & = \boldsymbol{Y}_{t}^{c} - \big(\boldsymbol{I}_{N}-\rho^{(m)}\boldsymbol{w}\boldsymbol{\alpha}^{\top(m)}-\rho^{(m)}\boldsymbol{W}\big)^{-1}\big((\boldsymbol{I}_{N}-\rho^{(m)}\boldsymbol{W})\boldsymbol{Y}_{t}^{c}-\rho^{(m)}\boldsymbol{w}Y_{0t}\big). \end{align}\] Calculating this for each \(m\;(=1,2,\cdots,M)\) allows us to construct the distribution of the treatment and spillover effects. Then the posterior means of \(\xi_{0t}\) and \(\boldsymbol{\xi}_{t}\) can be obtained by \((1/M)\sum_{m=1}^{M}\xi_{0t}^{(m)}\) and \((1/M)\sum_{m=1}^{M}\boldsymbol{\xi}_{t}^{(m)}\), respectively. Given the estimators of \(\boldsymbol{\alpha}\) and \(\rho\), the estimation of the treatment effect \(\xi_{0t}\) and spillover effect \(\boldsymbol{\xi}_{t}\) depends only on the observed outcomes \(Y_{0t}\) and \(\boldsymbol{Y}_{t}^{c}\) over time. This suggests that time-varying components in the SAR panel data model, such as factors \(\boldsymbol{\gamma}_{t}\), are not required to be estimated in the post-treatment periods \(t\;(> T_{0})\).

0.0.5 Simulation Study↩︎

0.0.5.1 Simulation Design

We conduct a simulation study to examine the finite sample performance of the proposed Bayesian inference method. Three scenarios are considered for the number of untreated units \(N\): \(N=16\), \(36\), and \(64\). For the length of time periods, we consider two scenarios: \((T,T_{0})=(30,20)\) and \((60,50)\), where each scenario has a post-treatment length of ten.

In each scenario of \((N,T,T_0)\), we set up the DGPs as follows. We consider a network of \(N\) untreated units represented by a rook matrix.4 Subsequently, we set the spatial weight \(\omega_{ij}\) of any two untreated units to be \(1\) if units \(i\) and \(j\) are connected in the network and \(0\) otherwise. Each element of the adjacency vector \(\boldsymbol{w}\) takes the value of \(1\) if \(i \in \{1,2,3,4\}\) and \(0\) otherwise. The synthetic weights \(\boldsymbol{\alpha}\) are set to provide large weights only to control units adjacent to the treatment unit, as follows: \[\begin{align} \alpha_{i} = \begin{cases} 0.5 & \mathrm{if}\;i=1, \\ -0.2 & \mathrm{if}\;i=2, \\ 0.4 & \mathrm{if}\;i \in \{3,4\}, \\ 0.1/6 & \mathrm{if}\;i \in \{5,6,\ldots,10\}, \\ 0 & \mathrm{otherwise}. \end{cases} \end{align}\]

For each \(t\) \((=1,2,\cdots,T)\), the control outcomes of untreated units \(\boldsymbol{Y}_{t}^{c}(0)\) are distributed from a SAR panel data model as follows: \(\boldsymbol{Y}_{t}^{c}(0) = \rho\boldsymbol{w}Y_{0t}(0) + \rho\boldsymbol{W}\boldsymbol{Y}_{t}^{c}(0) + \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}\), where \(\boldsymbol{X}_{t} = (X_{1t},\ldots,X_{NT})^{\top}\) with each element being i.i.d as \(N(0,1)\), \(\boldsymbol{u}_{t} = (u_{1t},\ldots,u_{NT})^{\top}\) with each element being i.i.d as \(N(0,1)\), and \(\beta = 1.0\). The control outcome of the treated unit \(Y_{0t}(0)\) is generated as \(Y_{0t}(0) =\sum_{i=1}^{N}\alpha_{i}Y_{i,t}(0)\). We consider seven scenarios of \(\rho\): \(\rho=-0.8,\;-0.3,\;-0.1,\;0.0,\;0.1,\;0.3,\;0.8\). A larger absolute value of \(\rho\) implies a stronger spatial correlation among the units.

We generate the treatment outcomes of the treated unit \(Y_{0t}(1)\) as \(Y_{0t}(1) = Y_{0t}(0) + N(1, 1)\). We also set \(\boldsymbol{Y}_{t}^{c}(1) = \rho\boldsymbol{w}Y_{0t}(1) + \rho\boldsymbol{W}\boldsymbol{Y}_{t}^{c}(1) + \boldsymbol{X}_{t}\boldsymbol{\beta} + \boldsymbol{u}_{t}.\) The observed outcome for each \(i\) and \(t\) is \(Y_{it} = Y_{it}(0)\cdot 1\{t\leq T_0\} + Y_{it}(1)\cdot 1\{t > T_0\}\).

Following the DGPs, we conduct 1000 Monte Carlo simulations. For each simulation \(r(=1,\cdots,1000)\), we draw \(M(=5000)\) samples by MCMC and compute bias and root mean squared error (RMSE) of the posterior mean of treatment effect as follows: \[\begin{align} Bias &= \dfrac{1}{1000}\sum_{r=1}^{1000}\dfrac{1}{T-T_{0}}\sum_{t=T_{0}+1}^{T}\big(\xi_{0t}^{(r)}-\widehat{\xi}_{0t}^{(r)}\big), \\ RMSE &= \sqrt{\dfrac{1}{1000}\sum_{r=1}^{1000}\dfrac{1}{T-T_{0}}\sum_{t=T_{0}+1}^{T}\big(\xi^{(r)}-\widehat{\xi}_{0t}^{(r)}\big)^{2}}, \end{align}\] where \(\xi_{0t}^{(r)}\) denotes a true treatment effect in the \(r\)-th simulation, \(\widehat{\xi}_{0t}^{(r)}=\sum_{m=1}^{M}\xi_{0t}^{(r,m)}/M\) is the estimated treatment effect with \(\xi_{0t}^{(r,m)}\) being the estimate of the treatment effect in the \(m\)-th iteration for the \(r\)-th simulation.

We compare the performance of the proposed method (labeled “Proposed”) with those of the standard SCM (labeled “SCM”) [2] and the Bayesian SCM of [7] (labeled “BSCM”).5 We also calculate the 95% coverage rate of treatment effect for the proposed method.

0.0.5.2 Results

Table 2 presents the simulation results for bias and RMSE. The key finding is that the error in estimating the treatment effect using SCM and BSCM becomes large as the absolute value of \(\rho\) increases. Particularly in the case of a strong positive spatial correlation (\(\rho=0.8\)), SCM and BSCM exhibit substantial biases and large RMSEs. These results indicate that, when spillovers exist, the treatment effects estimated by SCM and BSCM are biased. Conversely, the proposed method exhibits a small bias and RMSE in each simulation scenario. The bias and RMSE of the proposed method do not increase with the magnitude of the spatial correlation. These results indicate that the proposed method is robust to the spillover effects arising from the spatial correlation of the outcomes.

[Table 2 about here.]

Table 3 presents the coverage rate of \(95\)% credible interval of the treatment effect for the proposed method. In each scenario, the coverage rate is close to 95%, indicating that the proposed inference method performs adequately. The inference performs well even when the length of pretreatment periods is not very large (\(T_0 = 20\)) and/or the spatial correlation is strong (i.e., \(|\rho|\) is large).

[Table 3 about here.]

0.0.6 Empirical Application↩︎

We conduct two empirical studies applying the proposed method: one estimates the impact of the California tobacco tax on consumption [2], and the other estimates the economic impact of the 2011 division of Sudan on GDP per capita. This section presents the results of the two empirical studies.

0.0.6.1 Application I: California Tobacco Tax

In this section, we apply the proposed method to estimate the effect of California cigarette tax (Proposition 99) on consumption [2]. Proposition 99 is an anti-tobacco law issued in California in 1988 to promote awareness of the health risks associated with tobacco. It increased the tobacco excise tax on cigarettes in California by 25 cents per pack. [2] estimate the treatment effect of Proposition 99 on cigarette sales by SCM without accounting for spillovers. Applying the proposed method, we estimate both the treatment and spillover effects of Proposition 99 while allowing for spillovers.

0.0.6.1.1 Data

We use annual cigarette sales data (per state) for the U.S. from 1970 to 2000, as used in [2].6 The outcome of interest is state-level annual per capita cigarette consumption. We compare the proposed method with the standard SCM [2] (labeled as “SCM”), which is not robust to the existence of spillover effects. For SCM, we use the synthetic weights estimated by [2].7

In the estimation using the proposed method, the average retail cigarette prices for each year \(t\) are included as covariates \(\boldsymbol{X}_{t}\). As for the spatial weights \(w_{ij}\) in the SAR model (3 ), we use the adjacency weights of the states in the dataset. We also normalize the spatial weights matrix \((\boldsymbol{w},\boldsymbol{W})\) such that each row of \((\boldsymbol{w},\boldsymbol{W})\) sums to one.

0.0.6.1.2 Results

Figure 1 shows the estimation results for the synthetic control outcomes for California and the treatment effects of the California tobacco tax on consumption, where the posterior means for the proposed method are shown. Figure 1 (a) illustrates that both the SCM and the proposed method fit well with per-capita cigarette sales in California during the pretreatment periods. As shown in Figure 1 (b), each estimation method suggests that the California cigarette tax (Proposition 99) decreased cigarette consumption in California. The proposed method exhibits higher estimates of treatment effects compared to the SCM. The 90% credible interval for the proposed method is negative overall for any year following the intervention, which began in 1988. This finding suggests that Proposition 99 negatively impacted cigarette sales for a decade, supporting the findings of [2].

[Figure 1 about here]

Figure 2 shows the estimates of spillover effects for all states in the control group. The results suggest that the California tobacco tax reduced tobacco consumption in many other states. The particularly affected states are Louisiana, Texas, Mississippi, and Oklahoma, all of which are located in the Southern U.S.

[Figure 2 about here]

0.0.6.2 Application II: The Economic Cost of the 2011 Sudan Split

In this section, we assess the impact of Sudan’s north-south split in 2011 on GDP per capita in the Sudans (the region of the former united Sudan) and other African countries.

0.0.6.2.1 Background

Sudan has long been divided along ethnic and religious lines, with Arabs (primarily Muslims) predominantly in the north and Africans (primarily Christians) in the south, leading to many conflicts. Particularly, the Darfur conflict, driven by the Arab versus non-Arab ethnic divide, has persisted for many years in western Sudan. This conflict has been marked by large-scale atrocities, including mass killings carried out by Arab militias known as “Janjaweed.”

Amid these unending conflicts, the Sudanese government and the Sudan People’s Liberation Army (SPLA), the main rebel force in Southern Sudan, signed the Comprehensive Peace Agreement (CPA) in 2005. This agreement, aimed at ending Sudan’s civil wars, allowed South Sudan to establish its own government, achieve autonomy, and pursue independence through a referendum. Following the CPA, South Sudan voted for independence in January 2011 and was officially recognized as a nation on July 9, 2011.

Since 2011, South Sudan’s independence has caused several economic disturbances in both Sudan and South Sudan. In particular, South Sudan’s oil production shutdown in 2012 and its relapse into conflict in 2013 provoked a severe macroeconomic crisis in the region [34]. This study estimates the economic impact of Sudan’s south-north split, with a focus on GDP per capita in the Sudans (the region of the former united Sudan) and other African countries.8

0.0.6.2.2 Data

We use data from African countries obtained from the World Bank DataBank. Our outcome of interest is “GDP per capita (constant 2015 US$)” post-division. The covariates we use include “exports of goods and services (% of GDP)”, “merchandise trade (% of GDP)”, “access to electricity (% of population)”, “inflation measured by the consumer price index (annual %)”, “net migration”, and “trade (% of GDP)”. Countries with missing values for the outcome or covariates were excluded. We focus on GDP per capita in the Sudans (the region of the former united Sudan) as the outcome \(Y_{0t}\) of the treated unit.9 The control group comprises \(N=29\) African countries with complete observations from 2000 to 2015.10 Since South Sudan’s independence occurred in July 2011, we set pre-treatment periods to be 2000 to 2010 and post-treatment periods to be 2011 to 2015. We cannot compute GDP per capita for the Sudans in 2011 because of incomplete data caused by the Sudan split; however, this does not affect the estimation of synthetic control outcomes in the pre-treatment periods or the estimation of treatment effects from 2012 onward.

Regarding the spatial weights in the model (3 ), we specify \(w_{ij}\) as the average amount of international trade between countries \(i\) and \(j\) as follows: \[\begin{align} w_{ij} &= \dfrac{\text{the average amount of trade between countries } i \text{ and } j}{\sum_{j}\text{the average amount of trade between countries } i \text{ and } j}, \end{align}\] where the trade data come from the IMF and the average amounts of trade are computed from the data of the pre-intervention periods. Each weight \(w_{ij}\), measured by the amount of trade, reflects the strength of the economic connections between the two countries. Countries with stronger economic connections to the former united Sudan are expected to experience stronger spillover effects from the Sudan split.

0.0.6.2.3 Results

Figure 3 illustrates the estimation results for the synthetic outcomes of the Sudans and the treatment effects of the Sudan split. For each method, the estimates indicate a negative impact of southern independence on GDP per capita in the Sudans. In particular, the estimation results from the proposed method show that GDP per capita decreased by about 100 USD in 2012, which represents approximately \(7.8\%\) of the GDP per capita for that year. In 2015, the estimated GDP per capita in the synthetic Sudan is about \(9.5\%\) higher than the actual GDP per capita. In addition, the cumulative losses in the Sudans are estimated to have reached \(34\%\) from 2011 to 2015.11

[Figure 3 about here]

Figure 4 (a) shows the estimated spillover effects of the Sudan split on other African countries. Countries with significant trade volumes with the former united Sudan, such as Egypt and Kenya, experienced substantial negative spillover effects from the split.

The political instability and north-south split in Sudan significantly altered its industrial structure, which, in turn, profoundly impacted trade. For example, while South Sudan is rich in oil and other natural resources, the conflict between the North and South Sudans halted oil production in South Sudan in 2012, resulting in the loss of crucial export commodities. This disruption extended to countries with close economic ties through trade channels. Thus, the north-south split in Sudan inflicted negative economic impacts on the Sudans themselves and caused negative spillover effects on other countries. This empirical study illustrates that political and economic changes in one country can potentially affect other countries with close economic ties.

[Figure 4 (a) about here]

0.0.7 Conclusion↩︎

This study extends SCM to allow for spillover effects. While SCM is often applied to spatial data that may involve spillovers, conventional SCM relies on SUTVA and can produce biased estimates of treatment effects when spillovers are present. We propose a novel SCM that leverages the SAR panel data model to address these spillover effects. We also introduce a Bayesian inference method for estimating both treatment and spillover effects, while employing horseshoe priors for regularization. We apply the method to two empirical studies: evaluating the impact of the California tobacco tax on its consumption [2] and assessing the economic impact of the 2011 Sudan division on GDP per capita. The first study demonstrates the negative impact of the tax on tobacco consumption in California and other US states, while the second study shows that the Sudan split reduced GDP per capita in the Sudans and negatively affected other African countries with strong economic connections to the former united Sudan.

Acknowledgments↩︎

We thank Ryo Okui, Yasuyuki Sawada, Kaoru Irie, and participants in various seminars and workshops for valuable comments. The authors gratefully acknowledge the financial support from JSPS KAKENHI Grant (number 24K16342).

Tables↩︎

\(N=16\) \(N=36\) \(N=64\)
3-5(lr)6-8(lr)9-11 \(\rho\) Proposed SCM BSCM Proposed SCM BSCM Proposed SCM BSCM
-0.8 0.003 0.875 1.022 0.002 0.904 1.422 0.002 0.752 1.410
-0.3 0.004 0.404 0.421 0.000 0.366 0.490 -0.002 0.307 0.484
-0.1 0.007 0.144 0.150 0.003 0.113 0.166 0.003 0.101 0.165
Bias 0.0 0.009 -0.006 0.000 0.004 -0.004 0.000 0.004 -0.010 0.000
0.1 0.011 -0.156 -0.163 0.003 -0.145 -0.173 0.002 -0.139 -0.171
0.3 0.011 -0.557 -0.550 0.009 -0.482 -0.556 0.002 -0.449 -0.549
0.8 0.061 -2.642 -2.613 0.018 -2.051 -2.109 0.007 -2.002 -2.100
-0.8 0.350 1.274 1.043 0.053 1.354 1.449 0.057 1.330 1.440
-0.3 0.067 0.871 0.429 0.066 0.923 0.501 0.064 0.987 0.497
-0.1 0.085 0.738 0.153 0.069 0.806 0.174 0.075 0.869 0.179
RMSE 0.0 0.091 0.713 0.000 0.076 0.783 0.038 0.070 0.832 0.052
0.1 0.098 0.714 0.166 0.079 0.758 0.180 0.075 0.822 0.183
0.3 0.132 0.878 0.561 0.094 0.879 0.568 0.086 0.895 0.562
0.8 0.350 2.777 2.664 0.180 2.188 2.150 0.167 2.150 2.142
Table 1: Simulation Results for Bias and RMSE
\(N=16\) \(N=36\) \(N=64\)
3-5(lr)6-8(lr)9-11 \(\rho\) Proposed SCM BSCM Proposed SCM BSCM Proposed SCM BSCM
-0.8 -0.002 0.962 1.023 -0.002 1.109 1.456 -0.001 1.036 1.461
-0.3 0.001 0.443 0.423 -0.001 0.444 0.500 -0.001 0.427 0.499
-0.1 0.002 0.158 0.149 0.001 0.152 0.170 0.001 0.146 0.170
Bias 0.0 0.002 0.002 0.000 0.003 0.000 0.000 -0.001 0.001 0.000
0.1 0.009 -0.175 -0.162 0.004 -0.163 -0.175 0.002 -0.163 -0.176
0.3 0.008 -0.578 -0.551 0.003 -0.531 -0.559 0.002 -0.526 -0.558
0.8 0.041 -2.665 -2.611 0.015 -2.116 -2.118 0.009 -2.111 -2.118
-0.8 0.021 1.274 1.044 0.026 1.401 1.485 0.020 1.358 1.491
-0.3 0.041 0.826 0.431 0.035 0.837 0.510 0.026 0.838 0.509
-0.1 0.052 0.671 0.152 0.039 0.692 0.173 0.028 0.693 0.173
RMSE 0.0 0.057 0.638 0.000 0.041 0.648 0.000 0.028 0.665 0.000
0.1 0.062 0.655 0.166 0.044 0.671 0.179 0.033 0.670 0.179
0.3 0.081 0.854 0.562 0.052 0.821 0.570 0.038 0.829 0.569
0.8 0.215 2.777 2.663 0.112 2.233 2.160 0.086 2.230 2.160

Notes: Panels (a) and (b) show the simulation results for the bias and RMSE for \(T_0=20\) and \(50\), respectively. Each panel shows the simulation results for each of the proposed method, SCM, and BSCM, and each of \(\rho \in \{-0.8,-0.3, -0.1, 0.0, 0.1, 0.3,0.8\}\) and \(N \in \{16,36,64\}\).

Table 2: Coverage Rate of 95% Credible Interval for the proposed method
\(T_{0}=20\) \(T_{0}=50\)
2-4 (lr)6-8 \(\rho\) \(N=16\) \(N=36\) \(N=64\) \(N=16\) \(N=36\) \(N=64\)
-0.8 0.955 0.963 0.961 0.960 0.951 0.939
-0.3 0.936 0.962 0.967 0.956 0.943 0.950
-0.1 0.938 0.975 0.975 0.947 0.951 0.951
0.0 0.951 0.970 0.975 0.950 0.943 0.968
0.1 0.954 0.968 0.975 0.952 0.950 0.944
0.3 0.949 0.965 0.975 0.961 0.953 0.943
0.8 0.947 0.963 0.979 0.970 0.941 0.952

Notes: This table shows the coverage rate of \(95\)% credible interval for the proposed method for each scenario of \(\rho\), \(T_0\), and \(N\). The coverage rate is computed over 1000 simulations.

Figures↩︎

Figure 1: Estimates of the Counterfactual Outcomes and Treatment Effects of California Tobacco Tax
Figure 2: Estimates of the Spillover Effects of California Tobacco Tax
Figure 3: Estimates of the Counterfactual Outcomes and Treatment Effects of Sudan Split
a

Figure 4: No caption. a — Estimates of the Spillover Effects of Sudan Split

Appendix↩︎

To derive full conditional distributions, we use the following proposition:

Proposition 1 ([35]). \[\require{physics} \begin{align} X\mid a \sim \text{Half-Cauchy}(0,a) \Longleftrightarrow \begin{cases} X^{2}\mid b \sim \text{Inverse-Gamma}\qty(\dfrac{1}{2},\dfrac{1}{b}) \\ b\mid a \sim \text{Inverse-Gamma}\qty(\dfrac{1}{2},\dfrac{1}{a^{2}}) \end{cases} \end{align}\]

This proposition implies that a half-Cauchy distribution is equivalent to a hierarchical form of an inverse gamma distribution with an auxiliary variable.

0.0.8 Synthetic Weights↩︎

By using auxiliary variables, we can derive the full conditional distributions as follows: \[\require{physics} \begin{align} \boldsymbol{\alpha} \mid \text{rest} & \sim \mathcal{N}_{N}\big(A^{-1}(\boldsymbol{Y}^{c}(0))^{\top}\boldsymbol{Y}_{0}(0),\;\sigma^{2}_{1}\boldsymbol{A}^{-1}\big), \\ \text{where}\;\;\boldsymbol{A} & = (\boldsymbol{Y}^{c}(0))^{\top}\boldsymbol{Y}^{c}(0) + \sigma^{2}_{1}\mathrm{diag}(1/\lambda^{2}_{1},1/\lambda^{2}_{2},\cdots,1/\lambda^{2}_{N})\in\mathbb{R}^{N\times N} \\ \lambda^{2}_{i}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{\alpha_{i}^{2}}{2} + \dfrac{1}{\nu_{\lambda}}),\;\;\text{for}\;i=1,2,\cdots,N \\ \nu_{\lambda}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\sum_{i=1}^{N}\dfrac{1}{\lambda^{2}_{i}}+\dfrac{1}{\tau^{2}}) \\ \tau^{2}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\nu_{\lambda}} + \dfrac{1}{\nu_{\tau}}) \\ \nu_{\tau}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\tau^{2}}+\dfrac{1}{\sigma_{1}^{2}}) \\ \sigma^{2}_{1}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1+\dfrac{T_{0}}{2},\;\dfrac{1}{\nu_{\tau}}+\dfrac{1}{\nu_{\sigma_{1}}}+\dfrac{1}{2}(\boldsymbol{Y}_{0}(0)-\boldsymbol{\alpha}^{\top}\boldsymbol{Y}^{c}(0))^{\top}(\boldsymbol{Y}_{0}(0)-\boldsymbol{\alpha}^{\top}\boldsymbol{Y}^{c}(0))) \\ \nu_{\sigma_{1}}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\sigma_{1}^{2}}+\dfrac{1}{10^{2}}) \end{align}\]

0.0.9 Spatial Autoregressive Panel Data Model↩︎

For the remaining parameters, we specify priors as follows.

  1. Prior of \(\boldsymbol{\beta}\): \[\require{physics} \begin{align} \boldsymbol{\beta} \mid \text{rest} & \sim \mathcal{N}_{k}(\boldsymbol{A}_{\beta}^{-1}\tilde{u},\;\sigma^{2}_{2}\boldsymbol{A}_{\beta}^{-1}) \\ \text{where}\;\;\boldsymbol{A}_{\beta} & =\boldsymbol{X}^{\top}\boldsymbol{X} + \sigma^{2}_{2}\mathrm{diag}(1/\lambda_{\beta_{1}}^{2},1/\lambda_{\beta_{2}}^{2},\cdots,1/\lambda_{\beta_{k}}^{2}) \\ \lambda_{\beta_{j}}^{2}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{\beta_{j}^{2}}{2}+\dfrac{1}{\nu_{\lambda_{\beta_{0}}}}),\;\;\text{for}\;j=1,2,\cdots,k \\ \nu_{\lambda_{\beta}}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\tau^{2}_{\beta}} + \sum_{j=1}^{k}\dfrac{1}{\lambda^{2}_{\beta_{j}}}) \\ \tau^{2}_{\beta} \mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\nu_{\lambda_{\beta}}} + \dfrac{1}{\nu_{\tau_{\beta}}}) \\ \nu_{\tau_{\beta}}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\tau^{2}_{\beta}} + \dfrac{1}{\sigma_{2}^{2}}); \end{align}\]

  2. Priors of \(\phi_{\gamma}\) and \(\sigma^{2}_{\gamma}\): \[\require{physics} \begin{align} \phi_{\gamma}\mid\text{rest} & \sim N\qty(\dfrac{\sum_{t=1}^{T_{0}}\boldsymbol{\gamma}_{t-1}^{\top}\boldsymbol{\gamma}_{t}}{\sum_{t=1}^{T_{0}}\boldsymbol{\gamma}_{t-1}^{\top}\boldsymbol{\gamma}_{t-1}},\;\dfrac{\sigma^{2}_{\gamma}}{\sum_{t=1}^{T_{0}}\boldsymbol{\gamma}_{t-1}^{\top}\boldsymbol{\gamma}_{t-1}}), \\ \sigma^{2}_{\gamma}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(\dfrac{1}{2}+\dfrac{T_{0}}{2},\;\dfrac{1}{\nu_{\sigma_{\gamma}}} + \dfrac{1}{2}\sum_{t=1}^{T_{0}}(\boldsymbol{\gamma}_{t}-\phi_{\gamma}\boldsymbol{\gamma}_{t-1})^{\top}(\boldsymbol{\gamma}_{t}-\phi_{\gamma}\boldsymbol{\gamma}_{t-1})), \\ \nu_{\sigma}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\sigma^{2}_{\gamma}}+\dfrac{1}{10^{2}}). \end{align}\] After drawing samples from the above full conditional distributions, we update \(\boldsymbol{\gamma}_{t}=\phi_{\gamma}\boldsymbol{\gamma}_{t-1}+\boldsymbol{v}_{t}\).

  3. Priors \(\boldsymbol{\eta}^{0}\), \(\sigma^{2}_{\eta}\) and \(\omega_{1}^{2},\cdots,\omega_{p}^{2}\): \[\require{physics} \begin{align} \underset{p\times 1}{\boldsymbol{\eta}_{i}} \mid\text{rest} & \sim \mathcal{N}_{p}\qty(\boldsymbol{C}^{-1}\sum_{t=1}^{T_{0}}\boldsymbol{\gamma}_{t}^{\top}(\boldsymbol{Y}^{c}(0)-u_{it}),\;\sigma^{2}_{2}\boldsymbol{C}^{-1}),\;\;\text{for}\;i=1,2,\cdots,N \\ \text{where}\;\;\boldsymbol{C} & = \sum_{t=1}^{T_{0}}\boldsymbol{\gamma}_{t}\boldsymbol{\gamma}_{t}^{\top} + \sigma^{2}_{2}\sigma^{-2}_{\eta}\boldsymbol{\Sigma}_{\eta}^{-1}, \\ \text{and}\;\;u_{it} & = \rho w_{i} Y_{0t}(0)+\rho\sum_{j=1}^{N}W_{ij}Y_{jt}(0) + \boldsymbol{X}_{it}\boldsymbol{\beta}_{0} \\ \sigma^{2}_{\eta}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(\dfrac{1}{2}+\dfrac{pN}{2},\;\dfrac{1}{\nu_{\sigma_{\eta}}}+\dfrac{1}{2}\sum_{i=1}^{N}\boldsymbol{\eta}_{i}^{0\top}\boldsymbol{\Sigma}_{\eta}^{-1}\boldsymbol{\eta}^{0}_{i}) \\ \omega_{j}^{2}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(\dfrac{1}{2}+\dfrac{N}{2},\;\dfrac{1}{\nu_{\omega_{k}}} + \dfrac{1}{2}\sum_{i=1}^{N}\dfrac{(\eta_{ij}^{0})^{2}}{\sigma^{2}_{\eta}}),\;\;\text{for}\;j=1,2,\cdots,p \\ \nu_{\omega_{j}}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\omega_{j}^{2}}+\dfrac{1}{10^{2}}),\;\;\text{for}\;j=1,2,\cdots,p. \end{align}\]

  4. Prior of \(\sigma^{2}_{2}\): \[\require{physics} \begin{align} \sigma^{2}_{2}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1 + \dfrac{NT_{0}}{2},\;\dfrac{1}{\nu_{\beta}} + \dfrac{1}{\nu_{\sigma_{2}}} +\sum_{t=1}^{T_{0}}\boldsymbol{u}_{t}^{\top}\boldsymbol{u}_{t}) \\ \text{where}\;\;\boldsymbol{u}_{t} & = \boldsymbol{Y}_{t}^{c}(0)-\rho\boldsymbol{w}Y_{0t}(0)-\rho\boldsymbol{W}\boldsymbol{Y}_{t}^{c}(0)-\boldsymbol{X}_{t}\boldsymbol{\beta}_{0}-\boldsymbol{\eta}^{0}\boldsymbol{\gamma}_{t} \\ \nu_{\sigma_{2}}\mid\text{rest} & \sim \text{Inverse-Gamma}\qty(1,\;\dfrac{1}{\sigma^{2}_{2}} + \dfrac{1}{10^{2}}). \end{align}\]

  5. Regarding \(\rho\), because we cannot analytically derive the full conditional distribution, we use the Metropolis algorithm:

    1. Let \(\rho^{(c)}\) be the current value. Generate \(\rho^{*}\) from \(N(\rho^{(c)},k^{2})\).

    2. Using the following, we compute likelihood \(p(\rho^{c}\mid\text{rest})\) and \(p(\rho^{*}\mid\text{rest})\), respectively:\[\begin{align} &p(\rho\mid\text{rest}) \\ &= \prod_{t=1}^{T_{0}}|\boldsymbol{I}_{N}-\rho\boldsymbol{W}|\exp(-\dfrac{1}{2\sigma^{2}}\big(\boldsymbol{A}\boldsymbol{Y}_{t}^{c}-\rho\boldsymbol{w}Y_{0t}-\boldsymbol{X}_{t}\boldsymbol{\beta}-\boldsymbol{\eta}^{0}\boldsymbol{\gamma}_{t}\big)^{\top}\big(\boldsymbol{A}\boldsymbol{Y}_{t}^{c}-\rho\boldsymbol{w}Y_{0t}-\boldsymbol{X}_{t}\boldsymbol{\beta}-\boldsymbol{\eta}^{0}\boldsymbol{\gamma}_{t}\big)), \end{align}\] where \(\boldsymbol{A}=\boldsymbol{I}_{N}-\rho\boldsymbol{W}\).

    3. Let \(r=\min\{1,p(\rho^{*}\mid\text{rest})/p(\rho^{(c)}\mid\text{rest})\}\) be the acceptance probability. Subsequently, we generate \(u\sim U(0,1)\) and update \(\rho^{(c)}\) as \(\rho^{*}\) if \(r>u\), otherwise stay \(\rho^{(c)}\).

    This algorithm is referred from [24].

References↩︎

[1]
A. Abadie and J. Gardeazabal, “The economic costs of conflict: A case study of the basque country,” American Economic Review, vol. 93, no. 1, pp. 113–132, 2003.
[2]
A. Abadie, A. Diamond, and J. Hainmueller, “Synthetic control methods for comparative case studies: Estimating the effect of california’s tobacco control program,” Journal of the American Statistical Association, vol. 105, no. 490, pp. 493–505, 2010.
[3]
S. Athey and G. W. Imbens, “The state of applied econometrics: Causality and policy evaluation,” Journal of Economic Perspectives, vol. 31, no. 2, pp. 3–32, 2017.
[4]
A. Abadie, A. Diamond, and J. Hainmueller, “Comparative politics and the synthetic control method,” American Journal of Political Science, vol. 59, no. 2, pp. 495–510, 2015.
[5]
R. Bifulco, R. Rubenstein, and H. Sohn, “Using synthetic controls to evaluate the effect of unique interventions: The case of say yes to education,” Evaluation Review, vol. 41, no. 6, pp. 593–619, 2017.
[6]
D. B. Rubin, “Bayesian inference for causal effects: The role of randomization,” The Annals of Statistics, pp. 34–58, 1978.
[7]
S. Kim, C. Lee, and S. Gupta, “Bayesian synthetic control methods,” Journal of Marketing Research, vol. 57, no. 5, pp. 831–852, 2020.
[8]
C. M. Carvalho, N. G. Polson, and J. G. Scott, “The horseshoe estimator for sparse signals,” Biometrika, vol. 97, no. 2, pp. 465–480, 2010.
[9]
J. Mawejje and P. McSharry, “The economic cost of conflict: Evidence from south sudan,” Review of Development Economics, vol. 25, no. 4, pp. 1969–1990, 2021.
[10]
A. Abadie, “Using synthetic controls: Feasibility, data requirements, and methodological aspects,” Journal of Economic Literature, vol. 59, no. 2, pp. 391–425, 2021.
[11]
A. Abadie and J. L’hour, “A penalized synthetic control estimator for disaggregated data,” Journal of the American Statistical Association, vol. 116, no. 536, pp. 1817–1834, 2021.
[12]
D. Arkhangelsky, S. Athey, D. A. Hirshberg, G. W. Imbens, and S. Wager, “Synthetic difference-in-differences,” American Economic Review, vol. 111, no. 12, pp. 4088–4118, 2021.
[13]
E. Ben-Michael, A. Feller, and J. Rothstein, “The augmented synthetic control method,” Journal of the American Statistical Association, vol. 116, no. 536, pp. 1789–1803, 2021.
[14]
K. T. Li, “Statistical inference for average treatment effects estimated by synthetic control methods,” Journal of the American Statistical Association, vol. 115, no. 532, pp. 2068–2083, 2020.
[15]
V. Chernozhukov, K. Wüthrich, and Y. Zhu, “An exact and robust conformal inference method for counterfactual and synthetic controls,” Journal of the American Statistical Association, vol. 116, no. 536, pp. 1849–1864, 2021.
[16]
B. Ferman and C. Pinto, “Synthetic controls with imperfect pretreatment fit,” Quantitative Economics, vol. 12, no. 4, pp. 1197–1221, 2021.
[17]
K. H. Brodersen, F. Gallusser, J. Koehler, N. Remy, and S. L. Scott, Inferring causal impact using Bayesian structural time-series models,” The Annals of Applied Statistics, vol. 9, no. 1, pp. 247–274, 2015.
[18]
X. Pang, L. Liu, and Y. Xu, “A bayesian alternative to synthetic control for comparative case studies,” Political Analysis, vol. 30, no. 2, pp. 269–288, 2022.
[19]
D. Klinenberg, “Synthetic control with time varying coefficients: A state space approach with bayesian shrinkage,” Journal of Business & Economic Statistics, vol. 41, no. 4, pp. 1065–1076, 2023.
[20]
J. Cao and C. Dowd, “Estimation and inference for synthetic control methods with spillover effects,” arXiv preprint arXiv:1902.07343, 2019.
[21]
F. Menchetti and I. Bojinov, “Estimating the effectiveness of permanent price reductions for competing products using multivariate bayesian structural time series models,” The Annals of Applied Statistics, vol. 16, no. 1, pp. 414–435, 2022.
[22]
G. Grossi, P. Lattarulo, M. Mariani, A. Mattei, and O. Oner, “Synthetic control group methods in the presence of interference: The direct and spillover effects of light rail on neighborhood retail activity,” arXiv preprint arXiv:2004.05027, 2020.
[23]
E. Miguel and M. Kremer, “Worms: Identifying impacts on education and health in the presence of treatment externalities,” Econometrica, vol. 72, no. 1, pp. 159–217, 2004.
[24]
J. LeSage and R. K. Pace, Introduction to spatial econometrics. Chapman; Hall/CRC, 2009.
[25]
B. Fingleton, “A generalized method of moments estimator for a spatial panel model with an endogenous spatial lag and spatial moving average errors,” Spatial Economic Analysis, vol. 3, no. 1, pp. 27–44, 2008.
[26]
L. Lee and J. Yu, “Estimation of spatial autoregressive panel data models with fixed effects,” Journal of Econometrics, vol. 154, no. 2, pp. 165–185, 2010.
[27]
L. Su, “Semiparametric GMM estimation of spatial autoregressive models,” Journal of Econometrics, vol. 167, no. 2, pp. 543–560, 2012.
[28]
A. J. Glass, K. Kenjegalieva, and R. C. Sickles, “A spatial autoregressive stochastic frontier model for panel data with asymmetric efficiency spillovers,” Journal of Econometrics, vol. 190, no. 2, pp. 289–300, 2016.
[29]
X. Liang, J. Gao, and X. Gong, “Semiparametric spatial autoregressive panel data model with fixed effects and time-varying coefficients,” Journal of Business & Economic Statistics, vol. 40, no. 4, pp. 1784–1802, 2022.
[30]
T. Park and G. Casella, “The bayesian lasso,” Journal of the American Statistical Association, vol. 103, no. 482, pp. 681–686, 2008.
[31]
E. Makalic and D. F. Schmidt, “A simple sampler for the horseshoe estimator,” IEEE Signal Processing Letters, vol. 23, no. 1, pp. 179–182, 2015.
[32]
J. P. LeSage, “Bayesian estimation of spatial autoregressive models,” International Regional Science Review, vol. 20, no. 1–2, pp. 113–129, 1997.
[33]
S. Cunningham, Causal inference: The mixtape. Yale University Press, 2021.
[34]
J. Mawejje, The macroeconomic environment for jobs in south sudan: Jobs, recovery, and peacebuilding in urban south sudan-technical report II. World Bank, 2020.
[35]
M. P. Wand, J. T. Ormerod, S. A. Padoan, and R. Frühwirth, Mean field variational Bayes for elaborate distributions,” Bayesian Analysis, vol. 6, no. 4, pp. 847–900, 2011.

  1. Faculty of Economics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.
    Email: sakaguchi@e.u-tokyo.ac.jp.↩︎

  2. Nowcast Inc, Sumitomo RD Kudan Bldg. 9F, 1-8-10 Kudankita, Chiyoda-ku, Tokyo 102-0073, Japan.
    Email: h.tagawa8810@gmail.com.↩︎

  3. SUTVA means that the potential outcome of each unit \(i\) is represented by \(Y_{it}(d_{it})\), which does not depend on the treatment status of others \(\boldsymbol{d}_{t} \backslash d_{it}\),↩︎

  4. We use a rook matrix based on an \(r\) board (so that \(N=r^2\)) to represent the network of \(N\) untreated units. The rook matrix represents a square tessellation with connectivity of four for the inner fields on the chessboard, and two and three for the corner and border fields, respectively.↩︎

  5. SCM does not involve MCMC sampling.↩︎

  6. This dataset is available from [33] on GitHub. It has undergone preprocessing as described in [2].↩︎

  7. The estimated values are listed in Table 2 of [2].↩︎

  8. Using the standard SCM, [9] estimate the economic losses in South Sudan, owing to the oil production halt in 2012, finding a nearly 70% loss in per capita real GDP from 2012 to 2018. They excluded neighboring countries of South Sudan from their SCM analysis to avoid bias caused by spillovers; however, this practice can result in a poorly fitting synthetic control, making the perfect-fit assumption (Assumption 1) less plausible and potentially causing additional bias.↩︎

  9. The GDP per capita in the Sudans after the Sudan split is calculated by dividing the sum of GDP in North and South Sudan by the sum of their populations. Before the split, this measure corresponds to the GDP per capita in (the united) Sudan.↩︎

  10. The list of countries used in the control group is as follows: Algeria, Angola, Benin, Botswana, Burundi, Cameroon, Central African Republic, Chad, Egypt, Gabon, Ghana, Ivory Coast, Kenya, Madagascar, Mali, Mauritania, Mauritius, Morocco, Niger, Nigeria, Republic of the Congo, Rwanda, Senegal, South Africa, Tanzania, Togo, Tunisia, Uganda, Zambia.↩︎

  11. The losses in the Sudans are defined by \(100\times(\xi_{0t}/Y_{0t}(0))\) (%) for each \(t>T_{0}\) and those in control countries are defined by \(100\times\xi_{it}/Y_{it}(0)\) (%) for each \(i=1,2,\cdots,N\) and \(t>T_{0}\). The cumulative losses are computed by \(100\times \sum_{t=2011}^{2015}\xi_{it}/Y_{it}(0)\) for each \(i=0,1,\cdots,29\). We estimate these by the posterior means of these losses.↩︎