Yong Cai's Webpage

Yong Cai

Hi! I'm a an Assistant Professor at the University of Wisconsin-Madison. I often go by "Chai".

My research interests lie in theoretical and applied econometrics.

CV | Email

Working Papers

Yes, Standing Committee: Majority Rule in Non-Democracies

With Haokun Sun.

Abstract

The Chinese Communist Party (CCP) espouses a principle of collective leadership, in which the Politburo Standing Committee (PSC) makes important decisions by consensus. However, it is not known whether such a majority rule is employed in practice. This paper studies the appointment of party cadres into positions of power as a means of uncovering a general decision-making mechanism within the CCP. We provide reduced-form results showing that appointments are decided by the PSC, who selectively promote candidates in their social networks. This motivates a novel model of collective leadership in which PSC members play a coalition game to promote their preferred candidates. The majority rule is represented by a minimum constraint on the size of winning coalitions. Estimating our model, we show that appointments to positions above the vice-provincial minister level requires support from 75% of the PSC members on average. This cutoff varies depending on the President in power, ranging from 50% under Deng to 80% under Jiang and Hu. Estimating political factions using modularity clustering, we find that factional penalties operate in parallel to the majority rule. Our method can be useful for understanding decision-making in non-democracies more generally.

Regression Discontinuity Design with Spillovers

With Eric Auerbach and Ahnaf Rafi.

Abstract

Researchers who estimate treatment effects using a regression discontinuity design (RDD) typically assume that there are no spillovers between the treated and control units. This may be unrealistic. We characterize the estimand of RDD in a setting where spillovers occur between units that are close in their values of the running variable. Under the assumption that spillovers are linear-in-means, we show that the estimand depends on the ratio of two terms: (1) the radius over which spillovers occur and (2) the choice of bandwidth used for the local linear regression. Specifically, RDD estimates direct treatment effect when radius is of larger order than the bandwidth, and total treatment effect when radius is of smaller order than the bandwidth. In the more realistic regime where radius is of similar order as the bandwidth, the RDD estimand is a mix of the above effects. To recover direct and spillover effects, we propose incorporating estimated spillover terms into local linear regression -- the local analog of peer effects regression. We also clarify the settings under which the donut-hole RD is able to eliminate the effects of spillovers.

Linear Regression with Centrality Measures

Abstract

This paper studies the properties of linear regression on centrality measures when network data is sparse -- that is, when there are many more agents than links per agent -- and when they are measured with error. We make three contributions in this setting: (1) We show that OLS estimators can become inconsistent under sparsity and characterize the threshold at which this occurs, with and without measurement error. This threshold depends on the centrality measure used. Specifically, regression on eigenvector is less robust to sparsity than on degree and diffusion. (2) We develop distributional theory for OLS estimators under measurement error and sparsity, finding that OLS estimators are subject to asymptotic bias even when they are consistent. Moreover, bias can be large relative to their variances, so that bias correction is necessary for inference. (3) We propose novel bias correction and inference methods for OLS with sparse noisy networks. Simulation evidence suggests that our theory and methods perform well, particularly in settings where the usual OLS estimators and heteroskedasticity-consistent/robust t-tests are deficient. Finally, we demonstrate the utility of our results in an application inspired by De Weerdt and Dercon (2006), in which we study the relationship between consumption smoothing and informal insurance in Nyakatoke, Tanzania.

Identifying Socially Disruptive Policies

With Eric Auerbach.
[Online Appendix] | [R Package]

Abstract

Social disruption occurs when a policy creates or destroys many network connections between agents. It is a costly side effect of many interventions and so a growing empirical literature recommends measuring and accounting for social disruption when evaluating the welfare impact of a policy. However, there is currently little work characterizing what can actually be learned about social disruption from data in practice. In this paper, we consider the problem of identifying social disruption in a research design that is popular in the literature. We provide two sets of identification results. First, we show that social disruption is not generally point identified, but informative bounds can be constructed using the eigenvalues of the network adjacency matrices observed by the researcher. Second, we show that point identification follows from a theoretically motivated monotonicity condition, and we derive a closed form representation. We apply our methods in two empirical illustrations and find large policy effects that otherwise might be missed by alternatives in the literature.

It's not always about the money, sometimes it's about sending a message: Evidence of Informational Content in Monetary Policy Announcements

With Santiago Camara, and Nicholas Capel.

Abstract

This paper introduces a transparent framework to identify the informational content of FOMC announcements. We do so by modelling the expectations of the FOMC and private sector agents using state of the art computational linguistic tools on both FOMC statements and New York Times articles. We identify the informational content of FOMC announcements as the projection of high frequency movements in financial assets onto differences in expectations. Our recovered series is intuitively reasonable and shows that information disclosure has a significant impact on the yields of short-term government bonds.

Panel Data with Unknown Clusters

Abstract

Clustered standard errors and approximate randomization tests are popular inference methods that allow for dependence within observations. However, they require researchers to know the cluster structure ex ante. We propose a procedure to help researchers discover clusters in panel data. Our method is based on thresholding an estimated long-run variance-covariance matrix and requires the panel to be large in the time dimension, but imposes no lower bound on the number of units. We show that our procedure recovers the true clusters with high probability with no assumptions on the cluster structure. The estimated clusters are independently of interest, but they can also be used in the approximate randomization tests or with conventional cluster-robust covariance estimators. The resulting procedures control size and have good power.

Some Finite Sample Properties of the Sign Test

Abstract

This paper contains two finite-sample results concerning the sign test. First, we show that the sign-test is unbiased with independent, non-identically distributed data for both one-sided and two-sided hypotheses. The proof for the two-sided case is based on a novel argument that relates the derivatives of the power function to a regular bipartite graph. Unbiasedness then follows from the existence of perfect matchings on such graphs. Second, we provide a simple theoretical counterexample to show that the sign test over-rejects when the data exhibits correlation. Our results are useful for understanding the properties of approximate randomization tests in settings with few clusters.

Publications

On the Performance of the Neyman Allocation with Small Pilots

Journal of Econometrics (2024)
With Ahnaf Rafi.

Abstract

The Neyman Allocation is used in many papers on experimental design, which typically assume that researchers have access to large pilot studies. This may be unrealistic. To understand the properties of the Neyman Allocation with small pilots, we study its behavior in an asymptotic framework that takes pilot size to be fixed even as the size of the main wave tends to infinity. Our analysis shows that the Neyman Allocation can lead to estimates of the ATE with higher asymptotic variance than with (non-adaptive) balanced randomization. In particular, this happens when the outcome variable is relatively homoskedastic with respect to treatment status or when it exhibits high kurtosis. We provide a series of empirical examples showing that such situations can arise in practice. Our results suggest that researchers with small pilots should not use the Neyman Allocation if they believe that outcomes are homoskedastic or heavy-tailed. Finally, we examine some potential methods for improving the finite sample performance of the FNA via simulations.

A Modified Randomization Test for the Level of Clustering

Journal of Business & Economic Statistics (2024)
[R Package]

Abstract

Suppose a researcher observes individuals within a county within a state. Given concerns about correlation across individuals, at which level should they cluster their observations for inference? This paper proposes a modified randomization test as a robustness check for their chosen specification in a linear regression setting. Existing tests require either the number of states or number of counties to be large. Our method is designed for settings with few states and few counties. While the method is conservative, it has competitive power in settings that may be relevant to empirical work.

On the Implementation of Approximate Randomization Tests in Linear Models with a Small Number of Clusters

Journal of Econometric Methods (2023)
With Ivan Canay, Deborah Kim and Azeem Shaikh.
[Stata Package] | [R Package] | [Replication Files]

Abstract

This paper provides a user’s guide to the general theory of approximate randomization tests developed in Canay et al. (2017a) when specialized to linear regressions with clustered data. An important feature of the methodology is that it applies to settings in which the number of clusters is small – even as small as five. We provide a step-by-step algorithmic description of how to implement the test and construct confidence intervals for the parameter of interest. In doing so, we additionally present three novel results concerning the methodology: we show that the method admits an equivalent implementation based on weighted scores; we show the test and confidence intervals are invariant to whether the test statistic is studentized or not; and we prove convexity of the confidence intervals for scalar parameters. We also articulate the main requirements underlying the test, emphasizing in particular common pitfalls that researchers may encounter.