At the forefront of social science research, novel techniques are being developed that allow researchers to make robust inferences from complex data. These tools and methods rest on proof of their performance, which in turn relies on using the right kind of data to test them. These tests are hard to conduct well because real social science data is so complex: parametric tests may not comport well with actual social science applications. Conversely, benchmarking on well-known studies leaves researchers unable to determine true performance since the population parameters are unknown. In this paper we introduce a new solution using synthetic data: a strategy in which the underlying relationships between variables in real-world data are learned through deep, generative neural networks, and from which an arbitrary number of entirely new but realistic observations can be generated. We demonstrate our method by synthesising realistic looking, but entirely novel, survey data. We discuss how this synthetic data can be used to benchmark statistical designs and methods, and contribute new open-source software for researchers to use.
Despite its consequences, voters keep electing corrupt politicians. One common explanation is that voters simply lack information on whether candidates are corrupt, yet studies that deliberately provide such information find electoral accountability is weak. Can we root out corrupt politicians? We take a novel approach: first, we employ machine learning to identify widely available political/personal characteristics that are predictive of corrupt practices in Colombia. We then design an experiment that varies the provision of these predictors to study if voters discriminate corrupt from non-corrupt politicians. Results indicate that the presence of candidate images, the presence of a large donor, and information on candidates' political experience decreases the likelihood of choosing a corrupt politician. Moreover, voters behave differently according to their understanding of what corruption is and their attitudes towards it. Compared to established findings, using a novel approach, we show it is possible for voters to root-out corrupt politicians.
Active learning can improve the efficiency of training prediction models by identifying the most informative new labels to acquire. However, non-response to label requests can impact active learning's effectiveness in real-world contexts. We conceptualise this degradation by considering the type of non-response present in the data, demonstrating that biased non-response is particularly detrimental to model performance. We argue that this sort of non-response is particularly likely in contexts where the labelling process, by nature, relies on user interactions. To mitigate the impact of biased non-response, we propose a cost-based correction to the sampling strategy--the Upper Confidence Bound of the Expected Utility (UCB-EU)--that can, plausibly, be applied to any active learning algorithm. Through experiments, we demonstrate that our method successfully reduces the harm from labelling non-response in many settings. However, we also characterise settings where the non-response bias in the annotations remains detrimental under UCB-EU for particular sampling methods and data generating processes. Finally, we evaluate our method on a real-world dataset from e-commerce platform TaoBao. We show that UCB-EU yields substantial performance improvements to conversion models that are trained on clicked impressions. Most generally, this research serves to both better conceptualise the interplay between types of non-response and model improvements via active learning, and to provide a practical, easy to implement correction that helps correct for model degradation.
No democratic country has been shielded from the governance challenge posed by the COVID-19 pandemic. Importantly, governments have had to manage not only a public health crisis, but also an economic crisis precipitated by the virus. Emerging evidence suggests that the COVID-19 crisis has affected support for, or satisfaction with the performance of, incumbent governments. In this paper, we seek to understand, from a comparative perspective, how citizens evaluate the performance of governments in managing the crisis. We field a conjoint experiment in 18 countries across six continents where individuals are asked to evaluate incumbents using information on both their economic and public health-related performance. In total, our data contains responses from over 22,000 individuals, who collectively made over 177,000 choices in the experiment. Leveraging the large amount of data collected, we explore treatment effect heterogeneity using machine learning techniques.
This pre-registration report set out an analysis plan for the public goods game experiment embedded within the second wave of COVID-19 vaccine preference and Opinion Survey (or CANDOUR II study). The study involves a sample of 1200 residents from each of 18 countries interviewed via an anonymous online survey.
We introduce an extension of simple forced-choice conjoint experimental designs and call it ranked-choice conjoint design. In this pre-analysis plan we jointly describe research designs for a series of experiments evaluating ranking-based conjoint over traditional choice-based conjoint designs. We investigate the properties of the fully randomized ranked conjoint design with respect to efficiency, practicality, estimation, and inference. We hypothesize that (i) ranked conjoint designs are superior in recovering transitive rank orderings of individual preferences, (ii) ranked designs exhibit greater estimation accuracy, and (iii) ranked designs produce greater response quality from experimental subjects. Thus, ranked designs should be superior to forced-choice designs and should be used instead in survey experimental research. We test these predictions in a series of experiments with various treatment arms and different focuses.
Early in the pandemic, individuals in numerous countries experienced quite different rates of COVID-19 infections and deaths dependent on where they lived. This within-country variation offers an opportunity to study how the intensity of a catastrophic shock to systems affects individuals’ economic preferences – a topic without consensus in the literature. In April 2020, we conducted an online survey with approximately 1500 subjects in China, 800 in Chile, and 800 in Italy. Our sampling strategy deliberately sampled subjects with exposure to different levels of local COVID-19 infections. We find that respondents condition their behavior and economic preferences on this intensity – levels of COVID-19 preventive behavior are correlated with the intensity of community infections; exposure to intense infection rates correlates, positively, with risk aversion and patience; and, negatively, with other-regarding preferences. Using machine-learning to estimate individual-level effects, we find notable effect heterogeneity with respect to education levels. Finally, using multilevel regression and poststratification (MRP) we demonstrate province-level estimates of economic preferences for 107 Italian provinces.
Conjoint experiments are fast becoming one of the dominant experimental methods within the social sciences. Despite several scholars advancing novel ways to model heterogeneity within this type of design, the relationship between these new quantities and the conjoint design is underdeveloped. In this note, we clarify how conjoint heterogeneity can be construed as a set of nested, causal parameters that correspond to the levels of the conjoint design. We then use this framework to propose a new estimation strategy that allows researchers to evaluate treatment effect heterogeneity and which exhibits good statistical properties. Replicating two conjoint experiments, we first demonstrate our theoretical argument, and then show how this method helps uncover interesting heterogeneity. To accompany this paper, we provide new a R package, cjbart, that allows researchers to model heterogeneity in their experimental conjoint data.
This paper introduces software packages for efficiently imputing missing data using deep learning methods in Python (MIDASpy) and R (rMIDAS). The packages implement a recently developed approach to multiple imputation known as MIDAS, which involves introducing additional missing values into the dataset, attempting to reconstruct these values with a type of unsupervised neural network known as a denoising autoencoder, and using the resulting model to draw imputations of originally missing data. These steps are executed by a fast and flexible algorithm that expands both the quantity and the range of data that can be analyzed with multiple imputation. To help users optimize the algorithm for their particular application, MIDASpy and rMIDAS offer a host of user-friendly tools for calibrating and validating the imputation model. We provide a detailed guide to these functionalities and demonstrate their usage on a large real dataset.
In recent American elections political candidates have actively emphasized features of their fundraising profiles when campaigning. Yet, surprisingly, we know comparatively little about how financial information affects vote choice specifically, whether effects differ across types of election, and how robust any effects are to other relevant political signals. Using a series of conjoint experiment designs, I compare the effects of campaigns’ financial profiles on vote choice across direct democratic and representative elections, randomizing subjects’ exposure to additional political cues. I find that while the financial profile of candidates can affect vote choice, these effects are drowned out by non-financial signals. In ballot initiative races, the explicit policy focus of the election appears to swamp any effect of financial information. This paper is the first to explore the comparative effects of financial disclosure across election type, contributing to our understanding of how different heuristics interact across electoral contexts.
How does the public want a COVID-19 vaccine to be allocated? We conducted a conjoint experiment asking 15,536 adults in 13 countries to evaluate 248,576 profiles of potential vaccine recipients who varied randomly on five attributes. Our sample includes diverse countries from all continents. The results suggest that in addition to giving priority to health workers and to those at high risk, the public favors giving priority to a broad range of key workers and to those with lower income. These preferences are similar across respondents of different education levels, incomes, and political ideologies, as well as across most surveyed countries. The public favored COVID-19 vaccines being allocated solely via government programs but were highly polarized in some developed countries on whether taking a vaccine should be mandatory. There is a consensus among the public on many aspects of COVID-19 vaccination, which needs to be taken into account when developing and communicating rollout strategies.
An established finding on ballot design is that top positions on the ballot improve the electoral performance of parties or candidates because voters respond behaviorally to salient information. This article presents evidence on an additional unexplored mechanism: campaigns, that can act before voters, can also adjust their behavior when allocated a top position on the ballot. We use a constituency-level lottery of ballot positions in Colombia to establish, first, that a ballot-order effect exists: campaigns randomly placed at the top earn more votes and seat shares. Second, we show that campaigns react to being placed on top of the ballot: they raise and spend more money on their campaign, and spending itself is correlated with higher vote shares. Our results provide the first evidence for a new mechanism of ballot-order effects examined in many previous studies.
Principled methods for analyzing missing values, based chiefly on multiple imputation, have become increasingly popular yet can struggle to handle the kinds of large and complex data that are also becoming common. We propose an accurate, fast, and scalable approach to multiple imputation, which we call MIDAS (Multiple Imputation with Denoising Autoencoders). MIDAS employs a class of unsupervised neural networks known as denoising autoencoders, which are designed to reduce dimensionality by corrupting and attempting to reconstruct a subset of data. We repurpose denoising autoencoders for multiple imputation by treating missing values as an additional portion of corrupted data and drawing imputations from a model trained to minimize the reconstruction error on the originally observed portion. Systematic tests on simulated as well as real social science data, together with an applied example involving a large-scale electoral survey, illustrate MIDAS‚Äôs accuracy and efficiency across a range of settings. We provide open-source software for implementing MIDAS.
Firms in the USA rely on highly skilled immigrants, particularly in the science and engineering sectors. Yet, the recent politics of immigration marks a substantial change to US immigration policy. We implement a conjoint experiment that isolates the causal effect of nativist, anti-immigrant, pronouncements on where skilled potential-migrants choose to immigrate to. While these policies have a significantly negative effect on the destination choices of Chilean and UK student subjects, they have little effect on the choices of Indian and Chinese student subjects. These results are confirmed through an unobtrusive test of subjects‚Äô general immigration destination preferences. Moreover, there is some evidence that the negative effect of these nativist policies are particularly salient for those who self-identify with the Left.
Experiments should be designed to facilitate the detection of experimental measurement error. To this end, we advocate the implementation of identical experimental protocols employing diverse experimental modes. We suggest iterative nonparametric estimation techniques for assessing the magnitude of heterogeneous treatment effects across these modes. And we propose two diagnostic strategies‚Äîmeasurement metrics embedded in experiments, and measurement experiments‚Äîthat help assess whether any observed heterogeneity reflects experimental measurement error. To illustrate our argument, first we conduct and analyze results from four identical interactive experiments: in the lab; online with subjects from the CESS lab subject pool; online with an online subject pool; and online with MTurk workers. Second, we implement a measurement experiment in India with CESS Online subjects and MTurk workers.
Objective Immigration is a highly salient political issue. We examine the migration preferences of potential emigrants from the United Kingdom to determine whether the migration calculus is primarily economic or political. ,,Revise and Resubmit ,,Revise and Resubmit ,,Revise and Resubmit,Journal of Politics ,,Revise and Resubmit,Journal of Politics Methods A conjoint survey experiment was conducted with U.K. subjects drawn from the CESS, Nuffield College, Oxford University, student subject pool to identify causal drivers of emigration preferences. Results Logit estimation of emigration preferences indicates that economics and politics matters. Anti-immigrant rhetoric, ‚ÄúTrumpian policies,‚Äù and the United States deter high-skilled U.K. potential emigrants; economic growth, education, and social benefits attract them. Politics and social benefits are more important for those on the political left, while economics and education weigh more heavily for those on the right. Conclusion What will attract the highly skilled migrants from a post-Brexit United Kingdom? Economics matters of course but for many of these potential emigrants politics is important‚Äîthey are particularly sensitive to anti-immigrant rhetoric.
Policy referendums around the world succeed regularly and on important policy areas. But why do these policies pass by direct democracy and not through the legislature? While previous work has explored mechanisms that help explain policy incongruence, less work has considered how this impacts policymaking in systems where citizens have alternative venues to pass legislation. I test two novel theories – exploring institutional and behavioral factors respectively – using a combination of district-level voting data, campaign finance information, and a survey of state legislators to understand why policymaking occurs via ballot initiative and not the legislature. I find successful initiatives tend not to be fully captured by the partisan dimension and are supported by more ideologically extreme donors than successful legislative candidates in the same cycles. Taken together, the evidence suggests that initiatives succeed when policies have not taken root in the mainstream policy networks that regulate conventional policymaking.
When researchers suspect that error terms are correlated by group in observational research the standard correction is to cluster the standard errors. But what about in experimental contexts where treatment is randomised? Despite their ubiquity in analyses with group-constant variables, the rationale for using clustered standard errors in experimental contexts remains underdeveloped. In this paper I present an intuitive and applied explanation of when clustering is appropriate, building on recent contributions in the statistics and econometrics literatures. I demonstrate why randomisation does not lead to identical variance estimates across estimation strategies, and conduct a review of experimental studies published between 2017 and 2019 to show that these diﬀerences can be considerable. Finally, I provide practical guidance for when and why to cluster standard errors for common experimental designs.