Kosuke Imai’s Research on Computational Social Science

Overview

Over the last two decades, the amount and variety of data available to social scientists have dramatically increased. While in the 1990s most researchers were analyzing a handful of national surveys and government data, today’s quantitative social scientists conduct their own randomized experiments and surveys and analyze a diverse array of large-scale data sets, ranging from textual to spatial data. This emerging trend demands new statistical methodologies that enable social scientists to overcome these data analytical and computational challenges.I have developed fast and reliable computational methods for popular Bayesian models such as the multinomial probit and ecological inference models. I have also worked on the development of computational methods for lage-scale data sets in social science research. They include the fast and scalable estimation of various ideal point models for massive data, a dynamic clustering method for large scale product-level trade data, a dynamic regression model for networks, analyses of textual and video data, simulation and enumeration methods for redistricting, and a method for record linkage with large-scale administrative data.

 Manuscripts and Publications

Algorithm-assisted human decision-making:
Imai, Kosuke, Zhichao Jiang, D. James Greiner, Ryan Halen, and Sooahn Shin. “Experimental Evaluation of Algorithm-Assisted Human Decision-Making: Application to Pretrial Public Safety Assessment.” (with discussion) Journal of the Royal Statistical Society, Series A (Statistics in Society), Forthcoming. To be read before the Royal Statistical Society.
Ben-Michael, Eli, D. James Greiner, Kosuke Imai, and Zhichao Jiang. “Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment.
Imai, Kosuke and Zhichao Jiang. “Principal Fairness for Human and Algorithmic Decision-Making.”
Heterogeneous treatment effects:
Imai, Kosuke, and Aaron Strauss. (2011). “Estimation of Heterogeneous Treatment Effects from Randomized Experiments, with Application to the Optimal Planning of the Get-out-the-vote Campaign.” Political Analysis, Vol. 19, No. 1 (Winter), pp. 1-19. (lead article) Winner of Political Analysis Editors’ Choice Award.
Imai, Kosuke and Marc Ratkovic. (2013). “Estimating Treatment Effect Heterogeneity in Randomized Program Evaluation.” Annals of Applied Statistics, Vol. 7, No. 1 (March), pp. 443-470. Winner of the Tom Ten Have Memorial Award.
Imai, Kosuke and Michael Lingzhi Li. “Experimental Evaluation of Individualized Treatment Rules.” Journal of the American Statistical Association, Forthcoming.
Highdimensional treatments:
Egami, Naoki, and Kosuke Imai. (2019). “Causal Interaction in Factorial Experiments: Application to Conjoint Analysis.” Journal of the American Statistical Association, Vol. 114, No. 526 (June), pp. 529-540.
de la Cuesta, Brandon, Naoki Egami, and Kosuke Imai. (2022). “Experimental Design and Statistical Inference for Conjoint Analysis: The Essential Role of Population Distribution..” Political Analysis, Vol. 30, No. 1 (January), pp. 19-45.
Goplerud, Max, Kosuke Imai, Nicole E. Pashley. “Estimating Heterogeneous Causal Effects of High-Dimensional Treatments: Application to Conjoint Analysis.”
Ham, Dae Woong, Kosuke Imai, and Lucas Janson. “Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis.”
Highdimensional propensity score:
Ning, Yang, Sida Peng, and Kosuke Imai. (2020). “Robust Estimation of Causal Effects via High-Dimensional Covariate Balancing Propensity Score..” Biometrika, Vol. 107, No. 3 (September), pp. 533–554.
Clustering and scaling methods for large-scale data:
Imai, Kosuke, James Lo, and Jonathan Olmsted. (2016). “Fast Estimation of Ideal Points with Massive Data.” American Political Science Review, Vol. 110, No. 4 (December), pp. 631-656.
Kim, In Song, Steven Liao, and Kosuke Imai. (2020). “Measuring Trade Profile with Granular Product-level Trade Data.” American Journal of Political Science, Vol. 64, No. 1 (January), pp. 102-117.
Olivella, Santiago, Tyler Pratt, and Kosuke Imai. “Dynamic Stochastic Blockmodel Regression for Network Data: Application to International Conflicts..” Journal of the American Statistical Association, Forthcoming
Analysis of unstructured data: texts, video, and maps:
McCartan, Cory, Jacob Brown, and Kosuke Imai. “Measuring and Modeling Neighborhoods.”
Tarr, Alexander, June Hwang, and Kosuke Imai. “Automated Coding of Political Campaign Advertisement Videos: An Empirical Validation Study.
Eshima, Shusei, Kosuke Imai, and Tomoya Sasaki. “Keyword Assisted Topic Models.”
Algorithms for legislative redistricting:
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler Simko, and Kosuke Imai. (2021). “The Use of Differential Privacy for Census Data and its Impact on Redistricting: The Case of the 2020 U.S. Census..” Science Advances, Vol. 7, No. 7 (October), pp. 1-17.
McCartan, Cory and Kosuke Imai. “Sequential Monte Carlo for Sampling Balanced and Compact Redistricting Plans.”
Fifield, Benjamin, Michael Higgins, Kosuke Imai, and Alexander Tarr. (2020). “Automated Redistricting Simulation Using Markov Chain Monte Carlo.” Journal of Computational and Graphical Statistics, Vol. 29, No. 4, pp. 715-728.
Fifield, Benjamin, Kosuke Imai, Jun Kawahara, and Christopher T. Kenny. (2020). “The Essential Role of Empirical Validation in Legislative Redistricting Simulation.” Statistics and Public Policy, Vol. 7, No. 1, pp 52-68.
Record linkage methods:
Enamorado, Ted, Benjamin Fifield, and Kosuke Imai. (2019). “Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records.” American Political Science Review, Vol. 113, No. 2 (May), pp. 353-371.
Enamorado, Ted, and Kosuke Imai. (2019). “Validating Self-reported Turnout by Linking Public Opinion Surveys with Administrative Records.” Public Opinion Quarterly, Vol. 83, No. 4 (Winter), pp. 723–748.
Multinomial probit models:
Imai, Kosuke, and David A. van Dyk. (2005). “A Bayesian Analysis of the Multinomial Probit Model Using Marginal Data Augmentation.” Journal of Econometrics, Vol. 124, No. 2 (February), pp. 311-334.
Imai, Kosuke, and David A. van Dyk. (2005). “MNP: R Package for Fitting the Multinomial Probit Model.” Journal of Statistical Software, Vol. 14, No. 3 (May), pp. 1-32. abstract reprinted in Journal of Computational and Graphical Statistics, (2005) Vol. 14, No. 3 (September), p. 747.
Ecological inference and racial prediction models:
Imai, Kosuke, and Gary King. (2004). “Did Illegal Overseas Absentee Ballots Decide the 2000 U.S. Presidential Election?.” Perspectives on Politics, Vol. 2, No. 3 (September), pp.537-549. Our analysis is a part of The New York Times article, “How Bush Took Florida: Mining the Overseas Absentee Vote” By David Barstow and Don van Natta Jr. July 15, 2001, Page 1, Column 1.
Imai, Kosuke, Ying Lu, and Aaron Strauss. (2008). “Bayesian and Likelihood Inference for 2 x 2 Ecological Tables: An Incomplete Data Approach.” Political Analysis, Vol. 16, No. 1 (Winter), pp. 41-69.
Imai, Kosuke, Ying Lu, and Aaron Strauss. (2011). “eco: R Package for Ecological Inference in 2 x 2 Tables.” Journal of Statistical Software, Vol. 42, No. 5 (Special Volume on Political Methodology), pp. 1-23.
Imai, Kosuke and Kabir Khanna. (2016). “Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Record.” Political Analysis, Vol. 24, No. 2 (Spring), pp. 263-272.

 Statistical Software

Imai, Kosuke, Ying Lu, and Aaron Strauss. “eco: R Package for Ecological Inference in 2 x 2 Tables.” available through The Comprehensive R Archive Network. 2004-2009.
Imai, Kosuke, and David A. van Dyk. “MNP: R Package for Fitting the Multinomial Probit Model.” available through The Comprehensive R Archive Network. 2004-2008.
Khanna, Kabir, and Kosuke Imai. “wru: Who Are You? Bayesian Predictions of Racial Category Using Surname and Geolocation.” available through GitHub. 2015.
Fifield, Benjamin, Christopher T. Kenny, Cory MaCartan, Alexander Tarr, and Kosuke Imai. “redist: Computational Algorithms for Redistricting Simulation.” available through The Comprehensive R Archive Network and GitHub.
Imai, Kosuke, James Lo, and Jonathan Olmsted. “emIRT: EM Algorithms for Estimating Item Response Theory Models.” available through The Comprehensive R Archive Network and the GitHub. 2015.