References

Abadie, Alberto, Susan Athey, Guido W Imbens, and Jeffrey M Wooldridge. 2023. “When Should You Adjust Standard Errors for Clustering?” The Quarterly Journal of Economics 138 (1): 1–35.
Amrhein, Valentin, and Sander Greenland. 2022. “Rewriting Results in the Language of Compatibility.” Trends in Ecology & Evolution 37 (7): 567–68.
Amrhein, Valentin, Sander Greenland, and Blake McShane. 2019. “Scientists Rise up Against Statistical Significance.” Nature 567 (7748): 305–7.
Angrist, Joshua D, and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton university press.
Aronow, Peter M, and Benjamin T Miller. 2019. Foundations of Agnostic Statistics. Cambridge University Press.
Blair, Graeme, Jasper Cooper, Alexander Coppock, Macartan Humphreys, and Luke Sonnet. 2025. Estimatr: Fast Estimators for Design-Based Inference. https://declaredesign.org/r/estimatr/.
Bloom, Howard S. 1995. “Minimum Detectable Effects: A Simple Way to Report the Statistical Power of Experimental Designs.” Evaluation Review 19 (5): 547–56.
Bowers, Jake. 2011. “Making Effects Manifest in Randomized Experiments.” In Cambridge Handbook of Experimental Political Science, edited by James N. Druckman, Donald P. Green, James H. Kuklinski, and Arthur Lupia. New York, NY: Cambridge University Press.
Bowers, Jake, Mark M Fredrickson, and Costas Panagopoulos. 2013. “Reasoning about Interference Between Units: A General Framework.” Political Analysis 21 (1): 97–124.
Bowers, Jake, Mark Fredrickson, and Ben Hansen. 2016. RItools: Randomization Inference Tools.
Bowers, Jake, and Paul F. Testa. 2019. “Better Government, Better Science: The Promise of and Challenges Facing the Evidence-Informed Policy Movement.” Annual Review of Political Science 22.
Cameron, A Colin, and Pravin K Trivedi. 2005. Microeconometrics: Methods and Applications. Cambridge university press.
Cochran, William G. 1954. “Some Methods for Strengthening the Common χ 2 Tests.” Biometrics 10 (4): 417.
Conneely, Karen N, and Michael Boehnke. 2007. “So Many Correlated Tests, so Little Time! Rapid Adjustment of p Values for Multiple Correlated Tests.” The American Journal of Human Genetics 81 (6): 1158–68.
Coppock, Alexander. 2022. Ri2: Randomization Inference for Randomized Experiments. https://doi.org/10.32614/CRAN.package.ri2.
———. 2023. Randomizr: Easy-to-Use Tools for Common Forms of Random Assignment and Sampling. https://declaredesign.org/r/randomizr/.
Dunning, Thad. 2012. Natural Experiments in the Social Sciences: A Design-Based Approach. Strategies for Social Inquiry. New York, NY: Cambridge University Press.
Esarey, Justin, and Andrew Menger. 2019. “Practical and Effective Approaches to Dealing with Clustered Data.” Political Science Research and Methods 7 (3): 541–59.
Fisher, Ronald A. 1935. The Design of Experiments. Edinburgh: Oliver; Boyd.
Freedman, David A. 2008a. On regression adjustments to experimental data.” Advances in Applied Mathematics 40 (2): 180–93.
———. 2008b. “Randomization Does Not Justify Logistic Regression.” Statistical Science 23 (2): 237–49.
Gelman, Andrew, and Jennifer Hill. 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge university press.
Gerber, Alan S., and Donald P. Green. 2012. Field Experiments: Design, Analysis, and Interpretation. New York, NY: W. W. Norton & Company.
Gomila, Robin. 2021. “Logistic or Linear? Estimating Causal Effects of Experimental Treatments on Binary Outcomes Using Regression Analysis.” Journal of Experimental Psychology: General 150 (4): 700.
Greenland, Sander. 2019. “Valid p-Values Behave Exactly as They Should: Some Misleading Criticisms of p-Values and Their Resolution with s-Values.” The American Statistician 73 (sup1): 106–14.
———. 2023. “Divergence Versus Decision p-Values: A Distinction Worth Making in Theory and Keeping in Practice: Or, How Divergence p-Values Measure Evidence Even When Decision p-Values Do Not.” Scandinavian Journal of Statistics 50 (1): 54–88.
Greifer, Noah. 2023. “Estimating Effects After Weighting.” Estimating Effects After Weighting. https://ngreifer. github. io/WeightIt ….
Guo, Kevin, and Guillaume Basse. 2023. “The Generalized Oaxaca-Blinder Estimator.” Journal of the American Statistical Association 118 (541): 524–36.
Hansen, Ben B., and Jake Bowers. 2008. “Covariate Balance in Simple, Stratified and Clustered Comparative Studies.” Statistical Science 23 (2): 219–36.
Hartman, Erin, and F Daniel Hidalgo. 2018. “An Equivalence Approach to Balance and Placebo Tests.” American Journal of Political Science 62 (4): 1000–1013.
Holland, Paul W. 1986. “Statistics and Causal Inference (with Discussion).” Journal of the American Statistical Association 81: 945–70.
Hothorn, Torsten, Henric Winell, Kurt Hornik, Mark A. van de Wiel, and Achim Zeileis. 2023. Coin: Conditional Inference Procedures in a Permutation Test Framework. http://coin.r-forge.r-project.org.
Imai, Kosuke, Gary King, and Elizabeth A Stuart. 2008. “Misunderstandings Between Experimentalists and Observationalists about Causal Inference.” Journal of the Royal Statistical Society Series A: Statistics in Society 171 (2): 481–502.
Imbens, Guido W. 2004. “Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review.” Review of Economics and Statistics 86 (1): 4–29.
Ioannidis, John PA, Tom D Stanley, and Hristos Doucouliagos. 2017. “The Power of Bias in Economics Research.” Oxford University Press Oxford, UK.
Kerwin, Jason, Nada Rostom, and Olivier Sterck. 2024. “Striking the Right Balance: Why Standard Balance Tests over-Reject the Null, and How to Fix It.” Institute of Labor Economics (IZA).
Lakens, Daniel. 2021. “The Practical Alternative to the p Value Is the Correctly Used p Value.” Perspectives on Psychological Science 16 (3): 639–48.
Lin, Winston. 2013. “Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique.” The Annals of Applied Statistics 7 (1): 295–318.
Lin, Winston, Scott D Halpern, Meeta Prasad Kerlin, and Dylan S Small. 2017. “A ‘Placement of Death’ Approach for Studies of Treatment Effects on ICU Length of Stay.” Statistical Methods in Medical Research 26 (1): 292–311.
Mantel, N, and W Haenszel. 1959. “Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease.” Journal of the National Cancer Institute 22: 719–48.
McShane, Blakeley B, David Gal, Andrew Gelman, Christian Robert, and Jennifer L Tackett. 2019. “Abandon Statistical Significance.” The American Statistician 73 (sup1): 235–45.
Middleton, Joel A, and Peter M Aronow. 2015. “Unbiased Estimation of the Average Treatment Effect in Cluster-Randomized Experiments.” Statistics, Politics, and Policy 6 (1–2): 39–75.
Miratrix, Luke W, Jasjeet S Sekhon, and Bin Yu. 2013. “Adjusting Treatment Effect Estimates by Post-Stratification in Randomized Experiments.” Journal of the Royal Statistical Society Series B: Statistical Methodology 75 (2): 369–96.
Moore, Ryan T. 2012. “Multivariate Continuous Blocking to Improve Political Science Experiments.” Political Analysis 20 (4): 460–79.
Moore, Ryan T., and Keith Schnakenberg. 2016. blockTools: Blocking, Assignment, and Diagnosing Interference in Randomized Experiments. http://www.ryantmoore.org/html/software.blockTools.html.
Muralidharan, Karthik, Mauricio Romero, and Kaspar Wüthrich. 2023. “Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments.” Review of Economics and Statistics, 1–44.
Negi, Akanksha, and Jeffrey M Wooldridge. 2021. “Revisiting Regression Adjustment in Experiments with Heterogeneous Treatment Effects.” Econometric Reviews 40 (5): 504–34.
Neyman, Jerzy. 1923. “On the Application of Probability Theory to Agricultural Experiments: Essay on Principles, Section 9.” Annals of Agricultural Sciences, 101–151 (in Polish).
Oberfichtner, Michael, and Harald Tauchmann. 2021. “Stacked Linear Regression Analysis to Facilitate Testing of Hypotheses Across OLS Regressions.” The Stata Journal 21 (2): 411–29.
Pustejovsky, James. 2019. clubSandwich: Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections. https://CRAN.R-project.org/package=clubSandwich.
Rainey, Carlisle. 2014. “Arguing for a Negligible Effect.” American Journal of Political Science 58 (4): 1083–91.
Reichardt, Charles S, and Harry F Gollob. 1999. “Justifying the Use and Increasing the Power of at Test for a Randomized Experiment with a Convenience Sample.” Psychological Methods 4 (1): 117.
Robins, James M, Miguel Angel Hernan, and Babette Brumback. 2000. “Marginal Structural Models and Causal Inference in Epidemiology.” Epidemiology. Lww.
Rosenbaum, Paul R. 2012. “Testing One Hypothesis Twice in Observational Studies.” Biometrika 99 (4): 763–74.
Rosenbaum, Paul R. 2002. “Covariance Adjustment in Randomized Experiments and Observational Studies.” Statistical Science 17 (3): 286–327.
———. 2008. Testing hypotheses in order.” Biometrika 95: 248–52.
———. 2017. Observation and experiment : an introduction to causal inference. Cambridge, MA: Harvard University Press.
Rubin, Donald B. 1986. “Comment: Which Ifs Have Causal Answers.” Journal of the American Statistical Association 81 (396): 961–62.
Rubin, Mark. 2024. “Inconsistent Multiple Testing Corrections: The Fallacy of Using Family-Based Error Rates to Make Inferences about Individual Hypotheses.” Methods in Psychology, 100140.
Samii, Cyrus, and Peter M. Aronow. 2012. “On Equivalences Between Design-Based and Regression-Based Estimators for Random Experiments.” Statistics and Probability Letters 82: 365–70.
Small, D. S., K. G. Volpp, and P. R. Rosenbaum. 2011. “Structured Testing of 2×\times 2 Factorial Effects: An Analytic Plan Requiring Fewer Observations.” The American Statistician 65 (1): 11–15.
Snowden, Jonathan M, Sherri Rose, and Kathleen M Mortimer. 2011. “Implementation of g-Computation on a Simulated Data Set: Demonstration of a Causal Inference Technique.” American Journal of Epidemiology 173 (7): 731–38.
Stuart, Elizabeth A. 2010. “Matching Methods for Causal Inference: A Review and a Look Forward.” Statistical Science: A Review Journal of the Institute of Mathematical Statistics 25 (1): 1.
Weesie, Jeroen. 2000. “Seemlingly Unrelated Estimation and the Cluster-Adjusted Sandwich Estimator.” Stata Technical Bulletin 9 (52).
Zou, Hui. 2006. “The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association 101 (476): 1418–29.