{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Inference after ranking\n", "\n", "This is a template for regression analysis after ranking. It estimates the parameters using conditionally quantile-unbiased estimates and \"almost\" quantile-unbiased hybrid estimates.\n", "\n", "Click the badge below to use this template on your own data. This will open the notebook in a Jupyter binder.\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gl/dsbowen%2Fconditional-inference/HEAD?urlpath=lab/tree/docs/examples/rank_conditions.ipynb)\n", "\n", "Instructions:\n", "\n", "1. Upload a file named `data.csv` to this folder with your conventional estimates. Open `data.csv` to see an example. In this file, we named our dependent variable \"dep_variable\", and have estimated parameters named \"policy0\",..., \"policy9\". The first column of `data.csv` contains the conventional estimates $m$ of the unknown parameters. The remaining columns contain consistent estimates of the covariance matrix $\\Sigma$. In `data.csv`, $m=(0, 1,..., 9)$ and $\\Sigma = I$.\n", "2. Modify the code if necessary.\n", "3. Run the notebook.\n", "\n", "### Citations\n", "\n", " @techreport{andrews2019inference,\n", " title={Inference on winners},\n", " author={Andrews, Isaiah and Kitagawa, Toru and McCloskey, Adam},\n", " year={2019},\n", " institution={National Bureau of Economic Research}\n", " }\n", "\n", " @article{andrews2022inference,\n", " Author = {Andrews, Isaiah and Bowen, Dillon and Kitagawa, Toru and McCloskey, Adam},\n", " Title = {Inference for Losers},\n", " Journal = {AEA Papers and Proceedings},\n", " Volume = {112},\n", " Year = {2022},\n", " Month = {May},\n", " Pages = {635-42},\n", " DOI = {10.1257/pandp.20221065},\n", " URL = {https://www.aeaweb.org/articles?id=10.1257/pandp.20221065}\n", " }\n", "\n", "### Runtime warnings and long running times\n", "\n", "If you are estimating the effects of many policies or the policy effects are close together, you may see `RuntimeWarning` messages and experience long runtimes. Runtime warnings are common, usually benign, and can be safely ignored." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "from multiple_inference.bayes import Improper\n", "from multiple_inference.rank_condition import RankCondition\n", "\n", "data_file = \"data.csv\"\n", "alpha = .05\n", "\n", "conventional_model = Improper.from_csv(data_file, sort=True)\n", "ranked_model = RankCondition.from_csv(data_file, sort=True)\n", "sns.set()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We'll start by summarizing and plotting the conventional estimates." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conventional_results = conventional_model.fit(title=\"Conventional estiamtes\")\n", "conventional_results.summary(alpha=alpha)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conventional_results.point_plot(alpha=alpha)\n", "plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "One property we want our estimators to have is *quantile-unbiasedness*. An estimator is quantile-unbiased if the true parameter falls below its $\\alpha$-quantile estimate with probability $\\alpha$ given its estimated rank. For example, the true effect of the top-performing treatment should fall below its median estimate half the time.\n", "\n", "Similarly, we want confidence intervals to have *correct conditional coverage*. Correct conditional coverage means that the parameter should fall within our $\\alpha$-level confidence interval with probability $1-\\alpha$ given its estimated rank. For example, the true effect of the top-performing treatment should fall within its 95% confidence interval 95% of the time.\n", "\n", "Below, we compute the optimal quantile-unbiased estimates and conditionally correct confidence intervals for each parameter given its rank." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conditional_results = ranked_model.fit(title=\"Conditional estimates\")\n", "conditional_results.summary(alpha=alpha)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conditional_results.point_plot(alpha=alpha)\n", "plt.show()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Conditional inference is a strict requirement. Conditionally quantile-unbiased estimates can be highly variable. And conditionally correct confidence intervals can be unrealistically long. We can often obtain more reasonable estimates by focusing on *unconditional* inference instead of *conditional* inference.\n", "\n", "Imagine we ran our randomized control trial 10,000 times and want to estimate the effect of the top-performing treatment. We need *conditional* inference if we're interested the subset of trials where a specific parameter $k$ was the top performer. However, we can use *unconditional* inference if we're only interested in being right \"on average\" across all 10,000 trials.\n", "\n", "Below, we use *hybrid estimates* to compute approximately quantile-unbiased estimates and unconditionally correct confidence intervals for each parameter.\n", "\n", "If you don't know whether you need conditional or unconditional inference, use unconditional inference." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hybrid_results = ranked_model.fit(beta=.005, title=\"Hybrid estimates\")\n", "hybrid_results.summary(alpha=alpha)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hybrid_results.point_plot(alpha=alpha)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "interpreter": { "hash": "a31fe93114e6fe9c0b874076e62df141d5b35f609e1bfa94ca168a298e55e549" }, "kernelspec": { "display_name": "Python 3.9.0 ('conditional-inference')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }