Data: chall.txt  Download Notebook
Contents
 Collaborators
 Question 1: We’ll Always Have that Night Sampling in Monte Carlo
 Question 2: The Consequences of Oring Failure can be Painful and Deadly
 Question 3: Maximum Uniformity – Frequentist Bootstraps and the Bayesian Posterior
Harvard University
Fall 2018
Instructors: Rahul Dave
**Due Date: ** Saturday, October 13th, 2018 at 11:59pm
Instructions:

Upload your final answers in the form of a Jupyter notebook containing all work to Canvas.

Structure your notebook and your work to maximize readability.
Collaborators
** Place the name of everyone who’s submitting this assignment here**
import numpy as np
import scipy.stats
import scipy.special
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
from matplotlib import cm
import pandas as pd
%matplotlib inline
Question 1: We’ll Always Have that Night Sampling in Monte Carlo
Coding required
Let $X$ be a random variable with distribution described by the following pdf:
Let $h$ be the following function of $X$:
Compute $\mathbb{E}[h(X)]$ via Monte Carlo simulation using the following sampling methods:
1.1. Inverse Transform Sampling
1.2. Rejection Sampling with a uniform proposal distribution (rejection sampling in a rectangular box with uniform probability of sampling any x)
Question 2: The Consequences of Oring Failure can be Painful and Deadly
Coding required
In 1986, the space shuttle Challenger exploded during take off, killing the seven astronauts aboard. It is believed that the explosion was caused by the failure of an Oring (a rubber ring that seals parts of the solid fuel rockets together), and that the failure was caused by the cold weather at the time of launch (31F).
In the file chall.txt, you will find temperature (in Fahrenheit) and failure data from 23 shuttle launches, where 1 stands for Oring failure and 0 no failure. We assume that the observed temperatures are fixed and that, at temperature $t$, an Oring fails with probability $f(\theta_{1}+\theta_{2}t)$ conditionally on $\Theta = (\theta_1, \theta_2)$.
$f(\v{z})$ is defined to be the logistic function – $f(\v{ z }) = 1/(1 + \exp(\v{ z }))$
2.1. Based on your own knowledge and experience, suggest a prior distribution for the regression parameters ($\theta_1, \theta_2$). Make sure to explain your choice of prior.
2.2. Produce 500010000 samples from the posterior distribution of $\Theta $ using rejection sampling, and plot them and their marginals. (This may take a while.)
2.3. Use the logit package in the statsmodels
library to compute 68% confidence intervals on the $\theta$ parameters. Compare those intervals with the 68% credible intervals from the posterior above. Overlay these on the above marginals plots.
2.4. Use the MLE values from statsmodels
and the posterior mean from 2.2 at each temperature to plot the probability of failure in the frequentist and bayesian settings as a function of temperature. What do you see?
2.5. Compute the mean posterior probability for an Oring failure at $t = 31^\circ F$. To do this you must calculate the posterior at $31^\circ F$ and take the mean of the samples obtained.
2.6. You can instead obtain the probability from the posterior predictive. Use the posterior samples to obtain samples from the posterior predictive at $31^\circ F$ and calculate the fraction of failures.
2.7. The day before a new launch, meteorologists predict that the temperature will be $T \sim N(68, 1)$ during takeoff. Estimate the probability for an Oring failure during this takeoff. (You will calculate multiple predictives at different temperatures for this purpose).
Question 3: Maximum Uniformity – Frequentist Bootstraps and the Bayesian Posterior
Coding required
Recall in HW 3 Question 1 we attempted to explore an edge case in using nonparametric bootstrap to construct confidence intervals. Let’s revisit the setup of that problem. Suppose you have ${X_1, X_2, … X_n}$ datapoints such that $X_i$ are independently and identically drawn from a $Unif(0, \theta)$. Consider the extreme order statistic Y = $X_{(n)}$ = max($X_1, X_2, … X_n$).
1.1. Derive (or possibly rewrite from HW3) expressions for $F_Y(y\ \vert\ n, \theta)$ the CDF of Y and $f_Y(y\  \ n, \theta)$ the pdf of Y. 
1.2. In HW3 we had difficulty constructing confidence intervals to estimate $\theta$ with percentiles as normal so instead we introduced pivot confidence intervals. Let’s reframe the problem so that we can use percentiles to construct our confidence intervals. Define $Z \equiv n \cdot (\theta  Y)$ use elementary calculation to write an expression for $F_Z(z\ \vert\ n, \theta)$ the CDF of $Z$ and $f_Z(y\  \ n, \theta)$ the pdf of Z. 
1.3. What is the limiting distribution of Z (as $n \rightarrow \infty$)? Plot that limiting distribution.
1.4. Use scipy/numpy to generate 100000 samples {$X_i$} from Unif(0,100) (i.e. let $\theta$ = 100). Store them in Based on your data sample, what’s $\hat{\theta}$ the empirical estimate for $\theta$.
1.5. Use nonparametric bootstrap to generate a sampling distribution of 10000 estimates for $Z$ by substituting $\hat{\theta}$ for $\theta$. Plot a histogram of your sampling distribution. Make sure to title and label the plot. Use percentiles to construct the 10% and 68% bootstrap confidence intervals. Plot them in your graph.
Hint: Should the confidence intervals be symmetric around the estimate $\hat{\theta}$?
1.6. Make an argument that we can construct a bootstrap confidence interval that always mismatches the limiting distribution.
1.8. Let’s switch to being Bayesian. In 1.1 we came up with an expression for the likelihood $f_Y(y\  \ n, \theta)$. Use the Pareto distribution to construct a prior for $\theta$. What are some reasonable values to use for the scale and shape? 
1.9. Write down an expression for the posterior distribution $f_Y(\theta\  \ n, y)$ 
1.10. Draw 10000 posterior samples and plot a histogram of the posterior distribution. Use percentiles to construct the 68% HPD. Plot the posterior distribution and mark the HPD on your plot.
1.11. How does th 68% HPD compare with the confidence interval generated from bootstrapping? Why doesn’t the bayesian interval construction suffer the same concerns you noted in 1.6