PyStan 体验¶
安装环境¶
Python 3.6.4 |Anaconda, Inc.| (default, Dec 21 2017, 21:42:08) [GCC 7.2.0] on linux
直接运行
pip install pystan
Example: parallel experiments in eight schools¶
This example is from Section 5.5 of Gelman et al. (2013).
ETS perform a study to analyze the effects of special coaching programs on test scores. The results of the experiments are summarized in the following table:
A pooled estimate¶
This pooled estimate is 7.7, and the posterior variance is (\sum_{j=1}^81/\sigma_j^2)^{-1}=16.6. Thus, we would estimate the common effect to be 7.7 points with standard error equal to \sqrt{16.6}=4.1, which would lead to the 95\% posterior interval [-0.5, 15.9].
Posterior simulation under the hierarchical model¶
Compute the posterior distribution of \theta_1,\ldots,\theta_8.
The Bayesian analysis of this example not only allows straightforward inferences about many parameters that may be of interest, but the hierarchical model is flexible enough to adapt to the data, thereby providing posterior inferences that account for the partial pooling as well as the uncertainty in the hyperparameters.
import pystan
schools_code = """
data {
int<lower=0> J; // number of schools
vector[J] y; // estimated treatment effects
vector<lower=0>[J] sigma; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
eta ~ normal(0, 1);
y ~ normal(theta, sigma);
}
"""
schools_dat = {'J': 8,
'y': [28, 8, -3, 7, -1, 1, 18, 12],
'sigma': [15, 10, 16, 11, 9, 11, 10, 18]}
sm = pystan.StanModel(model_code=schools_code)
fit = sm.sampling(data=schools_dat, iter = 1000, chains = 4)
# return a dict of arrays
la = fit.extract(permuted=True)
mu = la['mu']
# return an array of three dimensions: iterations, chains, parameters
a = fit.extract(permuted=False)
# plot
fit.plot()