Probabilistic programs implement statistical models. Commonly, probabilistic programs follow the Bayesian generative pattern:
\begin{equation} \begin{aligned} x & \sim \mathrm{Prior} \\ y & \sim \mathrm{Conditional}(x) \end{aligned} \end{equation}
A prior is imposed on the latent variable $x$. Then, observations $y$ are drawn from a distribution conditioned on $x$. The program and the observations are passed to an inference algorithm which infers the posterior of latent variable $x$.
The questions is: what is observed?
Read More →
I taught a course on Bayesian data analysis, closely following the book by Andrew Gelman et al., but with the twist of using probabilistic programming, either Stan or Infergo, for all examples and exercises. However, it turned out that at least one important problem in the book is beyond the capabilities of Stan.
This case study is inspired by Section 7.6 in Bayesian Data Analysis, originally a paper published in 1983 by Ronald Rubin.
Read More →
Gaussian processes are great for time series forecasting. The time series does not have to be regular — ‘missing data’ is not an issue. A kernel can be chosen to express trend, seasonality, various degrees of smoothness, non-stationarity. External predictors can be added as input dimensions. A prior can be chosen to provide a reasonable forecast when little or even no data is available.
However, behind the Gaussian process stands an assumption that all observations come from a Gaussian distribution with constant noise and the mean lying on a smooth function of time.
Read More →