A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

Hierarchical Priors for Multinomial Data

As part of an ongoing project to understand autoregressive structures in hierarchical models I came across some WinBUGS code code that puts a non-Wishart prior on a Multivariate Normal precision matrix for a Multinomial-logistic model of pig weight histograms. Why might one do such a thing? Well, often when putting together histograms of things the distribution doesn’t look as smooth as one would expect. In the case of these pigs, the distribution of weights is slightly uneven:

Pig data

The objective in this model (from pg.177 of Congdon’s 2001 Bayesian Statistical Modelling book) is to smooth the observed histogram so that it looks more like the density of the population from which the sample is drawn, q(y).

Because histograms are broken into bins into which the observations falling are stacked, Congdon’s approach was to use a Multinomial distribution to estimate expected frequencies in weight bins ranging from 19 to 39. The key twist is that the smoothing occurs by assuming that adjacent points on the histogram have similar frequencies and cannily using an AR1 process to structure this belief.

Prior specification using matrices is a tricky business, not one that I’m an expert in by a long shot. But it is important for some water quality modelling I’ve been working on that will have a spatial AR process in the mix.

So on to the PyMC implementation…


First, the observed frequency of pig weight gains from Table 5.8 of Congdon:

import numpy as np

# Observed frequencies
weights = np.array([1,1,0,7,5,10,30,30,41,48,66,72,56,46,45,22,24,12,5,0,1])

# Total number of observations
n = sum(weights)

# Total number of bins
s = len(weights)


Next in our Bayesian model comes prior specification. For this model the covariance matrix $V=P^{-1}$ (that is, the inverse of the precision matrix, $P$) assumes a smoothness structure:

\[ V_{ij}=\sigma^{2}\rho^{|i-j|} \]

which means the covariance between bins i and j is equal to some common variance, $\sigma^{2}$, times a correlation coefficient $\rho$, and that this relationship diminishes exponentially by how far the bins are away from each other (i.e. $|i-j|$). This, Congdon says, is akin to the smoothing we’d see in an AR1 time series and expresses our prior belief as to the histograms’s smoothness. If $\tau = \sigma^{-2}(1-\rho^{2})^{-1}$, then the resulting precision matrix $P$ has the form:

$$ \begin{align} P_{ij} = P_{ji}=0, & for & i=1,…,J-2; j=2+i, J \end{align}$$

$$ \begin{align} P_{jj}=\tau(1+\rho^{2}) & & j=2,…,J-1 \end{align}$$

$$ P_{11,JJ} = \tau $$

$$ \begin{align} P_{j,j+1} = P_{j+1,j} = -\rho\tau & & j=2,…,J-1 \end{align} $$

To build this precision matrix in PyMC requires first defining the priors for $\rho$ and $\tau$:

# Prior on elements of AR Precision Matrix 
rho = Uniform('rho', lower=0, upper=1)
tau = Uniform('tau', lower=0.5, upper=10)

Variable $\rho$ is the correlation and, because we expect adjacent bins to be similar, it is bounded positively between zero and one; $\tau$ is a precision bounded between 0.5 and 10 due to prior information that Congdon cites in his text; we could easily use $\tau\sim U(0,100)$ priors instead, the results are the same. Next the precision matrix itself:

# Define precision matrix
def P(tau=tau, rho=rho):
    M = (np.zeros(s**2)).reshape(s,s)
    M[range(1,s-1),range(1,s-1)] = np.ones(s-2)*(tau*(1+rho**2))
    M[(0,s-1),(0,s-1)] = tau
    M[range(0,s-1),range(1,s)] = np.ones(s-1)*(-tau*rho)
    M[range(1,s),range(0,s-1)] = np.ones(s-1)*(-tau*rho)
    return M

This uses the decorator instantiation of PyMC (@deterministic) that allows the matrix to be specified as a regular Python function (def). A bit about the code here to explain; in the same order as the algebra above and for each line of PyMC code in function P:

  1. Sets up an empty matrix M that is the same dimension as the number of bins;
  2. Sets the diagonal elements to $\tau(1+\rho^{2})$ except for the upper left and lower right elements;
  3. Sets the upper left and lower right elements to $\tau$;
  4. Sets the off-diagonal elements to $-\rho\tau$;
  5. Return the matrix M.


Now that we have priors for P and have P itself is set up, we can estimate the multiple logit model parameters using a Multivariate Normal distribution:

# MVN for logit parameters
theta = MvNormal('theta', mu=np.ones(s)*-np.log(s), tau=P)

The mu=np.ones(s)*-np.log(s) part of this model correspond to a neutral (equal) prior for the Multinomial probabilities $\pi_{j}$ that will be used to estimate the frequencies in the smoothed histogram. From this, we can apply a multiple logit tranformation to get the bin probabilities:

# Bin probabilities
pi = Lambda('pi', lambda g=theta: np.exp(g)/sum(np.exp(g)))


These probabilities are then used with the data to complete the likelihood:

# Likelihood
Yi = Multinomial('Yi', n=n, p=pi, value=weights, observed=True)

Finally, we can calculate the expected bin frequencies for the smoothed histogram from the posterior probabilities for each bin and the total number of pigs weighed:

# Smoothed frequencies
Sm = Lambda('Sm', lambda pi=pi: n*pi)

Put this all together and run for $10e^{5}$ iterations, we get a rather satisfactory result, quite near the estimates from Congdon’s original Table 5.8 and the WinBUGS example:

Pig smoothed

Comments are closed.