# Two tricks for change of variables in MCMC

2017/05/23

Change of variables are sometimes advantageous, and occasionally inevitable for MCMC if you want efficient sampling, or to model a distribution that was obtained by a transformation. A classic example is the lognormal distribution: when

$\log(y) \sim N(\mu, \sigma^2)$

one has to adjust the log posterior by $$-\log y$$ since

$\frac{\partial \log(y)}{\partial y} = \frac{1}{y}$

and

$\log(1/y) = -\log(y).$

In Stan, one would accomplish this as

target += -log(y)

In general, when you transform using a multivariate function $$f$$, you would adjust by

$\log\det J_f(y)$

which is the log of the determinant of the Jacobian — some texts simply refer to this as "the Jacobian".

The above is well-known, but the following two tricks are worth mentioning.

## Chaining transformations

Suppose that you are changing a variable by using a chain of two functions $$f \circ g$$. Then

$\log\det J_{f \circ g}(y) = \log \bigl(\det J_f(g(y)) \cdot \det J_g(y)\bigr) \\\\ = \log\det J_f(g(y)) + \log\det J_g(y)$

which means that you can simply add (the log determinant of) the Jacobians, of course evaluated at the appropriate points.

This is very useful when $$f \circ g$$ is complicated and $$J_{f\circ g}$$ is tedious to derive, or if you want to use multiple $$f$$s or $$g$$s and economize on the algebra. From the above, it is also easy to see that this generalizes to arbitrarily long chains of functions $$f_1 \circ f_2 \circ \dots$$.

This trick turned out to be very useful when I was fitting a model where a transformation was general to both equilibrium concepts I was using (a noncooperative game and a social planner), so I could save on code. Of course, since #2224 is WIP, I had to copy-paste the code, but still saved quite a bit of work.

## Transforming a subset of variables

Suppose $$x \in \mathbb{R}^m$$ and $$y \in \mathbb{R}^n$$ are vectors, and you are interested in transforming to

$z = f(x,y)$

where $$x$$ and $$z$$ have the same dimension. It is useful to think about this transformation as

$g(x,y) = [f(x,y), y]^\top$

where $$g : \mathbb{R}^{m+n} \to \mathbb{R}^{m+n}$$. Since $$y$$ is mapped to itself,

$J_g = \begin{bmatrix} J_{f,x} & J_{f,y} \\\\ 0 & I \end{bmatrix}$

has a block structure, where

$J_{f,x} = \frac{\partial f(x,y)}{\partial x}$

and similarly for $$J_{f,y}$$. For the calculation of the determinant, you can safely ignore the latter, and $$\log \det I = 0$$, so

$\log\det J_g = \log\det J_{f,x}$

site not optimized for small screens, math may break