Probabilistic Principle Component Analysis (PPCA)

less than 1 minute read

Probabilistic Principle Component Analysis (PPCA) is a probabilistic version of PCA. It assumes linear relationship between observed variables and latent variables. PPCA can not only generate samples (fill missing values) as a generative model but also provide uncertainty of predictions as a probabilistic model.

Assume we have a set of data points $X=\{x_1, \cdots, x_N\}$ where $x_n \in \mathbb{R}^d$ . We assume the linear relationship between observed variables and latent variables with noise as

$x_i = Wt_i + \mu + \epsilon$

where the latent variable has a prior as $t_n \sim \mathcal{N}(0, \mathcal{I})$ where $t_n \in \mathbb{R}^q$ and $\epsilon \sim \mathcal{N}(0, \sigma^2\mathcal{I})$ . Since this is a form of linear Gaussian, marginal likelihood of $X$ with can easily be derived as follows.

$p(x_i) = \mathcal{N}(\mu, WW^T+\sigma^2\mathcal{I})$

Then with i.i.d. assumption, log-likelihood of $X$ is

$\begin{align*} \mathcal{l}(W, \mu, \beta) &= \log p(X) = \sum_{n=1}^N \log p(x_n) \\ &= -\frac{N}{2}\Big\{d\log(2\pi)+\log|\Sigma|+tr(\Sigma^{-1}S)\Big\} \end{align*}$

where $\Sigma = WW^T + \sigma^2\mathcal{I}$ and $S=\frac{1}{N}\sum_{n=1}^N(x_n-\bar{x})(x_n-\bar{x})^T$ . We can have an exact form of solutions for $W, \mu, \sigma^2$ by maximizing log-likelihood above after tedious calculation. The maximum likelihood (ML) solutions are as follows.

$\begin{align*} W_{ML} &= U_q(\Lambda_q - \sigma^2\mathcal{I})^{1/2}R \\ \mu_{ML} &= \frac{1}{N} \sum_{n=1}^N x_n \\ \sigma^2_{ML} &= \frac{1}{d-q} \sum_{j=q+1}^d \lambda_j \end{align*}$

where $U_q$ is a $d \times q$ matrix composed of $q$ principal eigenvectors of $S$ with corresponding eigenvalues $\lambda_1, \cdots, \lambda_q$ in a $q \times q$ diagonal matrix $\Lambda_q$ and $R$ is an arbitrary orthogonal rotation matrix.

In order to make predictions of $t_n$ for visualization, we can derive a probability distribution of $t_n$ given $x_n$ by Bayes’ rule.

$p(t_n|x_n) = \mathcal{N}(\Sigma^{-1}W^T(t_n-\mu), \sigma^2\Sigma^{-1})$

Twitter Facebook Google+ LinkedIn

Ingyo Chung

Probabilistic Principle Component Analysis (PPCA)

Comments

You May Also Enjoy

Principle Component Analysis (PCA)

Kullback–Leibler (KL) divergence of two Gaussians