Probabilistic Principle Component Analysis (PPCA)

less than 1 minute read

Probabilistic Principle Component Analysis (PPCA) is a probabilistic version of PCA. It assumes linear relationship between observed variables and latent variables. PPCA can not only generate samples (fill missing values) as a generative model but also provide uncertainty of predictions as a probabilistic model.

Assume we have a set of data points where . We assume the linear relationship between observed variables and latent variables with noise as

where the latent variable has a prior as where and . Since this is a form of linear Gaussian, marginal likelihood of with can easily be derived as follows.

Then with i.i.d. assumption, log-likelihood of is

where and . We can have an exact form of solutions for by maximizing log-likelihood above after tedious calculation. The maximum likelihood (ML) solutions are as follows.

where is a matrix composed of principal eigenvectors of with corresponding eigenvalues in a diagonal matrix and is an arbitrary orthogonal rotation matrix.

In order to make predictions of for visualization, we can derive a probability distribution of given by Bayes’ rule.

Comments