Gaussian distribution have these nice property that under linear transformation the resulting distribution is still a Gaussian distribution. This property is successfully exploited by the Kalman Filter. In a previous post I had explored the details of Extended Kalman Filter.

Today I am exploring the intuitive meaning of Marginalization vs Conditioning for Gaussian distributions. It is very easy to confuse these two terms. While reading on Gausssian Processes (which is a common technique in robotics) I came across this excellent post from distill.pub. Most of the material in this post is borrowed from it (this is essentially a summarization of it). Highly recommend to read that article and especially look at interactive animations.

Note that *Marginalization* and *Conditioning* both deal with joint distribution but the effects are different.

## Multivariate Gaussian Distribution Defination

## Marginalization

In plain English, what this means is that, we are interested in probability distribution of X=x. For this we need to consider all possible values of y and average over it. Mathematically this is written,

For Gaussian distributions (jointly gaussian) in X, Y, **the way to achieve it is to cherry pick the means and variances from the joint distribution** corresponding to this partition. Mathematically one would write this as follows. It is however easy to confuse and look-over. Important point is that the resulting distribution is still a gaussian distribution (but with reduced dimensionality).

## Conditioning

In plain English, conditioning is how the distribution of the other variable looks like when the first variable achieves a particular value. The resulting distribution after conditioning is also a Gaussian distribution. Means and variance of this new distribution are as follows:

Hopefully this post brings to front this subtle difference. This is the foundation stone for a class of methods called Gaussian Processes which are commonly used in robotics for estimation problems.