Just a quick cheatsheet on derivatives (of scalars and vectors) wrt of a vector. This is borrowed from the wiki page : Matrix Calculus.

Usually, in print following notations are in use:

A : Matrix (capital and bold)
b : Vector (small and bold)
c : scalar (small and not bold)

The rules for derivatives of a scalar by a vector

Screenshot from 2018-04-24 10:44:57.png

Screenshot from 2018-04-24 10:46:31.png

The rules for derivatives of a vector by a vector

Screenshot from 2018-04-24 10:12:26.png

Using all these rules, I have derived the commonly used least squares equation. The error function is a scalar. The optimization variable (the unknown is a vector). We are trying to minimize the error, hence we need to take the gradient (vector derivative) of error with respect to the optimization variable( the unknown) and set it to zero-vector. The values that make the gradient zeros will also be optimal value for the error (here minimum).

IMG_20180424_103301.jpg

 

Hope this helps!

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s