Matrix Calculus

Go to: Introduction, Notation, Index



Contents of Calculus Section

Notation

Derivatives

In the main part of this page we express results in terms of differentials rather than derivatives for two reasons: they avoid notational disagreements and they cope easily with the complex case. In most cases however, the differentials have been written in the form dY: = dY/dX dX: so that the corresponding derivative may be easily extracted.

Derivatives with respect to a real matrix

If X is p#q and Y is m#n, then dY: = dY/dX dX: where the derivative dY/dX is a large mn#pq matrix. If X and/or Y are column vectors or scalars, then the vectorization operator : has no effect and may be omitted. dY/dX is also called the Jacobian Matrix of Y: with respect to X: and det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Although they do not generalise so well, other authors use alternative notations for the cases when X and Y are both vectors or when one is a scalar. In particular:

Derivatives with respect to a complex matrix

If X is complex then dY: = dY/dX dX: can only be true iff Y(X) is an analytic function which implies in particular that Y(X) does not depend on XC or XH.

Even for non-analytic functions we can write uniquely dY: = dY/dX dX: + dY/dXC dXC: provided that   is analytic with respect to X and XC individually (or equivlaently with respect to XR and XI individually).  dY/dX is the Generalized Complex Derivative and dY/dXC is the Complex Conjugate Derivative [R.3, R.8]. We have the following relationships:

Complex Gradient Vector

If f(x) is a real function of a complex vector then df/dxC= (df/dx)C and we can define grad(f(x)) = 2 (df/dx)C =df/dxR+j df/dxI as the Complex Gradient Vector [R.8] with the following properties:

Basic Properties

Differentials of Linear Functions

Differentials of Quadratic Products

Differentials of Cubic Products

Differentials of Inverses

Differentials of Trace

Note: matrix dimensions must result in an n*n argument for tr().

Differentials  of Determinant

Note: matrix dimensions must result in an n#n argument for det(). Some of the expressions below involve inverses: these forms apply only if the quantity being inverted is square and non-singular; alternative forms involving the adjoint, ADJ(), do not have the non-singular requirement.

Jacobian

 dY/dX is called the Jacobian Matrix of Y: with respect to X: and JX(Y)=det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Hessian matrix

If f is a real function of x then the Hermitian matrix Hx  f= d/dx (df/dx)H  is the Hessian matrix of f(x). A value of x for which grad f(x) = 0 corresponds to a minimum, maximum or saddle point according to whether Hx f is positive definite, negative definite or indefinite.


This page is part of The Matrix Reference Manual. Copyright © 1998-2005 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: calculus.html,v 1.14 2005/08/17 10:42:09 dmb Exp $