The transpose is used for computing (first order) derivatives with respect to the initial condition

Nonlinear function (“model”) M of the initial state x 0 (state at time 0 ), giving the state x t at time t :
x t =M( x 0 )
x 0 and x t are vectors with I components, respectively x i 0 and x i t . M is then a vector function.
Let M ( x 0 ) be the Jacobian matrix of the vector function M , containing its first derivatives with respect to the initial variables:
M ( x 0 ) M x 0 ( x 0 )
since M is a nonlinear function, its derivatives depend on the initial state, M = M ( x 0 ) . This dependence is not indicated hereafter for simplicity. As a matrix, M is:
M =[ M 1 x 1 0 M 1 x 2 0 M 1 x I 0 M 2 x 1 0 M 2 x 2 0 M 2 x I 0 M I x 1 0 M I x 2 0 M I x I 0 ]
The first order variation is then obtained as a row-by-column product: δ x t = M δ x 0
δ x i t = i=1 I M i x j 0 δ x j 0
The Jacobian matrix M is also called tangent linear operator: its is linearly applied to variations of the initial state (tangent vectors) and it depends on the intial state because M is nonlinear.
Now consider a scalar function J of the state at time t .
J=J( x t )
Its first variation is obtained multiplying the row vector obtained by transposing its gradient to the vector δ x t , which can be expressed as above:
δ J= ( J x t ) T δ x t = ( J x t ) T M δ x 0
J , through M, is a composed function of the initial state:
J( x t )=J( M( x 0 ) )
So its first variation can be expressed by means of its gradient with respect to the initial condition:
δ J= ( J x 0 ) T δ x 0
By equating the two expressions of J , one obtaines:
( J x 0 ) T = ( J x t ) T M
By taking the transpose of this expression:
J x 0 = M T J x t
So the transpose of the Jacobian matrix, the transpose operator, is linearly applied to the gradient with respect to final time variables, to give the gradient with respect to initial time variables.
The transpose operator is sometimes called “adjoint” operator, though they are not exactly the same because the adjoint operator depends on the definition of a scalar product. The gradients are then “adjoint vectors”: remark that if the state components have physical dimensions, then the tangent vectors have the same dimensions and the components of the adjoint vectors have physical dimension which are the inverse (apart from possible physical dimensions of J ) of their corresponding tangent or state components.
Licenza Creative Commons Francesco Uboldi 2014,2015,2016,2017