How is the vector-Jacobian product invoked in Neural ODEs

This post just tries to explicate the claim in Deriving the Adjoint Equation for Neural ODEs Using Lagrange Multipliers that the vector-Jacobian product $\lambda^\intercal \frac{\partial f}{\partial z}$ can be calculated efficiently without explicitly constructing the Jacobian $\frac{\partial f}{\partial z}$. The claim is made in the Solving PL, PG, PM with Good Lagrange Multiplier section. This post is inspired by a question asked about this topic in the comments post there....

February 21, 2020 · 9 min · 1787 words · Vaibhav Patel

Deriving the Adjoint Equation for Neural ODEs using Lagrange Multipliers

A Neural ODE ​1​ expresses its output as the solution to a dynamical system whose evolution function is a learnable neural network. In other words, a Neural ODE models the transformation from input to output as a learnable ODE. Since our model is a learnable ODE, we use an ODE solver to evolve the input to an output in the forward pass and calculate a loss. For the backward pass, we would like to simply store the function evaluations of the ODE solver and then backprop through them to calculate the loss gradient....

February 4, 2020 · 14 min · 2932 words · Vaibhav Patel

Semilandmarks: Abridged

In Geometric Morphometrics the study of the shape of a species begins at the identification of homologous landmarks across specimens in a dataset. Think the tip of the nose, or the corner of the eye. The patterns in landmark variations that survive the locations and orientations of the specimens are taken to represent true shape variations. But, as explained in Semilandmarks in Three Dimensions 1, homologous regions don’t always neatly fit into discrete landmarks....

January 7, 2020 · 7 min · 1387 words · Vaibhav Patel