<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Machine Learning on Vaibhav Patel</title><link>https://vaipatel.com/tags/machine-learning/</link><description>Recent content in Machine Learning on Vaibhav Patel</description><image><title>Vaibhav Patel</title><url>https://vaipatel.com/</url><link>https://vaipatel.com/</link></image><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 21 Feb 2020 10:09:59 +0000</lastBuildDate><atom:link href="https://vaipatel.com/tags/machine-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>How is the vector-Jacobian product invoked in Neural ODEs</title><link>https://vaipatel.com/posts/how-is-the-vector-jacobian-product-invoked-in-neural-odes/</link><pubDate>Fri, 21 Feb 2020 10:09:59 +0000</pubDate><guid>https://vaipatel.com/posts/how-is-the-vector-jacobian-product-invoked-in-neural-odes/</guid><description>This post just tries to explicate the claim in Deriving the Adjoint Equation for Neural ODEs Using Lagrange Multipliers that the vector-Jacobian product $\lambda^\intercal \frac{\partial f}{\partial z}$ can be calculated efficiently without explicitly constructing the Jacobian $\frac{\partial f}{\partial z}$. The claim is made in the Solving PL, PG, PM with Good Lagrange Multiplier section.
This post is inspired by a question asked about this topic in the comments post there.</description></item><item><title>Deriving the Adjoint Equation for Neural ODEs using Lagrange Multipliers</title><link>https://vaipatel.com/posts/deriving-the-adjoint-equation-for-neural-odes-using-lagrange-multipliers/</link><pubDate>Tue, 04 Feb 2020 07:18:43 +0000</pubDate><guid>https://vaipatel.com/posts/deriving-the-adjoint-equation-for-neural-odes-using-lagrange-multipliers/</guid><description>A Neural ODE ​1​ expresses its output as the solution to a dynamical system whose evolution function is a learnable neural network. In other words, a Neural ODE models the transformation from input to output as a learnable ODE.
Since our model is a learnable ODE, we use an ODE solver to evolve the input to an output in the forward pass and calculate a loss. For the backward pass, we would like to simply store the function evaluations of the ODE solver and then backprop through them to calculate the loss gradient.</description></item></channel></rss>