5 - Backpropagation in ODENets [ID:60592]
50 von 113 angezeigt

Back propagation.

Okay

we know that ODE solve is equivalent to forward pass through the neural network

where F in ODE solve is the neural network that needs to be figured out.

In a regular neural network, we have to adjust the weight, parameters, theta based on the

predictions and our loss criterion.

The same should happen in our ODE network as well

where the parameters of F

theta

need to be adjusted so we are at a better F.

We need DL, D theta to update theta to

reduce the loss.

But how would that work?

Since at each depth layer and a time step

we would have to save the theta

save the

states, the hidden states and any intermediate time steps based on the ODE solver you're

using.

The authors propose an adjoint method for solving this problem that we will discuss

now.

Let us first consider calculating how our loss changes with respect to x of t.

That means with respect to the hidden state in the middle of

in the middle at a certain

depth t.

Here, x of t is considered a continuous function.

We are searching to define a of t for all the depths

for all the t to see how the loss

relates to all the intermediate hidden states.

At the output

this is pretty straightforward since we already know x t predicted.

But to calculate all the internal a of t

so all the internal loss with respect to hidden

state gradients

we need to propagate it back in depth or time based on how you see it

for which we need d a t by d t.

This is quite complex.

So instead of going at it from the continuous function form

we move to the discrete formulation.

We know that from chain rule

so d l by d x t is equal to d l by d x t plus epsilon times

d x by d x t plus epsilon by d x t

where epsilon is like a small time or depth bump.

Move that to the continuous form.

We just get what the equation you see here on the left

on the right

sorry.

But we know x of t

the continuous function follows our original ODE with the differential

f.

So we can write it down as an initial value problem where x of t plus epsilon is basically

Teil eines Kapitels:
Time-aware models

Zugänglich über

Offener Zugang

Dauer

00:07:14 Min

Aufnahmedatum

2025-11-04

Hochgeladen am

2025-11-04 16:05:12

Sprache

en-US

Backpropagation in ODENets