1 - Domain shifts, Ethics, and Bias [ID:60853]
50 von 448 angezeigt

Dear students, welcome again from my side.

This is already lecture number eight and I'm

going to talk to you today about domain shifts, ethics and biases.

These are problems that

arise when we want to bring our models to the real world and it's basically now testing our

models to new domains for true generalization

so outside of the well-controlled benchmarking

setup that we have used until then in the laboratory.

But there's also the ethics and bias part that it's not only about that, it also involves some of the preliminary actions that we need to take because those are more related also to early stages like data collection.

But we will see this together.

Let's get started with the usual organizational information.

So this is the last time I talk to you in a lecture.

The remaining lectures will be given by Christopher.

And I for sure will see you again in the workshop in the end of the seminar.

So we are in fact at the eighth lecture and you still have two lectures ahead.

So you will see the second part of XAI and active learning after this lecture.

But today, as I said, we will dive into different topics.

More specifically, we will talk about domain shifts and domain adaptation techniques.

And after that, we will dive into the topic of ethics and bias with always a focus on time series.

Let's move directly to the first of the two sections.

So domain shifts and domain adaptation.

And we start from the typical machine learning setup, the one we have used so far.

So in a typical setup, what we have is a training set that is sampled from a distribution here denoted by Q, X and Y.

A training set in a supervised setting is made of X and Y's.

The X is the training samples and the Y's are the training labels.

And again, we typically want to know, want to learn one mapping from X to Y.

And therefore, we optimize a problem defined like this.

So we want to find the parameters that minimize the error function or loss function L on the entire data set.

And this parameters we denote with data star and there are the optimal parameter for the training set.

And then when we want to test our model, we consider the old out set, the old test set.

It's a test set that is sampled from the same distribution Q, X and Y and as more or less the same structure.

So pairs of samples and corresponding labels.

And for that, we compute the same loss function normally or other metrics to evaluate the test error, which is how well is now our function mapping X to Y's in the test set.

And our aim is to make the this error small.

And we have seen we have techniques to try to improve generalization, for example, considering a validation set during training and things that you already are well familiar with.

So let me summarize.

We have a training set is sampled from distribution Q, X, Y.

We want to minimize the loss function on the training set to get the same result.

The loss function on the training set to get the optimal parameters that are then we consider a test set

which is a similar shape and more importantly sampled from the same distribution Q

X and Y.

And we can compute the error on the test set to evaluate the generalization performance.

So the key assumption

maybe you have noticed that already is that both the training set and the test set

they come from the very same probability distribution.

But the question is, the important question is, is this a realistic assumption?

In most cases, it could be different.

Because what do we have in practice is that the training distribution and the test distribution are oftentimes not the same.

And I'll give you an example.

So imagine we train an image classifier on a database of photos that are taken by with a professional camera.

Teil eines Kapitels:
Domain shifts, Ethics, and Bias

Presenters

Zugänglich über

Offener Zugang

Dauer

00:37:30 Min

Aufnahmedatum

2025-11-24

Hochgeladen am

2025-11-24 10:45:40

Sprache

en-US

An overview and practical considerations on problems that arise as we move to real-world problems.