Okay, the quiz is over I think.
So no real disasters, that's good.
Yeah, I was wondering, we moved the quiz and the number of people taking it went down considerably.
I may be wrong but the number of people starting the quiz late seemed to be increased.
And I can understand that.
Would it be useful if we tried to publish a calendar that has kind of the lecture times and
the quizzes, the homeworks and all of those kind of things in it that you could just
subscribe to on your phone?
Would anybody want that?
Or you still have everything you need up here?
Okay, good. That would be the first thing I would be asking for.
But yeah, okay, maybe we'll do it anyway so that I can actually subscribe to the calendar.
Okay, good.
So we kind of did the last bit on the static, timeless,
partially observable stochastic environment agents.
Right, so it was essentially an extension by information foraging behavior for these
agents. The key step here was to give information gathering
a value, a price, right? What's an estimated value of the information that we get and as such we can
just treat it like any other actions where we have an expected utility.
Okay, so we can just use our good old mechanism grinding through
the maximization of expected utility with information gathering. We have a formula.
It's a relatively big one, right, but still something we can do and
the agent is really simple. It does what the normal agent does plus estimate any,
plus estimate any, the expected utility of any of the information gathering actions
and if the information gathering kind of pays off, we'll do it.
Okay, the expected utility from the information gathering comes with
the better decisions we can do later.
And so that's really what we did. So we have in summary an agent design
plus all the necessary stuff we need for implementing it
for static environments where time doesn't play a role.
Okay, for environments that are much more complicated and much more challenging than the
ones we had last semester. Last semester we insisted on complete observability and on static
actions and we all know how unrealistic that is. So we've basically upgraded to dealing with
uncertainty. We kind of did a little bit of time and changes and so on in the planning chapter
and so we basically need to kind of combine time and planning, time and uncertainty
and that's what we undertook. Next, we have a very simple one.
The probability model. The example was the umbrellas example. We're going to keep on looking at that
example because it's so simple and the idea is to take random variables just to make sure that
they're not too large. So we're going to take random variables and we're going to take the
random variables. Take random variables just like before and then look at kind of timelines of those
and the main problem with that is that computationally this completely explodes.
So we spent all of yesterday kind of making up simplifications that make it tractable at all.
So the first thing that we did was well in principle some random variable can depend on all
of its predecessors.
That messes up computation big time. So what do we do? Well what we always do we find
fancy and scientific words for the influence, for the size of the influence and then we say
ah but we're only going to do influence depth one. Right? First order Markov models and they're so
important that we give them a separate name. We call them Markov chains. That's these guys. Right?
These guys we don't like as much so we don't cover them here but I've told you that they exist out
there. Right? Just like in old in medieval maps right where people didn't know what's happening
Presenters
Zugänglich über
Offener Zugang
Dauer
01:23:08 Min
Aufnahmedatum
2025-05-21
Hochgeladen am
2025-05-22 21:29:06
Sprache
en-US