We talked about Bayesian networks yesterday.
Bayesian networks basically as the probably most successful framework for probabilistic
reasoning. The idea there is that we want to replace the full joint probability
distribution, which is a typically huge thing with something more compact, with
something easier to understand for human modelers and with something where we
can compute more efficiently with. So for the math alone, for the computation in
principle alone, the full joint probability distribution is okay. If we want to
actually model, if we want to compute efficiently and remember Bayesian networks can get big,
or your model can get big. You typically have a couple of hundred random
variables that you want to keep track of. Maybe you should at some point take a
minute to think about all the random variables you regularly deal with over
the course of a day. I know with suspect you'll get into the five figures
relatively easily just over one day. So we want to have something that is
easy to model or comparatively easy to model and where we can actually debug the
model. Where we can look at the model and see does that look about right? And if
you just have this high dimensional hypercube of numbers, that's not something
you can do easily. If you have a graph structure that helps you do this. So we've
looked at the semantics of Bayesian networks and there's basically two answers.
One is the graph structure tells you something about dependencies in the world.
That is also something we can relatively easily plausible and
thereby debug. Whereas the conditional probability tables have essentially
just one virtue that are relatively small. So the quantitative thing is we have
small CPTs and the qualitative aspect of it is we have a model of the
structures of the world that we can debug. Now the question of course is how do you
construct those and the upshot is it's all in the variable ordering. And the
second upshot is if you're doing this but the causes in front and the symptoms in
the back. And then you can just basically come up with a good ordering and from
the time on where you've ordered this you basically just add edges to the
otherwise edgeless graph. And the edges are the causal dependencies or in general
any dependencies. But if they're causal they tend to be sparser. And if you do it
wrong we've seen a couple of examples. This is worse than the one we had
this is even worse and it can be even worse. So more edges mean fewer
conditional dependencies means in the computation bigger conditional
probabilities which means more values bigger CPTs and more computation. Now any
questions so far? Now I've been kind of hand waving about the size of the
Bayesian network. And how bad it gets and so on. The question is can we also
make this formal? Can we reason about the size of a Bayesian network where we can
put a number to this, put a number to that and see that this one is much worse
than that one. And the kind of obvious idea is that you try to estimate the size
of the CPTs. And if you think about it the size of the CPTs really depends on
kind of how many parents you have. And the size of the CPT really is
multiplicative in the sizes of the parents' domains if you think about it.
We did this in the example where we had an alarm here and two incoming edges
and one was burglary and the other one was earthquake. We had a binary domain
here, true false, and a binary domain there. And we had the CPT here and we had
basically, no, burglary and earthquakes, true true until false false. Or in other
words, for numbers. Now we can generalize that easily if we have a domain of n and
a domain of m here, then we have n times m here. And if we have more incoming
edges, then you're essentially getting this product here. We have the sizes of
the parents' domains here. And we're multiplying for each parent. And then if
Presenters
Zugänglich über
Offener Zugang
Dauer
01:28:23 Min
Aufnahmedatum
2023-05-03
Hochgeladen am
2023-05-03 17:59:03
Sprache
en-US