We are in the final. So we just have the month of July to work on pattern analysis, algorithms
and methods. And before we finalize the chapter on kernels, let's do again the big picture.
And now it's up to you to collect all the information that is basically characterizing
the contents of this lecture. So we talk about pattern analysis. And what is it all about?
It's all about the posterior. What is y? What is x?
Y is the probability of the class and x is the probability of...
There's no probability. Y is the random variable denoting the class number and this is the
feature vector. We have seen that feature vectors can be computed for patterns. We
have also seen that if we use speech signals or if we use images, then we cannot, quite
often we cannot associate with a pattern a feature vector of fixed dimension, but we
get feature sets. For instance, the corners of an object in an image are dependent on
the viewing direction, dependent on the segmentation results, dependent on the illumination. We
get different number of features. Or if you think about speech signals, depending on the
speed of the speaker, you get a longer or a shorter feature sequence. So the limitation
that we had in winter semester that we are required to have a feature vector of fixed
dimension is something that we have to resolve within this lecture. And last time we already
started to do that. What different topics did we cover so far? So Johannes, you're a
wild card. You can choose whatever you want out of the lists.
The Bayesian classifier. What is the Bayesian decision rule telling us? It's what? In which
case? Well, if all the components are Poisson distributed or Gaussian with different variances,
then it will not be a linear decision boundary. Actually, that's what we have shown. I mean,
the important thing here is that we decide for the class that maximizes what? The a posteriori
probability. And why? And here is a warning. I mean, if you show up in the old lecture
and you're not able to write down the decision rule of the Bayesian classifier, we both have
a serious problem. I run into legal issues because I'm close to hit you at this point.
And you run into a serious issue because you tend to fail the exam. That's the situation
we should avoid. Except you want to see me in prison. So the Bayesian classifier. And
then we basically know the Bayesian classifier is optimal in which sense? In which sense
is the Bayesian classifier optimal? With a 0, 1 cost function, it is optimal. And then
we have different ways to model the posterior probability. We have had the direct way and
the indirect way where on the one hand we have factorized it this way and here we have
used the direct way. Can somebody tell me what the prior is basically doing to the decision
boundary? It's just shifting the- Right. It's just an offset. It's just an intercept.
And how did we model the posterior probability directly? How did we do that directly? Who
wants to make the big points here? By logistic regression. Logistic regression. What was the
idea of logistic regression? We have given us a decision boundary and we put the function
of the decision boundary in another function. So we get directly-
Into the sigmoid function and that's directly the-
That was an important bridge that closes the gap between statistical classifiers and all
the geometric classifiers we have seen in the winter semester maybe. Where we have straight
decision boundaries for instance. How is it related to the basing classifier with a logistic
regression approach? We have a very, very good understanding of that. We have a very,
very good understanding of that. And then we modeled here the class conditional density.
For instance by Gaussians. What have we seen and discussed in this context for the class
conditionals? What did we discuss in this context? What did we discuss in this context?
Well if we do not know exactly what that is we assume that P of X, Y is normally distributed
with a mean vector and a covariance matrix depending on the class number. And what did
we conclude out of that? Out of this assumption? Sir?
Different decision boundaries are quadratic. Right. If you have the normal assumptions,
the Gaussian distribution assumption you get curved quadratic decision boundaries. In which
Presenters
Zugänglich über
Offener Zugang
Dauer
01:26:09 Min
Aufnahmedatum
2012-06-29
Hochgeladen am
2012-07-30 16:06:17
Sprache
en-US