[MUSIK]
so good morning good morning everybody to our Tuesday
morning session unfortunately yesterday I was not able to to teach but Thomas Köhler explained
to you the core idea of maximum likelihood estimation where you basically
set up a likelihood function or a log likelihood function and then
you maximize the log likelihood function in terms of the parameters that
we are looking for and we will see this over and over again
during the lecture so you will more or less grow into this
concept so what is pattern recognition in winter semester about let's look
again at the big picture to make you aware of the topics we are discussing to
make you aware of the storyline of the whole lecture so we started out basically with the motivation
of the problems we are considering we talked about basically the intention to
compute the mapping from feature vectors feature vectors to
class numbers class numbers that's basically what we discussed this semester
how can I transform feature vectors computed out of a signal
into class numbers we also know that this whole mapping is
called classification we also know if Y is a continuous parameter
a real valued number then we have a regression problem so classification and regression
are the topics we we will discuss in detail and then we looked into the concept of Bayesian
classifiers which tells us how much loss is generated by
a misclassification of a class of an object or a feature vector belonging to
class Y and being assigned to class y-prime so that's basically the
loss function is defined by loss Y y-prime and
these are the costs that I generate if an object or a pattern belonging
to class y-prime is assigned to class 1 and we have looked into a very concrete loss
function that was the so called 0 1 loss function
the 0 1 loss function is saying correct classifications are for free wrong classifications costs
1 euro 1 dollar or whatever 1 unit and if you use this cost function
then you end up with a decision rule that is optimal with respect to
the loss function in a statistical framework and that's the so called Bayesian classifier
the Bayesian classifier is optimal with respect to the 0 1
loss function if I have the posterior probabilities for all the
classes and all for the classes given the feature vector than I can build
an optimal classifier that minimizes the average loss of this
classifier and the decision rule the Bayesian decision rule is saying
we decide for the class Y star that's the optimal
class by computing argmax that's the argument for which the
posterior is maximized that's the Bayesian decision
rule we decide we compute the posteriors and at the end of the day will decide for the
class with the highest posterior that decision guarantees that we minimize the
average loss using the 0 1 loss function very cool later we will discuss other
loss functions even today will see a different loss function based on these loss functions of
course you get different different classifiers so since the mid of the sixties 1960s pattern
recognition researchers are basically looking for techniques to
find the a posterior probability because they know from a theoretical
point of view there is nothing better than that and we have seen two ways to compute it
one is the direct way that we model the posterior probability
right away these are the so called discriminative models
discriminative approach where we compute P of YX right
away and the generative model makes use of the Bayesian formula which
Presenters
Zugänglich über
Offener Zugang
Dauer
01:28:31 Min
Aufnahmedatum
2012-10-30
Hochgeladen am
2012-11-05 10:43:47
Sprache
en-US