[MUSIK]
and I put up again one important formula where yesterday I was
not clear enough I guess in explaining this what we see here is the
average loss of a classifier once again the loss function
tells us basically how much a misclassification is and the
simplest loss function is we pay 1 euro for misclassification and
if we do the decision correctly then everything's fine and we
weighted by 0 and the average loss for a given feature
vector and a given class is given here by the sum
over the loss function multiplied with the posterior okay and if you
look at the posterior probabilities these are numbers positive numbers in
between 0 and 1 okay in between 0 and 1 and
when is this sum the smallest well we sum up
here the posteriors the posteriors are weighted by the loss
and if we decide for the class Y with the
highest a posterior probability what does that mean that we
cancel out of the sum the element with the highest a posterior
probability due to the fact that all the components are none negative
the overall sum the overall average loss is 0 if we take
out the smallest value right the highest a posteriori probability and that's
the idea the Bayesian classifier will decide for the class with
the highest a posterior probability and once again it sounds so trivial in
this audience here in the lecture the Bayesian decision rule is nothing else
but computing the posteriors for all classes given a feature vector and then
we decide for the class with the highest a posteriori probability welcome okay
good and today we are
going to continue welcome you
are late okay if you have trouble to find the
room can you imagine what kind of trouble you will have answering my questions it's always
nice to see people walking around with a GPS finding the
lecture halls you know times have changed times have changed that's
cool okay so besides all the
math today we will have a lot of technical discussions we
will massage a lot of formulas using plus minus multiplication ratio
computation so basic math but it's a lot we're going to do but keep in mind the
story we have in the background what I'm going to consider in the following and if I
consider this you will also consider this let's consider the following problem we
have a feature vector with two elements 2-D features we have a
a labeled training set so we do supervised learning so we have
features here and features here and one feature set is belonging to the
class 0 we will call it 0 and 1 and then we have in addition to
that the decision boundary and this decision boundary is usually defined
by the zero level set of a function f capital f that's usually
the zero level set that the set of all the points X
where this function ends up with a final value 0 okay and
the question we are considering is what is the a posteriori probability
for Y is 0 given X and Y is
plus 1 given X for this decision boundary defined by
this zero level set that sounds very complicated that sounds very complicated but
we will see it so if you do it the right way it's it's not not so hard so we
Presenters
Zugänglich über
Offener Zugang
Dauer
01:26:48 Min
Aufnahmedatum
2012-10-23
Hochgeladen am
2012-11-05 10:43:20
Sprache
en-US