11 - Musteranalyse/Pattern Analysis (früher Mustererkennung 2) (PA) [ID:386]
50 von 625 angezeigt

So welcome to the Monday morning session. Before we continue with the topics of pattern

analysis, the pattern analysis lecture, let me briefly give an overview over the topics

that we have considered so far. So the summer semester pattern analysis is mostly about

statistical modeling of object classes, of objects, of pattern classes and so on. So

what we basically want to do is we want to find a way to characterize the a posteriori

probability of a given class for a given feature vector or for a given set of features or a

sequence of features. So that is the core of the lecture. And we started out the lecture

considering or reconsidering the reason why this a posteriori probability is so important for us.

And we learned about Bayesian classifier and its optimality properties. You know the Bayesian

classifier is optimal with respect to the zero one cost or loss function. And it decides according

to the maximization of the a posteriori probability. So we decide for the class that maximizes this

a posteriori probability. And then we looked into the naive Bayes approach. Naive Bayes approach

where we have basically assumed that the components of the feature vector are mutually independent.

And we have derived statistical model for using the naive Bayes. And we have also considered some

theoretical facts in which cases the naive Bayes is doing a good job and in which cases it actually

is quite risky to apply this independency assumption that we have considered. And then we have thought

about something that is obvious to many people that start to work on pattern recognition and

pattern analysis. So they look at feature vectors and then they put some hyper planes into the

feature space and expect that the hyper planes separate the classes. And the question that we

considered was how are hyper planes or geometrical structures in the feature space related to the

Bayesian classifier. For instance, I give you set of 2D features. I give you a decision boundary

that is a linear function, straight line. And I want you to write down the a posteriori probability.

And based on this idea, we have talked about logistic regression that gives us an immediate

relationship between the geometry of decision boundaries and the a posteriori probabilities.

So logistic regression was a very, very important topic. And then we talked about feature transforms.

We talked about PCA, principal component analysis. We talked about the LDA, the linear discriminant

analysis. And there we have seen two options. One was just doing the generation of spherical data

and the other one was including a reduction of the dimension. And please remember that if we apply

LDA there we know that the number of classes basically decides on the dimension that is

basically required for the feature vectors to distinguish the classes, the various classes.

And then we digged into more mathematical issues and we talked about norms. And for many of you,

this chapter might have appeared a little theoretically, but it's a basic chapter providing

concepts that are important if you look into modern pattern recognition and pattern analysis

literature. You will find a lot of recent results where the optimization problems that have to be

solved make use of different norms. And then we have looked a little bit into statistical learning

theory basically without noticing the details and noticing the fact that we are in a huge theoretical

field. We have learned about Rosenblatt's perceptron. We have learned about Rosenblatt's

perceptron and its update rule. And we have also considered a convergence proof of the

iteration scheme that Rosenblatt has introduced to find linear decision boundaries. And we also

have seen that the number of iterations is basically bounded if we have linearly separable

classes. And the boundary was independent of the dimension of the feature space, a very important

theoretical result in our field. And when we started to consider last lecture two weeks ago

already was ideas on optimization routines. We talked a little bit about optimization methods

for students of mathematics or technical mathematics. This is nothing new. You have

special lectures on optimization theory and optimization methods. But for all the other

students that work in the field of pattern recognition, they mostly have less experience

in optimization methods and also the relationship between various optimization routines and norms.

And that's why I have introduced this chapter to briefly give you a good idea what we are basically

doing in the field of or what is basically required for solving pattern recognition problems.

In pattern recognition in winter semester as far as or when I teach it, I usually say to people,

Zugänglich über

Offener Zugang

Dauer

01:25:08 Min

Aufnahmedatum

2009-06-01

Hochgeladen am

2017-07-05 12:50:32

Sprache

en-US

Tags

PA Optimization Descent Steepest Methods norms Support Vector Machine Margin
Einbetten
Wordpress FAU Plugin
iFrame
Teilen