Thank you. Thank you for the nice introduction. It's a pleasure to be here and to contribute
the presentation to this wonderful seminar series. I should say for me this presentation
is entirely new. It's the first time that I ever talk about this line of work and it's
quite a fascinating line of work. In contrast to what I originally announced, this talk
will not be on 3D reconstruction but solely on 3D shape analysis. And I should say upfront
the work that I'm presenting is very much not really my work but the work of students
of ours, current and former students. And so in particular these are Marvin Eisenberger,
Isim Toker and Zora Lehner who did these three did most of the work that I'm going to present
in the following. Where should I start? Well, this is about deep learning. So let's start
with deep learning. I think this is a slide that should be familiar with to most of you.
If you've looked into the field of deep learning, this is at least for me what marks the breakthrough
of deep neural networks. And this was the ImageNet challenge that Fei-Fei Li and her
students set up recognizing objects in images. You're quite familiar I guess with this challenge.
And at the time around 2011 on that benchmark that they put out, the best performing methods
made on whatever scale 26 errors, incorrect classifications. Whereas humans in terms of
seeing what is in an image, airplanes, cars, bicycles, et cetera, on the same data set
only made five mistakes on average. And so there was a huge gap between the performance
of computer vision algorithms and the performance of human observers. And it seemed at the time
pretty much impossible to ever achieve human level performance on this task, which is a
task I think that humans were, one could argue, designed for recognizing things in images.
But then something surprising happened. Then Alex Krzyzewski proposed this deep neural
network architecture. Neural networks have been around for decades, but this one was
significantly deeper than previous ones and more sophisticated in its architecture with
multiple layers, et cetera. I think it's an architecture that everyone knows it's not.
It's called the AlexNet. And what he achieved in 2012 was 16, an error of roughly 16. So
a dramatic drop in error, a dramatic boost in performance. And in the wake of this work,
there's a lot of deep network architectures, further refined deeper networks, et cetera,
to a level where now the deep network-based approaches, I'm saying now this is 2016, it
isn't even fully up to date here, but they make fewer errors than humans. And this I
find is quite impressive and to me far more impressive than say computers playing chess,
right? Chess is a game you can say it's now a surprise that a machine is better at chess
because humans were not designed by nature to play chess, but you could argue they were
designed to recognize objects and images. And so this for me is the real breakthrough
in artificial intelligence that was achieved here on this benchmark. In the wake of this
work also, the community focused on, and including our team, we focused on how can we carry the
success of deep networks beyond object recognition to a more low-level image processing tasks
like optical flow estimation. And there we proposed this technique called flow net where
we basically train a network to predict optical flow from two given images. And we showed
also in this video object segmentation paper that you can train neural networks to track
pixel accurately an object across videos in very challenging sequences such as this one.
We also showed more recently that you can leverage the power of deep learning to boost
visual simultaneous localization and mapping and get completely unprecedented precision
and robustness for large-scale mapping from just a single moving camera. And most recently,
this is actually this year, upcoming next week at CVPR, we showed that you can train
neural networks to recover from just one single moving camera, dense and faithful reconstructions
of the world computed with a suitably trained deep network. So all of this are great achievements.
I think we see that deep learning can not only be deployed for classification as it
was traditionally designed, but for all sorts of data analysis, challenges, segmentation,
neural flow, reconstruction, etc. What I want to talk about today, though, is how to kind
of extend the power of deep networks to the analysis of 3D shapes. This is a topic that's
Zugänglich über
Offener Zugang
Dauer
00:34:20 Min
Aufnahmedatum
2021-06-18
Hochgeladen am
2021-06-18 23:08:07
Sprache
en-US
Daniel Cremers on "Self-supervised Learning for 3D Shape Analysis":
While neural networks have swept the field of computer vision and replaced classical methods in most areas of image analysis and beyond, extending their power to the domain of 3D shape analysis remains an important open challenge. In my presentation, I will focus on the problems of shape matching, correspondence estimation and shape interpolation and develop suitable deep learning approaches to tackle these challenges. In particular, I will focus on the difficult problem of computing correspondence and interpolation for pairs of shapes from different classes — say a human and a horse — where traditional isometry assumptions no longer hold.