65 - Deep Learning - Plain Version 2020 [ID:21199]

50 von 215 angezeigt

Welcome back to deep learning and this is it. This is the final video. So today I want to show you a couple of more applications of this known operator paradigm and also some ideas where I believe future research could actually go to.

So let's see what I have here for you. Well, one thing that I would like to demonstrate is the simplified modern hearing aid pipeline. This is a collaboration with a company that is producing hearing aids and they typically have a signal processing pipeline where you have two microphones.

They collect some speech signal, then this is run through an analysis filter bank. So this is essentially a short term Fourier transform. This is then run through a directional microphone in order to focus on things that are in front of you.

Then you use noise reduction in order to get better intelligibility for the person who is wearing the hearing aid and then you go to an automatic gain control and using the gain control you then do a synthesis of the frequency analysis back to a speech signal that is then played back on a loudspeaker within the hearing aid.

So there's also a recurrent connection because you want to suppress feedback loops. This kind of pipeline you can find in modern day hearing aids of various manufacturers. Here you see some examples and the key problem in all of this processing is here the noise reduction.

This is the difficult part. All the other things we kind of know how to address them with traditional signal processing, but the noise reduction is something that is a huge problem.

So what can we do? Well, we can map this entire hearing aid pipeline onto a deep network, onto a deep recurrent network and all of those steps can be expressed in terms of differentiable operations.

So if we do so, we get up with the following outline. Actually, our network here is not so deep because we only have three hidden layers, but with two thousand and twenty four hidden nodes and reluers.

And this is then used to predict the coefficients of a Wiener filter gain in order to suppress channels that have particular noises.

So this is the setup. We have an input of seven thousand seven hundred and fourteen nodes from our normalized spectrum.

Then this is run through three hidden layers. They are fully connected with reluers.

And in the end, we have some output that is forty eight channels produced by the sigmoid producing our Wiener gain.

We evaluated this on some data set and here we had two hundred and fifty nine clean speech signals.

We then essentially had forty eight non stationary noise signals and then we mixed them.

So you could argue what we are essentially training here is a kind of recurrent auto encoder, actually a denoising auto encoder,

because as input we take the clean speech signal plus the noise and on the output we want to produce the clean speech signal.

Now, this is the example.

And then you may say, which guesthouses are near me?

And let's try a non stationary noise pattern.

And this is an electronic drill. And also note that the network has never heard an electronic drill before.

This typically kills your hearing aid.

Can you tell me which guesthouses are near me? I live in the Höhlesprache. I would like to try something completely new.

And let's listen to the output.

Can you tell me which guesthouses are near me? I live in the Höhlesprache. I would like to try something completely new.

Maybe Japanese kitchen, but a little bit of vegetarian.

So you can hear that the non stationary noise is also very well suppressed.

Wow, so that's pretty cool. Of course, there's many more applications of this.

And let's look into one more idea. Can we derive networks?

So here, let's say you have a scenario where you collect data in a format that you don't like, but you know the formal equation between the data and the projection.

So the example that I'm showing here is a comb beam acquisition.

Let's say this is simply a typical X-ray geometry. So you take an X-ray and this is typically conducted in comb beam geometry.

Now, for the comb beam geometry, we can describe it entirely using this linear operator, as we've already seen in the previous video.

So we can express the relation between the object X, our geometry ACB, and our projection PCB.

Now, the comb beam acquisition is not so great because you have magnifications in there.

So if you have something here, it will be magnified more than an object here.

So this is not so great for diagnosis. In orthopedics, they would prefer parallel projections because if you have something here, it will be orthogonally projected and it's not magnified.

So this would be really great for diagnosis. You would have metric projections. You can simply measure here and it would have the same size as inside the body.

So this would be really nice for diagnosis, but typically we can't measure it with the systems that we have.

So in order to create this, you would have to create a full reconstruction of the object.

So doing a full CT scan from all sides and then reconstruct the object and project it again.

Typically in orthopedics, people don't like slice volumes because they are far too complicated to read, but projection images are much nicer to read.

Well, what can we do? Well, we know the factor that connects two equations here is X.

So we can simply solve this equation here and produce the solution with respect to X.

And once we have X and the matrix inverse here of ACB, then we simply would multiply it to our projection image.

But we are not interested in the reconstruction. We are interested in this projection image here.

So let's plug it into our equation and then we can see that by applying this series of matrices, we can convert our comb beam projections into a parallel beam conjection.

There's no real reconstruction required, only an intermediate reconstruction is required.

Of course, you don't just acquire a single projection here. You may want to acquire a couple of those projections.

Let's say three or four projections, but not thousands as you would in a CT scan.

Now, if you look at this set of equations, we know all of the operations.

Teil einer Videoserie :

Deep Learning - Plain Version

Presenters

Prof. Dr.-Ing. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:24:50 Min

Aufnahmedatum

2020-10-12

Hochgeladen am

2020-10-13 00:36:19

Sprache

en-US

Deep Learning - Known Operator Learning Part 4

This is the final video of the class in which we present more applications of known operator learning. Furthermore, the video also contains my personal interpretation of where the field is heading and what kind of research will follow next.

For reminders to watch the new video follow on Twitter or LinkedIn.

Further Reading:
A gentle Introduction to Deep Learning

References
[1] Florin Ghesu et al. Robust Multi-Scale Anatomical Landmark Detection in Incomplete 3D-CT Data. Medical Image Computing and Computer-Assisted Intervention MICCAI 2017 (MICCAI), Quebec, Canada, pp. 194-202, 2017 – MICCAI Young Researcher Award
[2] Florin Ghesu et al. Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans. IEEE Transactions on Pattern Analysis and Machine Intelligence. ePub ahead of print. 2018
[3] Bastian Bier et al. X-ray-transform Invariant Anatomical Landmark Detection for Pelvic Trauma Surgery. MICCAI 2018 – MICCAI Young Researcher Award
[4] Yixing Huang et al. Some Investigations on Robustness of Deep Learning in Limited Angle Tomography. MICCAI 2018.
[5] Andreas Maier et al. Precision Learning: Towards use of known operators in neural networks. ICPR 2018.
[6] Tobias Würfl, Florin Ghesu, Vincent Christlein, Andreas Maier. Deep Learning Computed Tomography. MICCAI 2016.
[7] Hammernik, Kerstin, et al. "A deep learning architecture for limited-angle computed tomography reconstruction." Bildverarbeitung für die Medizin 2017. Springer Vieweg, Berlin, Heidelberg, 2017. 92-97.
[8] Aubreville, Marc, et al. "Deep Denoising for Hearing Aid Applications." 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC). IEEE, 2018.
[9] Christopher Syben, Bernhard Stimpel, Jonathan Lommen, Tobias Würfl, Arnd Dörfler, Andreas Maier. Deriving Neural Network Architectures using Precision Learning: Parallel-to-fan beam Conversion. GCPR 2018. https://arxiv.org/abs/1807.03057
[10] Fu, Weilin, et al. "Frangi-net." Bildverarbeitung für die Medizin 2018. Springer Vieweg, Berlin, Heidelberg, 2018. 341-346.
[11] Fu, Weilin, Lennart Husvogt, and Stefan Ploner James G. Maier. "Lesson Learnt: Modularization of Deep Networks Allow Cross-Modality Reuse." arXiv preprint arXiv:1911.02080 (2019).

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/21199

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/21199&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Per RSS abonnieren