12 - Deep Learning - Plain Version 2020/ClipID:21059 previous clip next clip

The automatic subtitles generated using Whisper Open AI in this video player (and in the Multistream video player) are provided for convenience and accessibility purposes. However, please note that accuracy and interpretation may vary. For more information, please refer to the FAQs (Paragraph 14).
Recording date 2020-10-11

Via

Free

Language

English

Organisational Unit

Lehrstuhl für Informatik 5 (Mustererkennung)

Producer

Lehrstuhl für Informatik 5 (Mustererkennung)

Format

lecture

Deep Learning - Loss and Optimization Part 3

This video discusses details on optimization and different options in gradient descent procedure such as momentum and ADAM.

For reminders to watch the new video follow on Twitter or LinkedIn.

References

[1] Christopher M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006.
[2] Anna Choromanska, Mikael Henaff, Michael Mathieu, et al. “The Loss Surfaces of Multilayer Networks.” In: AISTATS. 2015.
[3] Yann N Dauphin, Razvan Pascanu, Caglar Gulcehre, et al. “Identifying and attacking the saddle point problem in high-dimensional non-convex optimization”. In: Advances in neural information processing systems. 2014, pp. 2933–2941.
[4] Yichuan Tang. “Deep learning using linear support vector machines”. In: arXiv preprint arXiv:1306.0239 (2013).
[5] Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. “On the Convergence of Adam and Beyond”. In: International Conference on Learning Representations. 2018.
[6] Katarzyna Janocha and Wojciech Marian Czarnecki. “On Loss Functions for Deep Neural Networks in Classification”. In: arXiv preprint arXiv:1702.05659 (2017).
[7] Jeffrey Dean, Greg Corrado, Rajat Monga, et al. “Large scale distributed deep networks”. In: Advances in neural information processing systems. 2012, pp. 1223–1231.
[8] Maren Mahsereci and Philipp Hennig. “Probabilistic line searches for stochastic optimization”. In: Advances In Neural Information Processing Systems. 2015, pp. 181–189.
[9] Jason Weston, Chris Watkins, et al. “Support vector machines for multi-class pattern recognition.” In: ESANN. Vol. 99. 1999, pp. 219–224.
[10] Chiyuan Zhang, Samy Bengio, Moritz Hardt, et al. “Understanding deep learning requires rethinking generalization”. In: arXiv preprint arXiv:1611.03530 (2016).

Further Reading:
A gentle Introduction to Deep Learning

Up next

Maier, Andreas
Prof. Dr. Andreas Maier
2021-05-06
Free
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-11
Free
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-11
Free
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-12
Free
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-12
Free

More clips in this category "Technische Fakultät"

2024-06-17
Studon
protected  
2024-06-12
IdM-login
protected  
2024-06-12
Studon
protected  
2024-06-12
IdM-login
protected  
2024-06-13
Studon
protected