Optimization
References
- http://videolectures.net/deeplearning2015_goodfellow_network_optimization/ (Ian Goodfellow's tutorial on neural network optimization at Deep Learning Summer School 2015).
- http://int8.io/comparison-of-optimization-techniques-stochastic-gradient-descent-momentum-adagrad-and-adadelta (implementation and comparison of popular methods)
- http://www.deeplearningbook.org/contents/numerical.html (basic intro in 4.3)
- http://www.deeplearningbook.org/contents/optimization.html (8.1 generalization, 8.2 problems, 8.3 algorithms, 8.4 init, 8.5 adaptive lr, 8.6 approx 2nd order, 8.7 meta)
- http://andrew.gibiansky.com/blog/machine-learning/gauss-newton-matrix/ (great posts on optimization)
- https://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf (excellent tutorial on cg, gd, eigens etc)
- http://arxiv.org/abs/1412.6544 (Goodfellow paper)
- https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec6.pdf (hinton slides)
- https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec8.pdf (hinton slides)
- http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html
- http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf
- http://arxiv.org/abs/1503.05671
- http://arxiv.org/abs/1412.1193
- http://www.springer.com/us/book/9780387303031 (nocedal and wright)
- http://www.nrbook.com (numerical recipes)
- https://maths-people.anu.edu.au/~brent/pub/pub011.html (without derivatives)
- http://stanford.edu/~boyd/cvxbook/ (only convex optimization)