《高阶信息如何加速神经网络训练?.pdf》由会员分享,可在线阅读,更多相关《高阶信息如何加速神经网络训练?.pdf(35页珍藏版)》请在三个皮匠报告上搜索。
1、How high-order information accelerates the neural network training? Xunpeng Huang Bytedance AILab MLNLC Nov 2020 Outline Preliminaries Accelerate neural network training with second moments Approximate Hessian with a low computational complexity Summary References Preliminaries Accelerate neural net
2、work training with second moments Approximate Hessian with a low computational complexity Summary References Anbriefintroductiontooptimizersandtheirapplications Figure: Logistic regression and SVM classification Most optimization problems are solved by different optimizers. Many tasks in CV and NLP
3、can be formulated as some optimization problems. Anbriefintroductiontooptimizersandtheirapplications Optimization problems Stochastic problems: argminxf(x) = EF(x,) Finite sum problems: argminxf(x) = 1/n Pn i=1fi(x) where x, f(x) denote the parameters and the objective functions. OptimizersUpdate rules GDxt+1= xt tf(xt) Newton methodxt+1= xt 2f(xt)f(xt) Table: Deterministic optimizers OptimizersUp