A module for gradient descent optimizer
structure to memorize the stats e.g., momentum
CPU version of cudnn.transform
public api to update a target model
fill gradient arrays with zero
http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf
http://jmlr.org/papers/v12/duchi11a.html
https://arxiv.org/pdf/1412.6980v8.pdf
stochastic gradient descent optimizer
trait to identify optimizer
See Source File
A module for gradient descent optimizer