ref: http://rnduja.github.io/2015/10/26/deep_learning_with_torch_step_7_optim/
doc: https://github.com/torch/optim/blob/master/doc/intro.md
Before we implement the gd update step by defining a gradientUpdate
function and calling it in a loop.
function gradientUpdate(model, x, y, criterion, learningRate)
local pred = model:forward(x ...