shlogg · Early preview
Super Kai (Kazuya Ito) @superkai_kazuya

Implementing Adam Optimizer In PyTorch

Adam() optimizer explained in 250 characters: "Adam() optimizes gradient descent with Momentum & RMSProp. Args: params, lr, betas, eps, weight_decay, amsgrad, foreach, maximize, capturable, differentiable, fused.

Buy Me a Coffee☕
*Memos:

My post explains Adam.
My post explains Module().

Adam() can do gradient descent by Momentum and RMSProp as shown below:
*Memos:

The 1st argument for initialization is params(Required-Type:generator).
The 2nd argument for initialization is lr(Optional-Default:0.01-Type:int or float). *It must be 0 <= x.
The 3rd argument for initialization is betas(Optional-Default:(0.9, 0.999)-Type:tuple or list of int or float). *It must be 0 <= x < 1.
The 4th argument for initialization is eps(Optional-Default:1e-08-Type:int or float). *It must be 0 <= x.
The 5th argument for ini...