jittor.optim¶
这里是Jittor的优化器模块的API文档,您可以通过from jittor import optim
来获取该模块。
- class jittor.optim.Adam(params, lr, eps=1e-08, betas=(0.9, 0.999), weight_decay=0)[源代码]¶
Adam Optimizer.
Example:
optimizer = nn.Adam(model.parameters(), lr, eps=1e-8, betas=(0.9, 0.999)) optimizer.step(loss)
- class jittor.optim.AdamW(params, lr, eps=1e-08, betas=(0.9, 0.999), weight_decay=0)[源代码]¶
AdamW Optimizer.
Example:
optimizer = nn.AdamW(model.parameters(), lr, eps=1e-8, betas=(0.9, 0.999)) optimizer.step(loss)
- class jittor.optim.Optimizer(params, lr, param_sync_iter=10000)[源代码]¶
Basic class of Optimizer.
Example:
optimizer = nn.SGD(model.parameters(), lr) optimizer.step(loss)
- backward(loss, retain_graph=False)[源代码]¶
optimize.backward(loss) is used for accumulate multiple step, it can be used as following:
Origin source code
n_iter = 10000 batch_size = 100 … for i in range(n_iter):
… loss = calc_loss() optimizer.step(loss)
Accumulation version
n_iter = 10000 batch_size = 100 accumulation_steps = 10 n_iter *= accumulation_steps batch_size //= accumulation_steps … for i in range(n_iter):
… loss = calc_loss() # if loss is a mean across batch, we need to divide accumulation_steps optimizer.backward(loss / accumulation_steps) if (i+1) % accumulation_steps == 0:
optimizer.step()
- clip_grad_norm(max_norm: float, norm_type: int = 2)[源代码]¶
Clips gradient norm of this optimizer. The norm is computed over all gradients together.
- Args:
max_norm (float or int): max norm of the gradients norm_type (int): 1-norm or 2-norm
Example:
a = jt.ones(2) opt = jt.optim.SGD([a], 0.1) loss = a*a opt.zero_grad() opt.backward(loss) print(opt.param_groups[0]['grads'][0].norm()) # output: 2.83 opt.clip_grad_norm(0.01, 2) print(opt.param_groups[0]['grads'][0].norm()) # output: 0.01 opt.step()
- property defaults¶
- post_step()[源代码]¶
something should be done before step, such as zero grad, and so on.
Example:
class MyOptimizer(Optimizer): def step(self, loss): self.pre_step(loss) ... self.post_step()
- class jittor.optim.RMSprop(params, lr=0.01, eps=1e-08, alpha=0.99)[源代码]¶
RMSprop Optimizer. Args:
params(list): parameters of model. lr(float): learning rate. eps(float): term added to the denominator to avoid division by zero, default 1e-8. alpha(float): smoothing constant, default 0.99.
- Example:
optimizer = nn.RMSprop(model.parameters(), lr) optimizer.step(loss)
- class jittor.optim.SGD(params, lr, momentum=0, weight_decay=0, dampening=0, nesterov=False)[源代码]¶
SGD Optimizer.
Example:
optimizer = nn.SGD(model.parameters(), lr, momentum=0.9) optimizer.step(loss)
- jittor.optim.opt_grad(v: jittor_core.jittor_core.Var, opt: jittor.optim.Optimizer)[源代码]¶
Get grad of certain variable in optimizer, Example:
model = Model() optimizer = SGD(model.parameters()) … optimizer.backward(loss)
- for p in model.parameters():
grad = p.opt_grad(optimizer)
以下是Jittor的学习率调度模块的API文档,学习率调度模块需要配合优化器使用,您可以通过from jittor import lr_scheduler
来获取该模块。
- class jittor.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=- 1)[源代码]¶
learning rate is multiplied by gamma in each step.