找聊城做网站,宝安做棋牌网站建设多少钱,网页制作与设计讨论,苏州网站制作推广RTDETR更换Lion优化器
论文#xff1a;https://arxiv.org/abs/2302.06675 代码#xff1a;https://github.com/google/automl/blob/master/lion/lion_pytorch.py 简介#xff1a; Lion优化器是一种基于梯度的优化算法#xff0c;旨在提高梯度下降法在深度学习中的优化效果…RTDETR更换Lion优化器
论文https://arxiv.org/abs/2302.06675 代码https://github.com/google/automl/blob/master/lion/lion_pytorch.py 简介 Lion优化器是一种基于梯度的优化算法旨在提高梯度下降法在深度学习中的优化效果。Lion优化器具有以下几个特点 自适应学习率Lion优化器能够自动调整学习率根据每个参数的梯度情况来自适应地更新学习率。这使得模型能够更快地收敛并且不易陷入局部最优点。 动量加速Lion优化器引入了动量概念通过积累历史梯度的一部分来加速梯度更新。这样可以增加参数更新的稳定性避免陷入震荡或振荡状态。 参数分布均衡Lion优化器通过分析模型参数的梯度分布情况对梯度进行动态调整以实现参数分布的均衡。这有助于避免某些参数过于稀疏或过于密集的问题提高模型的泛化能力。
与AdamW 和各种自适应优化器需要同时保存一阶和二阶矩相比Lion 只需要动量将额外的内存占用减半 由于 Lion 的简单性Lion 在我们的实验中具有更快的运行时间step/s通常比 AdamW 和 Adafactor 提速 2-15%
优化器代码
# Copyright 2023 Google Research. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the License);
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an AS IS BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
PyTorch implementation of the Lion optimizer.
import torch
from torch.optim.optimizer import Optimizerclass Lion(Optimizer):rImplements Lion algorithm.def __init__(self, params, lr1e-4, betas(0.9, 0.99), weight_decay0.0):Initialize the hyperparameters.Args:params (iterable): iterable of parameters to optimize or dicts definingparameter groupslr (float, optional): learning rate (default: 1e-4)betas (Tuple[float, float], optional): coefficients used for computingrunning averages of gradient and its square (default: (0.9, 0.99))weight_decay (float, optional): weight decay coefficient (default: 0)if not 0.0 lr:raise ValueError(Invalid learning rate: {}.format(lr))if not 0.0 betas[0] 1.0:raise ValueError(Invalid beta parameter at index 0: {}.format(betas[0]))if not 0.0 betas[1] 1.0:raise ValueError(Invalid beta parameter at index 1: {}.format(betas[1]))defaults dict(lrlr, betasbetas, weight_decayweight_decay)super().__init__(params, defaults)torch.no_grad()def step(self, closureNone):Performs a single optimization step.Args:closure (callable, optional): A closure that reevaluates the modeland returns the loss.Returns:the loss.loss Noneif closure is not None:with torch.enable_grad():loss closure()for group in self.param_groups:for p in group[params]:if p.grad is None:continue# Perform stepweight decayp.data.mul_(1 - group[lr] * group[weight_decay])grad p.gradstate self.state[p]# State initializationif len(state) 0:# Exponential moving average of gradient valuesstate[exp_avg] torch.zeros_like(p)exp_avg state[exp_avg]beta1, beta2 group[betas]# Weight updateupdate exp_avg * beta1 grad * (1 - beta1)p.add_(update.sign_(), alpha-group[lr])# Decay the momentum running average coefficientexp_avg.mul_(beta2).add_(grad, alpha1 - beta2)return loss将上述代码复制粘贴在ultralytics/engine下创建lion_pytorch.py文件。 在ultralytics/engine/trainer.py中导入Lion
from ultralytics.engine.lion_pytorch import Lion
然后在def build_optimizer(self函数中加入下列代码 elif name Lion:optimizer Lion(g[2])之后就可以在训练时使用Lion优化器了 results model.train(dataultralytics/cfg/datasets/coco.yaml, epochs500, batch16, workers8,resumeFalse,close_mosaic10, namecfg, patience500, pretrainedFalse, cos_lrTrue,optimizer Lion,device1) # 训练模型