枣庄联通网站备案,共享办公室租赁平台,预约网站制作,优化服务质量YOLOv11v10v8使用教程#xff1a; YOLOv11入门到入土使用教程 YOLOv11改进汇总贴#xff1a;YOLOv11及自研模型更新汇总 《SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy》
一、 模块介绍 论文链接#xff1a;SCConv: Spatial and Cha… YOLOv11v10v8使用教程 YOLOv11入门到入土使用教程 YOLOv11改进汇总贴YOLOv11及自研模型更新汇总 《SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy》
一、 模块介绍 论文链接SCConv: Spatial and Channel Reconstruction Convolution... 代码链接大佬复现https://github.com/cheng-haha/ScConv
论文速览卷积神经网络 CNN 在各种计算机视觉任务中取得了卓越的性能但这是以消耗大量计算资源为代价的部分原因是卷积层提取了冗余特征。最近的工作要么压缩训练有素的大型模型要么探索设计良好的轻量级模型。在本文中我们尝试利用特征之间的空间和通道冗余进行 CNN 压缩并提出了一种高效的卷积模块称为 SCConv空间和通道重建卷积以减少冗余计算并促进代表性特征学习。所提出的 SCConv 由两个单元组成空间重建单元 SRU 和通道重建单元 CRU。SRU 采用 separate-and-reconstruct 方法来抑制空间冗余而 CRU 使用 split-transform-andfuse 策略来减少通道冗余。此外SCConv 是一个即插即用的架构单元可用于直接替换各种卷积神经网络中的标准卷积。实验结果表明SCConv 嵌入式模型能够通过减少冗余特征来实现更好的性能从而显著降低复杂性和计算成本。
总结轻量化模块好用。 二、 加入到YOLO中
2.1 创建脚本文件 首先在ultralytics-nn路径下创建blocks.py脚本用于存放模块代码。 2.2 复制代码 复制代码粘到刚刚创建的blocks.py脚本中如下图所示
import torch
import torch.nn as nn
import torch.nn.functional as Fclass GroupBatchnorm2d(nn.Module):def __init__(self, c_num: int,group_num: int 16,eps: float 1e-10):super(GroupBatchnorm2d, self).__init__()assert c_num group_numself.group_num group_numself.weight nn.Parameter(torch.randn(c_num, 1, 1))self.bias nn.Parameter(torch.zeros(c_num, 1, 1))self.eps epsdef forward(self, x):N, C, H, W x.size()x x.view(N, self.group_num, -1)mean x.mean(dim2, keepdimTrue)std x.std(dim2, keepdimTrue)x (x - mean) / (std self.eps)x x.view(N, C, H, W)return x * self.weight self.biasclass SRU(nn.Module):def __init__(self,oup_channels: int,group_num: int 16,gate_treshold: float 0.5,torch_gn: bool True):super().__init__()self.gn nn.GroupNorm(num_channelsoup_channels, num_groupsgroup_num) if torch_gn else GroupBatchnorm2d(c_numoup_channels, group_numgroup_num)self.gate_treshold gate_tresholdself.sigomid nn.Sigmoid()def forward(self, x):gn_x self.gn(x)w_gamma self.gn.weight / sum(self.gn.weight)w_gamma w_gamma.view(1, -1, 1, 1)reweigts self.sigomid(gn_x * w_gamma)# Gatew1 torch.where(reweigts self.gate_treshold, torch.ones_like(reweigts), reweigts) # 大于门限值的设为1否则保留原值w2 torch.where(reweigts self.gate_treshold, torch.zeros_like(reweigts), reweigts) # 大于门限值的设为0否则保留原值x_1 w1 * xx_2 w2 * xy self.reconstruct(x_1, x_2)return ydef reconstruct(self, x_1, x_2):x_11, x_12 torch.split(x_1, x_1.size(1) // 2, dim1)x_21, x_22 torch.split(x_2, x_2.size(1) // 2, dim1)return torch.cat([x_11 x_22, x_12 x_21], dim1)class CRU(nn.Module):alpha: 0alpha1def __init__(self,op_channel: int,alpha: float 1 / 2,squeeze_radio: int 2,group_size: int 2,group_kernel_size: int 3,):super().__init__()self.up_channel up_channel int(alpha * op_channel)self.low_channel low_channel op_channel - up_channelself.squeeze1 nn.Conv2d(up_channel, up_channel // squeeze_radio, kernel_size1, biasFalse)self.squeeze2 nn.Conv2d(low_channel, low_channel // squeeze_radio, kernel_size1, biasFalse)# upself.GWC nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_sizegroup_kernel_size, stride1,paddinggroup_kernel_size // 2, groupsgroup_size)self.PWC1 nn.Conv2d(up_channel // squeeze_radio, op_channel, kernel_size1, biasFalse)# lowself.PWC2 nn.Conv2d(low_channel // squeeze_radio, op_channel - low_channel // squeeze_radio, kernel_size1,biasFalse)self.advavg nn.AdaptiveAvgPool2d(1)def forward(self, x):# Splitup, low torch.split(x, [self.up_channel, self.low_channel], dim1)up, low self.squeeze1(up), self.squeeze2(low)# TransformY1 self.GWC(up) self.PWC1(up)Y2 torch.cat([self.PWC2(low), low], dim1)# Fuseout torch.cat([Y1, Y2], dim1)out F.softmax(self.advavg(out), dim1) * outout1, out2 torch.split(out, out.size(1) // 2, dim1)return out1 out2class ScConv(nn.Module):def __init__(self,op_channel: int,group_num: int 4,gate_treshold: float 0.5,alpha: float 1 / 2,squeeze_radio: int 2,group_size: int 2,group_kernel_size: int 3,):super().__init__()self.SRU SRU(op_channel,group_numgroup_num,gate_tresholdgate_treshold)self.CRU CRU(op_channel,alphaalpha,squeeze_radiosqueeze_radio,group_sizegroup_size,group_kernel_sizegroup_kernel_size)def forward(self, x):x self.SRU(x)x self.CRU(x)return x2.3 更改task.py文件 打开ultralytics-nn-modules-task.py在脚本空白处导入函数。
from ultralytics.nn.blocks import *之后找到模型解析函数parse_model约在tasks.py脚本中940行左右位置可能因代码版本不同变动在该函数的最后一个else分支上面增加相关解析代码。 elif m is ScConv:c2 ch[f]args [ch[f], ] 2.4 更改yaml文件 yam文件解读YOLO系列 “.yaml“文件解读_yolo yaml文件-CSDN博客 打开更改ultralytics/cfg/models/11路径下的YOLOv11.yaml文件替换原有模块。放在该位置仅能插入该模块具体效果未知。博主精力有限仅完成与其他模块二次创新融合的测试结构图见文末代码见群文件更新。 # Ultralytics YOLO , AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. modelyolo11n.yaml will call yolo11.yaml with scale n# [depth, width, max_channels]n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPss: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPsm: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPsl: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPsx: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs# YOLO11n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 2, C3k2, [256, False, 0.25]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 2, C3k2, [512, False, 0.25]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 2, ScConv, []]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 2, C3k2, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 2, C2PSA, [1024]] # 10# YOLO11n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 2, C3k2, [512, False]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)- [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)2.5 修改train.py文件 创建Train脚本用于训练。
from ultralytics.models import YOLO
import os
os.environ[KMP_DUPLICATE_LIB_OK] Trueif __name__ __main__:model YOLO(modelultralytics/cfg/models/11/yolo11.yaml)# model.load(yolov8n.pt)model.train(data./data.yaml, epochs2, batch1, device0, imgsz640, workers2, cacheFalse,ampTrue, mosaicFalse, projectruns/train, nameexp)在train.py脚本中填入修改好的yaml路径运行即可训练数据集创建教程见下方链接。 YOLOv11入门到入土使用教程(含结构图)_yolov11使用教程-CSDN博客 三、相关改进思路2024/11/16日群文件
根据ScConv模块特性可替换C2f、C3模块中的BottleNeck部分代码见群文件结构如图。 ⭐另外融合上百种深度学习改进模块的YOLO项目仅79.9含百种改进的v9RTDETR79.9,含高性能自研模型更易发论文代码每周更新欢迎点击下方小卡片加我了解。⭐