邯郸网站建设选哪家好,如何建设好企业的网站维护,广州建设网站哪个好,微官网建设#x1f4a1;#x1f4a1;#x1f4a1;本文独家改进#xff1a;手把手教程#xff0c;解决注意力机制引入到YOLOv8在自己数据集不涨点的问题点#xff0c;本文提供五种改进方法来解决此问题#xff1b;
ContextAggregation | 亲测在血细胞检测项目中涨点#xff0c;… 本文独家改进手把手教程解决注意力机制引入到YOLOv8在自己数据集不涨点的问题点本文提供五种改进方法来解决此问题
ContextAggregation | 亲测在血细胞检测项目中涨点提供五种改进方法最大 map0.5 从原始0.895提升至0.916
1.验证数据集
1.1 血细胞检测介绍
数据来源于医疗相关数据集目的是解决血细胞检测问题。任务是通过显微图像读数来检测每张图像中的所有红细胞RBC、白细胞WBC以及血小板 (Platelets)共三类
意义选择该数据集的原因是我们血液中RBC、WBC和血小板的密度提供了大量关于免疫系统和血红蛋白的信息这些信息可以帮助我们初步地识别一个人是否健康如果在其血液中发现了任何差异我们就可以迅速采取行动来进行下一步的诊断。然而通过显微镜手动查看样品是一个繁琐的过程这也是深度学习模式能够发挥重要作用的地方YOLOv8可以从显微图像中分类和检测血细胞并且达到很高的精确度。
1.2 血细胞检测数据集介绍
数据集大小364张 检测难点1类别不平衡2同个类别相互遮挡、不同类别相互遮挡3检测物长宽差异较大等
2.Context Aggregation注意力举例分析
2.1 Context Aggregation介绍 论文https://arxiv.org/abs/2106.01401
摘要 卷积神经网络(CNNs)在计算机视觉中无处不在具有无数有效和高效的变化。最近Container——最初是在自然语言处理中引入的——已经越来越多地应用于计算机视觉。早期的用户继续使用CNN的骨干最新的网络是端到端无CNN的Transformer解决方案。最近一个令人惊讶的发现表明一个简单的基于MLP的解决方案没有任何传统的卷积或Transformer组件可以产生有效的视觉表示。虽然CNN、Transformer和MLP-Mixers可以被视为完全不同的架构但我们提供了一个统一的视图表明它们实际上是在神经网络堆栈中聚合空间上下文的更通用方法的特殊情况。我们提出了Container(上下文聚合网络)一个用于多头上下文聚合的通用构建块它可以利用Container的长期交互作用同时仍然利用局部卷积操作的诱导偏差导致更快的收敛速度这经常在CNN中看到。我们的Container架构在ImageNet上使用22M参数实现了82.7%的Top-1精度比DeiT-Small提高了2.8并且可以在短短200个时代收敛到79.9%的Top-1精度。比起相比的基于Transformer的方法不能很好地扩展到下游任务依赖较大的输入图像的分辨率,我们高效的网络,名叫CONTAINER-LIGHT,可以使用在目标检测和分割网络如DETR实例,RetinaNet和Mask-RCNN获得令人印象深刻的检测图38.9,43.8,45.1和掩码mAP为41.3与具有可比较的计算和参数大小的ResNet-50骨干相比分别提供了6.6、7.3、6.9和6.6 pts的较大改进。与DINO框架下的DeiT相比我们的方法在自监督学习方面也取得了很好的效果。 仅需22M参数量所提CONTAINER在ImageNet数据集取得了82.7%的的top1精度以2.8%优于DeiT-Small此外仅需200epoch即可达到79.9%的top1精度。不用于难以扩展到下游任务的Transformer方案(因为需要更高分辨率)该方案CONTAINER-LIGHT可以嵌入到DETR、RetinaNet以及Mask-RCNN等架构中用于目标检测、实例分割任务并分别取得了6.67.66.9指标提升。 提供了一个统一视角表明它们均是更广义方案下通过神经网络集成空间上下文信息的特例。我们提出了CONTAINER(CONText AggregatIon NEtwoRK)一种用于多头上下文集成Context Aggregation的广义构建模块 。
本文有以下几点贡献
提出了关于主流视觉架构的一个统一视角提出了一种新颖的模块CONTAINER它通过可学习参数和响应的架构混合使用了静态与动态关联矩阵(Affinity Matrix)在图像分类任务中表现出了很强的结果提出了一种高效有效的扩展CONTAINER-LIGHT在检测与分割方面取得了显著的性能提升。
代码详见Yolov8涨点神器用于微小目标检测的上下文增强和特征细化网络ContextAggregation助力小目标检测暴力涨点-CSDN博客
3.多种网络结构进行验证
结果分析
3.1 YOLOv8_ContextAggregation1.yaml
# Ultralytics YOLO , GPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. modelyolov8n.yaml will call yolov8.yaml with scale n# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPss: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPsm: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPsl: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPsx: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs# YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9# YOLOv8.0n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 12- [-1, 1, ContextAggregation, [512]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 16 (P3/8-small)- [-1, 1, ContextAggregation, [256]] # 17 (P5/32-large)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 20 (P4/16-medium)- [-1, 1, ContextAggregation, [512]] # 21 (P5/32-large)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 9], 1, Concat, [1]] # cat head P5- [-1, 3, C2f, [1024]] # 24 (P5/32-large)- [-1, 1, ContextAggregation, [1024]] # 25 (P5/32-large)- [[17, 21, 25], 1, Detect, [nc]] # Detect(P3, P4, P5)map0.5 从原始0.895提升至0.897 YOLOv8_ContextAggregation1 summary (fused): 204 layers, 3009125 parameters, 0 gradients, 8.1 GFLOPsClass Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:0400:00, 1.36it/s]all 87 1138 0.816 0.893 0.897 0.602WBC 87 87 0.971 0.989 0.985 0.771RBC 87 968 0.699 0.836 0.841 0.584Platelets 87 83 0.777 0.855 0.865 0.452
3.2 YOLOv8_ContextAggregation2.yaml # Ultralytics YOLO , GPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. modelyolov8n.yaml will call yolov8.yaml with scale n# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPss: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPsm: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPsl: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPsx: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs# YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9# YOLOv8.0n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 12- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 15 (P3/8-small)- [-1, 1, ContextAggregation, [256]] # 16 (P5/32-large)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 12], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 19 (P4/16-medium)- [-1, 1, ContextAggregation, [512]] # 20 (P5/32-large)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 9], 1, Concat, [1]] # cat head P5- [-1, 3, C2f, [1024]] # 23 (P5/32-large)- [-1, 1, ContextAggregation, [1024]] # 24 (P5/32-large)- [[16, 20, 24], 1, Detect, [nc]] # Detect(P3, P4, P5)map0.5 从原始0.895提升至0.907 YOLOv8_ContextAggregation2 summary (fused): 195 layers, 3008482 parameters, 0 gradients, 8.1 GFLOPsClass Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 2/2 [00:0300:00, 1.59s/it]all 87 1138 0.824 0.892 0.907 0.613WBC 87 87 0.984 1 0.988 0.785RBC 87 968 0.727 0.836 0.851 0.596Platelets 87 83 0.76 0.84 0.881 0.457
3.3 YOLOv8_ContextAggregation3.yaml
# Ultralytics YOLO , GPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. modelyolov8n.yaml will call yolov8.yaml with scale n# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPss: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPsm: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPsl: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPsx: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs# YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 1, ContextAggregation, [1024]] # 10# YOLOv8.0n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 13- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 16 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 19 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 10], 1, Concat, [1]] # cat head P5- [-1, 3, C2f, [1024]] # 22 (P5/32-large)- [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)map0.5 从原始0.895提升至0.904 YOLOv8_ContextAggregation3 summary (fused): 177 layers, 3007516 parameters, 0 gradients, 8.1 GFLOPsClass Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 2/2 [00:0300:00, 1.91s/it]all 87 1138 0.835 0.874 0.904 0.61WBC 87 87 0.979 0.989 0.993 0.779RBC 87 968 0.722 0.841 0.86 0.597Platelets 87 83 0.804 0.792 0.86 0.453
3.4 YOLOv8_ContextAggregation4.yaml
# Ultralytics YOLO , GPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. modelyolov8n.yaml will call yolov8.yaml with scale n# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPss: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPsm: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPsl: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPsx: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs# YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, ContextAggregation, [128]] # 3- [-1, 1, Conv, [256, 3, 2]] # 4-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, ContextAggregation, [256]] # 6- [-1, 1, Conv, [512, 3, 2]] # 7-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, ContextAggregation, [512]] # 9- [-1, 1, Conv, [1024, 3, 2]] # 10-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, ContextAggregation, [1024]] # 12- [-1, 1, SPPF, [1024, 5]] # 13# YOLOv8.0n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 9], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 16- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 5], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 19 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 16], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 22 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 13], 1, Concat, [1]] # cat head P5- [-1, 3, C2f, [1024]] # 25 (P5/32-large)- [[19, 22, 25], 1, Detect, [nc]] # Detect(P3, P4, P5)map0.5 从原始0.895提升至0.896 YOLOv8_ContextAggregation4 summary (fused): 204 layers, 3008645 parameters, 0 gradients, 8.1 GFLOPsClass Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:0400:00, 1.37it/s]all 87 1138 0.829 0.884 0.896 0.609WBC 87 87 0.988 1 0.99 0.796RBC 87 968 0.741 0.796 0.843 0.581Platelets 87 83 0.759 0.855 0.854 0.451 3.5 YOLOv8_ContextAggregation5.yaml
# Ultralytics YOLO , GPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect# Parameters
nc: 1 # number of classes
scales: # model compound scaling constants, i.e. modelyolov8n.yaml will call yolov8.yaml with scale n# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPss: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPsm: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPsl: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPsx: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs# YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, ContextAggregation, [256]] # 5- [-1, 1, Conv, [512, 3, 2]] # 6-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, ContextAggregation, [512]] # 8- [-1, 1, Conv, [1024, 3, 2]] # 9-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, ContextAggregation, [1024]] # 11- [-1, 1, SPPF, [1024, 5]] # 12# YOLOv8.0n head
head:- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 8], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 15- [-1, 1, nn.Upsample, [None, 2, nearest]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 18 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 15], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 21 (P4/16-medium)- [-1, 1, Conv, [512, 3, 2]]- [[-1, 12], 1, Concat, [1]] # cat head P5- [-1, 3, C2f, [1024]] # 24 (P5/32-large)- [[18, 21, 24], 1, Detect, [nc]] # Detect(P3, P4, P5)map0.5 从原始0.895提升至0.916 YOLOv8_ContextAggregation5 summary (fused): 195 layers, 3008482 parameters, 0 gradients, 8.1 GFLOPs
8.0920064Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 2/2 [00:0300:00, 1.74s/it]all 87 1138 0.837 0.912 0.916 0.622WBC 87 87 0.971 1 0.99 0.791RBC 87 968 0.737 0.851 0.862 0.607Platelets 87 83 0.803 0.887 0.897 0.469 4.系列篇