前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >注意力机制---Yolov5/Yolov7引入CBAM、GAM、Resnet_CBAM

注意力机制---Yolov5/Yolov7引入CBAM、GAM、Resnet_CBAM

原创
作者头像
AI小怪兽
发布2023-11-30 16:04:09
1.8K0
发布2023-11-30 16:04:09
举报
文章被收录于专栏:YOLO大作战

1.计算机视觉中的注意力机制

一般来说,注意力机制通常被分为以下基本四大类:

通道注意力 Channel Attention

空间注意力机制 Spatial Attention

时间注意力机制 Temporal Attention

分支注意力机制 Branch Attention

1.1.CBAM:通道注意力和空间注意力的集成者

轻量级的卷积注意力模块,它结合了通道和空间的注意力机制模块

论文题目:《CBAM: Convolutional Block Attention Module》 论文地址: https://arxiv.org/pdf/1807.06521.pdf

上图可以看到,CBAM包含CAM(Channel Attention Module)和SAM(Spartial Attention Module)两个子模块,分别进行通道和空间上的Attention。这样不只能够节约参数和计算力,并且保证了其能够做为即插即用的模块集成到现有的网络架构中去。

1.2 GAM:Global Attention Mechanism

超越CBAM,全新注意力GAM:不计成本提高精度! 论文题目:Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions 论文地址:https://paperswithcode.com/paper/global-attention-mechanism-retain-information

从整体上可以看出,GAM和CBAM注意力机制还是比较相似的,同样是使用了通道注意力机制和空间注意力机制。但是不同的是对通道注意力和空间注意力的处理。

1.3 ResBlock_CBAM

CBAM结构其实就是将通道注意力信息核空间注意力信息在一个block结构中进行运用。

在resnet中实现cbam:即在原始block和残差结构连接前,依次通过channel attention和spatial attention即可。

1.4性能评价

2.Yolov5加入CBAM、GAM

2.1 CBAM加入common.py

代码语言:javascript
复制
class ChannelAttentionModule(nn.Module):  
    def __init__(self, c1, reduction=16,light=False):
        super(ChannelAttentionModule, self).__init__()
        mid_channel = c1 // reduction
        self.light=light
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        if self.light:
            self.max_pool = nn.AdaptiveMaxPool2d(1) 
            self.shared_MLP = nn.Sequential(
                nn.Linear(in_features=c1, out_features=mid_channel),
                nn.LeakyReLU(0.1, inplace=True),
                nn.Linear(in_features=mid_channel, out_features=c1)
            )
        else:

            self.shared_MLP = nn.Conv2d(c1, c1, 1, 1, 0, bias=True)    
        self.act = nn.Sigmoid()
       
    def forward(self, x):
        if self.light: 
            avgout = self.shared_MLP(self.avg_pool(x).view(x.size(0),-1)).unsqueeze(2).unsqueeze(3)
            maxout = self.shared_MLP(self.max_pool(x).view(x.size(0),-1)).unsqueeze(2).unsqueeze(3)
            fc_out=(avgout + maxout)
        else:
            fc_out=(self.shared_MLP(self.avg_pool(x)))
        return x * self.act(fc_out)
        
class SpatialAttentionModule(nn.Module): ##update:coding-style FOR LIGHTING
    def __init__(self, kernel_size=7):
        super().__init__()
        assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
        padding = 3 if kernel_size == 7 else 1
        self.cv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
        self.act = nn.Sigmoid()
    def forward(self, x):
        return x * self.act(self.cv1(torch.cat([torch.mean(x, 1, keepdim=True), torch.max(x, 1, keepdim=True)[0]], 1)))

class CBAM(nn.Module):
    def __init__(self, c1,c2,k=7):
        super().__init__()
        self.channel_attention = ChannelAttentionModule(c1)
        self.spatial_attention = SpatialAttentionModule(k)

    def forward(self, x):
        return self.spatial_attention(self.channel_attention(x))

2.2 GAM加入common.py

代码语言:javascript
复制
def channel_shuffle(x, groups=2):   ##shuffle channel 
        #RESHAPE----->transpose------->Flatten 
        B, C, H, W = x.size()
        out = x.view(B, groups, C // groups, H, W).permute(0, 2, 1, 3, 4).contiguous()
        out=out.view(B, C, H, W) 
        return out

class GAM_Attention(nn.Module):
   #https://paperswithcode.com/paper/global-attention-mechanism-retain-information
    def __init__(self, c1, c2, group=True,rate=4):
        super(GAM_Attention, self).__init__()
        
        self.channel_attention = nn.Sequential(
            nn.Linear(c1, int(c1 / rate)),
            nn.ReLU(inplace=True),
            nn.Linear(int(c1 / rate), c1)
        )
        
        
        self.spatial_attention = nn.Sequential(
            
            nn.Conv2d(c1, c1//rate, kernel_size=7, padding=3,groups=rate)if group else nn.Conv2d(c1, int(c1 / rate), kernel_size=7, padding=3), 
            nn.BatchNorm2d(int(c1 /rate)),
            nn.ReLU(inplace=True),
            nn.Conv2d(c1//rate, c2, kernel_size=7, padding=3,groups=rate) if group else nn.Conv2d(int(c1 / rate), c2, kernel_size=7, padding=3), 
            nn.BatchNorm2d(c2)
        )

    def forward(self, x):
        
        b, c, h, w = x.shape
        x_permute = x.permute(0, 2, 3, 1).view(b, -1, c)
        x_att_permute = self.channel_attention(x_permute).view(b, h, w, c)
        x_channel_att = x_att_permute.permute(0, 3, 1, 2)
       # x_channel_att=channel_shuffle(x_channel_att,4) #last shuffle 
        x = x * x_channel_att
 
        x_spatial_att = self.spatial_attention(x).sigmoid()
        x_spatial_att=channel_shuffle(x_spatial_att,4) #last shuffle 
        out = x * x_spatial_att
        #out=channel_shuffle(out,4) #last shuffle 
        return out    

2.4 GAM加入common.py中加入common.py

代码语言:javascript
复制
class ResBlock_CBAM(nn.Module):
    def __init__(self, in_places, places, stride=1, downsampling=False, expansion=4):
        super(ResBlock_CBAM, self).__init__()
        self.expansion = expansion
        self.downsampling = downsampling

        self.bottleneck = nn.Sequential(
            nn.Conv2d(in_channels=in_places, out_channels=places, kernel_size=1, stride=1, bias=False),
            nn.BatchNorm2d(places),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Conv2d(in_channels=places, out_channels=places, kernel_size=3, stride=stride, padding=1, bias=False),
            nn.BatchNorm2d(places),
            nn.LeakyReLU(0.1, inplace=True),
            nn.Conv2d(in_channels=places, out_channels=places * self.expansion, kernel_size=1, stride=1,
                        bias=False),
            nn.BatchNorm2d(places * self.expansion),
        )
        self.cbam = CBAM(c1=places * self.expansion, c2=places * self.expansion, )

        if self.downsampling:
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels=in_places, out_channels=places * self.expansion, kernel_size=1, stride=stride,
                            bias=False),
                nn.BatchNorm2d(places * self.expansion)
            )
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        residual = x
        out = self.bottleneck(x)
        out = self.cbam(out)
        if self.downsampling:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)
        return out

2.3 CBAM、GAM加入yolo.py

代码语言:javascript
复制
if m in {
                Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
                BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, C2f,CBAM,ResBlock_CBAM,GAM_Attention}:

详见:

by CSDN AI小怪兽 https://blog.csdn.net/m0_63774211/article/details/129611391

我正在参与2023腾讯技术创作特训营第三期有奖征文,组队打卡瓜分大奖!

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1.计算机视觉中的注意力机制
    • 1.1.CBAM:通道注意力和空间注意力的集成者
      • 1.2 GAM:Global Attention Mechanism
        • 1.3 ResBlock_CBAM
          • 1.4性能评价
          • 2.Yolov5加入CBAM、GAM
            • 2.1 CBAM加入common.py中
              • 2.2 GAM加入common.py中
                • 2.4 GAM加入common.py中加入common.py中
                  • 2.3 CBAM、GAM加入yolo.py中
                  领券
                  问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档