导读
一个非常好用的git仓库,封装了非常全面的计算机视觉中的自注意力构建块,直接调用,无需重复造轮子了。

用einsum和einops在PyTorch中实现计算机视觉的自我注意机制。专注于计算机视觉自注意模块。
$ pip install self-attention-cv
如果你没有GPU,最好是在环境中预装好pytorch。
import torch
from self_attention_cv import MultiHeadSelfAttention
model = MultiHeadSelfAttention(dim=64)
x = torch.rand(16, 10, 64) # [batch, tokens, dim]
mask = torch.zeros(10, 10) # tokens X tokens
mask[5:8, 5:8] = 1
y = model(x, mask)
import torch
from self_attention_cv import AxialAttentionBlock
model = AxialAttentionBlock(in_channels=256, dim=64, heads=8)
x = torch.rand(1, 256, 64, 64) # [batch, tokens, dim, dim]
y = model(x)
import torch
from self_attention_cv import TransformerEncoder
model = TransformerEncoder(dim=64,blocks=6,heads=8)
x = torch.rand(16, 10, 64) # [batch, tokens, dim]
mask = torch.zeros(10, 10) # tokens X tokens
mask[5:8, 5:8] = 1
y = model(x,mask)
import torch
from self_attention_cv import ViT, ResNet50ViT
model1 = ResNet50ViT(img_dim=128, pretrained_resnet=False,
blocks=6, num_classes=10,
dim_linear_block=256, dim=256)
# or
model2 = ViT(img_dim=256, in_channels=3, patch_dim=16, num_classes=10,dim=512)
x = torch.rand(2, 3, 256, 256)
y = model2(x) # [2,10]
import torch
from self_attention_cv.transunet import TransUnet
a = torch.rand(2, 3, 128, 128)
model = TransUnet(in_channels=3, img_dim=128, vit_blocks=8,
vit_dim_linear_mhsa_block=512, classes=5)
y = model(a) # [2, 5, 128, 128]
import torch
from self_attention_cv.bottleneck_transformer import BottleneckBlock
inp = torch.rand(1, 512, 32, 32)
bottleneck_block = BottleneckBlock(in_channels=512, fmap_size=(32, 32), heads=4, out_channels=1024, pooling=True)
y = bottleneck_block(inp)
import torch
from self_attention_cv.pos_embeddings import AbsPosEmb1D,RelPosEmb1D
model = AbsPosEmb1D(tokens=20, dim_head=64)
# batch heads tokens dim_head
q = torch.rand(2, 3, 20, 64)
y1 = model(q)
model = RelPosEmb1D(tokens=20, dim_head=64, heads=3)
q = torch.rand(2, 3, 20, 64)
y2 = model(q)
import torch
from self_attention_cv.pos_embeddings import RelPosEmb2D
dim = 32 # spatial dim of the feat map
model = RelPosEmb2D(
feat_map_size=(dim, dim),
dim_head=128)
q = torch.rand(2, 4, dim*dim, 128)
y = model(q)

—END—
英文原文:https://github.com/The-AI-Summer/self-attention-cv
个人微信(如果没有备注不拉群!)请注明:地区+学校/企业+研究方向+昵称
下载1:何恺明顶会分享
在「AI算法与图像处理」公众号后台回复:何恺明,即可下载。总共有6份PDF,涉及 ResNet、Mask RCNN等经典工作的总结分析
下载2:终身受益的编程指南:Google编程风格指南
在「AI算法与图像处理」公众号后台回复:c++,即可下载。历经十年考验,最权威的编程规范!下载3 CVPR2021
在「AI算法与图像处理」公众号后台回复:CVPR,即可下载1467篇CVPR 2020论文 和 CVPR 2021 最新论文靓仔,靓妹 点亮在看吧
