前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >AttentionFreeTransformer 核心结构图(GraphViz 重绘)

AttentionFreeTransformer 核心结构图(GraphViz 重绘)

作者头像
ApacheCN_飞龙
发布2024-01-11 09:23:22
1250
发布2024-01-11 09:23:22
举报
文章被收录于专栏:信数据得永生信数据得永生

AFTFull

请添加图片描述
请添加图片描述
代码语言:javascript
复制
digraph AFTFull {
	rankdir=BT
    node [
		style=filled, 
		color=Black
		fontcolor=White, 
		fillcolor="#30638e", 
		fontname="SimHei",
		fontsize=32,
		width=5, height=2,
	]

    inp [label="输入\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    llq [label="LinearQ\n[HidSize, ProjSize]", shape="box"]
    llk [label="LinearK\n[HidSize, ProjSize]", shape="box"]
    llv [label="LinearV\n[HidSize, ProjSize]", shape="box"]
	w [label="W:Param\n[SeqLen, SeqLen]", shape="Mrecord"]
    q [label="Q\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    k [label="K\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    v [label="V\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    σ [label="Sigmoid", shape="box", width=3]
    atten_op [label="exp(W) @ (exp(K) * V)\n/ exp(W) * exp(K)", shape="box"]
    atten [label="[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    mul [label="*", shape="box", width=3]
    llo [label="LinearO\n[ProjSize, HidSize]", shape="box"]
    oup [label="输出\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    
	inp -> llq
	inp -> llk
	inp -> llv
	llq -> q
	llk -> k
	llv -> v
	q -> σ
	w -> atten_op
	k -> atten_op
	v -> atten_op
	atten_op -> atten
	σ -> mul
	atten -> mul
	mul -> llo
	llo -> oup
}

AFTSimple

请添加图片描述
请添加图片描述
代码语言:javascript
复制
digraph AFTSimple {
	rankdir=BT
    node [
		style=filled, 
		color=Black
		fontcolor=White, 
		fillcolor="#30638e", 
		fontname="SimHei",
		fontsize=32,
		width=5, height=2,
	]

    inp [label="输入\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    llq [label="LinearQ\n[HidSize, ProjSize]", shape="box"]
    llk [label="LinearK\n[HidSize, ProjSize]", shape="box"]
    llv [label="LinearV\n[HidSize, ProjSize]", shape="box"]
    q [label="Q\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    k [label="K\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    v [label="V\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    σ [label="Sigmoid", shape="box", width=3]
    atten_op [label="sum(softmax(K, 1) * V, 1)", shape="box"]
    atten [label="[BatchSize, 1, ProjSize]", shape="Mrecord"]
    mul [label="*", shape="box", width=3]
    llo [label="LinearO\n[ProjSize, HidSize]", shape="box"]
    oup [label="输出\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    
	inp -> llq
	inp -> llk
	inp -> llv
	llq -> q
	llk -> k
	llv -> v
	q -> σ
	k -> atten_op
	v -> atten_op
	atten_op -> atten
	σ -> mul
	atten -> mul
	mul -> llo
	llo -> oup
}

AFTLocal

请添加图片描述
请添加图片描述
代码语言:javascript
复制
digraph AFTLocal {
	rankdir=BT
    node [
		style=filled, 
		color=Black
		fontcolor=White, 
		fillcolor="#30638e", 
		fontname="SimHei",
		fontsize=32,
		width=5, height=2,
	]

    inp [label="输入\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    llq [label="LinearQ\n[HidSize, ProjSize]", shape="box"]
    llk [label="LinearK\n[HidSize, ProjSize]", shape="box"]
    llv [label="LinearV\n[HidSize, ProjSize]", shape="box"]
	w [label="W:Param\n[SeqLen, SeqLen]", shape="Mrecord"]
	mask [label="mask\n[SeqLen, SeqLen]\nabs(i - j) < S? 1: 0", shape="box"]
    q [label="Q\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    k [label="K\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    v [label="V\n[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    σ [label="Sigmoid", shape="box", width=3]
    atten_op [label="exp(W) @ (exp(K) * V)\n/ exp(W) * exp(K)", shape="box"]
    atten [label="[BatchSize,\n SeqLen,\n ProjSize]", shape="Mrecord"]
    mul [label="*", shape="box", width=3]
    llo [label="LinearO\n[ProjSize, HidSize]", shape="box"]
    oup [label="输出\n[BatchSize,\n SeqLen,\n HidSize]", shape="Mrecord"]
    
	inp -> llq
	inp -> llk
	inp -> llv
	llq -> q
	llk -> k
	llv -> v
	q -> σ
	w -> mask
	mask -> atten_op
	k -> atten_op
	v -> atten_op
	atten_op -> atten
	σ -> mul
	atten -> mul
	mul -> llo
	llo -> oup
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2024-01-10,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • AFTFull
  • AFTSimple
  • AFTLocal
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档