Vanilla Transformer
2024-06-18 00:39:04 1 举报
Vanilla Transformer结构图
作者其他创作
大纲/内容
L×
Multi-HeadAttention
Inputs
Linear & Softmax
Token Embedding
Add & Norm
Position-wiseFFN
Positional Encodings
(Shifted) Outputs
Output Probabilities
×L
(Masked)Multi-HeadAttention
0 条评论
下一页