你好,通过阅读代码是最能搞懂知识点的方法,这里我给出pytorch版本中dot_attention的简单实现。
def dot_attention(self, seq, cond, lens): """ Arguments :param seq: (b_s, m_s, h_s) :param cond: (b_s, h_s) :param lens: [len_1, len_2] the real len of the seq for mask the eos. :return: contexts, scores """ scores = cond.unsqueeze(1).expand_as(seq).mul(seq).sum(2) # seq = self.dropout(seq) max_len = max(lens) for i, l in enumerate(lens): if l < max_len: scores.data[i, l:] = -np.inf scores = F.softmax(scores, dim=1) context = scores.unsqueeze(2).expand_as(seq).mul(seq).sum(1) return context, scores # context (b_s, h_s) scores (b_s, m_s)
返回的context是做完attention的context vector,scores是attention分值,其中输入lens是为了解除padding的影响。