dummy

RAWCODE

1 Formulas

1.1 符号

符号 解释
\(K\) 词汇表的大小
\(T\) 句子的长度
\(H\) 隐藏层单元数
\(N\) 训练数
\(x \in \mathbb{R}^{T \times 8000}\) 输入
\(o \in \mathbb{R}^{T \times 8000}\) 输出
\(s \in \mathbb{R}^{T \times 100}\) 隐藏状态, 没算上\(s_{-1}\)
\(U \in \mathbb{R}^{100 \times 8000}\) 输入权重
\(V \in \mathbb{R}^{8000 \times 100}\) 输出权重
\(W \in \mathbb{R}^{100 \times 100}\) 隐藏状态转移权重

1.2 公式

\[ \begin{aligned} d_3 & \triangleq \big(\hat{y}_3 - y_3 \big) \cdot V \cdot \big(1 - s_3 ^ 2 \big) \\ d_2 & \triangleq d_3 \cdot W \cdot \big(1 - s_2 ^ 2 \big) \\ d_1 & \triangleq d_2 \cdot W \cdot \big(1 - s_1 ^ 2 \big) \\ d_0 & \triangleq d_1 \cdot W \cdot \big(1 - s_0 ^ 2 \big) \\ \\ \frac{\partial{E_3}}{\partial{V}} &= \frac{\partial{E_3}}{\partial{\hat{y}_3}} \frac{\partial{\hat{y}_3}}{\partial{z_3}} \frac{\partial{z_3}}{\partial{V}} \\ &= (\hat{y}_{3} - y_3) s_3 \\ \\ \frac{\partial{E_3}}{\partial{W}} &= \frac{\partial{E_3}}{\partial{\hat{y}_3}} \frac{\partial{\hat{y}_3}}{\partial{z_3}} \frac{\partial{z_3}}{\partial{s_3}} \frac{\partial{s_3}}{\partial{W}} \\ & \triangleq d_3 s_2 + d_2 s_1 + d_1 s_0 + d_0 \cdot s_{-1} \\ \\ \frac{\partial{E_3}}{\partial{U}} &= \frac{\partial{E_3}}{\partial{\hat{y}_3}} \frac{\partial{\hat{y}_3}}{\partial{z_3}} \frac{\partial{z_3}}{\partial{s_3}} \frac{\partial{s_3}}{\partial{U}} \\ & \triangleq d_3 x_3 + d_2 x_2 + d_1 x_1 + d_0 \cdot x_0 \\ \end{aligned} \]

2 Notebook

百度云盘Datasets

3 References