RL_LSTM

Understanding LSTM — a tutorial into Long Short-Term Memory Recurrent Neural Networks

Paper Source
Journal:
Year: 2019
Institute:
Author: Ralf C. Staudemeyer, Eric Rothstein Morris
#Long Short-Term Memory #Recurrent Neural Networks

Recurrent Neural Network

An RNN model is typically used to process long sequential data like video, audio, etc. A simple RNN looks like:

However, simple perceptron neurons that linearly combine the current input element and the last unit state may easily lose the long-term dependencies. Standard RNN cannot bridge more than 5–10 time steps. This is due to that back-propagated error signals tend to either grow or shrink with every time, which makes it struggle to learn. There is where LSTM comes into the picture.

LSTM Networks

LSTMs is the short name of Long Short Term Memory networks that is explicitly designed to avoid the long-term dependency problem.

We can see from the above picture that it has multi-neural network layer in each recurrent unit.

$z_t = \sigma(W_{z} \dot [h_{t-1}, x_t]) \\ r_t = \sigma(W_{r} \dot [h_{t-1}, x_t]) \\ \hat{h_t} = \tanh(W \dot [r_t * h_{t-1}, x_t]) \\ h_t = (1 - z_t) * h_{t-1} + z_t * \hat{h_t}$

Understanding LSTM — a tutorial into Long Short-Term Memory Recurrent Neural Networks

Recurrent Neural Network

LSTM Networks

Reference