Connections form cycles x_t: input at time t h_t: hidden state at time t f: is an activation function for the hidden layer (e.g. tanh) U, V, W: network parameters RNN shares the same parameters across all steps g: activation function for the output layer Long Term Dependency x_1 ~ x_(t-1) are encoded into h_(t-1) h_(t-1) has the information on the past It is a context to process x_t However, it ..