This post explains what a recurrent neural network – or RNN model – is in machine learning.

RNN is a model that works with input *sequences*. Difference with sequences and normal data is that for sequence, order matters. Kind of like time-series but this is discrete time. This also means inputs are not independent of each other.

RNN takes input as a sequence of words. The output at each step, is a hidden state vector and an output.

Hidden state vector is a function of previous hidden state and current input $x$. It is the memory of the network. The formula is .

Output is a possibility across vocabulary .

Here size of parameters:

- is a matrix . m is the size of the hidden state vector, n is the size of input vector.
- is a matrix . m is the size of the hidden state vector.
- is a matrix of size h * m. h is the size of the output vector, m is the size of the hidden state vector.

Depending on the context, the output of an RNN can also be a single vector called *context*. This will be discussed in the next post.

Reference:

Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs