# What is a recurrent neural network (RNN)

This post explains what a recurrent neural network – or RNN model – is in machine learning.

RNN is a model that works with input sequences. Difference with sequences and normal data is that for sequence, order matters. Kind of like time-series but this is discrete time. This also means inputs are not independent of each other.

RNN takes input as a sequence of words. The output at each step, is a hidden state vector and an output.

Hidden state vector is a function of previous hidden state $s_{t-1}$ and current input $x$. It is the memory of the network. The formula is $s_t = f(Ux_t + Ws_{t-1})$.

Output is a possibility across vocabulary $o_t = \text{softmax} (Vs_t)$.

Here size of parameters:

• $U$ is a matrix $m * n$. m is the size of the hidden state vector, n is the size of input vector.
• $W$ is a matrix $m * m$. m is the size of the hidden state vector.
• $V$ is a matrix of size h * m. h is the size of the output vector, m is the size of the hidden state vector.

Depending on the context, the output of an RNN can also be a single vector called context. This will be discussed in the next post.

Reference:

Recurrent Neural Networks Tutorial, Part 1 – Introduction to RNNs