Inter Frame - Inter Frame Prediction

Inter Frame Prediction

An inter coded frame is divided into blocks known as macroblocks. After that, instead of directly encoding the raw pixel values for each block, the encoder will try to find a block similar to the one it is encoding on a previously encoded frame, referred to as a reference frame. This process is done by a block matching algorithm. If the encoder succeeds on its search, the block could be encoded by a vector, known as motion vector, which points to the position of the matching block at the reference frame. The process of motion vector determination is called motion estimation.

In most cases the encoder will succeed, but the block found is likely not an exact match to the block it is encoding. This is why the encoder will compute the differences between them. Those residual values are known as the prediction error and need to be transformed and sent to the decoder.

To sum up, if the encoder succeeds in finding a matching block on a reference frame, it will obtain a motion vector pointing to the matched block and a prediction error. Using both elements, the decoder will be able to recover the raw pixels of the block. The following image shows the whole process graphically:

This kind of prediction has some pros and cons:

If everything goes fine, the algorithm will be able to find a matching block with little prediction error so that, once transformed, the overall size of motion vector plus prediction error is lower than the size of a raw encoding.
If the block matching algorithm fails to find a suitable match the prediction error will be considerable. Thus the overall size of motion vector plus prediction error will be greater than the raw encoding. In this case the encoder would make an exception and send a raw encoding for that specific block.
If the matched block at the reference frame has also been encoded using Inter frame prediction, the errors made for its encoding will be propagated to the next block. If every frame was encoded using this technique, there would be no way for a decoder to synchronize to a video stream because it would be impossible to obtain the reference images.

These drawbacks stress out the need of a reliable and time periodic reference frame for this technique to be efficient and useful. That reference frame is known as I-frame, which is strictly intra coded -every block is coded using raw pixel values-, so it can always be decoded without additional information.

In most designs, there are two types of inter frames: P-frames and B-frames. These two kinds of frames and the I-frames (Intra-coded pictures) usually join in a GOP (Group Of Pictures). The I-frame doesn't need additional information to be decoded and it can be used as a reliable reference. This structure also allows to achieve an I-frame periodicity, which is needed for decoder synchronization.

Famous quotes containing the words frame and/or prediction:

“We are not permitted to choose the frame of our destiny. But what we put into it is ours.”
—Dag Hammarskjöld (1905–1961)

“Recent studies that have investigated maternal satisfaction have found this to be a better prediction of mother-child interaction than work status alone. More important for the overall quality of interaction with their children than simply whether the mother works or not, these studies suggest, is how satisfied the mother is with her role as worker or homemaker. Satisfied women are consistently more warm, involved, playful, stimulating and effective with their children than unsatisfied women.”
—Alison Clarke-Stewart (20th century)