The intra prediction in H.264/AVC is a type of spatial domain directional prediction, which means different intra prediction modes represent different prediction directions, such as horizontal, vertical, and diagonal. An intra-coded MB can be partitioned into 4×4, 8×8, or 16×16 intra prediction blocks. The 4×4 and 8×8 intra prediction blocks have nine prediction directions, respectively, and the 16×16 block has four. Hence, totally 22 (9+9+4) intra prediction modes are used in H.264/AVC. The residue usually has high energy along the direction of prediction, as edges are more difficult to be predicted than smooth areas.

Mode-dependent directional transform (MDDT) was proposed to compact the residue produced by intra prediction. It consists of a series of pre-defined separable transforms; each transform is efficient in compacting energy along one of the prediction directions, thus favoring one of the intra modes. The type of MDDT is coupled with the selected intra prediction mode, so is not explicitly signaled.

For inter prediction errors, which also contain direction information, MDDT cannot be used, unless the edge directions are explicitly detected and transmitted. However, the side information thus introduced is significant and hurts the overall performance improvement. Hence, MDDT is proposed only to intra-coded MBs.

Twenty-two separable transforms are pre-defined for the 22 intra prediction modes; each consists of two transform matrices for the horizontal and vertical transforms. The memory to store all the MDDT matrices is about 1.5Kb. The transform matrices are derived based on a large set of video sequences, which are all intra-coded. All the blocks are classified into 22 categories, according to their relevant intra prediction modes. For each category of blocks, the horizontal and vertical correlation matrices of the prediction errors are calculated, of which the eigenvectors are used to construct the horizontal and vertical transform matrices, respectively. The matrix derivation procedure of MDDT is similar to that of KLT, but MDDT is not optimal, because MDDT is separable and designed based on general statistics, which may not accord with local statistics of certain video sequences. Furthermore, basis vectors of MDDT, containing only integers, are the scaled and rounded versions of the eigenvectors, and are not orthogonal to each other. The risk that non-orthogonal transforms may take has been introduced in the earlier post (here).

It is well known separable transform efficiently deals with horizontal and vertical edges, because the basis images contain only horizontal and vertical edges, like checkerboards. MDDT–although a type of separable transform–is used to compacted energy along arbitrary directions, which seems quite contradictory. The basis images of MDDT for different intra prediction modes are studied. It is found that although the basis images also have checkerboard patterns, the positions of zero-crossings are different from those of DCT or ICT. Figs 1 and 2 show the basis images for the 4th mode (diagonal down right) of intra 8×8 and 4×4 prediction, respectively. Observing the basis image at the second row and the second column, which is typical, one will find the two squares along the diagonal down right direction have larger areas than the other two squares. Maybe, such differences make MDDT more efficient than DCT or ICT in dealing with arbitrary edges. Another observation is that intra prediction modes with close directions, such as (diagonal down right, vertical right, horizontal down) and (diagonal down left, vertical left, horizontal up), have similar MDDT basis image sets. A complete set of basis images of MDDT can be downloaded here.

Fig. 1 Basis images of MDDT for the 4th mode of intra 8×8 prediction — diagonal down right

Fig. 2 Basis images of MDDT for the 4th mode of intra 4×4 prediction — diagonal down right

MDDT has been adopted in KTA. The relevant documents include proposals (AF15, AG11, AH20, AJ24, AI36) and a conference paper [1].

[1] Y. Ye and M. Karczewicz, “Improved H.264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning,” *IEEE Int’l Conf. Image Process.’08 (ICIP08)*, San Diego, U.S.A., Oct. 2008.

Permanent Link: Mode-Dependent Directional Transform (MDDT) in JM/KTA

## Post Comment