## Analysis of Coding Tools in HEVC Test Model (HM 1.0) â€“ Inter Prediction

2010-12-07 H.265/HEVC View Comments Views(16,356)*[Update on 2011-02-15] The 12-tap DCT-based interpolation filter (high efficiency configuration) and 6-tap directional interpolation filter (low complexity configuration) for 1/4 luma sample will be replaced by 8-tap DCT-based interpolation filter for both HE and LC (JCTVC-D344)Â in the upcoming HM2.0. In addition, the bilinear interpolation filter for 1/8 chroma sample will be replaced by 4-tap DCT-based interpolation filter (JCTVC-D347) in the HM2.0.*

Tools adopted in HM0.9 include 12-tap DCT-based interpolation filter (high efficiency configuration) and 6-tap directional interpolation filter (low complexity configuration) for 1/4 luma sample, bilinear interpolation filter for 1/8 chroma sample (both HE and LC), advanced motion vector prediction, bi-direction rounding control, bi-directional prediction for temporal level 0.

**12-tap DCT-based Interpolation Filter for luma (HE)**

Motion compensation is the key factor for high efficient video compressing, where fractional pel accuracy requires interpolation of the reference frame. In H.264/AVC, a 6-tap fixed Wiener filter is first used for half pel accuracy interpolation and then a bilinear combination of integer and half pel values is used to provide Â¼ pel accuracy interpolation. Instead of adaptive interpolation filter, HM still adopts the fixed 12-tap DCT-based interpolation filter to provide fractional pel accuracy interpolation by replacing the combination of Wiener and bilinear filters with a set of interpolation filters at the desired fractional accuracy. More specifically, only one filtering procedure is needed to provide the interpolation pixel to any pixel accuracy, instead of a combination of 6-tap and bilinear filtering procedures in H.264/AVC. Thus, the motion compensation process can be simplified in the implementation point of view and the complexity can also be reduced for quarter-pel accuracy. Figure 1 shows the 12-tap DCT-based interpolation filter for luma (HE).

Figure 1. 12-tap DCT-based interpolation filter for luma (HE)

The DCT-based interpolation filtering process uses the horizontal neighboring integer pixels I(x,0),Â where x=-5â€¦6, Â to interpolate the horizontal fractional pixels, a, b and c, by using the following equations,

a = \sum_{x=-5}^{6} I(x,y)*f(x, 1/4)Â Â Â Â Â Â Â Â Â Â Â (1)

b = \sum_{x=-5}^{6} I(x,y)*f(x, 2/4)Â Â Â Â Â Â Â Â Â Â Â (2)

c = \sum_{x=-5}^{6} I(x,y)*f(x, 3/4)Â Â Â Â Â Â Â Â Â Â Â (3)

where y=0.

The vertical neighboring integer pixels I(0,y), where y=[-5,â€¦,6], are usedÂ to interpolate the vertical fractional pixels, d(0), h(0), and n(0), by using the following equations,

d(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 1/4)Â Â Â Â Â Â Â Â Â Â Â (4)

h(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 2/4)Â Â Â Â Â Â Â Â Â Â Â (5)

n(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 3/4)Â Â Â Â Â Â Â Â Â Â Â (6)

where x=0.

In order to interpolate other fractional pixels, e, f, g, i, j, k, p, q, and r, the corresponding auxiliary vertical fractional pixels, d(x), h(x) and n(x), whereÂ x=-5â€¦6, should be first interpolated using Eqs. (4)-(6), respectively. For example, in order to interpolate the fractional pixels, e, f, and g, the d(x), where x=-5â€¦6, should be first interpolated using Eq (4). Then the e, f, and g are interpolated using the following equations,

e = \sum_{x=-5}^{6} d(x)*f(x, 1/4)Â Â Â Â Â Â Â Â Â Â Â (7)

f = \sum_{x=-5}^{6} d(x)*f(x, 2/4)Â Â Â Â Â Â Â Â Â Â Â (8)

g = \sum_{x=-5}^{6} d(x)*f(x, 3/4)Â Â Â Â Â Â Â Â Â Â Â (9)

The same proceduresÂ are applied to other fractional pixels, i,j,k, p,q, and r by using the following equations,

i = \sum_{x=-5}^{6} h(x)*f(x, 1/4)Â Â Â Â Â Â Â Â Â Â Â (10)

j = \sum_{x=-5}^{6} h(x)*f(x, 2/4)Â Â Â Â Â Â Â Â Â Â Â (11)

k = \sum_{x=-5}^{6} h(x)*f(x, 3/4)Â Â Â Â Â Â Â Â Â Â Â (12)

p = \sum_{x=-5}^{6} n(x)*f(x, 1/4)Â Â Â Â Â Â Â Â Â Â Â (13)

q = \sum_{x=-5}^{6} n(x)*f(x, 2/4)Â Â Â Â Â Â Â Â Â Â Â (14)

r = \sum_{x=-5}^{6} n(x)*f(x, 3/4)Â Â Â Â Â Â Â Â Â Â Â (15)

In the above equantions (1)-(15),Â f(x,1/4), f(x,2/4) and f(x, 3/4) denote the 12-tap DCT-basedÂ interpolation filter coefficients in fractional positions 1/4, 2/4, and 3/4, respectively, as listed in Table 1.

Table 1. Filter coefficients for 12-tap DCT-based interpolation filter in HE configuration

Position | 12-tap filter coeficients |

Â¼ | {-1, 5, -12, 20, -40, 229, 76, -32, 16, -8, 4, -1}/256 |

Â½ | {-1, 8, -16, 24, -48, 161, 161, -48, 24, -16, 8, -1}/256 |

Â¾ | {-1, 4, -8, 16, -32, 76, 229, -40, 20, -12, 5, -1}/256 |

** **

**6-tap Directional Interpolation Filter for luma (HE)**

A set of interpolation filter for low complexity configuration is referred to as the Directional Interpolation Filter (DIF) and is used for all 15 quarter-pixel positions, as shown in Figure 2. All interpolated sample values can be calculated using 16 bit arithmetic. The resulting interpolated values are scaled, clipped and stored using 8 bit/sample. Figure 2 shows the support pixels for each quarter-pixel position with different colours. For instance, the blue integer pixels are used to support the interpolation of three horizontal fractional pixels, a, b and c, the light-blue integer pixels for three vertical fractional pixels, d, h and n, the deep-yellow integer pixels for two down-right fractional pixels, e and r, the light-yellow integer pixels for two down-left fractional pixels, g and p, the purple integer pixels for the central fractional pixel j.

Figure 2. 6-tap Directional interpolation filter for luma (LE)

For each of the three horizontalÂ fractionalÂ positions, a, b and c, and the three vertical fractionalÂ positions, d, h and n, which are aligned with full pixel positions, a single 6-tap filter is used. The filter coefficients of DIF are {3, -15,Â 111, 37, -10, 2}/128 for Â¼ position (and mirrored for Â¾ position), {3, -17,Â 78, 78, -17, 3}/128 for Â½ position.

a = (3H-15I+111J+37K-10L+2M+64)>>7

b = (3H-17I+78J+78K-17L+3M+64)>>7

c = (2H-10I+37J+111K-15L+3M+64)>>7

d = (3B-15E+111J+37O-10S+2W+64)>>7

h = (3B-17E+78J+78O-17S+3W+64)>>7

n = (2B-10E+37J+111O-15S+3W+64)>>7

For the 4 innermost quarter-pixel positions, e, g, p, and r, the 6-tap filters at +45 degree and -45 degree angles are used respectively.

e = (3A-15D+111J+37P-10U+2X+64)>>7

g = (3C-15G+111K+37O-10R+2V+64)>>7

p = (2C-10G+37K+111O-15R+3V+64)>>7

r = (2A-10D+37J+111P-15U+3X+64)>>7

For another 4 innermost quarter-pixel positions, f, i, k, and q, a combination of the 6-tap filters at +45 degree and -45 degree angles, which is equivalent to a 12-tap filter, is used.

f = (e+g+1)>>1 = ((3A-15D+111J+37P-10U+2X)+(3C-15G+111K+37O-10R+2V)+128)>>8

i = (e+p+1)>>1 = ((3A-15D+111J+37P-10U+2X)+(2C-10G+37K+111O-15R+3V)+128)>>8

k = (g+r+1)>>1 = ((3C-15G+111K+37O-10R+2V)+(2A-10D+37J+111P-15U+3X)+128)>>8

q= (p+r+1)>>1 = ((2C-10G+37K+111O-15R+3V)+(2A-10D+37J+111P-15U+3X)+128)>>8

The exception is the central position, j, where a 12-tap non-separable filter is used. The filter coefficients of DIF are {0,Â 5, 5, 0; 5, 22, 22, 5; 5,Â 22, 22,Â 5; 0,Â 5,Â 5,Â 0}/128 for the central position j.

j = ((5E+5F)+(5I+22J+22K+5L)+(5N+22O+22P+5Q)+(5S+5T)+64)>>7

**[To be continued...]**

Permanent Link: Analysis of Coding Tools in HEVC Test Model (HM 1.0) â€“ Inter Prediction

## Post Comment