Witness the development of H.265

Analysis of Coding Tools in HEVC Test Model (HM 1.0) – Inter Prediction

2010-12-07 H.265/HEVC View Comments Views(16,356)

[Update on 2011-02-15] The 12-tap DCT-based interpolation filter (high efficiency configuration) and 6-tap directional interpolation filter (low complexity configuration) for 1/4 luma sample will be replaced by 8-tap DCT-based interpolation filter for both HE and LC (JCTVC-D344) in the upcoming HM2.0. In addition, the bilinear interpolation filter for 1/8 chroma sample will be replaced by 4-tap DCT-based interpolation filter (JCTVC-D347) in the HM2.0.

Tools adopted in HM0.9 include 12-tap DCT-based interpolation filter (high efficiency configuration) and 6-tap directional interpolation filter (low complexity configuration) for 1/4 luma sample, bilinear interpolation filter for 1/8 chroma sample (both HE and LC), advanced motion vector prediction, bi-direction rounding control, bi-directional prediction for temporal level 0.

  • 12-tap DCT-based Interpolation Filter for luma (HE)
  • Motion compensation is the key factor for high efficient video compressing, where fractional pel accuracy requires interpolation of the reference frame. In H.264/AVC, a 6-tap fixed Wiener filter is first used for half pel accuracy interpolation and then a bilinear combination of integer and half pel values is used to provide ¼ pel accuracy interpolation. Instead of adaptive interpolation filter, HM still adopts the fixed 12-tap DCT-based interpolation filter to provide fractional pel accuracy interpolation by replacing the combination of Wiener and bilinear filters with a set of interpolation filters at the desired fractional accuracy. More specifically, only one filtering procedure is needed to provide the interpolation pixel to any pixel accuracy, instead of a combination of 6-tap and bilinear filtering procedures in H.264/AVC. Thus, the motion compensation process can be simplified in the implementation point of view and the complexity can also be reduced for quarter-pel accuracy. Figure 1 shows the 12-tap DCT-based interpolation filter for luma (HE).

    dct-if

    Figure 1. 12-tap DCT-based interpolation filter for luma (HE)

    The DCT-based interpolation filtering process uses the horizontal neighboring integer pixels I(x,0),  where x=-5…6,  to interpolate the horizontal fractional pixels, a, b and c, by using the following equations,
    a = \sum_{x=-5}^{6} I(x,y)*f(x, 1/4)            (1)
    b = \sum_{x=-5}^{6} I(x,y)*f(x, 2/4)            (2)
    c = \sum_{x=-5}^{6} I(x,y)*f(x, 3/4)            (3)
    where y=0.

    The vertical neighboring integer pixels I(0,y), where y=[-5,…,6], are used to interpolate the vertical fractional pixels, d(0), h(0), and n(0), by using the following equations,
    d(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 1/4)            (4)
    h(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 2/4)            (5)
    n(x) = \sum_{y=-5}^{6} I(x,y)*f(y, 3/4)            (6)
    where x=0.

    In order to interpolate other fractional pixels, e, f, g, i, j, k, p, q, and r, the corresponding auxiliary vertical fractional pixels, d(x), h(x) and n(x), where x=-5…6, should be first interpolated using Eqs. (4)-(6), respectively. For example, in order to interpolate the fractional pixels, e, f, and g, the d(x), where x=-5…6, should be first interpolated using Eq (4). Then the e, f, and g are interpolated using the following equations,
    e = \sum_{x=-5}^{6} d(x)*f(x, 1/4)           (7)
    f = \sum_{x=-5}^{6} d(x)*f(x, 2/4)            (8)
    g = \sum_{x=-5}^{6} d(x)*f(x, 3/4)            (9)

    The same procedures are applied to other fractional pixels, i,j,k, p,q, and r by using the following equations,
    i = \sum_{x=-5}^{6} h(x)*f(x, 1/4)            (10)
    j = \sum_{x=-5}^{6} h(x)*f(x, 2/4)            (11)
    k = \sum_{x=-5}^{6} h(x)*f(x, 3/4)            (12)

    p = \sum_{x=-5}^{6} n(x)*f(x, 1/4)            (13)
    q = \sum_{x=-5}^{6} n(x)*f(x, 2/4)            (14)
    r = \sum_{x=-5}^{6} n(x)*f(x, 3/4)            (15)

    In the above equantions (1)-(15), f(x,1/4), f(x,2/4) and f(x, 3/4) denote the 12-tap DCT-based interpolation filter coefficients in fractional positions 1/4, 2/4, and 3/4, respectively, as listed in Table 1.

    Table 1. Filter coefficients for 12-tap DCT-based interpolation filter in HE configuration

    Position 12-tap filter coeficients
    ¼ {-1, 5, -12, 20, -40, 229, 76, -32, 16, -8, 4, -1}/256
    ½ {-1, 8, -16, 24, -48, 161, 161, -48, 24, -16, 8, -1}/256
    ¾ {-1, 4, -8, 16, -32, 76, 229, -40, 20, -12, 5, -1}/256

  • 6-tap Directional Interpolation Filter for luma (HE)
  • A set of interpolation filter for low complexity configuration is referred to as the Directional Interpolation Filter (DIF) and is used for all 15 quarter-pixel positions, as shown in Figure 2. All interpolated sample values can be calculated using 16 bit arithmetic. The resulting interpolated values are scaled, clipped and stored using 8 bit/sample. Figure 2 shows the support pixels for each quarter-pixel position with different colours. For instance, the blue integer pixels are used to support the interpolation of three horizontal fractional pixels, a, b and c, the light-blue integer pixels for three vertical fractional pixels, d, h and n, the deep-yellow integer pixels for two down-right fractional pixels, e and r, the light-yellow integer pixels for two down-left fractional pixels, g and p, the purple integer pixels for the central fractional pixel j.

    DIF

    Figure 2. 6-tap Directional interpolation filter for luma (LE)

    For each of the three horizontal fractional positions, a, b and c, and the three vertical fractional positions, d, h and n, which are aligned with full pixel positions, a single 6-tap filter is used. The filter coefficients of DIF are {3, -15, 111, 37, -10, 2}/128 for ¼ position (and mirrored for ¾ position), {3, -17, 78, 78, -17, 3}/128 for ½ position.
    a = (3H-15I+111J+37K-10L+2M+64)>>7
    b = (3H-17I+78J+78K-17L+3M+64)>>7
    c = (2H-10I+37J+111K-15L+3M+64)>>7

    d = (3B-15E+111J+37O-10S+2W+64)>>7
    h = (3B-17E+78J+78O-17S+3W+64)>>7
    n = (2B-10E+37J+111O-15S+3W+64)>>7

    For the 4 innermost quarter-pixel positions, e, g, p, and r, the 6-tap filters at +45 degree and -45 degree angles are used respectively.
    e = (3A-15D+111J+37P-10U+2X+64)>>7
    g = (3C-15G+111K+37O-10R+2V+64)>>7
    p = (2C-10G+37K+111O-15R+3V+64)>>7
    r = (2A-10D+37J+111P-15U+3X+64)>>7

    For another 4 innermost quarter-pixel positions, f, i, k, and q, a combination of the 6-tap filters at +45 degree and -45 degree angles, which is equivalent to a 12-tap filter, is used.
    f = (e+g+1)>>1 = ((3A-15D+111J+37P-10U+2X)+(3C-15G+111K+37O-10R+2V)+128)>>8
    i = (e+p+1)>>1 = ((3A-15D+111J+37P-10U+2X)+(2C-10G+37K+111O-15R+3V)+128)>>8
    k = (g+r+1)>>1 = ((3C-15G+111K+37O-10R+2V)+(2A-10D+37J+111P-15U+3X)+128)>>8
    q= (p+r+1)>>1 = ((2C-10G+37K+111O-15R+3V)+(2A-10D+37J+111P-15U+3X)+128)>>8

    The exception is the central position, j, where a 12-tap non-separable filter is used. The filter coefficients of DIF are {0, 5, 5, 0; 5, 22, 22, 5; 5, 22, 22, 5; 0, 5, 5, 0}/128 for the central position j.
    j = ((5E+5F)+(5I+22J+22K+5L)+(5N+22O+22P+5Q)+(5S+5T)+64)>>7

    [To be continued...]

    Permanent Link: Analysis of Coding Tools in HEVC Test Model (HM 1.0) – Inter Prediction

    Post Comment

    *
    To prove you're a person (not a spam script), type the security word shown in the picture.
    Anti-spam image