4. Rate-Distortion Optimized Quantization
Previously, adaptive rounding was proposed to improve quantization, which captures the statistics of the incoming residual signal and adjusts the rounding offsets accordingly. However, the adaptive rounding quantization is still based on the criterion which minimizes the mean-squared quantization error between the original signal and the quantization reconstructed signal. From the sense of rate-distortion optimization, the cost from the rate should also be considered.
The basic idea underlying the rate-distortion optimized quantization is to minimize a cost function D+ λR such that both the rate R and the distortion D are considered in coding decisions. For quantization case, the RD optimal coding is to solve a minimization problem of
(7)
where S is the original signal, and T-1 denotes the inverse transform operation. Consider that the DCT is a unitary transform, which maintains the Euclidean distance. We use the Euclidean distance for D(.) to avoid the inverse DCT computation. Thus, the above equation becomes
(8)
where W=T(S), and T is the forward transform operation.
A simplified rate-distortion optimized quantization scheme is proposed and implemented in KTA software. In the RDO-Q, assuming the transform coefficients before quantization are Wi, (i=0,…,M-1), then the quantized coefficients/levels Zi (i=0,…,M-1) are calculated as follows:
- For a given coefficient position k, k=M-1,…,0, assume that coefficient Wk is the last significant coefficient in the block:
- For each coefficient Wi, i=k-1,…,0, calculate its Lagrangian cost when the quantized value Zi is equal to 0, Zfloor and Zceil, as shown in Figure 7. The Lagrangian cost Jk,i(λ,Zi) when coefficientWi is quantized to Zi is calculated as:
(9)
where err(Wi , Zi) is the quantization error if the coefficient Wi is quantized to value Zi, measured by Euclidean distance; And bits(Zi) is the number of bits needed to code Zi. The value of Zfloor andZceil are defined as:
(10)
(11)
- For each coefficient Wi, i=k-1,…,0, calculate its Lagrangian cost when the quantized value Zi is equal to 0, Zfloor and Zceil, as shown in Figure 7. The Lagrangian cost Jk,i(λ,Zi) when coefficientWi is quantized to Zi is calculated as:
- Let the final quantized level Zi,opt=argmin Jk,i(λ,Zi) and update Lagrangian cost Jk(λ) usingJk,i(λ,Zopt).
- The final quantized vector of quantized coefficients Zopt=argmin Jk(λ).

Figure 7. Possible quantized values in RDO-Q
To speed up the algorithm the following simplifications are made:
- If coefficient Wk is closer (measured as absolute difference between the coefficient and its reconstructed value) to Zfloor than to Zceil only value Zfloor is considered.
- If coefficient Wk is closer to Zfloor than to Zceil, and Zfloor is equal to zero, coefficient Wk can not be the last nonzero coefficient. Hence the calculation of Lagrangian cost Jk(λ) is skipped for this value of k.
- The calculation of Jk(λ) is stopped when Jk(λ) starts to increase with decreasing k.
Here, we only discuss the entropy coding by using CABAC, and for the CAVLC a similar estimation method for the number of bits can be applied. To estimate the number of bits required to code a coefficients, we use the tabularized values of entropy of the probabilities corresponding to states in CABAC arithmetic coding engine. But, the CABAC algorithm uses the context modeling. That is, the coding of current coefficient in a block is related to the state of previous coded coefficients.
In general, CABAC consists of three steps, i.e.,
- Binarization. The so-called UEG0 algorithm is used to convert non-zero transform coefficients into a binary representation so that the binary arithmetic coding engine can be used to code them.
- Context modeling. CABAC defines a probability model for each binary bit.
- Binary arithmetic coding. The binary representation is encoded bit by bit using corresponding models.
The RDO-Q is closely related to the context modeling for residual coding. Residual coding by CABAC includes two parts, i.e., coding a so-called significance map and coding non-zero coefficients. Given a zigzag ordered sequence of transform coefficients, its significance map is a binary sequence which indicates the occurrence and location of the non-zero coefficients. The context modeling for coding the significance map is associated with the zig-zag order, and is easy to be included in RDO-Q. The context modeling for coding non-zero coefficients, however, is complicated. For a given sequence, there are in total 10 contexts for coding non-zero coefficients, with 5 of them for coding the first bit of a binary representation and the other 5 dedicated to coding the second to 14th bits. Briefly, contexts are selected as follows,
1. Scan the sequence in the inverse order to initiate two parameters as NumLg1 and NumEq1. NumLg1 is the number of coefficients that are greater than 1 while NumEq1 accords to those equal to one.
2. The context for the first bit is determined by
(12)
3. The context for the 2-14th bits is selected by
(13)
There is also a bypass mode with a fixed distribution. Other bits in the binary representation use the bypass mode.

Figure 8. The graph structure for RDO quantization based on CABAC in H.264/AVC
In order to solve the minimization problem, a graph-based algorithm is used to address the computation of the rate function R(.) of CABAC. As shown in Figure 8, the graph is constructed based on coding features of CABAC. Basically, states are defined based on the context model selection. Thus, states are named by values of NumEq1and NumLg1, as of NumEq1_NumLg1 , e.g., 2_0 accords to NumEq1 = 2 and NumLg1 = 0. When NumLg1 > 0, the context is irrelevant to NumEq1. Thus, there are three states as X_1, X_2, and X_3. The context is fixed for allNumLg1 ≥4. Accordingly, one state X_X is defined. For a 4×4 luma block, there are 16 columns, each of them corresponding to one coefficient. In each column there are up to 8 states. Transitions are established between states according to the increase of NumEq1 and NumLg1 , e.g., the state 1_0 is connected to 1_0, 2_0 and X_1according to a quantization output of 0, 1, or greater than 1, respectively. In case that the quantization output is greater than 1, parallel transitions are established so that each accords to a unique value. In practice, because the distortion is a quadratic function with respect to the quantization output, it is sufficient to investigate only a few parallel transitions. Thus the complexity is greatly reduced without sacrificing the optimality. Finally, a graph structure as shown in Figure 8 is obtained.
The optimal RDO quantization design now becomes a problem to search for a path in the graph for the minimal RD cost. It is not hard to see that the above graph design allows an element-wise additive computation of the RD cost. The Viterbi algorithm is then used to do the search, which leads to the solution of the minimization problem.
References:
[1]. T Wedi, S Wittmann, “Quantization offsets for video coding”, IEEE International Symposium on Circuits and Systems, 2005.
[2]. Gary J. Sullivan, “Adaptive quantization encoding technique using an equal expected-value rule”, Hong Kong JVT meeting contribution JVT-N011, January 2005.
[3]. E.-h. Yang and X. Yang, “Rate distortion optimization of H.264 with main profile compatibility,” pp282 – 286, ISIT, July 2006
[4]. M. Karczewicz, Y. Ye and I. Chong, Rate distortion optimized quantization, VCEG-AH21, Qualcomm,Jan. 2008
Permanent Link: Quantization Techniques in JM/KTA – Part 4
Post Comment