Opened 4 years ago
Closed 4 years ago
#1261 closed defect (fixed)
mismatch on inverse transform shift in 1-D transform
Reported by: | zhaoyin | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | spec | Version: | VVC D10 vE |
Keywords: | Cc: | ksuehring, bbross, XiangLi, fbossen, jvet@… |
Description
In Spec vE 8.7.4.1, the bdShift after inverse transform, i.e., equation (1154), is "20 - BitDepth", no matter whether it is 1-D or 2-D transform.
5. The (nTbW)x(nTbH) array res[ x ][ y ] of residual samples with x = 0..nTbW − 1, y = 0..nTbH − 1 is derived as follows: bdShift = Max( 20 − BitDepth, 0 ) (1154) res[ x ][ y ] = ( r[ x ][ y ] + ( 1 << ( bdShift − 1 ) ) ) >> bdShift (1155)
In VTM software, function xIT(), shift = shift_2nd = 20 - BitDepth. But in the cases of 1-D transform, the shift value input to the fastInvTrans function is “shift + 1”, while in 2-D transform, the input value is shift_2nd.
if( width > 1 && height > 1 ) //2-D transform { const int shift_1st = TRANSFORM_MATRIX_SHIFT + 1 + COM16_C806_TRANS_PREC; // 1 has been added to shift_1st at the expense of shift_2nd const int shift_2nd = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC; CHECK( shift_1st < 0, "Negative shift" ); CHECK( shift_2nd < 0, "Negative shift" ); TCoeff *tmp = ( TCoeff * ) alloca( width * height * sizeof( TCoeff ) ); fastInvTrans[trTypeVer][transformHeightIndex](pCoeff.buf, tmp, shift_1st, width, skipWidth, skipHeight, clipMinimum, clipMaximum); fastInvTrans[trTypeHor][transformWidthIndex] (tmp, block, shift_2nd, height, 0, skipWidth, clipMinimum, clipMaximum); } else if( width == 1 ) //1-D vertical transform { int shift = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC; CHECK( shift < 0, "Negative shift" ); CHECK( ( transformHeightIndex < 0 ), "There is a problem with the height." ); fastInvTrans[trTypeVer][transformHeightIndex]( pCoeff.buf, block, shift + 1, 1, 0, skipHeight, clipMinimum, clipMaximum ); } else //if(iHeight == 1) //1-D horizontal transform { const int shift = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC; CHECK( shift < 0, "Negative shift" ); CHECK( ( transformWidthIndex < 0 ), "There is a problem with the width." ); fastInvTrans[trTypeHor][transformWidthIndex]( pCoeff.buf, block, shift + 1, 1, 0, skipWidth, clipMinimum, clipMaximum ); }
Therefore, there seems a mismatch between the spec and VTM on the shift value in 1-D transform.
Change history (3)
comment:1 Changed 4 years ago by deluxan
comment:2 Changed 4 years ago by fbossen
I agree with Santi.
It indeed looks like a mismatch.
For 2-d transforms, we shift down by 7 bits between vertical and horizontal transforms:
g[ x ][ y ] = Clip3( CoeffMin, CoeffMax, ( e[ x ][ y ] + 64 ) >> 7 ) (1153)
which is one more bit than the transform coefficients require.
For 1-d transforms, we also need to shift this extra bit at some point, which is why the software uses shift + 1 as input to fastInvTrans.
The text should be changed to reflect that behaviour:
if nTbH and nTbW are both greater than 1, bdShift = 20 - BitDepth
otherwise, bdShift = 21 - BitDepth
(no need for a Max function here)
comment:3 Changed 4 years ago by bbross
- Resolution set to fixed
- Status changed from new to closed
Thanks for brining that up and confirming. Fixed for JVET-S2001-vG as follows:
bdShift = ( nTbH > 1 && nTbW > 1 ) ? ( 20 − BitDepth ) : ( 21 − BitDepth )
The scaling for the 2-D inverse transform is given by:
Therefore, the inverse quantizer expects an input signal that is scaled by the amount
2-15+B+(log2(W)+log2(H))/2.
If the transform is only 1-D (for example, H=1), then the quantizer will expect a scaling of
2-15+B+(log2(W)+log2(1))/2=2(-15+B+log2(W)/2)
Now let us take a look at the scaling factors for a 1-D transform (H=1):
It can be observed that the total scaling is 2-14+B+(log2(W))/2, which is obviously different from the result expected by the inv. quantizer, 2-15+B+(log2(W))/2. Therefore, it is necessary to add a "+1" in the scaling for the 1-D case, so that the resulting scaling factor is identical regardless of being a 1-D or 2-D transform.
The reason why there is this "+1" difference is due to the fact that the shift of the first transform "borrows" a "+1" from the second shift (where it is subtracted). This is the reason why the scaling after the first inverse transform stage in the 2-D case is "2-7" instead of "2-6". In a 1-D case this is evidently not possible. Hence the "+1" in the shift variable for the 1-D path.
I would suggest to modify the spec text by making the variable bdShift depend on nTbW and nTbH. For example,
s = (nTbH == 1 | | nTbW == 1) ? 1 : 0.
bdShift = Max( 20 − BitDepth + s, 0 )