Opened 4 years ago

Closed 4 years ago

# mismatch on inverse transform shift in 1-D transform

Reported by: Owned by: zhaoyin major spec VVC D10 vE ksuehring, bbross, XiangLi, fbossen, jvet@…

### Description

In Spec vE 8.7.4.1, the bdShift after inverse transform, i.e., equation (1154), is "20 - BitDepth", no matter whether it is 1-D or 2-D transform.

```5.	The (nTbW)x(nTbH) array res[ x ][ y ] of residual samples with x = 0..nTbW − 1, y = 0..nTbH − 1 is derived as follows:
bdShift = Max( 20 − BitDepth, 0 )		(1154)
res[ x ][ y ] = ( r[ x ][ y ] + ( 1  <<  ( bdShift − 1 ) ) )  >>  bdShift	(1155)

```

In VTM software, function xIT(), shift = shift_2nd = 20 - BitDepth. But in the cases of 1-D transform, the shift value input to the fastInvTrans function is “shift + 1”, while in 2-D transform, the input value is shift_2nd.

```  if( width > 1 && height > 1 ) //2-D transform
{
const int      shift_1st              =   TRANSFORM_MATRIX_SHIFT + 1 + COM16_C806_TRANS_PREC; // 1 has been added to shift_1st at the expense of shift_2nd
const int      shift_2nd              = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC;
CHECK( shift_1st < 0, "Negative shift" );
CHECK( shift_2nd < 0, "Negative shift" );
TCoeff *tmp = ( TCoeff * ) alloca( width * height * sizeof( TCoeff ) );
fastInvTrans[trTypeVer][transformHeightIndex](pCoeff.buf, tmp, shift_1st, width, skipWidth, skipHeight, clipMinimum, clipMaximum);
fastInvTrans[trTypeHor][transformWidthIndex] (tmp,      block, shift_2nd, height,         0, skipWidth, clipMinimum, clipMaximum);
}
else if( width == 1 ) //1-D vertical transform
{
int shift = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC;
CHECK( shift < 0, "Negative shift" );
CHECK( ( transformHeightIndex < 0 ), "There is a problem with the height." );
fastInvTrans[trTypeVer][transformHeightIndex]( pCoeff.buf, block, shift + 1, 1, 0, skipHeight, clipMinimum, clipMaximum );
}
else //if(iHeight == 1) //1-D horizontal transform
{
const int      shift              = ( TRANSFORM_MATRIX_SHIFT + maxLog2TrDynamicRange - 1 ) - bitDepth + COM16_C806_TRANS_PREC;
CHECK( shift < 0, "Negative shift" );
CHECK( ( transformWidthIndex < 0 ), "There is a problem with the width." );
fastInvTrans[trTypeHor][transformWidthIndex]( pCoeff.buf, block, shift + 1, 1, 0, skipWidth, clipMinimum, clipMaximum );
}
```

Therefore, there seems a mismatch between the spec and VTM on the shift value in 1-D transform.

### comment:1 Changed 4 years ago by deluxan

The scaling for the 2-D inverse transform is given by:

Exponent of the Scale factor 6+log2(W)/2 -7 6+log2(H)/2 -20+B -15+B+(log2(W)+log2(H))/2

Therefore, the inverse quantizer expects an input signal that is scaled by the amount

2-15+B+(log2(W)+log2(H))/2.

If the transform is only 1-D (for example, H=1), then the quantizer will expect a scaling of

2-15+B+(log2(W)+log2(1))/2=2(-15+B+log2(W)/2)

Now let us take a look at the scaling factors for a 1-D transform (H=1):

Exponent of the Scale factor 6+log2(W)/2 -20+B -14+B+(log2(W))/2

It can be observed that the total scaling is 2-14+B+(log2(W))/2, which is obviously different from the result expected by the inv. quantizer, 2-15+B+(log2(W))/2. Therefore, it is necessary to add a "+1" in the scaling for the 1-D case, so that the resulting scaling factor is identical regardless of being a 1-D or 2-D transform.

The reason why there is this "+1" difference is due to the fact that the shift of the first transform "borrows" a "+1" from the second shift (where it is subtracted). This is the reason why the scaling after the first inverse transform stage in the 2-D case is "2-7" instead of "2-6". In a 1-D case this is evidently not possible. Hence the "+1" in the shift variable for the 1-D path.

I would suggest to modify the spec text by making the variable bdShift depend on nTbW and nTbH. For example,

s = (nTbH == 1 | | nTbW == 1) ? 1 : 0.
bdShift = Max( 20 − BitDepth + s, 0 )

### comment:2 Changed 4 years ago by fbossen

I agree with Santi.

It indeed looks like a mismatch.

For 2-d transforms, we shift down by 7 bits between vertical and horizontal transforms:

g[ x ][ y ] = Clip3( CoeffMin, CoeffMax, ( e[ x ][ y ] + 64 ) >> 7 ) (1153)

which is one more bit than the transform coefficients require.

For 1-d transforms, we also need to shift this extra bit at some point, which is why the software uses shift + 1 as input to fastInvTrans.

The text should be changed to reflect that behaviour:
if nTbH and nTbW are both greater than 1, bdShift = 20 - BitDepth
otherwise, bdShift = 21 - BitDepth
(no need for a Max function here)

### comment:3 Changed 4 years ago by bbross

• Resolution set to fixed
• Status changed from new to closed

Thanks for brining that up and confirming. Fixed for JVET-S2001-vG as follows:

bdShift = ( nTbH > 1 && nTbW > 1 ) ? ( 20 − BitDepth ) : ( 21 − BitDepth )

Note: See TracTickets for help on using tickets.