Opened 3 years ago

Closed 3 years ago

#1495 closed defect (fixed)

Encoder crashes for multi-layer encoding and GDR_ENABLED=1

Reported by: ksuehring Owned by:
Priority: minor Milestone:
Component: VTM Version: VTM-13.0
Keywords: Cc: ksuehring, XiangLi, fbossen, jvet@…

Description

A crash occurs when encoding when encoding multiple layer. Example config:

EncoderApp -c ../cfg/encoder_randomaccess_vtm.cfg -c ../cfg/multi-layer/two_layers.cfg -l0 -c ../cfg/per-sequence/RaceHorses.cfg -l1 -c ../cfg/per-sequence/RaceHorsesC.cfg -ipp ../../../../svn/origCfP/ -f 33 --SEIDecodedPictureHash=1 --Level=4

Crash as follows:

POC   16 LId:  1 TId: 1 ( RASL, B-SLICE, QP 34 )      88432 bits [Y 34.1698 dB    U 36.8642 dB    V 37.7834 dB] [ET   194 ] [L0 32 16(0.50x, 0.50x).0] [L1 32c 16(0.50x, 0.50x).0] [MD5:132868e47dec0e204c688abd15f532c2,58f23c1d14b7d67cc709a602858a1e13,43a1f2dffd1359704d7dc3ae9ec1d58d]
EncoderApp(88093,0x1153e9e00) malloc: Incorrect checksum for freed object 0x7fd857d8e200: probably modified after being freed.
Corrupt value: 0x0
EncoderApp(88093,0x1153e9e00) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

It seems setting GDR_ENABLED=0 avoids the crash.

Change history (8)

comment:1 Changed 3 years ago by ksuehring

This crash is related to using more than one PictureHeader object and thus related to MRs !2054 and !2055.

The code is trying to access a picture header to get the GDR pic flag (EncGOP.cpp:3336), but that picture header was just freed a bit earlier in the code when the list of scaled reference pictures is deleted (EncGOP.cpp:3302)

I think, I would prefer to stick with the logic of keeping only one picture header and duplicate the necessary fields into the picture (as discussed as one option in the linked MRs), because creating multiple headers seems to have unexpected side effects.

If we decide to switch to one picture header per picture, existing duplications should be removed as well.

Full address sanitiser log below:

==91506==ERROR: AddressSanitizer: heap-use-after-free on address 0x6220060bd11e at pc 0x000100d0a95d bp 0x7ffeefbe83e0 sp 0x7ffeefbe83d8
READ of size 1 at 0x6220060bd11e thread T0
    #0 0x100d0a95c in PicHeader::getGdrPicFlag() const Slice.h:2451
    #1 0x100cfdc29 in EncGOP::compressGOP(int, int, std::__1::list<Picture*, std::__1::allocator<Picture*> >&, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, bool, bool, InputColourSpaceConversion, bool, bool, bool, int) EncGOP.cpp:3336
    #2 0x100db771f in EncLib::encode(InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:646
    #3 0x100042d0c in EncApp::encode() EncApp.cpp:1348
    #4 0x1001a2e0f in main encmain.cpp:266
    #5 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)

0x6220060bd11e is located 30 bytes inside of 5136-byte region [0x6220060bd100,0x6220060be510)
freed by thread T0 here:
    #0 0x10444105d in wrap__ZdlPv+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x5505d)
    #1 0x1005d7467 in PicHeader::~PicHeader() Slice.cpp:2677
    #2 0x10022b3c9 in CodingStructure::destroy() CodingStructure.cpp:141
    #3 0x100456d9b in Picture::destroy() Picture.cpp:111
    #4 0x1005fd690 in Slice::freeScaledRefPicList(Picture**) Slice.cpp:4459
    #5 0x100cfd0ef in EncGOP::compressGOP(int, int, std::__1::list<Picture*, std::__1::allocator<Picture*> >&, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, bool, bool, InputColourSpaceConversion, bool, bool, bool, int) EncGOP.cpp:3302
    #6 0x100db771f in EncLib::encode(InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:646
    #7 0x100042d0c in EncApp::encode() EncApp.cpp:1348
    #8 0x1001a2e0f in main encmain.cpp:266
    #9 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)

previously allocated by thread T0 here:
    #0 0x104440c3d in wrap__Znwm+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x54c3d)
    #1 0x100db407e in EncLib::encodePrep(bool, PelStorage*, PelStorage*, PelStorage*, InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:597
    #2 0x100041f8a in EncApp::encodePrep(bool&) EncApp.cpp:1330
    #3 0x1001a2c3b in main encmain.cpp:239
    #4 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)

SUMMARY: AddressSanitizer: heap-use-after-free Slice.h:2451 in PicHeader::getGdrPicFlag() const
Shadow bytes around the buggy address:
  0x1c4400c179d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4400c179e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4400c179f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4400c17a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4400c17a10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x1c4400c17a20: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4400c17a30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4400c17a40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4400c17a50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4400c17a60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4400c17a70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
2021-07-16 19:40:30.582410+0200 EncoderApp[91506:25561364] =================================================================
2021-07-16 19:40:30.582549+0200 EncoderApp[91506:25561364] ==91506==ERROR: AddressSanitizer: heap-use-after-free on address 0x6220060bd11e at pc 0x000100d0a95d bp 0x7ffeefbe83e0 sp 0x7ffeefbe83d8
2021-07-16 19:40:30.582596+0200 EncoderApp[91506:25561364] READ of size 1 at 0x6220060bd11e thread T0
2021-07-16 19:40:30.582656+0200 EncoderApp[91506:25561364]     #0 0x100d0a95c in PicHeader::getGdrPicFlag() const Slice.h:2451
2021-07-16 19:40:30.582720+0200 EncoderApp[91506:25561364]     #1 0x100cfdc29 in EncGOP::compressGOP(int, int, std::__1::list<Picture*, std::__1::allocator<Picture*> >&, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, bool, bool, InputColourSpaceConversion, bool, bool, bool, int) EncGOP.cpp:3336
2021-07-16 19:40:30.582800+0200 EncoderApp[91506:25561364]     #2 0x100db771f in EncLib::encode(InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:646
2021-07-16 19:40:30.582878+0200 EncoderApp[91506:25561364]     #3 0x100042d0c in EncApp::encode() EncApp.cpp:1348
2021-07-16 19:40:30.582941+0200 EncoderApp[91506:25561364]     #4 0x1001a2e0f in main encmain.cpp:266
2021-07-16 19:40:30.583005+0200 EncoderApp[91506:25561364]     #5 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)
2021-07-16 19:40:30.583066+0200 EncoderApp[91506:25561364] 
2021-07-16 19:40:30.583125+0200 EncoderApp[91506:25561364] 0x6220060bd11e is located 30 bytes inside of 5136-byte region [0x6220060bd100,0x6220060be510)
2021-07-16 19:40:30.583182+0200 EncoderApp[91506:25561364] freed by thread T0 here:
2021-07-16 19:40:30.583270+0200 EncoderApp[91506:25561364]     #0 0x10444105d in wrap__ZdlPv+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x5505d)
2021-07-16 19:40:30.583354+0200 EncoderApp[91506:25561364]     #1 0x1005d7467 in PicHeader::~PicHeader() Slice.cpp:2677
2021-07-16 19:40:30.583459+0200 EncoderApp[91506:25561364]     #2 0x10022b3c9 in CodingStructure::destroy() CodingStructure.cpp:141
2021-07-16 19:40:30.583560+0200 EncoderApp[91506:25561364]     #3 0x100456d9b in Picture::destroy() Picture.cpp:111
2021-07-16 19:40:30.583633+0200 EncoderApp[91506:25561364]     #4 0x1005fd690 in Slice::freeScaledRefPicList(Picture**) Slice.cpp:4459
2021-07-16 19:40:30.583696+0200 EncoderApp[91506:25561364]     #5 0x100cfd0ef in EncGOP::compressGOP(int, int, std::__1::list<Picture*, std::__1::allocator<Picture*> >&, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, bool, bool, InputColourSpaceConversion, bool, bool, bool, int) EncGOP.cpp:3302
2021-07-16 19:40:30.583772+0200 EncoderApp[91506:25561364]     #6 0x100db771f in EncLib::encode(InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:646
2021-07-16 19:40:30.583846+0200 EncoderApp[91506:25561364]     #7 0x100042d0c in EncApp::encode() EncApp.cpp:1348
2021-07-16 19:40:30.583914+0200 EncoderApp[91506:25561364]     #8 0x1001a2e0f in main encmain.cpp:266
2021-07-16 19:40:30.583984+0200 EncoderApp[91506:25561364]     #9 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)
2021-07-16 19:40:30.584049+0200 EncoderApp[91506:25561364] 
2021-07-16 19:40:30.584103+0200 EncoderApp[91506:25561364] previously allocated by thread T0 here:
2021-07-16 19:40:30.584158+0200 EncoderApp[91506:25561364]     #0 0x104440c3d in wrap__Znwm+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x54c3d)
2021-07-16 19:40:30.584196+0200 EncoderApp[91506:25561364]     #1 0x100db407e in EncLib::encodePrep(bool, PelStorage*, PelStorage*, PelStorage*, InputColourSpaceConversion, std::__1::list<UnitBuf<short>*, std::__1::allocator<UnitBuf<short>*> >&, int&) EncLib.cpp:597
2021-07-16 19:40:30.584270+0200 EncoderApp[91506:25561364]     #2 0x100041f8a in EncApp::encodePrep(bool&) EncApp.cpp:1330
2021-07-16 19:40:30.584338+0200 EncoderApp[91506:25561364]     #3 0x1001a2c3b in main encmain.cpp:239
2021-07-16 19:40:30.584388+0200 EncoderApp[91506:25561364]     #4 0x7fff204c5f5c in start+0x0 (libdyld.dylib:x86_64+0x15f5c)
2021-07-16 19:40:30.584484+0200 EncoderApp[91506:25561364] 
2021-07-16 19:40:30.584549+0200 EncoderApp[91506:25561364] SUMMARY: AddressSanitizer: heap-use-after-free Slice.h:2451 in PicHeader::getGdrPicFlag() const
2021-07-16 19:40:30.584610+0200 EncoderApp[91506:25561364] Shadow bytes around the buggy address:
2021-07-16 19:40:30.584667+0200 EncoderApp[91506:25561364]   0x1c4400c179d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
2021-07-16 19:40:30.584728+0200 EncoderApp[91506:25561364]   0x1c4400c179e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
2021-07-16 19:40:30.584787+0200 EncoderApp[91506:25561364]   0x1c4400c179f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
2021-07-16 19:40:30.584834+0200 EncoderApp[91506:25561364]   0x1c4400c17a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
2021-07-16 19:40:30.585002+0200 EncoderApp[91506:25561364]   0x1c4400c17a10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
2021-07-16 19:40:30.585052+0200 EncoderApp[91506:25561364] =>0x1c4400c17a20: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585093+0200 EncoderApp[91506:25561364]   0x1c4400c17a30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585156+0200 EncoderApp[91506:25561364]   0x1c4400c17a40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585251+0200 EncoderApp[91506:25561364]   0x1c4400c17a50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585397+0200 EncoderApp[91506:25561364]   0x1c4400c17a60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585475+0200 EncoderApp[91506:25561364]   0x1c4400c17a70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
2021-07-16 19:40:30.585526+0200 EncoderApp[91506:25561364] Shadow byte legend (one shadow byte represents 8 application bytes):
2021-07-16 19:40:30.585564+0200 EncoderApp[91506:25561364]   Addressable:           00
2021-07-16 19:40:30.585597+0200 EncoderApp[91506:25561364]   Partially addressable: 01 02 03 04 05 06 07
2021-07-16 19:40:30.585640+0200 EncoderApp[91506:25561364]   Heap left redzone:       fa
2021-07-16 19:40:30.585724+0200 EncoderApp[91506:25561364]   Freed heap region:       fd
2021-07-16 19:40:30.585776+0200 EncoderApp[91506:25561364]   Stack left redzone:      f1
2021-07-16 19:40:30.585811+0200 EncoderApp[91506:25561364]   Stack mid redzone:       f2
2021-07-16 19:40:30.585886+0200 EncoderApp[91506:25561364]   Stack right redzone:     f3
2021-07-16 19:40:30.585958+0200 EncoderApp[91506:25561364]   Stack after return:      f5
2021-07-16 19:40:30.586019+0200 EncoderApp[91506:25561364]   Stack use after scope:   f8
2021-07-16 19:40:30.586074+0200 EncoderApp[91506:25561364]   Global redzone:          f9
2021-07-16 19:40:30.586131+0200 EncoderApp[91506:25561364]   Global init order:       f6
2021-07-16 19:40:30.586176+0200 EncoderApp[91506:25561364]   Poisoned by user:        f7
2021-07-16 19:40:30.586231+0200 EncoderApp[91506:25561364]   Container overflow:      fc
2021-07-16 19:40:30.586320+0200 EncoderApp[91506:25561364]   Array cookie:            ac
2021-07-16 19:40:30.586376+0200 EncoderApp[91506:25561364]   Intra object redzone:    bb
2021-07-16 19:40:30.586415+0200 EncoderApp[91506:25561364]   ASan internal:           fe
2021-07-16 19:40:30.586448+0200 EncoderApp[91506:25561364]   Left alloca redzone:     ca
2021-07-16 19:40:30.586500+0200 EncoderApp[91506:25561364]   Right alloca redzone:    cb
2021-07-16 19:40:30.586544+0200 EncoderApp[91506:25561364]   Shadow gap:              cc
==91506==ABORTING

comment:2 Changed 3 years ago by seuhong

I agree with your observation. However, I believe it is necessary to keep one picture header per picture for the following reason.

Position of virtual boundary, used as a border between clean/dirty area, moves over GDR period to progressively refresh clean area. In VVC, virtual boundary position is signaled thru picture header therefore one picture header cannot be shared by multiple GDR recovering pictures.

I will fix the crash and submit a MR when it is ready. Thank you.

comment:3 Changed 3 years ago by ksuehring

Can't these virtual boundaries be copied into the Picture structure?

comment:4 Changed 3 years ago by seuhong

Since virtual boundary is part of picHeader, moving it to Picture structure requires many changes.

The root cause of this crash is that same picHeader is being shared by two different picture structures (scaledRefPic, curPic).
Once the scaledRefPic is deallocated by freeScaledRefPicList(), the picHeader of curPic will become invalid and consequently it will get crash when any field of picHeader gets accessed by codePicHeader().

The fix for this ticket is allocate a picHeader to scaledRefPic instead of sharing same picHeader with curPic.

MR for this change could be found from
https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/merge_requests/2095

comment:5 Changed 3 years ago by ksuehring

My suggestion was to copy the boundaries, not to move them. We already copy several parameters from SPS/PPS/PH which are required for filtering or output, when the original structures are not available anymore. So this would not be something new.

Also in the common use case, where virtual boundaries are not used, this would only add an empty vector, which has almost no additions memory requirement.

On the other hand, storing PHs for each Picture causes increased memory usage, also for all cases where GDR is not enabled.

comment:6 Changed 3 years ago by seuhong

I believe it is more natural to have PH for each picture than copy fields to picture structure for following reasons:

  1. I believe PH was designed for 1 PH for 1 Picture, just like 1 SH for 1 Slice
  2. Copying VB to Picture structure is redundant to the picture which does not need VB
  3. Even though VB are not used, VB position may add empty vector but VBEnabledFlag, VBEnabledFlag, numVerVB, numHorVB still take up some space
  4. Storing PH for each picture may increase memory usage a little bit but up to the PH * GOP size because the life cycle of PH is same as Picture structure

comment:8 Changed 3 years ago by fbossen

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.