Opened 3 years ago
Closed 3 years ago
#1495 closed defect (fixed)
Encoder crashes for multi-layer encoding and GDR_ENABLED=1
Reported by: | ksuehring | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | VTM | Version: | VTM-13.0 |
Keywords: | Cc: | ksuehring, XiangLi, fbossen, jvet@… |
Description
A crash occurs when encoding when encoding multiple layer. Example config:
EncoderApp -c ../cfg/encoder_randomaccess_vtm.cfg -c ../cfg/multi-layer/two_layers.cfg -l0 -c ../cfg/per-sequence/RaceHorses.cfg -l1 -c ../cfg/per-sequence/RaceHorsesC.cfg -ipp ../../../../svn/origCfP/ -f 33 --SEIDecodedPictureHash=1 --Level=4
Crash as follows:
POC 16 LId: 1 TId: 1 ( RASL, B-SLICE, QP 34 ) 88432 bits [Y 34.1698 dB U 36.8642 dB V 37.7834 dB] [ET 194 ] [L0 32 16(0.50x, 0.50x).0] [L1 32c 16(0.50x, 0.50x).0] [MD5:132868e47dec0e204c688abd15f532c2,58f23c1d14b7d67cc709a602858a1e13,43a1f2dffd1359704d7dc3ae9ec1d58d] EncoderApp(88093,0x1153e9e00) malloc: Incorrect checksum for freed object 0x7fd857d8e200: probably modified after being freed. Corrupt value: 0x0 EncoderApp(88093,0x1153e9e00) malloc: *** set a breakpoint in malloc_error_break to debug Abort trap: 6
It seems setting GDR_ENABLED=0 avoids the crash.
Change history (8)
comment:1 Changed 3 years ago by ksuehring
comment:2 Changed 3 years ago by seuhong
I agree with your observation. However, I believe it is necessary to keep one picture header per picture for the following reason.
Position of virtual boundary, used as a border between clean/dirty area, moves over GDR period to progressively refresh clean area. In VVC, virtual boundary position is signaled thru picture header therefore one picture header cannot be shared by multiple GDR recovering pictures.
I will fix the crash and submit a MR when it is ready. Thank you.
comment:3 Changed 3 years ago by ksuehring
Can't these virtual boundaries be copied into the Picture structure?
comment:4 Changed 3 years ago by seuhong
Since virtual boundary is part of picHeader, moving it to Picture structure requires many changes.
The root cause of this crash is that same picHeader is being shared by two different picture structures (scaledRefPic, curPic).
Once the scaledRefPic is deallocated by freeScaledRefPicList(), the picHeader of curPic will become invalid and consequently it will get crash when any field of picHeader gets accessed by codePicHeader().
The fix for this ticket is allocate a picHeader to scaledRefPic instead of sharing same picHeader with curPic.
MR for this change could be found from
https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/merge_requests/2095
comment:5 Changed 3 years ago by ksuehring
My suggestion was to copy the boundaries, not to move them. We already copy several parameters from SPS/PPS/PH which are required for filtering or output, when the original structures are not available anymore. So this would not be something new.
Also in the common use case, where virtual boundaries are not used, this would only add an empty vector, which has almost no additions memory requirement.
On the other hand, storing PHs for each Picture causes increased memory usage, also for all cases where GDR is not enabled.
comment:6 Changed 3 years ago by seuhong
I believe it is more natural to have PH for each picture than copy fields to picture structure for following reasons:
- I believe PH was designed for 1 PH for 1 Picture, just like 1 SH for 1 Slice
- Copying VB to Picture structure is redundant to the picture which does not need VB
- Even though VB are not used, VB position may add empty vector but VBEnabledFlag, VBEnabledFlag, numVerVB, numHorVB still take up some space
- Storing PH for each picture may increase memory usage a little bit but up to the PH * GOP size because the life cycle of PH is same as Picture structure
comment:7 Changed 3 years ago by seuhong
Fix is in below MR
https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/merge_requests/2115
comment:8 Changed 3 years ago by fbossen
- Resolution set to fixed
- Status changed from new to closed
This crash is related to using more than one PictureHeader object and thus related to MRs !2054 and !2055.
The code is trying to access a picture header to get the GDR pic flag (EncGOP.cpp:3336), but that picture header was just freed a bit earlier in the code when the list of scaled reference pictures is deleted (EncGOP.cpp:3302)
I think, I would prefer to stick with the logic of keeping only one picture header and duplicate the necessary fields into the picture (as discussed as one option in the linked MRs), because creating multiple headers seems to have unexpected side effects.
If we decide to switch to one picture header per picture, existing duplications should be removed as well.
Full address sanitiser log below: