Windows 32-bit DLL crashes in dav1d_ipred_*_ssse3 functions
Dav1d decoder is used in a complex 32 bit windows application. While decoding AV1 steam, libdav1d.dll would crash in intra prediction functions for ssse3 simd. Problem is reproducible just with individual PCs and when used in that application. I was unable to reproduce it in any other scenario.
Examining the source code, and debugging with GDB led me to believe that there is a problem with calculating addresses for far jump instructions in those functions. Addresses stored as offsets in jump tables, with tables acting as a base, e.g.:
offset for ipred_h_pred_ssse3.w4 = ipred_h_pred_ssse3.w4 - i_pred_h_ssse3_table
The issue with defining offsets this way is that symbols are defined in different sections: ipred_h_pred_ssse3.w4 in .text and i_pred_h_ssse3_table in .rdata. It must be producing a complex relocation entry in the final DLL and resolving it wrongly leads to generating wrong jump addresses. Trying to catch the issue in GDB with breakpoint just before jump to the invalid address in dav1d_pal_pred_ssse3 function:
(gdb) i sh libdav1d.dll
From To Syms Read Shared Object Library
0x552a1000 0x553bb89c Yes c:\users\....\libdav1d.dll
0x61321000 0x6143b89c Yes c:\users\....\libdav1d.dll
(gdb) disas
Dump of assembler code for function dav1d_pal_pred_ssse3:
0x61331520 <+0>: push ebx
0x61331521 <+1>: push esi
0x61331522 <+2>: push edi
0x61331523 <+3>: mov eax,DWORD PTR [esp+0x10]
0x61331527 <+7>: mov ecx,DWORD PTR [esp+0x14]
0x6133152b <+11>: mov edx,DWORD PTR [esp+0x18]
0x6133152f <+15>: mov ebx,DWORD PTR [esp+0x1c]
0x61331533 <+19>: movdqa xmm4,XMMWORD PTR [edx]
0x61331537 <+23>: mov edx,0x55346570 <===== address of dav1d_pal_pred_ssse_table
0x6133153c <+28>: tzcnt esi,DWORD PTR [esp+0x20]
0x61331542 <+34>: mov edi,DWORD PTR [esp+0x24]
0x61331546 <+38>: mov esi,DWORD PTR [edx+esi*4]
0x61331549 <+41>: packuswb xmm4,xmm4
0x6133154d <+45>: add esi,edx
=> 0x6133154f <+47>: lea edx,[ecx+ecx*2]
0x61331552 <+50>: jmp esi
End of assembler dump.
(gdb) i r
eax 0x336fa0c0 862953664
ecx 0xa80 2688
edx 0x55346570 1429497200
ebx 0xf61b800 258062336
esp 0x241ed220 0x241ed220
ebp 0x10 0x10
esi 0x552b15d0 1428886992 <===== invalid jump address
edi 0x10 16
eip 0x6133154f 0x6133154f <dav1d_pal_pred_ssse3+47>
eflags 0x203 [ CF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
The simplest fix is to define tables in .text section rather than .rdata, so offsets are calculated between symbols in the same section:
%define ipred_dc_splat_ssse3_table (ipred_dc_ssse3_table + 10*4)
%define ipred_cfl_splat_ssse3_table (ipred_cfl_ssse3_table + 8*4)
+SECTION .text
+
JMP_TABLE ipred_h, ssse3, w4, w8, w16, w32, w64
JMP_TABLE ipred_dc, ssse3, h4, h8, h16, h32, h64, w4, w8, w16, w32, w64, \
s4-10*4, s8-10*4, s16-10*4, s32-10*4, s64-10*4
@@ -102,7 +104,7 @@ JMP_TABLE ipred_filter, ssse3, w4, w8, w16, w32
cextern filter_intra_taps
-SECTION .text
+
Also, keeping jump tables in .rdata section, but using a different base from the .text section also works.