Skip to content

Windows 32-bit DLL crashes in dav1d_ipred_*_ssse3 functions

Dav1d decoder is used in a complex 32 bit windows application. While decoding AV1 steam, libdav1d.dll would crash in intra prediction functions for ssse3 simd. Problem is reproducible just with individual PCs and when used in that application. I was unable to reproduce it in any other scenario.

Examining the source code, and debugging with GDB led me to believe that there is a problem with calculating addresses for far jump instructions in those functions. Addresses stored as offsets in jump tables, with tables acting as a base, e.g.:

offset for ipred_h_pred_ssse3.w4 = ipred_h_pred_ssse3.w4 - i_pred_h_ssse3_table

The issue with defining offsets this way is that symbols are defined in different sections: ipred_h_pred_ssse3.w4 in .text and i_pred_h_ssse3_table in .rdata. It must be producing a complex relocation entry in the final DLL and resolving it wrongly leads to generating wrong jump addresses. Trying to catch the issue in GDB with breakpoint just before jump to the invalid address in dav1d_pal_pred_ssse3 function:

(gdb) i sh libdav1d.dll
From        To          Syms Read   Shared Object Library
0x552a1000  0x553bb89c  Yes         c:\users\....\libdav1d.dll
0x61321000  0x6143b89c  Yes         c:\users\....\libdav1d.dll
(gdb) disas
Dump of assembler code for function dav1d_pal_pred_ssse3:
   0x61331520 <+0>:     push   ebx
   0x61331521 <+1>:     push   esi
   0x61331522 <+2>:     push   edi
   0x61331523 <+3>:     mov    eax,DWORD PTR [esp+0x10]
   0x61331527 <+7>:     mov    ecx,DWORD PTR [esp+0x14]
   0x6133152b <+11>:    mov    edx,DWORD PTR [esp+0x18]
   0x6133152f <+15>:    mov    ebx,DWORD PTR [esp+0x1c]
   0x61331533 <+19>:    movdqa xmm4,XMMWORD PTR [edx]
   0x61331537 <+23>:    mov    edx,0x55346570            <===== address of dav1d_pal_pred_ssse_table 
   0x6133153c <+28>:    tzcnt  esi,DWORD PTR [esp+0x20]
   0x61331542 <+34>:    mov    edi,DWORD PTR [esp+0x24]
   0x61331546 <+38>:    mov    esi,DWORD PTR [edx+esi*4]
   0x61331549 <+41>:    packuswb xmm4,xmm4
   0x6133154d <+45>:    add    esi,edx
=> 0x6133154f <+47>:    lea    edx,[ecx+ecx*2]
   0x61331552 <+50>:    jmp    esi
End of assembler dump.
(gdb) i r
eax            0x336fa0c0       862953664
ecx            0xa80    2688
edx            0x55346570       1429497200
ebx            0xf61b800        258062336
esp            0x241ed220       0x241ed220
ebp            0x10     0x10
esi            0x552b15d0       1428886992               <===== invalid jump address 
edi            0x10     16
eip            0x6133154f       0x6133154f <dav1d_pal_pred_ssse3+47>
eflags         0x203    [ CF IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43

The simplest fix is to define tables in .text section rather than .rdata, so offsets are calculated between symbols in the same section:

 %define ipred_dc_splat_ssse3_table (ipred_dc_ssse3_table + 10*4)
 %define ipred_cfl_splat_ssse3_table (ipred_cfl_ssse3_table + 8*4)

+SECTION .text
+
 JMP_TABLE ipred_h,          ssse3, w4, w8, w16, w32, w64
 JMP_TABLE ipred_dc,         ssse3, h4, h8, h16, h32, h64, w4, w8, w16, w32, w64, \
                                 s4-10*4, s8-10*4, s16-10*4, s32-10*4, s64-10*4
@@ -102,7 +104,7 @@ JMP_TABLE ipred_filter,     ssse3, w4, w8, w16, w32
 cextern filter_intra_taps


-SECTION .text
+

Also, keeping jump tables in .rdata section, but using a different base from the .text section also works.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information