tsan warnings when n_tile_threads >= n_tile_cols

tools/dav1d -i ducks.444.4x4tiles.aomenc.ivf -o /tmp/ducks.dav1d.y4m --tilethreads=4 --framethreads=1:

WARNING: ThreadSanitizer: data race (pid=54618)
  Write of size 4 at 0x7ba800005258 by main thread:
  * #0 submit_frame decode.c:2802 (libdav1d.0.dylib:x86_64+0x2655d)
    #1 parse_obus obu.c:1067 (libdav1d.0.dylib:x86_64+0x9bab)
    #2 dav1d_decode lib.c:195 (libdav1d.0.dylib:x86_64+0x12b9fe)
    #3 main dav1d.c:110 (dav1d:x86_64+0x1000025fc)

  Previous read of size 4 at 0x7ba800005258 by thread T1:
  * #0 dav1d_tile_task thread_task.c:94 (libdav1d.0.dylib:x86_64+0x12dc72)

  Issue is caused by frames marked with "*".

  Location is heap block of size 23440 at 0x7ba800000000 allocated by main thread:
    #0 posix_memalign <null>:1439069344 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x49215)
    #1 dav1d_open lib.c:84 (libdav1d.0.dylib:x86_64+0x12a1bc)
    #2 main dav1d.c:105 (dav1d:x86_64+0x100002574)

  Thread T1 (tid=44573671, running) created by main thread at:
    #0 pthread_create <null>:1439069392 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2a72d)
    #1 dav1d_open lib.c:121 (libdav1d.0.dylib:x86_64+0x12ad3c)
    #2 main dav1d.c:105 (dav1d:x86_64+0x100002574)

SUMMARY: ThreadSanitizer: data race decode.c:2802 in submit_frame
==================
==================
WARNING: ThreadSanitizer: data race (pid=54618)
  Write of size 4 at 0x00010c189c2c by main thread:
  * #0 setup_tile decode.c:2039 (libdav1d.0.dylib:x86_64+0x24676)
    #1 decode_frame decode.c:2509 (libdav1d.0.dylib:x86_64+0x2188d)
    #2 submit_frame decode.c:2909 (libdav1d.0.dylib:x86_64+0x28390)
    #3 parse_obus obu.c:1067 (libdav1d.0.dylib:x86_64+0x9bab)
    #4 dav1d_decode lib.c:195 (libdav1d.0.dylib:x86_64+0x12b9fe)
    #5 main dav1d.c:110 (dav1d:x86_64+0x1000025fc)

  Previous read of size 4 at 0x00010c189c2c by thread T2:
  * #0 dav1d_tile_task thread_task.c:93 (libdav1d.0.dylib:x86_64+0x12db2e)

  Issue is caused by frames marked with "*".

  Location is heap block of size 260160 at 0x00010c16a000 allocated by main thread:
    #0 realloc <null>:1439063616 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x49042)
    #1 decode_frame decode.c:2274 (libdav1d.0.dylib:x86_64+0x1d5b0)
    #2 submit_frame decode.c:2909 (libdav1d.0.dylib:x86_64+0x28390)
    #3 parse_obus obu.c:1067 (libdav1d.0.dylib:x86_64+0x9bab)
    #4 dav1d_decode lib.c:195 (libdav1d.0.dylib:x86_64+0x12b9fe)
    #5 main dav1d.c:110 (dav1d:x86_64+0x1000025fc)

  Thread T2 (tid=44573672, running) created by main thread at:
    #0 pthread_create <null>:1439063664 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2a72d)
    #1 dav1d_open lib.c:121 (libdav1d.0.dylib:x86_64+0x12ad3c)
    #2 main dav1d.c:105 (dav1d:x86_64+0x100002574)

SUMMARY: ThreadSanitizer: data race decode.c:2039 in setup_tile

The relevant entries are ts->tiling.row_end and f->sb_step. I'm confused by this because these are set long before the relevant cond_signal, and read after the relevant cond_wait. This is indeed outside a mutex, but my understanding of cond wait/signal pairing is that that still is an acceptable thread barrier. But tsan appears to disagree with me...

Edited by Ronald S. Bultje