1. 25 Oct, 2018 1 commit
  2. 23 Oct, 2018 1 commit
    • Martin Storsjö's avatar
      arm64: mc: Make the jump tables local symbols · 0bb53898
      Martin Storsjö authored
      For MachO, this makes sure that the label difference actually is
      evaluated at assembly time (as it already was for ELF and COFF);
      evaluating it at link time failed when the difference is stored in
      a .hword.
      
      This fixes linking errors like these:
      ld: in section __TEXT,__text reloc 0: ARM64_RELOC_SUBTRACTOR must have r_length of 2 or 3 file 'src/src@@dav1d_bitdepth_8@sta/arm_64_mc.S.o' for architecture arm64
      
      This adds an asm.S macro for decorating a symbol for making a
      local symbol. For armasm64 with gas-preprocessor, this doesn't
      actually create a local label (but neither do the local numbered
      labels either currently), which might be slightly inconsistent
      in it would be necessary to make the distinction for that assembler
      as well.
      
      Alternatively, the table symbol could be made into a plain local
      numbered label as all the other labels.
      0bb53898
  3. 21 Oct, 2018 1 commit
    • Martin Storsjö's avatar
      arm64: Don't use uxth for extending a register · 91e0b478
      Martin Storsjö authored
      armasm64 fails to assemble this:
      error A2173: syntax error in expression
              sub             x7,  x7,  w4, uxth
      
      This clearly is a bug in armasm64, and will be reported. For now,
      this workaround should be harmless though, as we've just loaded
      the register with ldrh, so the upper parts of the register should
      be zeroed.
      91e0b478
  4. 20 Oct, 2018 1 commit
    • Janne Grunau's avatar
      arm64/mc: add 8-bit neon asm for avg, w_avg and mask · 80e47425
      Janne Grunau authored
      checkasm --bench on a Qualcomm Kryo (Sanpdragon 820):
      nop: 33.0
      avg_w4_8bpc_c: 450.5
      avg_w4_8bpc_neon: 20.1
      avg_w8_8bpc_c: 438.6
      avg_w8_8bpc_neon: 45.2
      avg_w16_8bpc_c: 1003.7
      avg_w16_8bpc_neon: 112.8
      avg_w32_8bpc_c: 3249.6
      avg_w32_8bpc_neon: 429.9
      avg_w64_8bpc_c: 7213.3
      avg_w64_8bpc_neon: 1299.4
      avg_w128_8bpc_c: 16791.3
      avg_w128_8bpc_neon: 2978.4
      w_avg_w4_8bpc_c: 605.7
      w_avg_w4_8bpc_neon: 30.9
      w_avg_w8_8bpc_c: 545.8
      w_avg_w8_8bpc_neon: 72.9
      w_avg_w16_8bpc_c: 1430.1
      w_avg_w16_8bpc_neon: 193.5
      w_avg_w32_8bpc_c: 4876.3
      w_avg_w32_8bpc_neon: 715.3
      w_avg_w64_8bpc_c: 11338.0
      w_avg_w64_8bpc_neon: 2147.0
      w_avg_w128_8bpc_c: 26822.0
      w_avg_w128_8bpc_neon: 4596.3
      mask_w4_8bpc_c: 604.6
      mask_w4_8bpc_neon: 37.2
      mask_w8_8bpc_c: 654.8
      mask_w8_8bpc_neon: 96.0
      mask_w16_8bpc_c: 1663.0
      mask_w16_8bpc_neon: 272.4
      mask_w32_8bpc_c: 5707.6
      mask_w32_8bpc_neon: 1028.9
      mask_w64_8bpc_c: 12735.3
      mask_w64_8bpc_neon: 2533.2
      mask_w128_8bpc_c: 31027.6
      mask_w128_8bpc_neon: 6247.2
      80e47425