Skip to content
Snippets Groups Projects

Loongarch: multiple SIMD optimization functions are added

Merged Hecai Yuan requested to merge HecaiYuan/dav1d:master into master
1 unresolved thread

Multiple LSX optimizations and a few LASX optimizations were added. On the loongarch platform, the decoding performance has also been greatly improved. Welcome reviews. @jbk @jamrial @gramner

Merge request reports

Pipeline #518544 passed

Pipeline passed for ed004fe9 on HecaiYuan:master

Test coverage 91.53% (-0.11%) from 1 job

Merged by Ronald S. BultjeRonald S. Bultje 5 months ago (Sep 30, 2024 11:12am UTC)

Merge details

  • Changes merged into master with ed004fe9.
  • Did not delete the source branch.

Pipeline #518589 passed

Pipeline passed for ed004fe9 on master

Test coverage 91.55% (-0.11%) from 1 job

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Hecai Yuan added 1 commit

    added 1 commit

    • bd2825e7 - LoongArch: Add save_tmvs_lsx

    Compare with previous version

  • Ronald S. Bultje approved this merge request

    approved this merge request

  • @HecaiYuan could you please click "Edit" (top/right) and then select "Allow commits from members who can merge to the target branch." (at the bottom)? That way I can rebase & merge for you.

    (Or rebase manually, if that was already selected.)

    Edited by Ronald S. Bultje
  • Author Contributor

    I tried to turn this option on and found that it was not selectable. I'm not sure what caused this. @rbultje

  • Author Contributor

    I looked at the revised code and found it more comfortable and reasonable. Our plan is to continually maintain community work and deliver high quality code to the best of our ability. In addition, the modified code will be in another patch.

  • Hecai Yuan resolved all threads

    resolved all threads

  • Hecai Yuan added 2 commits

    added 2 commits

    • 0f618ba1 - loongarch: rewrite optimization functions in loongarch/itx.S
    • 6973f5fc - loongarch: minor improvement on decode_symbol_adapt

    Compare with previous version

  • Hecai Yuan added 2 commits

    added 2 commits

    • 489225ba - loongarch: rewrite optimization functions in loongarch/itx.S
    • 55c54ad3 - loongarch: minor improvement on decode_symbol_adapt

    Compare with previous version

  • Author Contributor

    Relevant functions in loongarch/itx.S have been adjusted. The number of lines of code is significantly reduced. @rbultje @jbk

  • Please rebase this set on top of master.

  • Hecai Yuan added 56 commits

    added 56 commits

    • 55c54ad3...f2c3ccd6 - 21 commits from branch videolan:master
    • f2c3ccd6...7c63bb1b - 25 earlier commits
    • 411fc219 - Loongarch: Optimized ipred_z1 8bpc functions by LSX
    • 90a9549b - Loongarch: Optimized load_tmvs_c function by LSX
    • af11a10a - loongarch: add lasx implementation of wiener filter for 8 bpc
    • b9e9a0ef - loongarch: Refine prep_8tap_8bpc_lasx
    • 96d6e472 - loongarch: rewirte warp_8x8/8x8t_lsx for 8 bpc
    • 70582027 - loongarch: add lasx implementation of sgr_3x3 for 8 bpc
    • 3d96175d - loongarch: refactor loopfilter
    • 757f294a - LoongArch: Add save_tmvs_lsx
    • 62a51df1 - loongarch: rewrite optimization functions in loongarch/itx.S
    • ed004fe9 - loongarch: minor improvement on decode_symbol_adapt

    Compare with previous version

  • Thanks @HecaiYuan & team!

  • changed milestone to %1.5.0

    • @HecaiYuan sorry to poke you, but would it be possible for you to run the argon test suite with these optimizations merged? These samples test things like clipping behaviour in the inverse transforms or potential overflows in MC better than the standard conformance samples. There's a script in the repo to run it. Thanks!

    • In particular, these tests currently do fail, after this MR:

      $ ../tests/dav1d_argon.bash -f
      Mismatch in profile0_core/streams/test10571_10597_10562.obu                

      This breakage seems to have started since commit 13a857d0.

    • Please register or sign in to reply
  • The suite itself is here

  • mentioned in issue #448 (closed)

  • Author Contributor

    In issue  #448 (closed), the problem has been solved. @mstorsjo Thank you for suite. @lu_zero

  • Please register or sign in to reply
    Loading