Commit Graph

4 Commits

Author SHA1 Message Date
Andreas Rheinhardt
f4a87d8ca4 avcodec/x86/mpegvideoencdsp_init: Use xmm registers in SSSE3 functions
Improves performance and no longer breaks the ABI (by forgetting
to call emms).

Old benchmarks:
add_8x8basis_c:                                         43.6 ( 1.00x)
add_8x8basis_ssse3:                                     12.3 ( 3.55x)

New benchmarks:
add_8x8basis_c:                                         43.0 ( 1.00x)
add_8x8basis_ssse3:                                      6.3 ( 6.79x)

Notice that the output of try_8x8basis_ssse3 changes a bit:
Before this commit, it computes certain values and adds the values
for i,i+1,i+4 and i+5 before right shifting them; now it adds
the values for i,i+1,i+8,i+9. The second pair in these lists
could be avoided (by shifting xmm0 and xmm1 before adding both together
instead of only shifting xmm0 after adding them), but the former
i,i+1 is inherent in using pmaddwd. This is the reason that this
function is not bitexact.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00
Andreas Rheinhardt
ce499ebf96 tests/checkasm/mpegvideoencdsp: Add test for add_8x8basis
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00
Ramiro Polla
6aafe61285 avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_t 2024-09-01 13:42:30 +02:00
Ramiro Polla
834964ce1a checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edges 2024-08-26 12:48:09 +02:00