Files
ffmpeg/tests/checkasm
Andreas Rheinhardt f4a87d8ca4 avcodec/x86/mpegvideoencdsp_init: Use xmm registers in SSSE3 functions
Improves performance and no longer breaks the ABI (by forgetting
to call emms).

Old benchmarks:
add_8x8basis_c:                                         43.6 ( 1.00x)
add_8x8basis_ssse3:                                     12.3 ( 3.55x)

New benchmarks:
add_8x8basis_c:                                         43.0 ( 1.00x)
add_8x8basis_ssse3:                                      6.3 ( 6.79x)

Notice that the output of try_8x8basis_ssse3 changes a bit:
Before this commit, it computes certain values and adds the values
for i,i+1,i+4 and i+5 before right shifting them; now it adds
the values for i,i+1,i+8,i+9. The second pair in these lists
could be avoided (by shifting xmm0 and xmm1 before adding both together
instead of only shifting xmm0 after adding them), but the former
i,i+1 is inherent in using pmaddwd. This is the reason that this
function is not bitexact.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-10-15 08:55:13 +02:00
..
2025-08-03 13:48:47 +02:00
2025-04-06 11:02:10 -03:00
2023-11-27 17:55:24 +02:00
2025-04-27 15:52:30 +01:00
2025-10-08 20:40:08 +02:00
2025-10-05 10:09:04 -03:00
2024-05-11 10:28:59 +02:00
2024-03-15 12:51:15 +01:00
2024-06-04 11:46:27 +02:00
2025-08-03 13:48:47 +02:00
2025-10-04 07:06:32 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-09-13 21:27:27 +02:00
2022-11-06 14:39:36 +01:00
2025-08-03 13:48:47 +02:00
2025-09-21 11:02:41 +00:00
2024-03-15 12:51:15 +01:00
2024-08-31 14:08:54 +08:00