ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-01-06 14:15:29 +01:00

Author	SHA1	Message	Date
James Almer	00caeba050	avcodec: rename avcodec_receive_frame2 to avcodec_receive_frame_flags It's a name that communicates its functionality in a better way. Since the function was introduced very recently, we can safely rename it. Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-07 12:47:46 -03:00
Michael Niedermayer	88f26718a0	avcodec/decode: Fix build due to ff_thread_receive_frame() Regression since: `5e56937b74` Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-12-07 11:58:01 +01:00
Kacper Michajłow	6a14a93af5	checkasm/sw_xyz2rgb: fix function type Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-05 21:55:03 +00:00
Niklas Haas	929a2ced9b	fftools/ffmpeg_dec: decode the first frame synchronously In an ideal world, a filtergraph like `-i A -i B ... -filter_complex concat` would only keep resources in memory related to the file currently being output by the concat filter. So ideally, we'd open and fully decode A, then open and fully decode B, and so on. Practically, however, fftools wants to get one frame from each input file in order to initialize the filter graph (buffersrc parameters). So what happens currently is that fftools will request a single frame from each input A, B, etc that is plugged into the filtergraph. When using frame threading, the design of the decoder (ff_thread_receive_frame) is that it will not output any frames until we have received enough packets to saturate all threads. This, however, forces the decoder to buffer at least as many frames for each input file as we have threads, before outputting anything. By decoding the first frame synchronously, we avoid this issue and allow configuring the filter graph more quickly and without wasting excess resources on frames that will not (yet) be used.	2025-12-05 19:42:45 +01:00
Niklas Haas	5e56937b74	avcodec: allow bypassing frame threading with an optional flag Normally, this function tries to make sure all threads are saturated with work to do before returning any frames; and will continue requesting packets until that is the case. However, this significantly slows down initial decoding latency when only requesting a single frame (to e.g. configure the filter graph), and also wastes a lot of unnecessary memory in the event that the user does not intend to decode more frames until later. By introducing a new `flags` paramater and a new flag `AV_CODEC_RECEIVE_FRAME_FLAG_SYNCHRONOUS` to go along with it, we can allow users to temporarily bypass this logic.	2025-12-05 19:42:41 +01:00
Araz Iusubov	077864dfd6	avcodec/amf: fix hw_device_ctx handling	2025-12-05 15:53:19 +00:00
Zhao Zhili	d3953237d1	avcodec/h264_slice: don't force ff_get_format unconditionally after flush h->context_initialized is zero after flush, which triggers call to ff_get_format unconditionally. ff_get_format can be heavy with ff_hwaccel_uninit and hwaccel_init. For example, it takes 20 ms on macOS with videotoolbox. ff_get_format should not be called if nothing changed. ff_get_format is guarantee to be called at the first time and when video information changed with (must_reinit \|\| needs_reinit). Fix #20760.	2025-12-05 13:54:08 +00:00
Andreas Rheinhardt	1d47ae65bf	avcodec/tableprint_vlc: Unbreak hardcoded tables Forgotten in `d8ffec5bf9`. Fixes issue #21102. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-05 11:31:23 +01:00
Arpad Panyik	1f30ff30fb	swscale: Add AArch64 Neon path for xyz12Torgb48 LE Add optimized Neon code path for the little endian case of the xyz12Torgb48 function. The innermost loop processes the data in 4x2 pixel blocks using software gathers with the matrix multiplication and clipping done by Neon. Relative runtime of micro benchmarks after this patch on some Cortex and Neoverse CPU cores: xyz12le_rgb48le X1 X3 X4 X925 V2 16x4_neon: 2.55x 4.34x 3.84x 3.31x 3.22x 32x4_neon: 2.39x 3.63x 3.22x 3.35x 3.29x 64x4_neon: 2.37x 3.31x 2.91x 3.33x 3.27x 128x4_neon: 2.34x 3.28x 2.91x 3.35x 3.24x 256x4_neon: 2.30x 3.17x 2.91x 3.32x 3.10x 512x4_neon: 2.26x 3.10x 2.91x 3.30x 3.07x 1024x4_neon: 2.26x 3.07x 2.96x 3.30x 3.05x 1920x4_neon: 2.26x 3.06x 2.93x 3.28x 3.04x xyz12le_rgb48le A76 A78 A715 A720 A725 16x4_neon: 2.33x 2.28x 2.53x 3.33x 3.19x 32x4_neon: 2.35x 2.18x 2.45x 3.23x 3.24x 64x4_neon: 2.35x 2.16x 2.42x 3.15x 3.21x 128x4_neon: 2.35x 2.13x 2.39x 3.00x 3.09x 256x4_neon: 2.36x 2.12x 2.35x 2.85x 2.99x 512x4_neon: 2.35x 2.14x 2.35x 2.78x 2.95x 1024x4_neon: 2.31x 2.09x 2.33x 2.80x 2.91x 1920x4_neon: 2.30x 2.07x 2.32x 2.81x 2.94x xyz12le_rgb48le A55 A510 A520 16x4_neon: 2.09x 1.92x 2.36x 32x4_neon: 2.05x 1.89x 2.38x 64x4_neon: 2.02x 1.77x 2.35x 128x4_neon: 1.96x 1.74x 2.25x 256x4_neon: 1.90x 1.72x 2.19x 512x4_neon: 1.83x 1.75x 2.16x 1024x4_neon: 1.83x 1.62x 2.15x 1920x4_neon: 1.82x 1.60x 2.15x Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>	2025-12-05 10:28:18 +00:00
Arpad Panyik	a13871ae19	checkasm: Add xyz12Torgb48le test Add checkasm coverage for the XYZ12LE to RGB48LE path via the ctx->xyz12Torgb48 hook. Integrate the test into the build and runner, exercise a variety of widths/heights, compare against the C reference, and benchmark when width is multiple of 4. This improves test coverage for the new function pointer in preparation for architecture-specific implementations in subsequent commits. Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>	2025-12-05 10:28:18 +00:00
Arpad Panyik	ef651b84ce	swscale: Refactor XYZ+RGB state and add function hooks Prepare for xyz12Torgb48 architecture-specific optimizations in subsequent patches by: - Grouping XYZ+RGB gamma LUTs and 3x3 matrices into SwsColorXform (ctx->xyz2rgb and ctx->rgb2xyz), replacing scattered fields. - Dropping the unused last matrix column giving the same or smaller SwsInternal size. - Renaming ff_xyz12Torgb48 and ff_rgb48Toxyz12 and routing calls via the new per-context function pointer (ctx->xyz12Torgb48 and ctx->rgb48Toxyz12) in graph.c and swscale.c. - Adding ff_sws_init_xyzdsp and invoking it in swscale init paths (normal and unscaled). - Making fill_xyztables public to ease its setup later in checkasm. These modifications do not introduce any functional changes. Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>	2025-12-05 10:28:18 +00:00
Andreas Rheinhardt	9e038fd959	swscale/tests/swscale: Fix typo Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-05 10:42:01 +01:00
James Almer	52c84b06d5	avfilter/f_sidedata: also handle global side data in filter links Should fix issue #21071 Signed-off-by: James Almer <jamrial@gmail.com>	2025-12-04 13:50:45 -03:00
Andreas Rheinhardt	e0845ec2cf	avformat/movenc: Fix leak of IAMFContext on error Forgotten in `5b87869c09`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 16:15:09 +00:00
Lynne	f80addbb07	ffv1enc_vulkan: fix encoding with large contexts When RGB_LINECACHE == 2, then top2 is not the current line.	2025-12-04 16:53:58 +01:00
Andreas Rheinhardt	4b6e40a298	avcodec/vp8dsp: Don't compile unused functions The width 16 epel functions never use four taps in any direction, so don't build said functions. Saves 4352B of .text and 89B of .text.unlikely here. : mx and my in vp8_mc_luma() are always even. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	9cff236e2f	avcodec/riscv/vp8dsp_rvv: Remove unused functions Only the sixtap functions are used for size 16. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	050c80a526	avcodec/x86/vp8dsp: Don't use saturated addition when unnecessary For the epel functions, there can be no overflow as long as the sum contains only one of the two large central coefficients; for bilinear functions, there can be no overflow whatsoever. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	575e9e9c08	avcodec/x86/vp8dsp: Reduce number of coefficient tables By changing the permutations used in the epel8_h{4,6} case we can simply reuse the coefficient tables from the vertical epel filters. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	99fb257f58	avcodec/x86/vp8dsp: Don't use MMX registers in ff_put_vp8_epel4_h6_ssse3 Doubling the register width allowed to avoid a pshufb and a pmaddubsw. Old benchmarks: vp8_put_epel4_h6_c: 115.9 ( 1.00x) vp8_put_epel4_h6_ssse3: 20.2 ( 5.74x) vp8_put_epel4_h6v4_c: 276.3 ( 1.00x) vp8_put_epel4_h6v4_ssse3: 58.6 ( 4.71x) vp8_put_epel4_h6v6_c: 363.6 ( 1.00x) vp8_put_epel4_h6v6_ssse3: 62.5 ( 5.82x) New benchmarks: vp8_put_epel4_h6_c: 116.4 ( 1.00x) vp8_put_epel4_h6_ssse3: 16.0 ( 7.29x) vp8_put_epel4_h6v4_c: 280.9 ( 1.00x) vp8_put_epel4_h6v4_ssse3: 44.3 ( 6.33x) vp8_put_epel4_h6v6_c: 365.6 ( 1.00x) vp8_put_epel4_h6v6_ssse3: 53.1 ( 6.89x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	3135bc0d3a	avcodec/x86/vp8dsp: Don't use MMX registers in ff_put_vp8_epel4_h4_ssse3 Doubling the register width allows to use only one pshufb and pmaddubsw. Old benchmarks: vp8_put_epel4_h4_c: 82.8 ( 1.00x) vp8_put_epel4_h4_ssse3: 13.9 ( 5.96x) New benchmarks: vp8_put_epel4_h4_c: 82.7 ( 1.00x) vp8_put_epel4_h4_ssse3: 11.7 ( 7.08x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	714cbf1c70	avcodec/x86/vp8dsp: Don't use MMX registers in ff_put_vp8_epel4_v4_ssse3 Switching to xmm registers allows to process two rows in parallel, leading to speedups. It is also ABI compliant (no more missing emms). Old benchmarks: vp8_put_epel4_v4_c: 96.8 ( 1.00x) vp8_put_epel4_v4_ssse3: 28.2 ( 3.43x) New benchmarks: vp8_put_epel4_v4_c: 95.1 ( 1.00x) vp8_put_epel4_v4_ssse3: 22.8 ( 4.17x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	f017806829	avcodec/x86/vp8dsp: Don't use MMX registers in ff_put_vp8_epel4_v6_ssse3 Switching to xmm registers allows to process two rows in parallel, leading to speedups. It is also ABI compliant (no more missing emms). Old benchmarks: vp8_put_epel4_v6_c: 132.8 ( 1.00x) vp8_put_epel4_v6_ssse3: 34.3 ( 3.87x) New benchmarks: vp8_put_epel4_v6_c: 131.5 ( 1.00x) vp8_put_epel4_v6_ssse3: 27.1 ( 4.86x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	7411998757	avcodec/x86/vp8dsp: Avoid unpacking multiple times Always pair row i with row i+2 for the vertical four-tap filter and row i+3 for the vertical six-tap filter (instead of pairing the first with the sixth, the second with the third and the fourth and the fifth). This allows to unpack each row only once instead of (at most) three times. Old benchmarks: vp8_put_epel4_v4_c: 98.4 ( 1.00x) vp8_put_epel4_v4_ssse3: 28.6 ( 3.44x) vp8_put_epel4_v6_c: 131.6 ( 1.00x) vp8_put_epel4_v6_ssse3: 38.5 ( 3.42x) vp8_put_epel8_v4_c: 362.5 ( 1.00x) vp8_put_epel8_v4_sse2: 63.8 ( 5.68x) vp8_put_epel8_v4_ssse3: 44.4 ( 8.16x) vp8_put_epel8_v6_c: 538.3 ( 1.00x) vp8_put_epel8_v6_sse2: 86.5 ( 6.22x) vp8_put_epel8_v6_ssse3: 57.0 ( 9.44x) vp8_put_epel16_v6_c: 1044.6 ( 1.00x) vp8_put_epel16_v6_sse2: 158.0 ( 6.61x) vp8_put_epel16_v6_ssse3: 106.7 ( 9.79x) New benchmarks: vp8_put_epel4_v4_c: 100.0 ( 1.00x) vp8_put_epel4_v4_ssse3: 28.4 ( 3.52x) vp8_put_epel4_v6_c: 131.7 ( 1.00x) vp8_put_epel4_v6_ssse3: 34.3 ( 3.84x) vp8_put_epel8_v4_c: 364.4 ( 1.00x) vp8_put_epel8_v4_sse2: 63.7 ( 5.72x) vp8_put_epel8_v4_ssse3: 43.3 ( 8.42x) vp8_put_epel8_v6_c: 550.2 ( 1.00x) vp8_put_epel8_v6_sse2: 86.4 ( 6.37x) vp8_put_epel8_v6_ssse3: 52.9 (10.40x) vp8_put_epel16_v6_c: 1052.5 ( 1.00x) vp8_put_epel16_v6_sse2: 158.3 ( 6.65x) vp8_put_epel16_v6_ssse3: 98.9 (10.64x) Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	24cdd4100d	avcodec/x86/vp8dsp_init: Remove unused macro Forgotten in `6a551f1405`. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	76900089fb	avcodec/x86/vp8dsp: Avoid reload Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	86aa1b81ec	avcodec/x86/vp8dsp: Increment src pointer earlier Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	e59ed3470d	avcodec/x86/vp8dsp: Directly use negated stride There is a register available. No change in benchmarks here. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:37 +01:00
Andreas Rheinhardt	8fb6b0c733	avcodec/x86/vp8dsp: Don't use MMX registers in put_vp8_pixels8 Use GPRs on x64 and xmm registers else (using GPRs reduces codesize). This avoids clobbering the floating point state and therefore no longer breaks the ABI. No change in benchmarks here. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:36 +01:00
Andreas Rheinhardt	ed5e0f9c68	avcodec/x86/vp8dsp: Remove MMXEXT functions overridden by SSSE3 SSSE3 is already quite old (introduced 2006 for Intel, 2011 for AMD), so that the overwhelming majority of our users (particularly those that actually update their FFmpeg) will be using the SSSE3 versions. This commit therefore removes the MMX(EXT) functions overridden by them (which don't abide by the ABI) to get closer to a removal of emms_c. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-04 15:17:36 +01:00
Lynne	9b14ea0aa1	vulkan_dpx: fix alignment issue 12-bit images apparently require mod-32 alignment for each line. Go figure.	2025-12-04 15:08:46 +01:00
Oliver Chang	d6458f6a8b	avcodec/aacdec: Fix heap-use-after-free in USAC decoding A heap-use-after-free vulnerability was identified in `libavcodec/aac/aacdec.c`. When `che_configure` frees a `ChannelElement` (`ac->che[type][id]`), it failed to clear all references to it in `ac->tag_che_map`. `ac->tag_che_map` caches pointers to `ChannelElement`s and can contain cross-type mappings (e.g., a `TYPE_SCE` tag mapping to a `TYPE_LFE` element). In a USAC stream reconfiguration scenario, an LFE element was freed, but a stale pointer remained in `ac->tag_che_map`. Subsequent calls to `ff_aac_get_che` returned this dangling pointer, leading to a crash in `decode_usac_core_coder`. This commit fixes the issue by iterating over the entire `ac->tag_che_map` in `che_configure` and clearing any entries that point to the `ChannelElement` about to be freed, ensuring no dangling pointers remain. Fixes: https://issues.oss-fuzz.com/issues/440220467	2025-12-04 09:34:32 +00:00
Xia Tao	7922d4ca7d	avcodec/wasm/hevc: fix typo in butterfly macro Signed-off-by: Xia Tao <xiatao@gmail.com>	2025-12-04 08:40:43 +00:00
stevxiao	7b2ae2ccf7	avcodec/d3d12va_encode: add intra refresh support for d3d12va encode Intra refresh is a technique that gradually refreshes the video by encoding rows or regions as intra macroblocks/CTUs spread over multiple frames, rather than using periodic I-frames. This provides better error resilience for video streaming while maintaining more consistent bitrate. Disable Intra Refresh (This is the default) ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \ -i input.mp4 \ -c:v h264_d3d12va \ -intra_refresh_mode none \ -intra_refresh_duration 30 \ -g 60 \ output.h264 Enable Intra Refresh ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \ -i input.mp4 \ -c:v h264_d3d12va \ -intra_refresh_mode row_based \ -intra_refresh_duration 30 \ -g 60 \ output.h264 Parameters - `-intra_refresh_mode`: Set to `row_based` to enable row-based intra refresh, or `NONE` to disable - `-intra_refresh_duration`: Number of frames over which to spread the intra refresh (default: 0 = use GOP size) - `-g`: GOP size (should typically be larger than intra refresh duration)	2025-12-04 08:26:26 +00:00
Michael Niedermayer	12e7d095b1	Revert "avformat/rawdec: set framerate in codec parameters" Fixes single image videos this works and creates our single image video ./ffmpeg -i lena.pnm /tmp/file.m2v this fails after `3d96d83a0a`: ./ffmpeg -i /tmp/file.m2v /tmp/file.jpg -y This reverts commit `3d96d83a0a`.	2025-12-04 01:59:04 +00:00
Kacper Michajłow	9ed71a837b	avutil/vulkan: fix device memory size truncation size_t cannot fit VK_WHOLE_SIZE on 32-bit builds. Fixes: warning: conversion from 'long long unsigned int' to 'size_t' {aka 'unsigned int'} changes value from '18446744073709551615' to '4294967295' Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-03 23:45:44 +00:00
Kacper Michajłow	3cc10b5ff6	fftools/cmdutils: use strcpy directly, the length is computed already There is no need to scan for NULL, if we inject it ourselves. Fixes: warning: 'strncat' specified bound 10 equals source length [-Wstringop-overflow=] Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-03 23:45:44 +00:00
Kacper Michajłow	f7b7972f78	avdevice/gdigrab: suppress int to pointer cast warning Fixes: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-12-03 23:45:44 +00:00
wutno	f4312ea138	avformat/xmv: Handle zero sized packet at end of file Some XMVs introduce a blank packet at the end of the stream. Previously, we didn't account for this and returned AVERROR_INVALIDDATA, indicating an issue with the file. Instead, let's check for this and close out with AVERROR_EOF.	2025-12-03 22:09:20 +00:00
Lynne	a8e8daa276	hwcontext_vulkan: fix final error to let old header files work ........	2025-12-03 22:34:32 +01:00
Jack Lau	c4b050fd67	tests/fate/filter-video: add two feedback tests - Add fate-filter-feedback-yadif - add fate-filter-feedback-hflip Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>	2025-12-03 21:23:51 +00:00
Jack Lau	3f0842294f	avfilter/vf_feedback: fix feedback block Fix #20940 The feedback and its sub-filter both request frame from each other, casuing block since `4440e499ba` The feedback should only request inputs[1] once rather than continuously request frame cause blocking. This patch add check whether feedback already request inputs[1] via ff_outlink_frame_wanted(ctx->outputs[1]), if true, then exit and waiting inputs[0] because it means we need more frames input to proceed. Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>	2025-12-03 21:23:51 +00:00
Lynne	bce14bb160	hwcontext_vulkan: fix compilation with older header versions	2025-12-03 21:22:54 +01:00
Oliver Chang	041d4f010e	libavcodec/prores_raw: Fix heap-buffer-overflow in decode_frame Fixes a heap-buffer-overflow in `decode_frame` where `header_len` read from the bitstream was not validated against the remaining bytes in the input buffer (`gb`). This allowed `gb_hdr` to be initialized with a size exceeding the actual packet data, leading to an out-of-bounds read. The fix adds a check to ensure `bytestream2_get_bytes_left(&gb)` is greater than or equal to `header_len - 2` before initializing `gb_hdr`. Fixes: https://issues.oss-fuzz.com/issues/439711053	2025-12-03 16:40:02 +00:00
Andreas Rheinhardt	e3e3265034	tests/checkasm/mpegvideo_unquantize: Add missing const Fixes this test under UBSan: runtime error: call to function dct_unquantize_mpeg1_intra_c through pointer to incorrect function type 'void ()(struct MpegEncContext , short *, int, int)' I don't know how I could forget this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-03 14:17:58 +01:00
Martin Storsjö	b98179cec6	avcodec/{arm,neon}/mpegvideo: Readd a missed initialization This was accidentally removed in `357fc5243c`. This fixes test failures when built with Clang and MSVC; surprisingly, the checkasm test did seem to pass when built with GCC. Clang and MSVC also warn about the use of the uninitialized variable, while GCC didn't.	2025-12-03 13:53:54 +02:00
Andreas Rheinhardt	5d9270df7f	libavutil/internal: Remove {SIZE,PTRDIFF}_SPECIFIER Possible since `222127418b`. Reviewed-by: Kacper Michajłow <kasper93@gmail.com> Reviewed-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-03 11:52:54 +01:00
Andreas Rheinhardt	c22c2c5e03	avcodec/mpegvideo: Port dct_unquantize_mpeg2_intra_mmx to SSE2 Benefits from wider registers. Benchmarks: dct_unquantize_mpeg2_intra_c: 228.2 ( 1.00x) dct_unquantize_mpeg2_intra_mmx: 28.2 ( 8.10x) dct_unquantize_mpeg2_intra_sse2: 18.4 (12.37x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-03 10:23:43 +01:00
Andreas Rheinhardt	6e2153111d	avcodec/x86/mpegvideo: Port dct_unquantize_mpeg2_inter_mmx to SSSE3 Benefits from wider registers, pabsw and psignw. Benchmarks: dct_unquantize_mpeg2_inter_c: 131.2 ( 1.00x) dct_unquantize_mpeg2_inter_mmx: 50.2 ( 2.62x) dct_unquantize_mpeg2_inter_ssse3: 20.5 ( 6.38x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-03 10:23:43 +01:00
Andreas Rheinhardt	60084b1369	avcodec/x86/mpegvideo: Port MPEG-1 unquantize functions to SSSE3 Benefits from wider registers and pabsw, psignw. Benchmarks: dct_unquantize_mpeg1_inter_c: 343.0 ( 1.00x) dct_unquantize_mpeg1_inter_mmx: 50.6 ( 6.78x) dct_unquantize_mpeg1_inter_ssse3: 17.2 (19.94x) dct_unquantize_mpeg1_intra_c: 352.1 ( 1.00x) dct_unquantize_mpeg1_intra_mmx: 48.8 ( 7.22x) dct_unquantize_mpeg1_intra_ssse3: 19.5 (18.03x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-12-03 10:23:43 +01:00

1 2 3 4 5 ...

122018 Commits