Commit Graph

12305 Commits

Author SHA1 Message Date
Rémi Denis-Courmont
7ee5c907e5 lavfi/vf_blackdetect: R-V V count_pixels_16
SpacemiT X60:
blackdetect16_c:                                      7171.0 ( 1.00x)
blackdetect16_rvv_i32:                                 383.6 (18.69x)
2025-12-16 17:30:29 +02:00
Rémi Denis-Courmont
570908af09 lavfi/vf_blackdetect: R-V V count_pixels_8
SpacemiT X60:
blackdetect8_c:                                      14911.0 ( 1.00x)
blackdetect8_rvv_i32:                                  369.5 (40.35x)
2025-12-16 17:30:23 +02:00
Niklas Haas
4ac3b3a6da avfilter/vf_libplacebo: rotate all input frames, not just reference
In commit 6e0034ab7e, the image crop adjustment
was moved after the fitting logic. However, this moved the adjustment inside the
`if (src == ref)` branch, thus missing applying the same un-rotation to input
frames that are *not* the reference frame.

Fix this by pulling the logic back out of the branch again. While we could just
move it after the fitting logic, I think it's more clear to the intent of the
code to just preserve the (rotated) crop rect as a separate variable
`crop_orig`.

Fixes: 6e0034ab7e
2025-12-15 11:29:58 +01:00
Gyan Doshi
b89f1581b0 lavfi/sidedata: fix typo
S12M_TIMECOD --> S12M_TIMECODE

Old version is marked deprecated.
Should be removed at lavfi bump to 12.
2025-12-14 12:41:00 +05:30
Ruikai Peng
cc43670268 avfilter/x86/vf_noise: Use unaligned access
Regression since: 3ba570de8b (port from MMX to SSE2).

The SSE2 inline asm in libavfilter/x86/vf_noise.c (line_noise_sse2 and
line_noise_avg_sse2) uses aligned loads/stores (movdqa, movntdq) but never
checks pointer alignment. When the filter reuses an input frame (common
path when av_frame_is_writable() is true), it may receive misaligned data
from upstream filters that adjust frame->data[i] in place, notably vf_crop:

- vf_crop adjusts plane pointers by arbitrary byte offsets
(frame->data[plane] += ...), so an x offset of 1 on 8-bit formats produces
a 1‑byte misalignment.
- The noise filter then calls the SSE2 path directly on those pointers
without realigning or falling back.

Repro on x86_64/SSE2 (current HEAD at that commit):

./ffmpeg -v error -f lavfi -i testsrc=s=320x240:rate=1 \
-vf "format=yuv420p,crop=w=319:x=1:h=240:exact=1,noise=alls=50" \
-frames:v 1 -f null -

This crashes with SIGSEGV at the aligned load in line_noise_sse2 (movdqa
(%r9,%rax),%xmm0; effective address misaligned by 1 byte).

Impact: denial of service via crafted filtergraphs (e.g., crop + noise).
Applies to planar 8-bit formats where upstream filters can shift data
pointers without reallocating.

Found-by: Pwno OSS Team
2025-12-12 19:25:21 +00:00
Niklas Haas
440d58f5b1 avfilter/avfiltergraph: add missing newlines to format printing 2025-12-09 21:31:58 +00:00
Niklas Haas
978a0821ee avfilter/avfiltergraph: always retry format negotiation after auto-filters
There is an edge case not covered by the current logic: If there is only
a single auto-filter inserted, but the auto-inserted filter is incompatible
with a *different* format attribute (after settling the previous formats),
we may need a second auto-filter (e.g. `scale`) to settle the newly introduced
incompatibility.

A regression test demonstrating the issue is added.
2025-12-09 21:31:58 +00:00
Kacper Michajłow
cca872b6fd avfilter/vf_libopencv: make sure there is space for null-terminator in shape_str
Fixes: warning: 'sscanf' may overflow; destination buffer in argument 7 has size 32, but the corresponding specifier may require size 33 [-Wfortify-source]
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-08 21:31:13 +00:00
Kacper Michajłow
1fa5e001bc avfilter/vf_neighbor_opencl: add error condition when filter name doesn't match
This cannot really happen, but to suppress compiler warnings, we can
just return AVERROR_BUG here.

Fixes: warning: variable 'kernel_name' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-08 21:31:13 +00:00
Timo Rothenpieler
338889c0d9 avfilter/vf_scale_d3d12: fix integer overflow in input framerate calculation
Also removes pointless intermediate variables that caused
the overflow and truncation to happen in the first place.

Fixes #YWH-PGM40646-1
2025-12-08 14:22:16 +01:00
Araz Iusubov
c4d22f2d2c avfilter: D3D12 scale video filter support
This filter allows scaling of video frames using Direct3D 12 acceleration.

Example:
    ffmpeg -hwaccel d3d12va -hwaccel_output_format d3d12 \
           -i input.mp4 -vf scale_d3d12=1920:1280 \
           -c:v hevc_d3d12va -y output_1920x1280.mp4
2025-12-07 21:22:23 +00:00
Marton Balint
315446da2f avfilter/af_amerge: fix indentation
Signed-off-by: Marton Balint <cus@passwd.hu>
2025-12-07 19:36:49 +00:00
Marton Balint
3d667b147a avfilter/af_amerge: add layout_mode option to control output channel layout
Signed-off-by: Marton Balint <cus@passwd.hu>
2025-12-07 19:36:49 +00:00
Marton Balint
4e0a8b745a avfilter/af_amerge: rework routing calculation
No change in functionality.

Signed-off-by: Marton Balint <cus@passwd.hu>
2025-12-07 19:36:49 +00:00
Marton Balint
e8b10a9b09 avfilter/af_amerge: fix possible crash with custom layouts
The check if a native layout can be created from the sources was incomplete and
casued a crash with custom layouts if the layout contained a native channel
multiple times, as in this example command line:

ffmpeg -lavfi "sine[a0];sine,pan=FL+FL[a1];[a0][a1]amerge[aout]" -map "[aout]" -t 1 -f framecrc -

Signed-off-by: Marton Balint <cus@passwd.hu>
2025-12-07 19:36:49 +00:00
James Almer
52c84b06d5 avfilter/f_sidedata: also handle global side data in filter links
Should fix issue #21071

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-04 13:50:45 -03:00
Jack Lau
3f0842294f avfilter/vf_feedback: fix feedback block
Fix #20940

The feedback and its sub-filter both request frame
from each other, casuing block since 4440e499ba

The feedback should only request inputs[1] once
rather than continuously request frame cause blocking.

This patch add check whether feedback already request
inputs[1] via ff_outlink_frame_wanted(ctx->outputs[1]),
if true, then exit and waiting inputs[0] because it means
we need more frames input to proceed.

Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>
2025-12-03 21:23:51 +00:00
Andreas Rheinhardt
5d9270df7f libavutil/internal: Remove {SIZE,PTRDIFF}_SPECIFIER
Possible since 222127418b.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 11:52:54 +01:00
Andreas Rheinhardt
7356981bec avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Kacper Michajłow
2456a39581 avfilter/avfiltergraph: fix constant string comparision
It's not guaranteed that the conversion filter name string will be
deduplicated to the same memory location. While this is common
optimization to do, we cannot rely on it always happening.

Fixes regression since 8b375b2ffd.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-11-30 03:02:41 +01:00
Niklas Haas
04eeaeed11 avfilter/vf_libplacebo: also rotate SAR when fitting 2025-11-29 08:45:24 +00:00
Niklas Haas
f83fdad550 avfilter/vf_libplacebo: fix math when AVRationals are undefined 2025-11-29 08:45:24 +00:00
Niklas Haas
6e0034ab7e avfilter/vf_libplacebo: un-rotate image crop after fitting
When combining rotation with a FIT_ mode other than FIT_FILL, the fitting
logic was operating on the un-rotated rects, when it should have been
operating on the rotated (output) rects.
2025-11-29 08:45:24 +00:00
Piotr Pawlowski
372dab2a4d All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
Diego de Souza
75b8567591 avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling
The supported YUV pixel formats were separated between planar
and semiplanar. This approach reduces the number of CUDA kernels
for all pixel formats.

This patch:
1. Adds support for YUV 4:2:2 planar and semi-planar formats:
        yuv422p, yuv422p10, nv16, p210, p216
2. Implements new conversion structures and kernel definitions
        for planar and semi-planar formats

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00
Diego de Souza
04b5e25d35 avfilter/hwupload_cuda: Expands pixel formats support
Add support for uploading additional pixel formats to NVIDIA GPUs:
- Planar formats (yuv420p10, yuv422p, yuv422p10, yuv444p10)
- Semiplanar formats (nv16, p210, p216)

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00
Niklas Haas
623669a02c avfilter/buffersrc: add av_buffersrc_get_status()
There is currently no way for API users to know that a buffersrc is no longer
accepting input, except by trying to feed it a frame and seeing what happens.

Of course, this is not possible if the user does not *have* a frame to feed,
but may still wish to know if the filter is still accepting input or not.

Since passing `frame == NULL` to `av_buffersrc_add_frame()` is already treated
as closing the input, we are left with no choice but to introduce a new
function for this.

We don't explicitly return the result of `ff_outlink_get_status()` to avoid
leaking internal status codes, and instead translate them all to AVERROR(EOF).
2025-11-26 13:15:16 +00:00
Niklas Haas
f3346ca6f7 avfilter/x86/f_ebur128: only use filter_channels_avx for >= 2 channels
The approach of this ASM routine is to process two channels at a time using
AVX instructions. Obviously, there is no point in doing this if there is only
a single channel; in which case the scalar loop would be better.

Fixes a performance regression when filtering mono audio on certain CPUs,
notably e.g. the Intel N100.
2025-11-25 22:13:57 +00:00
Kacper Michajłow
a75b15a4ab avfilter/vf_drawvg: round color values to avoid differences on some platforms
This ensures consistent color conversion between double and u8 and
guarantees that values remain consistent across different platforms,
especially when x87 math is used.

Note that libcairo also performs rounding internally when converting
doubles to integers, see _cairo_color_double_to_short().

Fixes: fate-filter-drawvg-interpreter
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-11-25 22:32:50 +01:00
Gyan Doshi
2b221fdb4a avfilter/zscale: add support for resize filter spline64
Fixes #20928
2025-11-25 12:42:41 +05:30
Andreas Rheinhardt
2a90e7d725 av{codec,util}/tests: Remove pointless undefs
Before commit e96d90eed6 lavu/internal.h
contained redefined various discouraged/forbidden functions to induce
compilation failures upon use, like e.g.
 #define malloc please_use_av_malloc
In order to use these functions, some files had to undefine these
macros. This commit removes the remaining pointless undefs.

Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-24 16:48:31 +01:00
Anders Rein
7411e902da avfilter/f_select: Added activate for aselect
During migration to the activation filter API the aselect filter was
accidentally turned into a no-op filter.
2025-11-22 18:36:41 +00:00
Zhao Zhili
a5cc0e5c9e avfilter/vf_drawtext: fix call GET_UTF8 with invalid argument
For GET_UTF8(val, GET_BYTE, ERROR), val has type of uint32_t,
GET_BYTE must return an unsigned integer, otherwise signed
extension happened due to val= (GET_BYTE), and GET_UTF8 went to
the error path.

This bug incidentally cancelled the bug where hb_buffer_add_utf8
was being called with incorrect argument, allowing drawtext to
function correctly on x86 and macOS ARM, which defined char as
signed. However, on Linux and Android ARM environments, because
char is unsigned by default, GET_UTF8 now returns the correct
return, which unexpectedly revealed issue #20906.
2025-11-19 17:46:06 +00:00
Zhao Zhili
9bc3c572ea avfilter/vf_drawtext: fix incorrect text length
From the doc of HarfBuzz, what hb_buffer_add_utf8 needs is the
number of bytes, not Unicode character:
hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text));

Fix issue #20906.
2025-11-19 17:46:06 +00:00
Stefan Breunig
f8bfc20281 avfilter/vf_frei0r: fix time when input is realigned
av_frame_copy doesn't copy the input's PTS property, which resulted
in the frei0r filter always receiving the same static time.

Example that has a static distortion without patch:

ffmpeg -filter_complex "testsrc2=s=328x240:d=5,frei0r=distort0r" out.mp4
2025-11-18 21:26:36 +00:00
Carl Hetherington via ffmpeg-devel
1eb2cbd865 avfilter/f_ebur128: Fix incorrect ebur128 peak calculation.
Since 3b26b782ee it would only look at the
first channel.

Signed-off-by: Carl Hetherington <cth@carlh.net>
Reviewed-by: Niklas Haas <ffmpeg@haasn.xyz>
2025-11-18 08:40:08 +01:00
Andreas Rheinhardt
ddf443f1e9 avfilter/vf_fsppdsp: Fix left shifts of negative numbers
They are undefined behavior and UBSan warns about them
(in the checkasm test). Put the shifts in the constants
instead. This even gives a tiny speedup here:

Old benchmarks:
column_fidct_c:                                       3369.9 ( 1.00x)
column_fidct_sse2:                                     829.1 ( 4.06x)
New benchmarks:
column_fidct_c:                                       3304.2 ( 1.00x)
column_fidct_sse2:                                     827.9 ( 3.99x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
f8bcea4946 avfilter/vf_fsppdsp: Remove pointless cast
Also don't cast const away and use a smaller scope.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
0c556a6b09 avfilter/vf_fspp: Pre-reorder threshold table
Avoids reordering at runtime.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
778ff97efa avfilter/vf_fspp: Make output endian-independent
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
f442145729 avfilter/vf_fspp: Avoid casts, effective-type violations
Maybe uint64_t has been used as a poor man's alignment specifier?
Anyway, reading an uint64_t via an lvalue of type int16_t (as happens
in the C versions of the dsp functions) is undefined behavior.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
c0648b2004 avfilter/x86/vf_spp: Fix comment
Forgotten in dcb28ed860.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
06b0dae51b avfilter/vf_fsppdsp: Constify
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
cc97f1e276 avfilter/vf_fspp: Fix effective type violation
Also don't use unnecessarily large alignment; it avoids having to align
the stack.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
3cd452cbf1 avfilter/x86/vf_fspp: Avoid stack on x64
Possible due to the amount of registers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:12 +01:00
Andreas Rheinhardt
ddd74276f8 avfilter/x86/vf_fspp: Port ff_column_fidct_mmx() to SSE2
It gains a lot because it has to operate on eight words;
it also saves 608B of .text here.

Old benchmarks:
column_fidct_c:                                       3365.7 ( 1.00x)
column_fidct_mmx:                                     1784.6 ( 1.89x)

New benchmarks:
column_fidct_c:                                       3361.5 ( 1.00x)
column_fidct_sse2:                                     801.1 ( 4.20x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:11 +01:00
Andreas Rheinhardt
63493bf0e0 avfilter/x86/vf_fspp: Put shifts into constants
This avoids some shift instructions and also gives us more headroom
in the registers. In fact, I have proven to myself that everything
that is supposed to fit into 16bits now actually does so.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:11 +01:00
Andreas Rheinhardt
66af18d06a avfilter/x86/vf_fspp: Make ff_column_fidct_mmx() bitexact
It currently is not, because the shortcut mode uses different rounding
than the C code (as well as the non-shortcut code).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 12:18:11 +01:00
Andreas Rheinhardt
1049a5fba8 avfilter/vf_fsppdsp: Reduce discrepancies between C code and x86 asm
The x86 assembly uses the following pattern to zero all
the values with abs<threshold:
    x -= threshold;
    x satu+= threshold (unsigned saturated addition)
    x += threshold
    x satu-= threshold (unsigned saturated subtraction)
The reference C code meanwhile zeroed everything
with abs <= threshold. This commit makes the C code behave
like the x86 assembly to reduce discrepancies between the two.

An alternative would be to require SSSE3, so that
one can use pabsw, pcmpgtw for abs>threshold, followed by
a pand with the original data. Or one could modify the thresholds
to make both equal.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 11:28:04 +01:00
Andreas Rheinhardt
d19050a1ae avfilter/vf_fsppdsp: Use restrict
It is possible because the requirements are fulfilled;
it is also beneficial performance and code-size wise.
For GCC 14 (with -O3), this reduced codesize by 26750B
here; for Clang 20, it was 432B.

Old benchmarks:
mul_thrmat_c:                                            4.3 ( 1.00x)
mul_thrmat_sse2:                                         4.3 ( 1.00x)
store_slice_c:                                        2810.8 ( 1.00x)
store_slice_sse2:                                      542.5 ( 5.18x)
store_slice2_c:                                       3817.0 ( 1.00x)
store_slice2_sse2:                                     410.4 ( 9.30x)

New benchmarks:
mul_thrmat_c:                                            4.3 ( 1.00x)
mul_thrmat_sse2:                                         4.3 ( 1.00x)
store_slice_c:                                        1510.1 ( 1.00x)
store_slice_sse2:                                      545.2 ( 2.77x)
store_slice2_c:                                       1763.5 ( 1.00x)
store_slice2_sse2:                                     408.3 ( 4.32x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-17 11:28:04 +01:00