122011 Commits

Author SHA1 Message Date
Andreas Rheinhardt
e7a629049f avcodec/{arm,neon}/mpegvideo: Use intra scantable to unquant H263 intra
Forgotten in 70a7df049c.

Using the wrong scantable matters for codecs for which both scantables
can differ, namely the MPEG-4 decoder and the WMV1/2 codecs.

For WMV1 it can lead to wrong output in case the IDCT permutation
is FF_IDCT_PERM_PARTTRANS, because in this case the entries of
of the intra scantable's raster end are not always <= the corresponding
entries of the inter scantable's raster end when the former is
initialized via ff_wmv1_scantable[1] and the latter via ff_wmv1_scantable[0].
FF_IDCT_PERM_PARTTRANS is used iff the Neon IDCT is used (for both arm
and aarch64).* Said IDCT is not used during FATE, so that this issue
went unnoticed.

WMV2 uses the same scantables, but uses a custom IDCT
which always uses FF_IDCT_PERM_NONE for which the inter_scantable,
so that the output is always correct for it.

The scantable for MPEG-4 can change mid-stream (for the decoder),
but since c41818dc5d only the intra
scantable is updated, so that both scantables can get out of sync.
In such a case the unquantize intra functions could unquantize
an incorrect number of coefficients.

Using raster_end of the wrong scantable can also lead to an
unnecessarily large amount of coefficients unquantized.

*: FF_IDCT_PERM_SIMPLE and FF_IDCT_PERM_TRANSPOSE would also not work,
but they are not used at all by arm and aarch64.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 10:20:42 +01:00
Andreas Rheinhardt
5d41d3e21d avcodec/ppc/mpegvideo_altivec: Reindent after the previous commit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 10:20:42 +01:00
Andreas Rheinhardt
011ef7fc65 avcodec/ppc/mpegvideo_altivec: Split intra/inter unquantizing
Don't use a single function that checks mb_intra. Forgotten
in d50635cd24.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 10:20:42 +01:00
Andreas Rheinhardt
358c569b05 avcodec/mpegvideo_unquantize: Constify MPVContext pointee
Also use MPVContext instead of MpegEncContext.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 10:20:41 +01:00
yuanhecai
f7551e7505 avcodec: fix checkasm-hpeldsp failed on LA 2025-12-03 01:36:01 +00:00
Zhao Zhili
413346bd06 tests/fate/ffmpeg: add test for -force_key_frames scd_metadata 2025-12-02 03:03:55 +00:00
Zhao Zhili
540aacf759 fftools/ffmpeg: add force key frame by scdet metadata support
For example:

./ffmpeg -hwaccel videotoolbox \
	-i input.mp4 -c:a copy \
	-vf scdet=threshold=10 \
	-c:v h264_videotoolbox \
	-force_key_frames scd_metadata \
	-g 1000 -t 30 output.mp4
2025-12-02 03:03:55 +00:00
Thomas Gritzan
27e94281d1 libavdevice/decklink: add support for DeckLink SDK 14.3
This patch adds support for DeckLink SDK 14.3 and newer by using
the legacy interfaces in the header <DeckLinkAPI_v14_2_1.h>.

The missing QueryInterface implementations are also provided.
2025-12-01 21:37:12 +00:00
averne
1e90047fe6 vulkan: fix host copy stride
memoryRowLength is is texels, not bytes
2025-12-01 15:40:40 +01:00
llyyr
7043522fe0 avutil/hwcontext_d3d12va: use hwdev context for logging
This fixes warning about av_log being called with NULL AVClass. This is
also an API violation

Fixes: https://trac.ffmpeg.org/ticket/11335
2025-12-01 03:15:25 +00:00
Lynne
932a872dbc hwcontext_vulkan: fix VkImageToMemoryCopyEXT.sType
It was copy pasted from the upload path.
Somehow, it was missed, despite god knows how many validation layer runs.
2025-11-30 23:11:46 +01:00
Kacper Michajłow
17456c553e tests/checkasm: fix check for 32-bit Windows build
With --disable-asm, ARCH_X86_32 is set to 0, but we still build the
checkasm binary. Update the check so it is config.h agnostic.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-11-30 22:07:39 +00:00
Russell Greene
3beaa2d70f hwcontext_vulkan: remove VK_HOST_IMAGE_COPY_MEMCPY flag
Reading the spec for what this flag means, it copies the data verbatim, including any swizzling/tiling, this has two issues

1. the format may not be what ffmpeg expects elsewhere, as it is expecing normal pitch linear host memeory in `swf`
2. the size of the copied data may not match the size of buffer provided, causing heap buffer overflow

It seems like addition of this flag is an oversight as it seems to be for caching/backups of image data, just to be used with copying back to the GPU with the MEMCPY flag, which is *not* how its used in ffmpeg.

Additionally, set memoryRowLength as if it isn't set, it assumes pitch = width_in_bytes, which I don't think is necessarily the case
2025-11-30 21:47:12 +00:00
Andreas Rheinhardt
59d75bf9e4 avutil/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
d5a47bf2b3 swresample/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

(x86/ops.c has already been put in X86ASM-OBJS.)

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
7356981bec avfilter/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
eccf130fdb {lib{avcodec,swscale}/x86/,}Makefile: Kill MMX-OBJS
Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
ba94177242 avcodec/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
65b4feb782 avcodec/x86/Makefile: Remove redundant WebP decoder->vp8dsp dependencies
Redundant since 35b02732b9.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
averne
1d1643b42a vulkan/prores: use cached bitstream reader
Speedup is around 75% on NVIDIA 3050, 20% on AMD 6700XT, 5% on Intel TigerLake.
2025-11-30 22:01:17 +01:00
averne
fd2fd3828c libavcodec/vulkan: remove unnessary member in GetBitContext
The number of remaining bits can be calculated using existing state.
This simplifies calculations and frees up one register.
2025-11-30 19:21:08 +01:00
averne
ef7354d471 libavcodec/vulkan: introduce cached bitstream reader
This stores a small buffer in shared memory per decode thread (16 bytes),
which helps reduce the number of memory accesses.
The bitstream buffer is first aligned to a 4 byte boundary, so that the
buffer can be filled with a single memory request.
2025-11-30 19:21:04 +01:00
Kacper Michajłow
2456a39581 avfilter/avfiltergraph: fix constant string comparision
It's not guaranteed that the conversion filter name string will be
deduplicated to the same memory location. While this is common
optimization to do, we cannot rely on it always happening.

Fixes regression since 8b375b2ffd.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-11-30 03:02:41 +01:00
Andreas Rheinhardt
89f984e3d1 avcodec/x86/h264_idct: Fix ff_h264_luma_dc_dequant_idct_sse2 checkasm failures
ff_h264_luma_dc_dequant_idct_sse2() does not pass checkasm for certain
seeds, because the input to packssdw no longer fits into an int16_t,
leading to saturation, where the C code just truncates. I don't know
whether the spec contains provisions that ensure that valid input
must not exceed 16 bit or whether the such inputs (even if invalid)
can be triggered by the actual code and not only the test.

This commit adapts the behavior of the function to the C reference code
to fix the test. packssdw is avoided, instead the lower words are
directly transfered to GPRs to be written out. This has unfortunately
led to a slight performance regression here (14.5 vs 15.1 cycles).

Fixes issue #20835.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
e6ae2802a3 avcodec/x86/h264_idct: Deduplicate generating constant
pw_1 is currently loaded in both codepaths. Generate it earlier instead.
Gives tiny speedups (15 vs 14.5 cycles) and reduces codesize.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
ada0a81577 avcodec/x86/h264_idct: Don't use MMX registers in ff_h264_luma_dc_dequant_idct_sse2
It is ABI compliant and gives a tiny speedup here (and is 16B smaller).

Old benchmarks:
h264_luma_dc_dequant_idct_8_c:                          33.2 ( 1.00x)
h264_luma_dc_dequant_idct_8_sse2:                       16.0 ( 2.07x)

New benchmarks:
h264_luma_dc_dequant_idct_8_c:                          33.0 ( 1.00x)
h264_luma_dc_dequant_idct_8_sse2:                       15.0 ( 2.20x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
012c25bac4 avcodec/x86/h264_idct: Zero with full-width stores
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
b9cbbd9074 avcodec/x86/h264_idct: Use tail call where advantageous
It is possible on UNIX64.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
0ec9c1b68d avutil/x86/x86inc: Use parentheses in has_epilogue
Prevents surprises.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
01ff05e4bc avcodec/x86/h264_idct: Avoid call where possible
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
b51cbd4116 avcodec/x86/h264_idct: Remove redundant movsxdifnidn
Only exported (i.e. cglobal) functions need it; stride is already
sign-extended when it reaches any of the internal functions used here,
so don't sign-extend again.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Andreas Rheinhardt
18019f177e avcodec/x86/h264idct: Remove dead MMX macros
Forgotten in 4618f36a24.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Kacper Michajłow
9cd4be6d7c tools/sofa2wavs: fix build on Windows
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-11-29 21:43:12 +00:00
averne
1c5bb1b12d vulkan/prores: normalize coefficients during IDCT
This allows increased internal precision.
In addition, we can introduce an offset to the DC coefficient
during the second IDCT step, to remove a per-element addition
in the output codepath.
Finally, by processing columns first we can remove the barrier
after loading coefficients.

Signed-off-by: averne <averne381@gmail.com>
2025-11-29 17:56:28 +01:00
averne
1982add485 vulkan/prores: fix dequantization for 4:2:2 subsampling
Bug introduced in d00f41f due to an oversight.
2025-11-29 17:27:21 +01:00
Niklas Haas
04eeaeed11 avfilter/vf_libplacebo: also rotate SAR when fitting 2025-11-29 08:45:24 +00:00
Niklas Haas
f83fdad550 avfilter/vf_libplacebo: fix math when AVRationals are undefined 2025-11-29 08:45:24 +00:00
Niklas Haas
6e0034ab7e avfilter/vf_libplacebo: un-rotate image crop after fitting
When combining rotation with a FIT_ mode other than FIT_FILL, the fitting
logic was operating on the un-rotated rects, when it should have been
operating on the rotated (output) rects.
2025-11-29 08:45:24 +00:00
Piotr Pawlowski
372dab2a4d All: Removed reliance on compiler performing dead code elimination, changed various macro constant checks from if() to #if 2025-11-28 19:52:51 +01:00
Hao Chen
a6206a31ea swscale: Fix out-of-bounds write errors in yuv2rgb_lasx.c file.
The patch adds support for dstw values ending in 2, 4, 6, 8, 10, 12, and 14,
which fixes the out-of-bounds write problem.
2025-11-28 03:40:47 +00:00
Ayose
c1b86a009e tests/fate-filter-drawvg-video: copy drawvg.lines file to tests/data.
If the SRC_PATH variable contains certain characters (like a `:`, which may
happen when FATE is executed on Windows), the value for the `file` option is
broken, so `make fate-filter-drawvg-video` always fails.

The solution in this commit is to copy the `drawvg.lines` to the `tests/data`
directory (which already has temp files), so the value for `file` is a fixed
string with no problematic characters.

Signed-off-by: Ayose <ayosec@gmail.com>
2025-11-28 02:38:09 +00:00
Gavin Li
3d96d83a0a avformat/rawdec: set framerate in codec parameters
Commit ba4b73c977 caused a regression in
the usage of avg_frame_rate to detect the frame rate of raw h264/hevc
bitstreams: after the commit, avg_frame_rate is always the value of the
-framerate option (which is set to 25 by default) instead of the actual
frame rate derived from the bitstream SPS/VPS NALUs.

This commit fixes the regression by setting the framerate codec
parameter to the value of the framerate option instead. After this
change, bitstreams without timing information will derive avg_frame_rate
from the -framerate option, while bitstreams with timing information
will derive avg_frame_rate from the bitstream itself.

The h264-bsf-dts2pts test now returns the correct frame durations for a
bitstream with a mix of single-field and double-field frames.

Signed-off-by: Gavin Li <git@thegavinli.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-27 20:01:54 -03:00
James Almer
69534d4e7e avcodec/cavs_parser: parse sequence headers for stream parameters
Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-27 20:01:54 -03:00
Diego de Souza
75b8567591 avfilter/scale_cuda: Add support for 4:2:2 chroma subsampling
The supported YUV pixel formats were separated between planar
and semiplanar. This approach reduces the number of CUDA kernels
for all pixel formats.

This patch:
1. Adds support for YUV 4:2:2 planar and semi-planar formats:
        yuv422p, yuv422p10, nv16, p210, p216
2. Implements new conversion structures and kernel definitions
        for planar and semi-planar formats

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00
Diego de Souza
04b5e25d35 avfilter/hwupload_cuda: Expands pixel formats support
Add support for uploading additional pixel formats to NVIDIA GPUs:
- Planar formats (yuv420p10, yuv422p, yuv422p10, yuv444p10)
- Semiplanar formats (nv16, p210, p216)

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00
Diego de Souza
9c76d7db86 avutil/hwcontext_cuda: Expands pixel formats support
Add support for additional pixel formats in CUDA hardware context:
- Planar formats (yuv420p10, yuv422p, yuv422p10, yuv444p10)
- Semiplanar formats (nv16, p210, p216)

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00
Thomas Gritzan
0cd75dbfa0 libavdevice/decklink: Implement QueryInterface to support newer driver
Playback to a decklink device with a newer version of the
DeckLink SDK (14.3) stalls because the driver code calls
IDeckLinkVideoFrame::QueryInterface, which is not
implemented by ffmpeg.
This patch implements decklink_frame::QueryInterface,
so that playback works with both older (12.x) and
newer (>= 14.3) drivers.

Note: The patch still does not allow the code to compile
with DeckLink SDK 14.3 or newer, as the API has changed.
2025-11-27 20:12:03 +00:00
Frank Plowman
5169b0c3dc lavc/vvc: Ensure seq_decode is always updated with SPS
seq_decode is used to ensure that a picture and all of its reference
pictures use the same SPS. Any time the SPS changes, seq_decode should
be incremented. Prior to this patch, seq_decode was incremented in
frame_context_setup, which is called after the SPS is potentially
changed in decode_sps. Should the decoder encounter an error between
changing the SPS and incrementing seq_decode, the SPS could be modified
while seq_decode was not incremented, which could lead to invalid
reference pictures and various downstream issues. By instead updating
seq_decode within the picture set manager, we ensure seq_decode and the
SPS are always updated in tandem.
2025-11-27 14:51:52 +00:00
Anthony Bajoua
93ccca22bb libavformat/mov: Fixes individual track duration on fragmented files 2025-11-27 14:05:33 +00:00
mux47
618fc15e65 libavcodec/opus/parser: Fix spurious 'Error parsing Opus packet header'
When PARSER_FLAG_COMPLETE_FRAMES is set, opus_parse() calls
set_frame_duration even on flush (buf_size==0), which triggers
a spurious "Error parsing Opus packet header" at EOF.

Match streaming-path behavior by skipping duration parsing on empty buffers.
Fixes #20954
2025-11-27 14:04:20 +00:00