122011 Commits

Author SHA1 Message Date
Zhao Zhili
cac5018eb9 avformat/mov: fix crash when stsz_sample_size is zero and sample_sizes is null
Co-Authored-by: James Almer <jamrial@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-11-27 14:03:06 +00:00
Andreas Rheinhardt
f0f7834726 avcodec/cbs_apv: Use ff_cbs_{read,write}_simple_unsigned()
Avoids checks and makes the calls cheaper.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 14:00:45 +00:00
Andreas Rheinhardt
7018ce14df avcodec/x86/vp6dsp: Avoid packing+unpacking
Store the intermediate values as words, clipped to the 0..255 range
instead.

Old benchmarks:
filter_diag4_c:                                        353.4 ( 1.00x)
filter_diag4_sse2:                                      57.5 ( 6.15x)

New benchmarks:
filter_diag4_c:                                        350.6 ( 1.00x)
filter_diag4_sse2:                                      55.1 ( 6.36x)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:49 +01:00
Andreas Rheinhardt
300cd2c2f2 avcodec/x86/vp6dsp: Avoid saturated addition
Only the two middle coefficients are so huge that overflow can happen.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:46 +01:00
Andreas Rheinhardt
dcc101167c avcodec/x86/vp6dsp: Simplify splatting
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:43 +01:00
Andreas Rheinhardt
111fabf5b4 avcodec/x86/vp6dsp: Don't align the stack manually
For most systems (particularly all x64), the stack is already
guaranteed to be sufficiently aligned. So just use x86inc's
stack feature which does the right thing.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:40 +01:00
Andreas Rheinhardt
363a34a7cb avcodec/x86/vp6dsp: Fix outdated comment
Forgotten in 6cb3ee80b3.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:37 +01:00
Andreas Rheinhardt
aabaab10d2 tests/checkasm: Test VP6DSP
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:34 +01:00
Andreas Rheinhardt
962858169a avcodec/vp6dsp: Constify source in vp6_filter_diag4
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:32 +01:00
Andreas Rheinhardt
f397fe86c3 avcodec/vp56dsp: Separate VP5DSP and VP6DSP
They don't have anything in common since
160ebe0a8d.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:29 +01:00
Andreas Rheinhardt
5dadae9feb avcodec/vp56: Fix indentation
Forgotten in 160ebe0a8d.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:10:26 +01:00
Andreas Rheinhardt
8443940002 avcodec/arm/vp6dsp: Remove VP6 edge filter functions
Forgotten in 160ebe0a8d.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 12:08:45 +01:00
Andreas Rheinhardt
0ea961c070 avcodec/vp3: Redo updating frames
VP3's frame managment is actually simple: It has three frame slots:
current, last and golden. After having decoded the current frame,
the old last frame will be freed and replaced by the current frame.
If the current frame is a keyframe, it also takes over the golden slot.

The VP3 decoder handled this like this: In single-threaded mode,
the above procedure was carried out (on success). Doing so with
frame-threading is impossible, as it would lead to data races.
Instead vp3_update_thread_context() created new references
to these frames and then carried out said procedure.

This means that vp3_update_thread_context() is not just a "dumb"
function that only copies certain fields from src to dst; instead
it actually processes them. E.g. trying to copy the decoding state
from A to B and then from B to C (with no decode_frame call in between)
will not be equivalent to copying from A to C, as both current and last
frames will be blank in the first case.

This commit changes this: Because last_frame won't be needed after
decoding, no reference to it will be created to it in
vp3_update_thread_context(); instead it is now always unreferenced
after decoding it (even on error). Replacing last_frame with the new
frame is now always performed when the new frame is allocated.
Replacing the golden frame is now done earlier, namely in decode_frame()
before ff_thread_finish_setup(), so that update_thread_context only
has to reference current frame and golden frame. Being dumb means
that update_thread_context also no longer checks whether the current
frame is valid, so that it can no longer error out.

This unifies the single- and multi-threaded codepaths; it can lead
to changes in output in single threaded mode: When erroring out,
the current frame would be discarded and not be put into one
of the reference slots at all in single-threaded mode. The new
code meanwhile does everything as the frame-threaded code already did
in order to reduce discrepancies between the two. It would be possible
to keep the old single-threaded behavior (one would need to postpone
replacing the golden frame to the end of vp3_decode_frame and would
need to swap the current frame and the last frame on error,
unreferencing the former).

Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 11:34:25 +01:00
Andreas Rheinhardt
2ca072e168 avcodec/vp3: Remove always-false checks
The dimensions are only set at two places: theora_decode_header()
and vp3_decode_init(). These functions are called during init
and during dimension changes, but the latter is only supported
(and attempted) when frame threading is not active. This implies that
the dimensions of the various worker threads in
vp3_update_thread_context() always coincide, so that these checks
are dead and can be removed.

(These checks would of course need to be removed when support
for dimension changes during frame threading is implemented;
and in any case, a dimension change is not an error.)

Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 11:33:00 +01:00
Andreas Rheinhardt
d52bca36ef avcodec/vp3: Move last_qps from context to stack
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 11:32:54 +01:00
Andreas Rheinhardt
90551b7d80 avcodec/vp3: Sync VLCs once during init, fix crash
6c7a344b65 made the VLCs shared between
threads and did so in a way that was designed to support stream
reconfigurations, so that the structure containing the VLCs was
synced in update_thread_context. The idea was that the currently
active VLCs would just be passed along between threads.

Yet this was broken by 5acbdd2264:
Before this commit, submit_packet() was a no-op during flushing
for VP3, as it is a no-delay decoder, so it won't produce any output
during flushing. This meant that prev_thread in pthread_frame.c
contained the last dst thread that update_thread_context()
was called for (so that these VLCs could be passed along between
threads). Yet after said commit, submit_packet was no longer
a no-op during flushing and changed prev_thread in such a way
that it did not need to contain any VLCs at all*. When flushing,
prev_thread is used to pass the current state to the first worker
thread which is the one that is used to restart decoding.
It could therefore happen that the decoding thread did not contain
the VLCs at all any more after decoding restarts after flushing
leading to a crash (this scenario was never anticipated and
must not happen at all).

There is a simple, easily backportable fix given that we do not
support stream reconfigurations (yet) when using frame threading:
Don't sync the VLCs in update_thread_context(), instead do it once
during init.

This fixes forgejo issue #20346 and trac issue #11592.

(I don't know why 5acbdd2264
changed submit_packet() to no longer be a no-op when draining
no-delay decoders.)

*: The exact condition for the crash is nb_threads > 2*nb_frames.

Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-27 11:30:55 +01:00
Zhao Zhili
61b034a47c avcodec/rkmppenc: add h264/hevc rkmpp encoder
Bump rockchip_mpp to 1.3.8.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-11-27 15:54:49 +08:00
Zhao Zhili
d8e095b56d configure: cleanup rkmpp check
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2025-11-27 15:54:42 +08:00
Lynne
7d0483e6a7 vulkan_dpx: fix compilation with older headers
Fixes #21028
2025-11-27 03:12:30 +01:00
Timo Rothenpieler
f7aaa8ecb5 forgejo/workflows: make test shared/static mode more human readable 2025-11-26 23:21:11 +00:00
Lynne
231f735d55 vulkan_dpx: use host visible allocation for host image copy buffer
Fixes black screen on Nvidia.
2025-11-26 18:33:10 +01:00
Lynne
162e07da61 vulkan_dpx: fix "upoad" typo 2025-11-26 18:32:25 +01:00
James Almer
faa382e5b1 avformat/iamf_parse: ensure the stream count in a scalable channel representation is equal to the audio element's stream count
Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-26 12:01:17 -03:00
James Almer
554ae5ada9 avformat/iamf_parse: ensure each layout in an scalable channel representation has an increasing number of channels
Fixes issue #21013

Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-26 12:01:17 -03:00
Lynne
3bcf2be06c Changelog: bump lavc minor and add entry for the DPX Vulkan hwaccel 2025-11-26 15:16:43 +01:00
Lynne
531ce713a0 dpxdec: add a Vulkan hwaccel 2025-11-26 15:16:43 +01:00
Lynne
a9acae202a dpxdec: add hardware decoding hooks 2025-11-26 15:16:42 +01:00
Lynne
61ae1ec85f dpxdec: move data parsing into a separate function 2025-11-26 15:16:42 +01:00
Lynne
bc07d03d06 dpx: add a context
This simply adds a context with 4 fields to enable hardware unpacking.
2025-11-26 15:16:42 +01:00
Lynne
7af5b5cec3 vulkan_prores_raw: use the native image representation
It allows us to easily synchronize the software and hardware
decoders, by removing the abstraction the Vulkan layer added by changing
the values written.
2025-11-26 15:16:42 +01:00
Lynne
a811a6885a vulkan_prores_raw: read the header length rather than assuming its 8
In all known samples, it is equal to 8.
2025-11-26 15:16:42 +01:00
Lynne
0db891366d vulkan_prores_raw: fix dynamically non-uniform accesses to pushconsts
The Vulkan spec requires that all accesses to push data are uniform for
all invocations (e.g. can't be based on gl_WorkGroupID or gl_LocalInvocationID).
2025-11-26 15:16:41 +01:00
Lynne
edb844510e vulkan_prores_raw: use regular descriptors for tile data instead of BDA
Regular descriptors are faster.
2025-11-26 15:16:41 +01:00
Lynne
bb30a0d0d8 vulkan_prores_raw: split up decoding and DCT
This commit optimizes the Vulkan decoder by splitting up decoding
from iDCT, and merging the few tables needed directly into the shader.

The speedup on Intel is 10x.
2025-11-26 15:16:41 +01:00
Lynne
0c20edaa7a vulkan_prores: initialize only the necessary shaders on init 2025-11-26 15:16:41 +01:00
Lynne
a160e4a9e2 prores_raw: call ff_get_format if the version changes 2025-11-26 15:16:41 +01:00
Lynne
3934089de2 vulkan_prores: initialize only the necessary shaders on init 2025-11-26 15:16:41 +01:00
Lynne
8c0314d44a proresdec: call ff_get_format if the interlacing changes
Decoders need to track all state that hwaccels may be intersted in,
and trigger a reconfiguration if it changes.
2025-11-26 15:16:41 +01:00
Lynne
56dea1a9e8 vulkan_ffv1: initialize only the necessary shaders on init
The decoder will reinit the hwaccel upon pixfmt/dimension changes,
so we can remove the f->use32bit and is_rgb variants of all shaders.

This speeds up init time.
2025-11-26 15:16:40 +01:00
Lynne
a1154b74a4 ffv1dec: call ff_get_format if the EC coding changes
Decoders need to track all state that hwaccels may be intersted in,
and trigger a reconfiguration if it changes.
2025-11-26 15:16:40 +01:00
Lynne
be9998674a vulkan_ffv1/prores: remove unnecessary slice buffer unref
The slice buffer is already unref'd by ff_vk_decode_free_frame().
2025-11-26 15:16:40 +01:00
Lynne
615b26f1b1 vulkan_ffv1: fix swapped colors for x2bgr10 2025-11-26 15:16:40 +01:00
Lynne
3ddcf042b2 ffv1enc_vulkan: add support for x2bgr10/x2rgb10 2025-11-26 15:16:40 +01:00
Lynne
23cfcf93d2 vulkan: change ff_vk_frame_barrier access and stage type to sync2
Cleans up a compiler warning.
2025-11-26 15:16:40 +01:00
Lynne
d36d88dcbb vulkan/common: add reverse2 endian reversal macro 2025-11-26 15:16:39 +01:00
Lynne
6c3984db7f vulkan/common: add a function to flush/invalidate a buffer and use it
Just for convenience.
2025-11-26 15:16:39 +01:00
Lynne
d288d4a24e hwcontext_vulkan: use vkTransitionImageLayoutEXT to switch layouts
Falls back to regular submit-based layout switching if unsupported.
2025-11-26 15:16:39 +01:00
Lynne
5c89528342 hwcontext_vulkan: disable host image transfers for Nvidia devices
Nvidia's binary drivers have a very buggy implementation that is
yet to be fixed.
2025-11-26 15:16:39 +01:00
Lynne
686951849b hwcontext_vulkan: re-enable host image copy extension
We'll slowly start to use it in the code in safe places
rather than globally.
2025-11-26 15:16:39 +01:00
Lynne
fc2dd6c751 hwcontext_vulkan: enable runtime descriptor sizing
We were already using this in places, but it seems validation
layers finally got support to detect it.
2025-11-26 15:16:39 +01:00