It's a name that communicates its functionality in a better way.
Since the function was introduced very recently, we can safely rename it.
Signed-off-by: James Almer <jamrial@gmail.com>
In an ideal world, a filtergraph like `-i A -i B ... -filter_complex concat`
would only keep resources in memory related to the file currently being output
by the concat filter. So ideally, we'd open and fully decode A, then open and
fully decode B, and so on.
Practically, however, fftools wants to get one frame from each input file in
order to initialize the filter graph (buffersrc parameters). So what happens
currently is that fftools will request a single frame from each input A, B, etc
that is plugged into the filtergraph.
When using frame threading, the design of the decoder (ff_thread_receive_frame)
is that it will not output any frames until we have received enough packets to
saturate all threads. This, however, forces the decoder to buffer at least as
many frames for each input file as we have threads, before outputting anything.
By decoding the first frame synchronously, we avoid this issue and allow
configuring the filter graph more quickly and without wasting excess resources
on frames that will not (yet) be used.
Normally, this function tries to make sure all threads are saturated with
work to do before returning any frames; and will continue requesting packets
until that is the case.
However, this significantly slows down initial decoding latency when only
requesting a single frame (to e.g. configure the filter graph), and also
wastes a lot of unnecessary memory in the event that the user does not intend
to decode more frames until later.
By introducing a new `flags` paramater and a new flag
`AV_CODEC_RECEIVE_FRAME_FLAG_SYNCHRONOUS` to go along with it, we can allow
users to temporarily bypass this logic.
h->context_initialized is zero after flush, which triggers call to
ff_get_format unconditionally. ff_get_format can be heavy with
ff_hwaccel_uninit and hwaccel_init. For example, it takes 20 ms on
macOS with videotoolbox. ff_get_format should not be called if
nothing changed. ff_get_format is guarantee to be called at the
first time and when video information changed with
(must_reinit || needs_reinit).
Fix#20760.
Add checkasm coverage for the XYZ12LE to RGB48LE path via the
ctx->xyz12Torgb48 hook. Integrate the test into the build and runner,
exercise a variety of widths/heights, compare against the C reference,
and benchmark when width is multiple of 4.
This improves test coverage for the new function pointer in preparation
for architecture-specific implementations in subsequent commits.
Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
Prepare for xyz12Torgb48 architecture-specific optimizations in
subsequent patches by:
- Grouping XYZ+RGB gamma LUTs and 3x3 matrices into SwsColorXform
(ctx->xyz2rgb and ctx->rgb2xyz), replacing scattered fields.
- Dropping the unused last matrix column giving the same or smaller
SwsInternal size.
- Renaming ff_xyz12Torgb48 and ff_rgb48Toxyz12 and routing calls via
the new per-context function pointer (ctx->xyz12Torgb48 and
ctx->rgb48Toxyz12) in graph.c and swscale.c.
- Adding ff_sws_init_xyzdsp and invoking it in swscale init paths
(normal and unscaled).
- Making fill_xyztables public to ease its setup later in checkasm.
These modifications do not introduce any functional changes.
Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
The width 16 epel functions never use four taps in any direction*,
so don't build said functions. Saves 4352B of .text and 89B of
.text.unlikely here.
*: mx and my in vp8_mc_luma() are always even.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For the epel functions, there can be no overflow as long as the sum
contains only one of the two large central coefficients; for bilinear
functions, there can be no overflow whatsoever.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
By changing the permutations used in the epel8_h{4,6} case
we can simply reuse the coefficient tables from the vertical epel
filters.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Doubling the register width allows to use only one pshufb and pmaddubsw.
Old benchmarks:
vp8_put_epel4_h4_c: 82.8 ( 1.00x)
vp8_put_epel4_h4_ssse3: 13.9 ( 5.96x)
New benchmarks:
vp8_put_epel4_h4_c: 82.7 ( 1.00x)
vp8_put_epel4_h4_ssse3: 11.7 ( 7.08x)
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Switching to xmm registers allows to process two rows in parallel,
leading to speedups. It is also ABI compliant (no more missing emms).
Old benchmarks:
vp8_put_epel4_v4_c: 96.8 ( 1.00x)
vp8_put_epel4_v4_ssse3: 28.2 ( 3.43x)
New benchmarks:
vp8_put_epel4_v4_c: 95.1 ( 1.00x)
vp8_put_epel4_v4_ssse3: 22.8 ( 4.17x)
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Switching to xmm registers allows to process two rows in parallel,
leading to speedups. It is also ABI compliant (no more missing emms).
Old benchmarks:
vp8_put_epel4_v6_c: 132.8 ( 1.00x)
vp8_put_epel4_v6_ssse3: 34.3 ( 3.87x)
New benchmarks:
vp8_put_epel4_v6_c: 131.5 ( 1.00x)
vp8_put_epel4_v6_ssse3: 27.1 ( 4.86x)
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use GPRs on x64 and xmm registers else (using GPRs reduces codesize).
This avoids clobbering the floating point state and therefore no longer
breaks the ABI.
No change in benchmarks here.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
SSSE3 is already quite old (introduced 2006 for Intel, 2011 for AMD),
so that the overwhelming majority of our users (particularly those
that actually update their FFmpeg) will be using the SSSE3 versions.
This commit therefore removes the MMX(EXT) functions overridden
by them (which don't abide by the ABI) to get closer to a removal
of emms_c.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
A heap-use-after-free vulnerability was identified in
`libavcodec/aac/aacdec.c`. When `che_configure` frees a
`ChannelElement` (`ac->che[type][id]`), it failed to clear all
references to it in `ac->tag_che_map`. `ac->tag_che_map` caches
pointers to `ChannelElement`s and can contain cross-type mappings (e.g.,
a `TYPE_SCE` tag mapping to a `TYPE_LFE` element).
In a USAC stream reconfiguration scenario, an LFE element was freed, but
a stale pointer remained in `ac->tag_che_map`. Subsequent calls to
`ff_aac_get_che` returned this dangling pointer, leading to a crash in
`decode_usac_core_coder`.
This commit fixes the issue by iterating over the entire
`ac->tag_che_map` in `che_configure` and clearing any entries that point
to the `ChannelElement` about to be freed, ensuring no dangling pointers
remain.
Fixes: https://issues.oss-fuzz.com/issues/440220467
Intra refresh is a technique that gradually refreshes the video by encoding rows or regions as intra macroblocks/CTUs spread over multiple frames, rather than using periodic I-frames.
This provides better error resilience for video streaming while maintaining more consistent bitrate.
Disable Intra Refresh (This is the default)
ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \
-i input.mp4 \
-c:v h264_d3d12va \
-intra_refresh_mode none \
-intra_refresh_duration 30 \
-g 60 \
output.h264
Enable Intra Refresh
ffmpeg -init_hw_device d3d12va -hwaccel d3d12va -hwaccel_output_format d3d12 \
-i input.mp4 \
-c:v h264_d3d12va \
-intra_refresh_mode row_based \
-intra_refresh_duration 30 \
-g 60 \
output.h264
Parameters
- `-intra_refresh_mode`: Set to `row_based` to enable row-based intra refresh, or `NONE` to disable
- `-intra_refresh_duration`: Number of frames over which to spread the intra refresh (default: 0 = use GOP size)
- `-g`: GOP size (should typically be larger than intra refresh duration)
Fixes single image videos
this works and creates our single image video
./ffmpeg -i lena.pnm /tmp/file.m2v
this fails after 3d96d83a0a:
./ffmpeg -i /tmp/file.m2v /tmp/file.jpg -y
This reverts commit 3d96d83a0a.
size_t cannot fit VK_WHOLE_SIZE on 32-bit builds.
Fixes: warning: conversion from 'long long unsigned int' to 'size_t' {aka 'unsigned int'} changes value from '18446744073709551615' to '4294967295'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
There is no need to scan for NULL, if we inject it ourselves.
Fixes: warning: 'strncat' specified bound 10 equals source length [-Wstringop-overflow=]
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Some XMVs introduce a blank packet at the end of the stream. Previously, we
didn't account for this and returned AVERROR_INVALIDDATA, indicating an issue
with the file. Instead, let's check for this and close out with AVERROR_EOF.
Fix#20940
The feedback and its sub-filter both request frame
from each other, casuing block since 4440e499ba
The feedback should only request inputs[1] once
rather than continuously request frame cause blocking.
This patch add check whether feedback already request
inputs[1] via ff_outlink_frame_wanted(ctx->outputs[1]),
if true, then exit and waiting inputs[0] because it means
we need more frames input to proceed.
Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>
Fixes a heap-buffer-overflow in `decode_frame` where `header_len` read
from the bitstream was not validated against the remaining bytes in the
input buffer (`gb`). This allowed `gb_hdr` to be initialized with a size
exceeding the actual packet data, leading to an out-of-bounds read.
The fix adds a check to ensure `bytestream2_get_bytes_left(&gb)` is
greater than or equal to `header_len - 2` before initializing `gb_hdr`.
Fixes: https://issues.oss-fuzz.com/issues/439711053
Fixes this test under UBSan:
runtime error: call to function dct_unquantize_mpeg1_intra_c through pointer to incorrect function type 'void (*)(struct MpegEncContext *, short *, int, int)'
I don't know how I could forget this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This was accidentally removed in
357fc5243c.
This fixes test failures when built with Clang and MSVC;
surprisingly, the checkasm test did seem to pass when built with
GCC. Clang and MSVC also warn about the use of the uninitialized
variable, while GCC didn't.