The 16-bit kernel is dispatched for every non-8-bit pixel format
(9/10/12/16-bit content, all stored in uint16_t). It's supposed to
undo the Q16 scaling that set_filter_param() applies to `amount`:
fp->amount = amount * 65536.0;
but the shift written in the kernel is `>> (8+nbits)`, which for the
nbits=16 instantiation of the macro comes out to `>> 24` instead of
`>> 16`. Because of this, on any non-8-bit input, unsharp applies ~1/256
of the user's requested strength and is effectively a no-op. The
8-bit kernel (nbits=8) happens to be correct because 8+8 == 16.
This commit also widens the intermediate product to int64 before the
shift, to avoid a potential overflow. Take a 16-bit pixel at the
edge of a sharp white/black region, with the user-facing `amount`
set to its declared maximum of 5.0.
*srx = 65535
blur = 32768
diff = *srx - blur = 32767
amount_q16 = 5.0 * 65536 = 327680
Then the kernel computes:
product = diff * amount_q16
= 32767 * 327680 = 10,737,090,560 (~1.07e10)
which overflows INT32_MAX. Widening to int64 keeps the
multiplication in range; the subsequent `>> 16` brings it back to
sample range and the final cast to int32 is then safe. The widening
is a semantic no-op for 8/9/10/12-bit content where the product
always fits in int32 (worst case at 12-bit: 4095 * 327680 ~ 1.34e9).
Introduced by ee792ebe08 (2019-11-08, "avfilter/vf_unsharp: add 10bit
support"). The fate-filter-unsharp-yuv420p10 reference added in the
same series was generated from the broken kernel and is regenerated
here. fate-filter-unsharp (8-bit) is unaffected.
Repro:
python3 -c "import numpy as np; y=np.tile(np.where(np.arange(128)//8 & 1, 512, 256).astype('<u2'), (128,1)); c=np.full((64,64), 512, '<u2'); open('in.yuv','wb').write(y.tobytes()+c.tobytes()*2)"
ffmpeg -f rawvideo -pix_fmt yuv420p10le -s 128x128 -i in.yuv \
-lavfi "split=2[a][b];[b]unsharp=la=1[bs];[a][bs]psnr" \
-f null - 2>&1 | grep PSNR
Before: `PSNR y:66.50 ...` -- the filter is effectively a no-op,
so the sharpened output matches the input almost exactly.
After: `PSNR y:28.27 ...` -- the filter actually sharpens, so
output and input differ as expected.
Signed-off-by: Nil Fons Miret <nilf@netflix.com>
Made-with: Cursor
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive. Check the return
value and fail fast instead of continuing with the unadjusted result.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
When scale filter expressions evaluate to zero or negative output
dimensions (e.g. cascaded scale=...:-2 on extreme aspect ratios),
ff_scale_adjust_dimensions() only checked for int32 overflow and
passed them through, potentially hanging downstream components.
Reject them explicitly so the pipeline fails fast.
Callers that currently ignore the return value will be updated in
the following patches to propagate the error.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
When duplicate frames are forced to be kept, forward the input frame
without cloning instead of creating an unnecessary extra reference.
This removes the leak path introduced when clone allocation fails.
For frames that become the new reference, keep using a clone for
forwarding.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This patch adds the transpose_cuda video filter.
It's similar to the existing transpose filter but accelerated by CUDA.
It supports the same pixel formats as the scale_cuda filter.
This also supersedes the deprecated transpose_npp filter.
Example usage:
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i <INPUT> -vf "transpose_cuda=dir=clock" <OUTPUT>
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
tape_length * 8 overflows 32-bit int for large input widths. Then
av_malloc_array() allocates a tiny buffer while the subsequent
loop writes tape_length*8 BilinearMap entries, causing
heap-buffer-overflow.
Validate the value in float before converting to int and left
shifting, to avoid both float-to-int and signed left shift
overflow UB. Also split av_malloc_array() arguments to avoid
the multiplication overflow.
Fixes: #21511
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This was originally introduced by commit 05d6cc116e. During the FFmpeg-libav
split, this function was refactored by commit 7e350379f8 into
av_buffersrc_add_frame(), replacing av_buffersrc_add_ref(). The new function
did not include the overflow warning, despite the same being done for
buffersink.
Then, when commit a05a44e205 merged the two functions back together, the
libav implementation was favored over the FFmpeg implementation, silently
removing the overflow warning in the process.
This commit re-adds that missing warning.
Signed-off-by: Niklas Haas <git@haasn.dev>
Fixes a memory leak caused by AV_MEDIA_TYPE_VIDEO == 0 being excluded by
the !pool->type check. We can just remove the entire check because
av_buffer_pool_uninit() is already safe on NULL.
Fixes: fe2691b3bb
Reported-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
Saves a pointless free/alloc cycle on reinit. For the vast majority of filter
links, this going to be allocated anyway; and on the occasions that it's not,
the waste is marginal.
Signed-off-by: Niklas Haas <git@haasn.dev>
As per the FFmpeg coding style guidelines, braces should be avoided on
isolated single-line statement bodies.
Signed-off-by: Niklas Haas <git@haasn.dev>
FFALIGN(..., pool->align) = (...) & ~(pool->align - 1), so this condition
equates to: ((...) & ~(align - 1) & (align - 1)), which is trivially 0.
(Note that all expressions are of type `int`)
Signed-off-by: Niklas Haas <git@haasn.dev>
This struct is overally pretty trivial and there is little to no internal
state or invariants that need to be protected.
Making it public allows e.g. libswscale to allocate buffers for individual
planes directly.
Signed-off-by: Niklas Haas <git@haasn.dev>