ffmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2026-01-02 12:20:03 +01:00

Author	SHA1	Message	Date
Niklas Haas	4ede75b5f4	swscale/graph: fix double-free when legacy pass fails initializing If this function returns an error after ff_sws_graph_add_pass() has been called, and the pass->free callback is therefore already set up to free the context, the graph will end up freed twice: once by the pass->free callback (during ff_sws_graph_free()), and once before that by failure path of the caller (e.g. add_legacy_sws_pass(), or init_legacy_subpass() itself for cascaded contexts.) The solution is to redefine the ownership of SwsGraph to pass clearly from the caller of add_legacy_sws_pass() to init_legacy_subpass(), which can then deal with appropriately freeing the context conditional on whether or not the pass was already registered in the pass list. Reported-by: 김영민 <kunshim@naver.com> Signed-off-by: Niklas Haas <git@haasn.dev>	2025-08-29 13:22:03 +00:00
Michael Niedermayer	ca20d42cd7	swscale/swscale_internal: Use more precisse gamma Avoids failure of xyz12 fate tests on mingw and linux x86-32 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-08-18 19:12:46 +00:00
Marton Balint	b61e510e75	swscale/swscale_unscaled: use 8 line alignment for planarCopyWrapper with dithering Dithering relies on a 8 line dithering table and the code always uses it from the beginning. So in order to make dithering independent from height of the slices used we must enforce a 8 line alignment. Fixes issue #20071. Signed-off-by: Marton Balint <cus@passwd.hu>	2025-08-12 21:56:09 +00:00
Dash Santosh	ca2a88c1b3	swscale/output: Implement yuv2nv12cx neon assembly yuv2nv12cX_2_512_accurate_c: 3540.1 ( 1.00x) yuv2nv12cX_2_512_accurate_neon: 408.0 ( 8.68x) yuv2nv12cX_2_512_approximate_c: 3521.4 ( 1.00x) yuv2nv12cX_2_512_approximate_neon: 409.2 ( 8.61x) yuv2nv12cX_4_512_accurate_c: 4740.0 ( 1.00x) yuv2nv12cX_4_512_accurate_neon: 604.4 ( 7.84x) yuv2nv12cX_4_512_approximate_c: 4681.9 ( 1.00x) yuv2nv12cX_4_512_approximate_neon: 603.3 ( 7.76x) yuv2nv12cX_8_512_accurate_c: 7273.1 ( 1.00x) yuv2nv12cX_8_512_accurate_neon: 1012.2 ( 7.19x) yuv2nv12cX_8_512_approximate_c: 7223.0 ( 1.00x) yuv2nv12cX_8_512_approximate_neon: 1015.8 ( 7.11x) yuv2nv12cX_16_512_accurate_c: 13762.0 ( 1.00x) yuv2nv12cX_16_512_accurate_neon: 1761.4 ( 7.81x) yuv2nv12cX_16_512_approximate_c: 13884.0 ( 1.00x) yuv2nv12cX_16_512_approximate_neon: 1766.8 ( 7.86x) Benchmarked on: Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) CPU 3417 Mhz, 12 Core(s), 12 Logical Processor(s)	2025-08-12 09:05:00 +00:00
Logaprakash Ramajayam	49477972b7	swscale/aarch64/output: Implement neon assembly for yuv2planeX_10_c_template() yuv2yuvX_8_2_0_512_accurate_c: 2213.4 ( 1.00x) yuv2yuvX_8_2_0_512_accurate_neon: 147.5 (15.01x) yuv2yuvX_8_2_0_512_approximate_c: 2203.9 ( 1.00x) yuv2yuvX_8_2_0_512_approximate_neon: 154.1 (14.30x) yuv2yuvX_8_2_16_512_accurate_c: 2147.2 ( 1.00x) yuv2yuvX_8_2_16_512_accurate_neon: 150.8 (14.24x) yuv2yuvX_8_2_16_512_approximate_c: 2149.7 ( 1.00x) yuv2yuvX_8_2_16_512_approximate_neon: 146.8 (14.64x) yuv2yuvX_8_2_32_512_accurate_c: 2078.9 ( 1.00x) yuv2yuvX_8_2_32_512_accurate_neon: 139.0 (14.95x) yuv2yuvX_8_2_32_512_approximate_c: 2083.7 ( 1.00x) yuv2yuvX_8_2_32_512_approximate_neon: 140.5 (14.84x) yuv2yuvX_8_2_48_512_accurate_c: 2010.7 ( 1.00x) yuv2yuvX_8_2_48_512_accurate_neon: 138.2 (14.55x) yuv2yuvX_8_2_48_512_approximate_c: 2012.6 ( 1.00x) yuv2yuvX_8_2_48_512_approximate_neon: 141.2 (14.26x) yuv2yuvX_10LE_16_0_512_accurate_c: 7874.1 ( 1.00x) yuv2yuvX_10LE_16_0_512_accurate_neon: 831.6 ( 9.47x) yuv2yuvX_10LE_16_0_512_approximate_c: 7918.1 ( 1.00x) yuv2yuvX_10LE_16_0_512_approximate_neon: 836.1 ( 9.47x) yuv2yuvX_10LE_16_16_512_accurate_c: 7630.9 ( 1.00x) yuv2yuvX_10LE_16_16_512_accurate_neon: 804.5 ( 9.49x) yuv2yuvX_10LE_16_16_512_approximate_c: 7724.7 ( 1.00x) yuv2yuvX_10LE_16_16_512_approximate_neon: 808.6 ( 9.55x) yuv2yuvX_10LE_16_32_512_accurate_c: 7436.4 ( 1.00x) yuv2yuvX_10LE_16_32_512_accurate_neon: 780.4 ( 9.53x) yuv2yuvX_10LE_16_32_512_approximate_c: 7366.7 ( 1.00x) yuv2yuvX_10LE_16_32_512_approximate_neon: 780.5 ( 9.44x) yuv2yuvX_10LE_16_48_512_accurate_c: 7099.9 ( 1.00x) yuv2yuvX_10LE_16_48_512_accurate_neon: 761.0 ( 9.33x) yuv2yuvX_10LE_16_48_512_approximate_c: 7097.6 ( 1.00x) yuv2yuvX_10LE_16_48_512_approximate_neon: 754.6 ( 9.41x) Benchmarked on: Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Oryon(TM) CPU 3417 Mhz, 12 Core(s), 12 Logical Processor(s)	2025-08-12 09:05:00 +00:00
Kacper Michajłow	98c4b9dbbd	swscale/input: don't generate unused functions Fixes: input.c:1271:1: warning: unused function 'planar_rgb16_s12_to_a' Fixes: input.c:1272:1: warning: unused function 'planar_rgb16_s10_to_a' Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-08-11 19:29:53 +00:00
Michael Niedermayer	638b521c7b	Bump versions for master after release/8.0 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-08-09 18:03:05 +02:00
Michael Niedermayer	7eaa0f799a	Bump versions for release/8.0 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-08-09 17:30:39 +02:00
Timo Rothenpieler	262d41c804	all: fix typos found by codespell	2025-08-03 13:48:47 +02:00
Michael Niedermayer	aca41d3d93	swscale/output: Fix all bilinear integer overflows Ticket11686 hinted at one of these overflows this fixes them all Issue in line 1325/1326 found by HAORAN FANG <xfanghaoran@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-08-02 16:26:33 +00:00
Michael Niedermayer	c44d237d80	swscale/output: Fix integer overflow with lum/chr/alpha filter Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-08-02 16:26:33 +00:00
Niklas Haas	b7946098b1	swscale/alphablend: don't overread alpha plane on subsampled odd size This function overreads the input plane for odd dimensions, because the chroma plane is always rounded up, which means (xy << subsample) + 1 exceeds the actual alpha plane size. To verify: valgrind ffmpeg -pix_fmt yuva420p -f lavfi -i color -vf \ "scale=1x1,format=yuva420p,scale=alphablend=uniform_color,format=yuv420p \ -vframes 1 -f null - Fixes: https://trac.ffmpeg.org/ticket/11692	2025-07-31 11:32:20 +00:00
Kacper Michajłow	22da57c444	swscale/lut3d: remove unused function Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-07-22 19:56:34 +02:00
James Almer	11032d819d	swscale/swscale_unscaled: don't add offsets to more NULL pointers Continuation of `af9b43455a`. Signed-off-by: James Almer <jamrial@gmail.com>	2025-07-18 21:35:26 -03:00
James Almer	af9b43455a	swscale/swscale_unscaled: don't add offsets to NULL pointers Fixes: libswscale/swscale_unscaled.c:916:20: runtime error: applying zero offset to null pointer Signed-off-by: James Almer <jamrial@gmail.com>	2025-07-18 14:23:10 -03:00
Timo Rothenpieler	02a7c85753	swscale: add support for new 10/12 bit MSB formats	2025-07-11 17:49:58 +02:00
Michael Niedermayer	38ead08815	swscale/output: Fix integer overflows in yuv2rgba64_1_c_template() Fixes: signed integer overflow: -132524 * 16525 cannot be represented in type 'int' Fixes: 414862270/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-4869083202125824 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-07-06 19:24:07 +02:00
Andreas Rheinhardt	54c865fbec	swscale/utils: Fix potential race when initializing xyz tables Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-05-27 13:49:26 +02:00
Ramiro Polla	d028cf03b8	swscale/swscale_unscaled: fix planarRgbToplanarRgbWrapper() for formats with bpc between 9-14 bits Currently, planarRgbToplanarRgbWrapper() always sets the alpha value to 255, without taking the bit depth into consideration. This commit restricts the alpha value to the bit depth.	2025-05-23 00:07:56 +02:00
Ramiro Polla	748e960e04	swscale/swscale_unscaled: fix packed16togbra16() for formats with bpc between 9-14 bits Currently, packed16togbra16() always sets the alpha value to 0xFFFF, without taking the bit depth into consideration. This causes a bug on x86, which can be reproduced with: ./libswscale/tests/swscale -unscaled 1 -src xyz12le -dst gbrap12be The problem arises in ff_hscale14to15_4_ssse3(), in the conversion from gbrap12be to yuva444p, which comes after the conversion from xyz12le to gbrap12be. It has something to do with pmaddwd not working on unsigned values. There is some code to deal with 0xFFFF if the input has a bit depth of 16, but not for bit depths < 16. We could fix ff_hscale14to15_4_ssse3() to also work correctly with 0xFFFF on bit depths < 16, or we could just not write 0xFFFF there in the first place, which is what this commit does.	2025-05-23 00:01:04 +02:00
Ramiro Polla	0c1d87d1e6	swscale/swscale_unscaled: fix packed30togbra10() for formats with bpc between 9-14 bits Currently, packed30togbra10() always sets the alpha value to 0xFFFF, without taking the bit depth into consideration. This commit restricts the alpha value to the bit depth.	2025-05-23 00:00:05 +02:00
Ramiro Polla	a16c053a33	swscale/swscale_unscaled: fix planarCopyWrapper() for yuv444p => yuva444p Currently, planarCopyWrapper() assumes that src[3] must be NULL when the source format has no alpha plane. This commit updates the condition for filling the alpha plane based on the number of components available in the source format as well.	2025-05-22 23:59:39 +02:00
Niklas Haas	6072e27e9a	swscale/graph: prefer bools to ints This is more consistent with the rest of the newly added code, which universally switched to using bools for boolean values.	2025-05-18 15:00:45 +02:00
Niklas Haas	d95944786e	swscale/graph: move vshift() and shift_img() to shared header I need to reuse these inside `ops.c`.	2025-05-18 14:39:57 +02:00
Niklas Haas	bc9696bff8	swscale/graph: make noop loop more robust The current loop only works if the input and output have the same number of planes. However, with the new scaling logic, we can also optimize into a noop the case where the input has extra unneeded planes. For the memcpy fallback to work in these cases we have to instead check if the output pointer is set, rather than the input pointer.	2025-05-18 14:37:33 +02:00
Niklas Haas	51e912466f	swscale/graph: expose ff_sws_graph_add_pass So we can move pass-adding business logic outside of graph.c.	2025-05-18 14:37:33 +02:00
Niklas Haas	f297ebf97a	tests/swscale: improve colorization of speedup The old limits were a bit too tightly clustered around 1.0. Make the value range much more generous, and also introduce a new highlight for speedups above 10.0 (order of magnitude improvement).	2025-05-18 14:37:33 +02:00
Michael Niedermayer	23592f942d	swscale/output: fix integer overflow in yuv2rgba64_full_1_c_template() Fixes: signed integer overflow: -293650 * 16525 cannot be represented in type 'int' Fixes: 408304111/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-4762210299871232 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-05-15 03:03:58 +02:00
Andreas Rheinhardt	35fcdb2132	swscale/x86/rgb2rgb: Deduplicate ASM constants Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-04-13 22:49:21 +02:00
Michael Niedermayer	d16a058dbc	swscale/swscale: Do not crash on floats Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int' Fixes: division by zero Fixes: 391981061/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6691017763389440 Fixes: 392929028/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5142088307507200 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-04-10 03:01:32 +02:00
Michael Niedermayer	ce538ef97a	swscale/output: Fix integer overflow in yuv2gbrp_full_X_c() Fixes: signed integer overflow: 1966895953 + 210305024 cannot be represented in type 'int' Fixes: 391921975/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5916798905548800 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2025-04-10 03:01:32 +02:00
Andreas Rheinhardt	435be31ef5	swscale/csputils: Remove unused ff_sws_matrix3x3_rmul() Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-04-03 06:04:57 +02:00
Andreas Rheinhardt	4da84d5c2b	swscale/swscale_unscaled: Actually use X2->RGBA64 conversions The conversion functions were added in `e7382b4d01`, yet they were never really enabled. Found via -ffunction-sections and --gc-sections. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-03-31 21:45:20 +02:00
Niklas Haas	3e32dc8b08	tests/swscale: allow setting log verbosity Helpful for debugging the new swscale code, since it dumps the operations list in verbose logging mode.	2025-03-31 12:19:26 +02:00
Niklas Haas	92a57f1cfd	tests/swscale: constrain reference SSIM for low bit depth formats Sometimes, the reference SSIM is significantly higher than the SSIM level expected for the test. This is the case when the source format has a much lower bit depth than the destination format. In this case, the fact that legacy swscale does not accurately preserve the source dither pattern gives it an unfair advantage in a direct comparison, leading to false positives. For example, conversion like rgb4 -> rgb565 should be lossless, but swscale low passes / downscales the input chroma, throwing away massive amounts of detail. This gives it a higher SSIM score since the lowpassed result removes some of the dither noise that was present in the source.	2025-03-31 12:19:26 +02:00
Niklas Haas	8fc9808f18	tests/swscale: calculate theoretical expected SSIM We can calculate with some confidence the theoretical expected SSIM from an "ideal" conversion, by computing the reference SSIM level for an image dithered with uniformly distributed quatization noise. This gives us an additional safety net to check for regressions even in the absence of a reference to compare against.	2025-03-31 12:19:26 +02:00
Niklas Haas	9549daa996	tests/swscale: remove stray whitespace in scanf format	2025-03-31 12:19:24 +02:00
Niklas Haas	a22faeb992	tests/swscale: check supported inputs for legacy swscale separately The new code path supports more formats, so we can't test them all against the legacy implementation.	2025-03-31 12:19:08 +02:00
Niklas Haas	e1736d0d0b	tests/swscale: print performance stats on exit	2025-03-31 12:19:08 +02:00
Niklas Haas	6c12b1535a	tests/swscale: switch from MSE to SSIM And bias it towards Y. This is much better at ignoring errors due to differing dither patterns, and rewards algorithms that lower luma noise at the cost of higher chroma noise. The (0.8, 0.1, 0.1) weights for YCbCr are taken from the paper: "Understanding SSIM" by Jim Nilsson and Tomas Akenine-Möller (https://arxiv.org/abs/2006.13846)	2025-03-31 12:19:07 +02:00
Niklas Haas	1707e81073	tests/swscale: use yuva444p as reference Instead of the lossy yuva420p. This does change the results compared to the status quo, but is more reflective of the actual strength of a conversion, since it will faithfully measure the round-trip error from subsampling and upsampling.	2025-03-31 12:18:35 +02:00
Niklas Haas	f438f3f8cd	tests/swscale: print speedup numbers in color	2025-03-31 12:18:35 +02:00
Niklas Haas	995986e512	tests/swscale: allow testing only unscaled convertors I need this to be able to test the new unscaled conversion code more quickly. We re-order the flags order to make 0 the first entry, so we don't set any flags when performing unscaled tests.	2025-03-31 12:18:35 +02:00
Niklas Haas	d467ceaa9b	tests/swscale: use hex format for flags values	2025-03-31 12:18:11 +02:00
Niklas Haas	0e2742a693	tests/swscale: allow choosing specific flags and dither mode So I can quickly iterate on the new swscale code.	2025-03-31 12:16:10 +02:00
James Almer	b338d1b35b	libs: bump major version for all libraries Signed-off-by: James Almer <jamrial@gmail.com>	2025-03-28 14:44:34 -03:00
Shreesh Adiga	26f2f03e0d	swscale/x86/rgb2rgb: optimize AVX2 version of uyvytoyuv422 Currently the AVX2 version of uyvytoyuv422 in the SIMD loop does the following: 4 vinsertq to have interleaving of the vector lanes during load from memory. 4 vperm2i128 inside 4 RSHIFT_COPY calls to achieve the desired layout. This patch replaces the above 8 instructions with 2 vpermq and 2 vpermd with a vector register similar to AVX512ICL version. Observed the following numbers on various microarchitectures: On AMD Zen3 laptop: Before: uyvytoyuv422_c: 51979.7 ( 1.00x) uyvytoyuv422_sse2: 5410.5 ( 9.61x) uyvytoyuv422_avx: 4642.7 (11.20x) uyvytoyuv422_avx2: 4249.0 (12.23x) After: uyvytoyuv422_c: 51659.8 ( 1.00x) uyvytoyuv422_sse2: 5420.8 ( 9.53x) uyvytoyuv422_avx: 4651.2 (11.11x) uyvytoyuv422_avx2: 3953.8 (13.07x) On Intel Macbook Pro 2019: Before: uyvytoyuv422_c: 185014.4 ( 1.00x) uyvytoyuv422_sse2: 22800.4 ( 8.11x) uyvytoyuv422_avx: 19796.9 ( 9.35x) uyvytoyuv422_avx2: 13141.9 (14.08x) After: uyvytoyuv422_c: 185093.4 ( 1.00x) uyvytoyuv422_sse2: 22795.4 ( 8.12x) uyvytoyuv422_avx: 19791.9 ( 9.35x) uyvytoyuv422_avx2: 12043.1 (15.37x) On AMD Zen4 desktop: Before: uyvytoyuv422_c: 29105.0 ( 1.00x) uyvytoyuv422_sse2: 3888.0 ( 7.49x) uyvytoyuv422_avx: 3374.2 ( 8.63x) uyvytoyuv422_avx2: 2649.8 (10.98x) uyvytoyuv422_avx512icl: 1615.0 (18.02x) After: uyvytoyuv422_c: 29093.4 ( 1.00x) uyvytoyuv422_sse2: 3874.4 ( 7.51x) uyvytoyuv422_avx: 3371.6 ( 8.63x) uyvytoyuv422_avx2: 2174.6 (13.38x) uyvytoyuv422_avx512icl: 1625.1 (17.90x) Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>	2025-03-23 15:25:48 +00:00
Andreas Rheinhardt	c94143350f	avutil/libm: Only include intfloat.h when needed Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-03-22 03:35:28 +01:00
Andreas Rheinhardt	65154ba994	swscale/tests/swscale: Fix potential buffer overflow The field width in a %s directive gives the amount of characters to read from the input and not the size of the receiving buffer; the latter must be of course also have space for the trailing \0 which has been forgotten here. The commit adds it (and fixes a -Wfortify-source warning from Clang). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-03-21 04:30:09 +01:00
Andreas Rheinhardt	dff498fddf	avutil/csp: Improve enum range comparisons The underlying integer type of an enumeration is implementation-defined (see C11, 6.7.2.2 (4)); GCC defaults to unsigned if there are no negative values like for all enums from pixfmt.h except enum AVPixelFormat. This means that tests like "if (csp >= AVCOL_SPC_NB)" for invalid colorspaces need not work as expected (namely if enum AVColorSpace is signed). It also means that testing for such an enum variable to be >= 0 may be tautologically true. Clang emits a -Wtautological-unsigned-enum-zero-compare warning for this. Fix both of these issues by casting to unsigned. Also do the same in libswscale/format.c. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2025-03-21 04:30:09 +01:00

1 2 3 4 5 ...

2811 Commits