Commit Graph

2873 Commits

Author SHA1 Message Date
Niklas Haas
7b773aba82 swscale/format: merge fmt_* helpers into a single fmt_analyze()
Handles the split between "regular" and "irregular" pixel formats in a single
place, and opens up the door for more complicated formats.
2025-12-09 09:47:48 +00:00
Niklas Haas
a8dc3a543c swscale/format: consolidate format information into a single struct
I use a switch/case instead of an array because this is only needed for
irregular formats, which are very sparse.
2025-12-09 09:47:48 +00:00
Niklas Haas
3160ace20a swscale/format: derive fmt_read_write() for regular formats 2025-12-09 09:47:48 +00:00
Niklas Haas
004127f00b swscale/format: explicitly test for unsupported subsampled formats
This includes semiplanar formats. Note that the first check typically
subsumes the second check, but I decided to keep both for clarity.
2025-12-09 09:47:48 +00:00
Niklas Haas
748855b227 swscale/format: derive fmt_shift() from AVPixFmtDescriptor
XV36 is the odd one out, being a byte-shifted packed format whose components
don't actually cross any byte boundaries.
2025-12-09 09:47:48 +00:00
Niklas Haas
2feb848252 swscale/format: derive fmt_swizzle() from AVPixFmtDescriptor when possible
Unfortunately, this is exceptionally difficult to handle in the general case,
when packed/bitstream formats come into play - the actual interpretation of
the offset, shift etc. are so difficult to deal with in a general case that
I think it's simpler to continue falling back to a static table of variants
for these exceptions. They are fortunately small in number.
2025-12-09 09:47:48 +00:00
Niklas Haas
83d572e6f6 swscale/format: check SwsPixelType in fmt_read_write()
This is the only function that actually has the ability to return an
error, so just move the pixel type assignment here and add a check to
ensure a valid pixel type is found.
2025-12-09 09:47:48 +00:00
Niklas Haas
ef2ce57c31 swscale/format: exclude U32 from sws_pixel_type()
This function is supposed to give us representable pixel types; but U32 is not
representable (due only to the AVRational range limit).
2025-12-09 09:47:48 +00:00
Martin Storsjö
69b4474367 swscale/tests: Fix fate-sws-ops-list on Windows
Set stdout to binary mode, to avoid platform specific differences
in the output that is hashed.
2025-12-08 23:02:30 +02:00
Niklas Haas
c94e8afe5d swscale/ops: clarify SwsOpList.src/dst semantics
Turns out these are not, in fact, purely informative - but the optimizer
can take them into account. This should be documented properly.

I tried to think of a way to avoid needing this in the optimizer, but any
way I could think of would require shoving this to SwsReadWriteOp, which I
am particularly unwilling to do.
2025-12-08 20:09:37 +00:00
Niklas Haas
f39fe6380c swscale/ops_optimizer: set correct value range for subpixel reads
e.g. rgb4 only reads values up to 15, not 255.

Setting this correctly eliminates a number of redundant clamps in cases
like e.g. rgb4 -> monow.
2025-12-08 20:09:37 +00:00
Kacper Michajłow
a8df08f628 swscale/x86/yuv2yuvX: don't use deprecated hexadecimal prefix
Fixes: warning: $ prefix for hexadecimal is deprecated [-w+number-deprecated-hex]
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-08 17:43:29 +00:00
Niklas Haas
016749f2e1 swscale/tests: add new test for generated operation lists
This is similar to swscale/tests/swscale.c, but significantly cheaper - it
merely prints the generated (optimized) operation list for every format
conversion.

Mostly useful for my own purposes as a regression test when making changes
to the ops optimizer. Note the distinction between this and tests/swscale.c,
the latter of which tests the result of *applying* an operation list for
equality.

There is an argument to be made that the two tests could be merged, but
I think the amount of overlap is small enough to not be worth the amount
of differences.
2025-12-08 16:58:53 +00:00
Niklas Haas
d54090fe9f swscale/format: add assertion to prevent nan/inf matrix coeffs 2025-12-08 16:58:53 +00:00
Niklas Haas
868426814a swscale/format: handle YA format swizzles more robustly
This code was previously broken; since YAF32BE/LE were not included as
part of the format enumeration. However, since we *always* know the correct
swizzle for YA formats, we can just special-case this by the number of
components instead.
2025-12-08 16:58:53 +00:00
Arpad Panyik
1f30ff30fb swscale: Add AArch64 Neon path for xyz12Torgb48 LE
Add optimized Neon code path for the little endian case of the
xyz12Torgb48 function. The innermost loop processes the data in 4x2
pixel blocks using software gathers with the matrix multiplication
and clipping done by Neon.

Relative runtime of micro benchmarks after this patch on some
Cortex and Neoverse CPU cores:

 xyz12le_rgb48le    X1      X3      X4    X925      V2
 16x4_neon:       2.55x   4.34x   3.84x   3.31x   3.22x
 32x4_neon:       2.39x   3.63x   3.22x   3.35x   3.29x
 64x4_neon:       2.37x   3.31x   2.91x   3.33x   3.27x
 128x4_neon:      2.34x   3.28x   2.91x   3.35x   3.24x
 256x4_neon:      2.30x   3.17x   2.91x   3.32x   3.10x
 512x4_neon:      2.26x   3.10x   2.91x   3.30x   3.07x
 1024x4_neon:     2.26x   3.07x   2.96x   3.30x   3.05x
 1920x4_neon:     2.26x   3.06x   2.93x   3.28x   3.04x

 xyz12le_rgb48le   A76     A78    A715    A720    A725
 16x4_neon:       2.33x   2.28x   2.53x   3.33x   3.19x
 32x4_neon:       2.35x   2.18x   2.45x   3.23x   3.24x
 64x4_neon:       2.35x   2.16x   2.42x   3.15x   3.21x
 128x4_neon:      2.35x   2.13x   2.39x   3.00x   3.09x
 256x4_neon:      2.36x   2.12x   2.35x   2.85x   2.99x
 512x4_neon:      2.35x   2.14x   2.35x   2.78x   2.95x
 1024x4_neon:     2.31x   2.09x   2.33x   2.80x   2.91x
 1920x4_neon:     2.30x   2.07x   2.32x   2.81x   2.94x

 xyz12le_rgb48le   A55    A510    A520
 16x4_neon:       2.09x   1.92x   2.36x
 32x4_neon:       2.05x   1.89x   2.38x
 64x4_neon:       2.02x   1.77x   2.35x
 128x4_neon:      1.96x   1.74x   2.25x
 256x4_neon:      1.90x   1.72x   2.19x
 512x4_neon:      1.83x   1.75x   2.16x
 1024x4_neon:     1.83x   1.62x   2.15x
 1920x4_neon:     1.82x   1.60x   2.15x

Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
2025-12-05 10:28:18 +00:00
Arpad Panyik
ef651b84ce swscale: Refactor XYZ+RGB state and add function hooks
Prepare for xyz12Torgb48 architecture-specific optimizations in
subsequent patches by:
 - Grouping XYZ+RGB gamma LUTs and 3x3 matrices into SwsColorXform
   (ctx->xyz2rgb and ctx->rgb2xyz), replacing scattered fields.
 - Dropping the unused last matrix column giving the same or smaller
   SwsInternal size.
 - Renaming ff_xyz12Torgb48 and ff_rgb48Toxyz12 and routing calls via
   the new per-context function pointer (ctx->xyz12Torgb48 and
   ctx->rgb48Toxyz12) in graph.c and swscale.c.
 - Adding ff_sws_init_xyzdsp and invoking it in swscale init paths
   (normal and unscaled).
 - Making fill_xyztables public to ease its setup later in checkasm.

These modifications do not introduce any functional changes.

Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
2025-12-05 10:28:18 +00:00
Andreas Rheinhardt
9e038fd959 swscale/tests/swscale: Fix typo
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-05 10:42:01 +01:00
Andreas Rheinhardt
eccf130fdb {lib{avcodec,swscale}/x86/,}Makefile: Kill MMX-OBJS
Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Hao Chen
a6206a31ea swscale: Fix out-of-bounds write errors in yuv2rgb_lasx.c file.
The patch adds support for dstw values ending in 2, 4, 6, 8, 10, 12, and 14,
which fixes the out-of-bounds write problem.
2025-11-28 03:40:47 +00:00
Martin Storsjö
3cc1dc3358 swscale: Remove the unused ff_sws_pixel_type_to_uint
This function uses ff_sws_pixel_type_size to switch on the
size of the provided type. However, ff_sws_pixel_type_size returns
a size in bytes (from sizeof()), not a size in bits. Therefore,
this would previously never return the right thing but always
hit the av_unreachable() below.

As the function is entirely unused, just remove it.

This fixes compilation with MSVC 2026 18.0 when targeting ARM64,
which previously hit an internal compiler error [1].

[1] https://developercommunity.visualstudio.com/t/Internal-Compiler-Error-targeting-ARM64-/10962922
2025-11-21 21:07:34 +00:00
James Almer
06b3a20761 swscale/ops_tmpl_int: fix signed integer related UB when shifting values
Fixes:
src/libswscale/ops_tmpl_int.c:292:23: runtime error: left shift of 188 by 24 places cannot be represented in type 'int'
src/libswscale/ops_tmpl_int.c:290:23: runtime error: left shift of 158 by 24 places cannot be represented in type 'int'
src/libswscale/ops_tmpl_int.c:293:23: runtime error: left shift of 136 by 24 places cannot be represented in type 'int'
src/libswscale/ops_tmpl_int.c:291:23: runtime error: left shift of 160 by 24 places cannot be represented in type 'int'

Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-21 18:40:58 +00:00
James Almer
30d66be21a swscale/x86/ops: fix signed integer related UB in normalize_clear()
Signed-off-by: James Almer <jamrial@gmail.com>
2025-11-21 18:40:58 +00:00
Lynne
d916803290 swscale: allow extended primaries 2025-11-10 21:50:58 +00:00
Lynne
b982b2a2a3 Revert "swscale: add support for 10/12-bit grayscale MSB pixfmts"
This reverts commit a5be0ecbfd.
2025-11-06 21:46:41 +01:00
Lynne
72a19a1c4a Revert "swscale: add support for 10/12-bit 422 and 444 MSB pixfmts"
This reverts commit bc0ee8b7cc.
2025-11-06 21:44:13 +01:00
Lynne
deaece6e56 Revert "swscale/format: add missing fmt_shift for gray12/12 msb formats"
This reverts commit c9710dae3c.
2025-11-06 21:44:13 +01:00
Ramiro Polla
4bee010844 swscale/range_convert: fix truncation bias in range conversion
384fe39623 introduced a regression in the
range conversion offset calculation, resulting in a slight green tint
in full-range RGB to YUV conversions of grayscale values.

The offset being calculated was not taking into consideration a bias
needed for correctly rounding the result from the multiplication stage,
leading to a truncated value.

Fixes issue #11646.
2025-11-06 20:36:08 +00:00
Niklas Haas
9a386078cc tests/swscale: use av_log() where appropriate
We can't use ANSI color codes inside av_log(), so fall back to printf()
for these; but match the INFO verbosity level.

Also change the format slightly to drop SSIM numbers down to just below
VERBOSE level, since VERBOSE tends to generate a lot of swscale related
spam.
2025-11-06 20:34:51 +00:00
Niklas Haas
c9710dae3c swscale/format: add missing fmt_shift for gray12/12 msb formats
The MSB YUV formats were added, but the gray formats were not. Seems to have
been an oversight.

Fixes: a5be0ecbfd
2025-11-06 15:56:24 +01:00
Lynne
aeb9b19ebc lavu: add support for Panasonic V-Log 2025-10-28 20:46:21 +01:00
Lynne
bc0ee8b7cc swscale: add support for 10/12-bit 422 and 444 MSB pixfmts 2025-10-27 22:59:41 -03:00
Lynne
a5be0ecbfd swscale: add support for 10/12-bit grayscale MSB pixfmts 2025-10-27 22:59:40 -03:00
Michael Niedermayer
566e9032b1 swscale/output: Fix unsigned cast position in yuv2*
Fixes: signed overflow

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-14 20:55:54 +02:00
Michael Niedermayer
0c6b7f9483 swscale/output: Fix integer overflow in yuv2ya16_X_c_template()
Found-by: colod colod <colodcolod7@gmail.com>

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-10-14 20:55:53 +02:00
Andreas Rheinhardt
8a34faa250 swscale/ppc/swscale_ppc_template: Fix av_unused placement
Forgotten in d6cb0d2c2b.

Reviewed-by: Sean McGovern <gseanmcg@gmail.com>
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-26 22:38:13 +02:00
Kacper Michajłow
d6cb0d2c2b ALL: move av_unused to conform with standard requirement
This is required placement by standard [[maybe_unused]] attribute, works
the same for __attribute__((unused)).

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-09-26 16:15:46 +00:00
Kacper Michajłow
1294ab5db1 swscale/ops_tmpl_int: remove unused arguments from wrap read decl
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-09-13 19:12:44 +02:00
Kacper Michajłow
66faef3dbe swscale/ops_chain: add type removed ff_sws_op_chain_free_cb
to avoid pointer casting and UB of calling function with different
pointer type.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-09-13 18:14:02 +02:00
Andreas Rheinhardt
a4fd3f27f4 swscale/x86/ops: Fix leak
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-12 22:42:30 +02:00
Andreas Rheinhardt
c74ee4ceff swscale/ops_chain: Free correct pointer on error
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-12 22:41:24 +02:00
Andreas Rheinhardt
2451e06f19 all: Use "" instead of <> to include internal headers
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 22:20:58 +02:00
Andreas Rheinhardt
7dc4c4f6f5 swscale/ops: Fix linking with x86 assembly disabled
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 22:14:39 +02:00
Martin Storsjö
a3360c0eaf swscale: Don't pass a concrete SwsOpPriv as parameter
This fixes the following compiler error, if compiling with MSVC
for ARM (32 bit):

    src/libswscale/ops_chain.c(48): error C2719: 'priv': formal parameter with requested alignment of 16 won't be aligned

This change shouldn't affect the performance of this operation
(which in itself probably isn't relevant); instead of copying the
contents of the SwsOpPriv struct from the stack as parameter,
it gets copied straight from the caller function's stack frame
instead.

Separately from this issue, MSVC 17.8 and 17.9 end up in an
internal compiler error when compiling libswscale/ops.c, but
older and newer versions do compile it successfully.
2025-09-03 20:18:03 +00:00
Zhao Zhili
a0e00bed88 swscale/ops: fix build with msvc
x86/ops.c(500): error C2099: initializer is not a constant
2025-09-02 10:28:56 +00:00
Niklas Haas
4ec2bffe62 configure: allow disabling experimental swscale code
In theory we can also expand this to disable e.g. experimental codecs.
2025-09-01 19:28:36 +02:00
Niklas Haas
cc42bc1f4b swscale/graph: allow experimental use of new format handler
The humor originally contained in this commit message has been
redacted to comply with the strict FFmpeg code quality standards.
2025-09-01 19:28:36 +02:00
Niklas Haas
f8f7935a97 swscale/format: add new format decode/encode logic
This patch adds format handling code for the new operations. This entails
fully decoding a format to standardized RGB, and the inverse.

Handling it this way means we can always guarantee that a conversion path
exists from A to B without having to explicitly cover logic for each path;
and choosing RGB instead of YUV as the intermediate (as was done in swscale
v1) is more flexible with regards to enabling further operations such as
primaries conversions, linear scaling, etc.

In the case of YUV->YUV transform, the redundant matrix multiplication will
be canceled out anyways.
2025-09-01 19:28:36 +02:00
Niklas Haas
982d3a98d0 swscale/x86: add SIMD backend
This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all
floating point operations. While this is not yet 100% coverage, it's good
enough for the vast majority of formats out there.

Of special note is the packed shuffle fast path, which uses pshufb at vector
sizes up to AVX512.
2025-09-01 19:28:36 +02:00
Niklas Haas
a151b426f9 swscale/ops_memcpy: add 'memcpy' backend for plane->plane copies
Provides a generic fast path for any operation list that can be decomposed
into a series of memcpy and memset operations.

25% faster than the x86 backend for yuv444p -> yuva444p
33% faster than the x86 backend for gray -> yuvj444p
2025-09-01 19:28:36 +02:00