Commit Graph

21822 Commits

Author SHA1 Message Date
Martin Storsjö
c513fcd7d2 aarch64: vp8: Fix a typo in a comment
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:46:00 +02:00
Martin Storsjö
f1011ea28a aarch64: vp8: Reorder the function pointer inits to match the arm original
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:56 +02:00
Martin Storsjö
b4b27dce95 aarch64: vp8: Move the vp8dsp makefile entries to the right places
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.

These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:53 +02:00
Martin Storsjö
ad32f7b126 aarch64: vp8: Remove superfluous includes
This fixes building with MSVC, which lacks unistd.h.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:50 +02:00
Martin Storsjö
85bfaa4949 aarch64: vp8: Use the proper aarch64 form for conditional branches
The previous form also does seem to assemble on current tools,
but I think it might fail on some older aarch64 tools.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:47 +02:00
Martin Storsjö
2eeac79936 aarch64: vp8: Fix assembling with armasm64
Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:44 +02:00
Martin Storsjö
26d7af4c38 aarch64: vp8: Fix assembling with clang
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).

The movrel macro invocations need to pass the offset via a separate
parameter. Mach-o and COFF relocations don't allow a negative
offset to a symbol, which is handled properly if the offset is passed
via the parameter. If no offset parameter is given, the macro
evaluates to something like "adrp x17, subpel_filters-16+(0)", which
older clang versions also fail to parse (the older clang versions
only support one single offset term, although it can be a parenthesis.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:41 +02:00
Magnus Röös
0801853e64 libavcodec: vp8 neon optimizations for aarch64
Partial port of the ARM Neon for aarch64.

Benchmarks from fate:

benchmarking with Linux Perf Monitoring API
nop: 58.6
checkasm: using random seed 1760970128
NEON:
 - vp8dsp.idct       [OK]
 - vp8dsp.mc         [OK]
 - vp8dsp.loopfilter [OK]
checkasm: all 21 tests passed
vp8_idct_add_c: 201.6
vp8_idct_add_neon: 83.1
vp8_idct_dc_add_c: 107.6
vp8_idct_dc_add_neon: 33.8
vp8_idct_dc_add4y_c: 426.4
vp8_idct_dc_add4y_neon: 59.4
vp8_loop_filter8uv_h_c: 688.1
vp8_loop_filter8uv_h_neon: 216.3
vp8_loop_filter8uv_inner_h_c: 649.3
vp8_loop_filter8uv_inner_h_neon: 195.3
vp8_loop_filter8uv_inner_v_c: 544.8
vp8_loop_filter8uv_inner_v_neon: 131.3
vp8_loop_filter8uv_v_c: 706.1
vp8_loop_filter8uv_v_neon: 141.1
vp8_loop_filter16y_h_c: 668.8
vp8_loop_filter16y_h_neon: 242.8
vp8_loop_filter16y_inner_h_c: 647.3
vp8_loop_filter16y_inner_h_neon: 224.6
vp8_loop_filter16y_inner_v_c: 647.8
vp8_loop_filter16y_inner_v_neon: 128.8
vp8_loop_filter16y_v_c: 721.8
vp8_loop_filter16y_v_neon: 154.3
vp8_loop_filter_simple_h_c: 387.8
vp8_loop_filter_simple_h_neon: 187.6
vp8_loop_filter_simple_v_c: 384.1
vp8_loop_filter_simple_v_neon: 78.6
vp8_put_epel8_h4v4_c: 3971.1
vp8_put_epel8_h4v4_neon: 855.1
vp8_put_epel8_h4v6_c: 5060.1
vp8_put_epel8_h4v6_neon: 989.6
vp8_put_epel8_h6v4_c: 4320.8
vp8_put_epel8_h6v4_neon: 1007.3
vp8_put_epel8_h6v6_c: 5449.3
vp8_put_epel8_h6v6_neon: 1158.1
vp8_put_epel16_h6_c: 6683.8
vp8_put_epel16_h6_neon: 831.8
vp8_put_epel16_h6v6_c: 11110.8
vp8_put_epel16_h6v6_neon: 2214.8
vp8_put_epel16_v6_c: 7024.8
vp8_put_epel16_v6_neon: 799.6
vp8_put_pixels8_c: 112.8
vp8_put_pixels8_neon: 78.1
vp8_put_pixels16_c: 131.3
vp8_put_pixels16_neon: 129.8

This contains a fix to include guards by Carl Eugen Hoyos.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-02-19 11:45:33 +02:00
Janne Grunau
156ea66c91 h264/x86: sign extend int stride in deblock functions
Fixes checkasm errors after adding the h264 deblock tests.
2019-01-27 11:16:31 +01:00
Martin Storsjö
eec93e5709 libopenh264dec: Use a newer decoding entry point function
The "new" entry point actually has existed since OpenH264 1.4 in
2015 and is the the recommended decoding entry point.

The name of this function, DecodeFrameNoDelay, is rather backwards
considering that it doesn't return the latest decoded frame immediately,
but actually does proper delaying and reordering of frames.

Signed-off-by: Martin Storsjö <martin@martin.st>
2019-01-26 21:13:03 +02:00
Janne Grunau
28a8b5413b h264/aarch64: add intra loop filter neon asm
Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported
(x264 uses nv12 chroma) and optimized.

Cycle count for checkasm --bench on a Snapdragon 820e:
h264_h_loop_filter_luma_intra_8bpp_c: 60.0
h264_h_loop_filter_luma_intra_8bpp_neon: 54.2
h264_v_loop_filter_luma_intra_8bpp_c: 148.3
h264_v_loop_filter_luma_intra_8bpp_neon: 73.8
h264_h_loop_filter_chroma_intra_8bpp_c: 27.8
h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4
h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8
h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7
h264_v_loop_filter_chroma_intra_8bpp_c: 45.8
h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3
2019-01-26 12:05:10 +01:00
Janne Grunau
846c3d6aca h264/aarch64: optimize neon loop filter
Exit as soon as possible if no filtering will be done.

Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c:      72.4 ->  72.5
h264_h_loop_filter_luma_8bpp_neon:   97.1 ->  56.3
h264_v_loop_filter_luma_8bpp_c:     174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon:   62.9 ->  60.9
h264_h_loop_filter_chroma_8bpp_c:    30.2 ->  30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 ->  25.7
h264_v_loop_filter_chroma_8bpp_c:    57.3 ->  57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 ->  24.0
2019-01-26 12:05:10 +01:00
Janne Grunau
bb515e3a73 h264/aarch64: sign extend int stride in loop filter asm 2019-01-26 12:05:10 +01:00
James Almer
ca44fa5d7f avcodec/libdav1d: properly free all output picture references
Dav1dPictures contain more than one buffer reference, so we're forced to use the
API properly to free them all.

Signed-off-by: James Almer <jamrial@gmail.com>
2019-01-23 17:39:20 -03:00
Luca Barbato
90adbf4abf cook: Use the correct table for 6-bit stereo coupling
Thanks to Kostya for digging it out and telling me.
2019-01-17 14:58:03 +01:00
James Almer
70ab2778be libdav1d: update API usage to the first stable release
The color fields were moved to another struct, and a way to propagate
timestamps and other input metadata was introduced, so the packet
fifo can be removed.

Add support for 12bit streams, an option to disable film grain, and
read the profile from the sequence header referenced by the ouput
picture instead of guessing based on output pix_fmt.

Signed-off-by: James Almer <jamrial@gmail.com>
2018-12-12 19:56:16 -03:00
James Almer
56f50183f3 libdav1d: fix build after a recent API break
Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-14 22:04:35 -03:00
Linjie Fu
e716323fa8 qsvenc: Add VDENC support for H264 and HEVC
Add VDENC(lowpower mode) support for QSV h264 and HEVC

It's an experimental function(like lowpower in vaapi) with
some limitations:
- CBR/VBR require HuC which should be explicitly loaded via i915
module parameter(i915.enable_guc=2 for linux kerner version >= 4.16)
- HEVC VDENC was supported >= ICE LAKE

use option "-low_power 1" to enable VDENC.

Signed-off-by: Linjie Fu <linjie.fu@intel.com>
2018-11-13 16:36:04 +00:00
James Almer
9bf9358b61 avcodec: libdav1d AV1 decoder wrapper.
Originally written by Ronald S. Bultje, with fixes, optimizations and
improvements by James Almer.

Signed-off-by: James Almer <jamrial@gmail.com>
2018-11-06 12:40:27 -03:00
Martin Storsjö
80f85a95da libx264: Pass the reordered_opaque field through the encoder
libx264 does have a field for opaque data to pass along with frames
through the encoder, but it is a pointer, while the libavcodec
reordered_opaque field is an int64_t. Therefore, allocate an array
within the libx264 wrapper, where reordered_opaque values in flight
are stored, and pass a pointer to this array to libx264.

Update the public libavcodec documentation for the AVCodecContext
field to explain this usage, and add a codec capability that allows
detecting whether an encoder handles this field.

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-11-05 15:41:14 +02:00
James Almer
8d80046a0f libaom: remove references to yuva444p pixfmt
Support for it was apparently never in the codebase, and the enum
value was recently removed from the public headers [1]

[1] https://aomedia.googlesource.com/aom/+/f1570f0c2f70832dd170285f8de60bd2379c8efa

Signed-off-by: James Almer <jamrial@gmail.com>
2018-10-27 00:02:17 -03:00
James Almer
cacb62f9cb Revert "decode: copy the output parameters from the last bsf in the chain back to the AVCodecContext"
This reverts commit 662558f985.

The avcodec_parameters_to_context() call was freeing and reallocating
AVCodecContext->extradata, essentially taking ownership of it, which according
to the doxy is user owned. This is an API break and has produces crashes in
some library users like Firefox.
Revert until a better solution is found to internally propagate the filtered
extradata back into the decoder context.

Signed-off-by: James Almer <jamrial@gmail.com>
2018-10-27 00:02:13 -03:00
Zhong Li
1ff6cb2ca6 lavc/qsvenc_jpeg: set a default quality
Keep alignment with vaapi mjpeg encoder.

Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-13 15:57:06 +02:00
Zhong Li
4c5e77e0bf lavc/qsvenc_jpeg: add async_depth support
Currently qsv (m)jpeg encoding is broken.
Regression introducing by the commit(id: c1bcd3): fix async support,
which requires the minimum async_depth to be 1, instead previous zero.
But the default async_depth of qsv (m)jpeg encoding is still initialized
(mostly) as zero.

This patch also abviously improves qsv (m)jpeg encoding performance
due to the default async_depth is changed to 4.

Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-13 15:57:06 +02:00
James Almer
04e8b8b053 avcodec/libaomenc: export the Sequence Header OBU as extradata
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-11 11:37:53 +02:00
James Almer
97c9a50844 avcodec/libaomenc: remove AVOption related to frame partitions
Support for it was apparently never in the codebase, and the enum
value was recently removed from the public headers [1]

[1] https://aomedia.googlesource.com/aom/+/df4ffb73140fe31bebdabd17c1a7b53721e74838

Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-11 11:32:46 +02:00
James Almer
3365e91b1e avcodec/extract_extradata: don't uninitialize the H2645Packet on every processed packet
Based on hevc_parser code. This prevents repeated unnecessary allocations
and frees on every packet processed by the bsf.

Reviewed-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-06 21:31:54 +02:00
James Almer
6442d8bad6 avcodec/extract_extradata: Move the reference in the bsf internal buffer
There is no need to allocate a new packet for it.

Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-06 21:31:54 +02:00
James Almer
70f11ee998 avcodec/extract_extradata: Do not allocate more space than needed when removing NALUs in h264/hevc
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-06 21:31:54 +02:00
James Almer
be65995d23 avcodec/extract_extradata: Zero-initialize the padding bytes in all allocated buffers
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-06 21:31:43 +02:00
Nikolas Bowe
aaf7fd1680 avcodec/extract_extradata_bsf: Fix leak discovered via fuzzing
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-03 21:03:31 +02:00
Jan Sebechlebsky
87d5686151 avcodec/bsf: Add ff_bsf_get_packet_ref() function
Use of this function can save unnecessary malloc operation
in bitstream filter.

Signed-off-by: Jan Sebechlebsky <sebechlebskyjan@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-10-03 21:03:30 +02:00
Maxym Dmytrychenko
a2041a6522 qsvenc: AV_PIX_FMT_QSV path should release frame
Fixes high memory usage and prevents over allocation of the frames via
proper unref.

Can be checked as:
-hwaccel qsv -c:v h264_qsv -i ../h264-conformance/CANL2_Sony_E.jsv -c:v
h264_qsv -b:v 2000k -y qsv.mp4
2018-09-18 17:53:37 +02:00
Martin Storsjö
2a9e1c122e libfdk-aac: Don't use defined() in a #define
MSVC expands the preprocessor directives differently, making the
version check fail in the previous form.

Clang can warn about this with -Wexpansion-to-defined (not currently
enabled by default):
warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-09-13 22:11:50 +03:00
Martin Storsjö
7e929dac10 libfdk-aacenc: Allow enabling the ELDv2 profile
This is a new feature in FDK v2.

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-09-05 22:40:54 +03:00
Martin Storsjö
2edaafe5b9 libfdk-aacdec: Allow setting the new dynamic range control effect setting
This is a new setting in FDK v2.

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-09-05 22:40:50 +03:00
Martin Storsjö
ffb9b7a6ba libfdk-aac: Consistently use a proper version check macro for detecting features
The previous version checks checked explicitly for the version
where the version define was added to the installed headers,
making an "#ifdef AACDECODER_LIB_VL0" enough. Now that we have
a need for more diverse version checks than this, convert all checks
to such checks.

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-09-05 22:40:46 +03:00
Martin Storsjö
141c960e21 libfdk-aacenc: Fix building with libfdk-aac v2
When flushing the encoder, we now need to provide non-null buffer
parameters for everything, even if they are unused.

The encoderDelay parameter has been replaced by two, nDelay and
nDelayCore.

Signed-off-by: Martin Storsjö <martin@martin.st>
2018-09-03 10:50:51 +03:00
Zhong Li
c8bca9fe46 lavc/qsvenc: dump BufferSizeInKB message
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Maxym Dmytrychenko <maxim.d33@gmail.com>
2018-09-02 20:02:02 +02:00
Zhong Li
e16b20782a lavc/qsvenc: allow to set qp range for h264 BRC
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Maxym Dmytrychenko <maxim.d33@gmail.com>
2018-09-02 20:01:42 +02:00
Martin Storsjö
83678dbbae libopenh264dec: Export the decoded profile and level in AVCodecContext
Signed-off-by: Martin Storsjö <martin@martin.st>
2018-08-31 13:25:25 +03:00
Zhong Li
69caad8959 qsvdec: Release packet on decoding failure for mpeg2/vp8/vc1
H264/265 have been fixed such an issue with commit
559370f2c4.
Similar fixing is needed for other codecs.

Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-08-23 08:22:46 +02:00
Zhong Li
76eef04f30 qsvenc: Fix a misleading log message
Signed-off-by: Zhong Li <zhong.li@intel.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2018-08-23 08:22:45 +02:00
James Almer
662558f985 decode: copy the output parameters from the last bsf in the chain back to the AVCodecContext
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:33:44 -03:00
James Almer
ad99cbc9b3 decode: flush the internal bsfs instead of constantly reinitalizing them
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:33:43 -03:00
James Almer
0e27e27670 h264_redundant_pps_bsf: implement a AVBSFContext.flush() callback
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:33:43 -03:00
James Almer
7f01c209f2 vp9_superframe_bsf: implement a AVBSFContext.flush() callback
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:33:43 -03:00
James Almer
eb1d1c764c vp9_superframe_split_bsf: implement a AVBSFContext.flush() callback
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:33:25 -03:00
James Almer
d6321851ba h264_mp4toannexb_bsf: implement a AVBSFContext.flush() callback
Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:06:21 -03:00
James Almer
e1e1e8dbb2 bsf: add a flushing mechanism to AVBSFContext
Meant to reset the internal bsf state without the need to reinitialize it.

Signed-off-by: James Almer <jamrial@gmail.com>
2018-08-17 14:06:21 -03:00