dcadsp: add int8x8_fmul_int32 to dsp context

It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This commit is contained in:
Christophe Gisquet
2012-05-11 11:17:36 +02:00
committed by Janne Grunau
parent e3fec3f095
commit 2bd44cb705
4 changed files with 16 additions and 7 deletions

View File

@@ -24,6 +24,14 @@
#include "libavutil/intreadwrite.h"
#include "dcadsp.h"
static void int8x8_fmul_int32_c(float *dst, const int8_t *src, int scale)
{
float fscale = scale / 16.0;
int i;
for (i = 0; i < 8; i++)
dst[i] = src[i] * fscale;
}
static void dca_lfe_fir_c(float *out, const float *in, const float *coefs,
int decifactor, float scale)
{
@@ -78,5 +86,6 @@ av_cold void ff_dcadsp_init(DCADSPContext *s)
{
s->lfe_fir = dca_lfe_fir_c;
s->qmf_32_subbands = dca_qmf_32_subbands;
s->int8x8_fmul_int32 = int8x8_fmul_int32_c;
if (ARCH_ARM) ff_dcadsp_init_arm(s);
}