h264: Don't store intra pcm samples in h->mb

Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
This commit is contained in:
Ronald S. Bultje
2013-02-17 14:52:24 -08:00
committed by Martin Storsjö
parent 9918f57dcf
commit 7ebfb466ae
4 changed files with 21 additions and 23 deletions

View File

@@ -416,6 +416,7 @@ typedef struct H264Context {
GetBitContext *intra_gb_ptr;
GetBitContext *inter_gb_ptr;
const uint8_t *intra_pcm_ptr;
DECLARE_ALIGNED(16, int16_t, mb)[16 * 48 * 2]; ///< as a dct coeffecient is int32_t in high depth, we need to reserve twice the space.
DECLARE_ALIGNED(16, int16_t, mb_luma_dc)[3][16 * 2];
int16_t mb_padding[256 * 2]; ///< as mb is addressed by scantable[i] and scantable is uint8_t we can either check that i is not too large or ensure that there is some unused stuff after mb