Unfortunatelly the benchmark function cannot return
corrected parallel cost, so it must fail.
Note that some backends (like OpenSSL) also limits maximal thread count,
so currently it was clapped at 4 for luksFormat and 8 for benchmark.
This patch set it all to PBKDF internal parallel limit.