Use the interleaved method of Kadatch and Jenkins in order to make
use of pipelined instructions through multiple ALUs in a single
core. This also speeds up and simplifies the combination of CRCs,
and updates the functions to pre-calculate and use an operator for
CRC combination.
Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
Performance benchmarks have so far not shown that any platform benefits from UNROLL_MORE,
although this might be beneficial on older compilers/cpus or for compiling without optimizations.
The extra UNROLL_MORE code should be considered for removal since it is never enabled by us
and will likely only serve to confuse and contribute to bitrot.
When the same len2 is used repeatedly, it is faster to use
crc32_combine_gen() to generate an operator, that is then used to
combine CRCs with crc32_combine_op().
to co-exist in an application that has been linked to something that
depends on stock zlib. Previously, that would cause random problems
since there is no way to guarantee what zlib version is being used
for each dynamically linked function.
Add the corresponding zlib-ng.h.
Tests, example and minigzip will not compile before they have been
adapted to use the correct functions as well.
Either duplicate them, so we have minigzip-ng.c for example, or add
compile-time detection in the source code.
See the comment for more details. This is in response to an issue
raised as a result of a security audit of the zlib code by Trail
of Bits and TrustInSoft, in support of the Mozilla Foundation.
There was a small optimization for PowerPCs to pre-increment a
pointer when accessing a word, instead of post-incrementing. This
required prefacing the loop with a decrement of the pointer,
possibly pointing before the object passed. This is not compliant
with the C standard, for which decrementing a pointer before its
allocated memory is undefined. When tested on a modern PowerPC
with a modern compiler, the optimization no longer has any effect.
Due to all that, and per the recommendation of a security audit of
the zlib code by Trail of Bits and TrustInSoft, in support of the
Mozilla Foundation, this "optimization" was removed, in order to
avoid the possibility of undefined behavior.
Solaris doesn't have sys/endian.h or endian.h, it has sys/byteorder.h,
which doesn't define BYTE_ORDER, it defines either _LITTLE_ENDIAN or
_BIG_ENDIAN.