Reduce development burden by getting rid of NMake files that are manually
kept up to date. For continued NMake support please generate NMake project
files using CMake.
This patch adds the ability to run zlib-ng test suite against the
original zlib as follows:
cmake -DZLIB_COMPAT=ON -DZLIBNG_ENABLE_TESTS=OFF .
make
LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu make test
The benefit of this is that modifications to the original zlib can be
tested with a more extensive zlib-ng's testsuite, and the assumptions
that the zlib-ng tests make can be validated against the original zlib.
In addition to a number of tests that exercise purely zlib-ng specific
API, there are a few that expect zlib-ng specific behavior from the
original zlib API:
- deflate() (obviously) emits different streams
- zlib-ng's deflatePrime() can take more than 16 bits
- zVersion() returns a different string
Adjust or disable the respective tests for ZLIBNG_ENABLE_TESTS=OFF.
This variant uses the lower number of cycles psadw insruction in place
of pmaddubsw for the running sum that does not need multiplication.
This allows this sum to be done independently, partially overlapping the
running "sum2" half of the checksum. We also have moved the shift
outside of the loop, breaking a small data dependency chain. The code
also now does a vectorized horizontal sum without having to rebase to
the adler32 base, as NMAX is defined as the maximum number of scalar
sums that can be peformed, so we're actually safe in doing this without
upgrading to higher precision. We can do a partial horizontal sum
because psadw only ends up accumulating 16 bit words in 2 vector lanes,
the other two can safely be assumed as 0.
The new adler32 checksum uses the VNNI instructions with appreciable
gains when possible. Otherwise, a pure avx512f variant exists which
still gives appreciable gains.
While DFLTCC takes care of accelerating compression on level 1, other
levels can be sped up too by computing CRC32 using various vector
instructions.
Take the Linux kernel assembly code that does that - its original
author (Hendrik Brueckner) works for IBM at the time of writing and has
allowed reusing the code under the zlib license. Rewrite it in C for
better maintainability, but keep the original structure, variable names
and comments.
Update the documentation.
Add CI configurations.
README.md changes:
- Added a related projects section at the end.
- Added blank line after header where missing.
- Added extra blank line before header to make them easier to spot as plain-text.
- Changed line-length for Contributing section, to make it more readable as plain-text.