The comment said that the os is set to 255, when in fact it has
been set to the current os since zlib 1.2.3. Or at least our best
guess at the os made at compile time.
Changes since 2.1.0-Beta1:
- Fix missing exported z_size_t type in zlib.h (zlib-compat mode).
- Fix two Coverity warnings
- Fix CMake GNUInstallDirs usage
- Configure/CMake improvements for compilers with early AVX512-VNNI support (GCC8.0 etc)
- Microptimalization for AVX512 implementation of CRC32
- Optimized deflate_rle compression, also added related test and benchmark.
- Add testing of file_compress/file_uncompress in minigzip/minideflate
- Add emulated RISC-V to CI test workflow
- Add deflate_fast to switchlevels test
- Fix abicheck CI test was not ignoring version string
- Fix MinGW CI test, broken by Github Actions VM image updates
This release contains two years of development and improvements to zlib-ng,
as well as fixes and changes inherited from zlib.
The 2.1.x version series has new targeted minumum buildsystem versions, as detailed on the Wiki https://github.com/zlib-ng/zlib-ng/wiki
Buildsystem:
- Many improvements to the CMake scripts.
- Improved support for detecting memory alignment functions.
- Improved support for unaligned access by letting the compiler promote code to unaligned if supported by the CPU.
- Remove x86 cpu feature detection for TZCNT, safely fallback to BSF.
- Enable using AVX512 intrinsics with GCC <9.
Optimizations and Enhancements:
- Decompression is a lot faster (56% faster measured on AVX2-capable x86-64)
- Compresson is improved for Level 9, at the cost of a little performance.
- Compression is improved for Level 3, by switching from deflate_fast to deflate_medium.
- Levels 3 and 4 have been reconfigured to provide a better gradual tradeoff for speed/compression between levels 2 and 5.
- Deflate_quick (Level 1) has been improved to default to a bigger windowsize and support changing the window size like the other levels.
New instruction set optimizations:
- Adler32 implementation using AVX512, AVX512-VNNI, VMX.
- CRC32-B implementation using VPCLMULQDQ & IBM-Z.
- Slide hash implementation using VMX.
- Compare256 implementations using SSE2, Neon, & POWER9.
- Inflate chunk copying using SSSE3 & VSX.
Compatibility and Porting:
- CRC-32 computation changes from madler/zlib. zlib-ng/zlib-ng#a6155234
- Compatible and up-to-date with zlib 1.2.13.
- Removed the usage of macros in zlib-ng.h, making life easier for languages that want to call the C functions without having the C preprocessor (Python, etc).
Improved support more environments:
- Apple M1
- vcpkg
- Emscripten
Testing:
- Tests have been converted to use GTest. Many new tests have also been added.
- Gbench support has been added to easily benchmark changes to performance-critical functions.
Misc:
- Several pieces of core code has been restructured or rewritten.
- Too many changes to list here, see the git commit log for the full list of changes.
Deprecations:
- Configure no longer has the full range of tests.
- NMake is no longer actively supported and tested, it is now community supported.
- See the wiki for minimum build system versions and deprecations https://github.com/zlib-ng/zlib-ng/wiki
Use the interleaved method of Kadatch and Jenkins in order to make
use of pipelined instructions through multiple ALUs in a single
core. This also speeds up and simplifies the combination of CRCs,
and updates the functions to pre-calculate and use an operator for
CRC combination.
Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
This is useful when zlib-ng is embedded into another library,
such as ITK: https://itk.org/Closes#1025.
Co-authored-by: Mika Lindqvist <postmaster@raasu.org>