zlib-ng

mirror of https://github.com/GerbilSoft/zlib-ng.git synced 2025-06-18 11:35:35 -04:00

Author	SHA1	Message	Date
Nathan Moinvaziri	4b7037cd08	Add large 1mb buffer test for crc32 hashing.	2025-05-27 14:45:11 +02:00
yintong	10b51fa592	riscv: add crc32 optimization using zbc extension Some checks failed Configure / ${{ matrix.name }} (gcc, --warn, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --sprefix=zTest_, macOS GCC Symbol Prefix, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --warn, macOS GCC, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --zlib-compat --sprefix=zTest_, macOS GCC Symbol Prefix & Compat, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (mips-linux-gnu, mips-linux-gnu-gcc, --warn, Ubuntu GCC MIPS, ubuntu-latest, qemu-user gcc-mips-linux-gnu libc-dev-mips-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (mips64-linux-gnuabi64, mips64-linux-gnuabi64-gcc, --warn, Ubuntu GCC MIPS64, ubuntu-latest, qemu-user gcc-mips64-linux-gnuabi64 libc-dev-mips64-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (powerpc-linux-gnu, powerpc-linux-gnu-gcc, --warn --without-power8, Ubuntu GCC PPC No Power8, ubuntu-latest, qemu-user gcc-powerpc-linux-gnu libc-dev-powerpc-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (powerpc64le-linux-gnu, powerpc64le-linux-gnu-gcc, --warn, Ubuntu GCC PPC64LE, ubuntu-latest, qemu-user gcc-powerpc64le-linux-gnu libc-dev-ppc64el-cross) (push) Has been cancelled Details OSS-Fuzz / Fuzzing (push) Has been cancelled Details Libpng / Ubuntu Clang (push) Has been cancelled Details Link / Link zlib (push) Has been cancelled Details Link / Link zlib-ng compat (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DCMAKE_TOOLCHAIN_FILE=../../cmake/toolchain-aarch64.cmake, ubuntu_gcc_pigz_aarch64, Ubuntu GCC AARCH64, ubuntu-latest, qemu-user gcc-aarch64-linux-gnu libc-dev-arm64-cross) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DWITH_OPTIM=OFF, ubuntu_clang_pigz_no_optim, clang, llvm-cov-15 gcov, Ubuntu Clang No Optim, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DWITH_THREADS=OFF -DPIGZ_VERSION=v2.6, ubuntu_clang_pigz_no_threads, clang, llvm-cov-15 gcov, Ubuntu Clang No Threads, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DZLIB_SYMBOL_PREFIX=zTest_, ubuntu_gcc_pigz, gcc, Ubuntu GCC Symbol Prefix, ubuntu-latest) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (ubuntu_clang_pigz, clang, llvm-cov-15 gcov, Ubuntu Clang, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (ubuntu_gcc_pigz, gcc, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (-DZLIB_SYMBOL_PREFIX=zTest_, clang, --sprefix=zTest_, clang++, macOS Clang Symbol Prefix, macOS-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (-m32, -DCMAKE_C_FLAGS=-m32 -DCMAKE_CXX_FLAGS=-m32, gcc, g++, -m32, -m32, Ubuntu GCC -m32, ubuntu-latest, gcc-multilib g++-multilib) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (aarch64-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-aarch64.cmake, aarch64-linux-gnu-gcc, aarch64-linux-gnu-g++, Ubuntu GCC AARCH64, ubuntu-latest, qemu-user gcc-aarch64-linux-gnu g++-aarch64-linux-gnu libc6-dev-arm64-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (arm-linux-gnueabihf, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-armhf.cmake, arm-linux-gnueabihf-gcc, arm-linux-gnueabihf-g++, Ubuntu GCC ARM HF, ubuntu-latest, qemu-user gcc-arm-linux-gnueabihf g++-arm-linux-gnueabihf libc6-dev-armhf-c… (push) Has been cancelled Details Package Check / ${{ matrix.name }} (clang, clang++, macOS Clang, macOS-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (gcc, g++, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (mips-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-mips.cmake, mips-linux-gnu-gcc, mips-linux-gnu-g++, Ubuntu GCC MIPS, ubuntu-latest, qemu-user gcc-mips-linux-gnu g++-mips-linux-gnu libc6-dev-mips-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (mips64-linux-gnuabi64, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-mips64.cmake, mips64-linux-gnuabi64-gcc, mips64-linux-gnuabi64-g++, Ubuntu GCC MIPS64, ubuntu-latest, qemu-user gcc-mips64-linux-gnuabi64 g++-mips64-linux-gnuabi64 libc6-… (push) Has been cancelled Details Package Check / ${{ matrix.name }} (powerpc-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-powerpc.cmake, powerpc-linux-gnu-gcc, powerpc-linux-gnu-g++, Ubuntu GCC PPC, ubuntu-latest, qemu-user gcc-powerpc-linux-gnu g++-powerpc-linux-gnu libc6-dev-powerpc-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (powerpc64le-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-powerpc64le.cmake, powerpc64le-linux-gnu-gcc, powerpc64le-linux-gnu-g++, Ubuntu GCC PPC64LE, ubuntu-latest, qemu-user gcc-powerpc64le-linux-gnu g++-powerpc64le-linux-gnu … (push) Has been cancelled Details CMake / Upload Coverage Reports (push) Has been cancelled Details Pigz / Upload Coverage Reports (push) Has been cancelled Details	2025-04-27 18:23:50 +02:00
Adam Stylinski	46fc33f39d	SSE4.1 optimized chorba This is ~25-30% faster than the SSE2 variant on a core2 quad. The main reason for this has to do with the fact that, while incurring far fewer shifts, an entirely separate stack buffer has to be managed that is the size of the L1 cache on most CPUs. This was one of the main reasons the 32k specialized function was slower for the scalar counterpart, despite auto vectorizing. The auto vectorized loop was setting up the stack buffer at unaligned offsets, which is detrimental to performance pre-nehalem. Additionally, we were losing a fair bit of time to the zero initialization, which we are now doing more selectively. There are a ton of loads and stores happening, and for sure we are bound on the fill buffer + store forwarding. An SSE2 version of this code is probably possible by simply replacing the shifts with unpacks with zero and the palignr's with shufpd's. I'm just not sure it'll be all that worth it, though. We are gating against SSE4.1 not because we are using specifically a 4.1 instruction but because that marks when Wolfdale came out and palignr became a lot faster.	2025-04-15 14:11:12 +02:00
Hans Kristian Rosbach	00a3168d5d	Add AVX512 version of compare256 Improve the speed of sub-16 byte matches by first using a 128-bit intrinsic, after that use only 512-bit intrinsics. This requires us to overlap on the last run, but this is cheaper than processing the tail using a 256-bit and then a 128-bit run. Change benchmark steps to avoid it hitting chunk boundaries of one or the other function as much, this gives more fair benchmarks.	2025-04-14 23:28:38 +02:00
Hans Kristian Rosbach	cfd90c7e1a	Speed up benchmarks when run as part of gtest as it does not check data for correctness, making it only run each benchmark for 1 iteration, instead of thousands or hundreds of thousands. Add a separate CI step to crashtest benchmarks without collecting any coverage data. Activate benchmarks in more arches. Disable some warnings to avoid errors in compiling google benchmark. Remove separate benchmark CI job, now included in other jobs instead.	2025-04-14 23:18:41 +02:00
Mika Lindqvist	19b3b7f6e9	[FreeBSD] Define _XOPEN_SOURCE for gtest_zlib * Without defining _XOPEN_SOURCE as 700, isascii() is not available on FreeBSD 14.	2025-04-14 23:17:59 +02:00
Vladislav Shchapov	ac1642facc	Remove late enable_language calls. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2025-04-14 23:17:47 +02:00
Mika Lindqvist	e88471f379	Fix pointer type mismatch.	2025-04-06 15:13:26 +02:00
Adam Stylinski	724dc0cfb4	Explicit SSE2 vectorization of Chorba CRC method The version that's currently in the generic implementation for 32768 byte buffers leverages the stack. It manages to autovectorize but unfortunately the trips to the stack hurt its performance for CPUs which need this the most. This version is explicitly SIMD vectorized and doesn't use trips to the stack. In my testing it's ~10% faster than the "small" variant, and about 42% faster than the "32768" variant.	2025-03-28 20:43:59 +01:00
Adam Stylinski	18b933b88a	Fix a bug on the 32k and greater chorba specializations In testing a SIMD vectorization for this, I wrote a gtest which stumbled onto the fact that this had a bug on big endian. Before the initial CRC had been mixed in it needed to be byte swapped.	2025-03-28 15:29:11 +01:00
Adam Stylinski	50e9ca06e2	Fold a copy into the adler32 function for UPDATEWINDOW for neon So a lot of alterations had to be done to make this not worse and so far, it's not really better, either. I had to force inlining for the adler routine, I had to remove the x4 load instruction otherwise pipelining stalled, and I had to use restrict pointers with a copy idiom for GCC to inline a copy routine for the tail. Still, we see a small benefit in benchmarks, particularly when done with size of our window or larger. There's also an added benefit that this will fix #1824.	2025-03-05 22:17:55 +01:00
Hans Kristian Rosbach	f411580733	Clean up internal crc32 function handling. Mark crc32_c and crc32_braid functions as internal, and remove prefix. Reorder contents of generic_functions, and remove Z_INTERNAL hints from declarations. Add test/benchmark output to indicate whether Chorba is used.	2025-02-18 23:59:16 +01:00
Sam Russell	b33ba962c2	implement chorba algorithm	2025-02-15 14:31:50 +01:00
Cameron Cawley	721c488aff	Rename most ACLE references to ARMv8	2025-02-12 13:54:30 +01:00
Hans Kristian Rosbach	a3c0430afa	Fix -Wmaybe-uninitialized warnings in benchmarks.	2025-02-01 12:41:22 +01:00
Hans Kristian Rosbach	057104f11a	Add uncompress benchmark	2025-02-01 12:41:22 +01:00
Mika Lindqvist	7701ce9f21	[abicheck] Regenerate ABI files for zlib * Generate using Ubuntu 24.04.1 LTS to fix mismatch in function signatures of gzseek() and gztell()	2025-01-27 12:37:45 +01:00
Hans Kristian Rosbach	bf05e882b8	Continued cleanup of old UNALIGNED_OK checks - Remove obsolete checks - Fix checks that are inconsistent - Stop compiling compare256/longest_match variants that never gets called - Improve how the generic compare256 functions are handled. - Allow overriding OPTIMAL_CMP This simplifies the code and avoids having a lot of code in the compiled library than can never get executed.	2024-12-26 22:14:46 +01:00
Hans Kristian Rosbach	1aeb2915a0	Rename functions to get rid of old and now misleading "unaligned" naming	2024-12-26 22:14:46 +01:00
Adam Stylinski	06bba67470	Fix unaligned access in ACLE based crc32 This fixes a rightful complaint from the alignment sanitizer that we alias memory in an unaligned fashion. A nice added bonus is that this improves performance a tiny bit on the larger buffers, perhaps due to loops that idiomatically decrement a count and increment a single buffer pointer rather than the maze of conditional pointer reassignments. While here, let's write a unit test just for this. Since this is the only variant that accesses memory in a potentially unaligned fashion that doesn't explicitly go byte by byte or use intrinsics that don't require alignment, we'll enable it only for this function for now. Adding more tests later if need be should be possible. For everything else not crc, we're relying on ubsan to hopefully catch things by chance.	2024-12-23 14:06:35 +01:00
Hans Kristian Rosbach	509f6b5818	Since we long ago make unaligned reads safe (by using memcpy or intrinsics), it is time to replace the UNALIGNED_OK checks that have since really only been used to select the optimal comparison sizes for the arch instead.	2024-12-21 00:46:48 +01:00
Adeel Mujahid	4fa76be6c0	Fix typos (#1825 )	2024-12-20 23:35:50 +01:00
Hans Kristian Rosbach	037ab0fd35	Revert "Since we long ago make unaligned reads safe (by using memcpy or intrinsics)," This reverts commit `80fffd72f3`. It was mistakenly pushed to develop instead of going through a PR and the appropriate reviews.	2024-12-17 23:09:31 +01:00
Hans Kristian Rosbach	80fffd72f3	Since we long ago make unaligned reads safe (by using memcpy or intrinsics), it is time to replace the UNALIGNED_OK checks that have since really only been used to select the optimal comparison sizes for the arch instead.	2024-12-17 23:02:32 +01:00
Bradley Lowekamp	11bef87c1c	Address deprecated cmake version warning. Use cmake_minimum_required(VERSION <min>...<policy_max>) syntax to set the policy at the same time as the compatibile CMake version.	2024-12-07 22:31:37 +01:00
Pavel P	7fdc3aa26a	Fix casting warning/error in test_compress_bound.cc Fixes the following error when building with msvc compiler ``` test_compress_bound.cc D:\zlib-ng\test\test_compress_bound.cc(41,50): error C2220: the following warning is treated as an error D:\zlib-ng\test\test_compress_bound.cc(41,50): warning C4267: 'argument': conversion from 'size_t' to 'unsigned long', possible loss of data D:\zlib-ng\test\test_compress_bound.cc(43,68): warning C4267: 'argument': conversion from 'size_t' to 'unsigned long', possible loss of data ```	2024-12-01 16:04:33 +01:00
FantasqueX	8d10c309d7	Explicitly set CMake policy 0169 to silence warning The recommended `FetchContent_MakeAvailable()` is introduced in CMake 3.14 which is greater than `cmake_minimum_required()`. CMake policy will effects subdirectories. The `cmake_minimum_required(VERSION)` command implicitly calls `cmake_policy(VERSION)`. Closes https://github.com/zlib-ng/zlib-ng/issues/1788	2024-09-27 16:21:00 +02:00
Mika Lindqvist	13d0a89496	Force Visual C++ to treat source files as UTF-8.	2024-09-25 14:19:05 +02:00
Mika Lindqvist	8e19f15a78	[CI] Don't try to use macOS 11 as it's no longer supported.	2024-09-20 13:08:00 +02:00
Vladislav Shchapov	ce93943af0	Allow overridde CMAKE_CXX_STANDARD, CMAKE_CXX_STANDARD_REQUIRED, CMAKE_CXX_EXTENSIONS variables for tests and benchmarks. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-09-18 10:05:43 +02:00
Mika Lindqvist	5b04d9ce04	Enable warning C4242 and treat warnings as errors for Visual C++.	2024-08-22 16:47:59 +02:00
Vladislav Shchapov	e580ebdcdc	Enabled orphaned unit tests for compare256_rle family of functions. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-05-30 18:15:08 +02:00
Hans Kristian Rosbach	5b208676f8	Clean up memory allocation functions that are no longer used, and its tests. Co-authored-by: Ilya Leoshkevich <iii@linux.ibm.com>	2024-05-28 16:35:13 +02:00
Hans Kristian Rosbach	63e1d460aa	Rewrite inflate memory allocation. Inflate used to allocate state during init, but window would be allocated when/if needed and could be resized and that required a new free/alloc round. - Now, we allocate state and a 32K window during init, allowing the latency cost of allocs to be done during init instead of at one or more times later. - Total memory allocation is about the same when requesting a 32K window, but if now window or a smaller window was requested, then it is an increase. - While doing alloc(), we now store pointer to corresponding free(), avoiding crashes with applications that incorrectly set alloc/free pointers after running init function. - After init has succeeded, inflate will no longer possibly fail due to a failing malloc. Co-authored-by: Ilya Leoshkevich <iii@linux.ibm.com>	2024-05-28 16:35:13 +02:00
Ilya Leoshkevich	af7c6cad18	Add back-and-forth inflateCopy() test Check that calling inflateCopy() twice does not result in memory corruption.	2024-05-21 16:27:53 +02:00
Vladislav Shchapov	4a99dd0d71	Use actual compressed length Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-05-20 21:42:31 +02:00
Adam Stylinski	80c541b743	Fix a name conflict when compiled with zlib-compat The benchmarks fail to compile properly when built with ZLIB_COMPAT. The name of the class cannot have the same name as the function.	2024-05-19 23:19:04 +02:00
Tulio Magno Quites Machado Filho	1a15c4b20e	Fix illegal instruction usage in Xeon Phi x200 processors The Xeon Phi x200 family of processors (Knights Landing) supports AVX512 (F, CD, ER, PF) but does not support AVX512 (VL, DQ, BW). Because of processors like this, the Intel Software Developer's Manual suggests the bits AVX512 (DQ,BW,VL) are also tested in EBX together with AVX512F before deciding to run AVX512 (DQ,BW,VL) instructions. This also adds a new x86 feature called avx512_common that indicates that AVX512 (F,DQ,BW,VL) are all available and start using this for both adler32_avx512 and crc32_vpclmulqdq implementations because they are both built with -mavx512dq -mavx512bw -mavx512vl. This has been reported downstream as https://bugzilla.redhat.com/show_bug.cgi?id=2280347 .	2024-05-19 12:25:01 +02:00
Hans Kristian Rosbach	a31e6ddecf	Add small compress() benchmark	2024-05-15 11:28:44 +02:00
Vladislav Shchapov	5401b24a16	Allow disabling runtime CPU features detection in tests and benchmarks Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-04-04 13:47:02 +02:00
Vladislav Shchapov	ba9b3cdb61	Rename cpu_functions.h to arch_functions.h. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-02-22 20:11:46 +01:00
Vladislav Shchapov	ac25a2ea6a	Split CPU features checks and CPU-specific function prototypes and reduce include-dependencies. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-02-22 20:11:46 +01:00
Nathan Moinvaziri	9d33c8163d	Use zng_alloc_aligned in unit tests to prevent having to use C++17. alloc_aligned when using in C++ requires C++17 standard. zutil_p.h include removed from test_crc32 since it was causing the same issue and was not really needed.	2024-02-07 19:16:28 +01:00
Mark Adler	4fe59efbe0	zlib 1.3.1 madler/zlib#51b7f2abdade71cd9bb0e7a373ef2610ec6f9daf	2024-02-07 19:15:56 +01:00
Xin LI	41ed14070b	Make internal functions static in the test code. To avoid warnings when building with -Wmissing-prototypes. madler/zlib#bd9c329c1055a9265812352655ed2eec93f36e92	2024-02-07 19:15:56 +01:00
Hans Kristian Rosbach	9d66cec2fb	Add more fine-grained small-length benchmarking of adler32 and crc32.	2024-02-02 14:00:18 +01:00
Nathan Moinvaziri	0417cd24f9	Simplify architecture #ifdef in test_crc32.cc.	2024-01-28 00:27:43 +01:00
Nathan Moinvaziri	59745cdb03	Add VPCLMULQDQ crc32 tests to Google benchmarks	2024-01-28 00:27:43 +01:00
Vladislav Shchapov	1aa53f40fc	Improve x86 intrinsics dependencies. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-01-25 10:21:49 +01:00
Mika Lindqvist	9053e43dde	Fix building tests with cmake 3.10 and older.	2024-01-10 14:42:54 +01:00

1 2 3 4 5 ...

390 Commits