Commit Graph

79 Commits

Author SHA1 Message Date
Nathan Moinvaziri
dee0ff75f8 Remove NMake build projects
Reduce development burden by getting rid of NMake files that are manually
kept up to date. For continued NMake support please generate NMake project
files using CMake.
2025-04-14 23:18:18 +02:00
Cameron Cawley
721c488aff Rename most ACLE references to ARMv8 2025-02-12 13:54:30 +01:00
Hans Kristian Rosbach
509f6b5818 Since we long ago make unaligned reads safe (by using memcpy or intrinsics),
it is time to replace the UNALIGNED_OK checks that have since really only been
used to select the optimal comparison sizes for the arch instead.
2024-12-21 00:46:48 +01:00
Hans Kristian Rosbach
037ab0fd35 Revert "Since we long ago make unaligned reads safe (by using memcpy or intrinsics),"
This reverts commit 80fffd72f3.
It was mistakenly pushed to develop instead of going through a PR and the appropriate reviews.
2024-12-17 23:09:31 +01:00
Hans Kristian Rosbach
80fffd72f3 Since we long ago make unaligned reads safe (by using memcpy or intrinsics),
it is time to replace the UNALIGNED_OK checks that have since really only been
used to select the optimal comparison sizes for the arch instead.
2024-12-17 23:02:32 +01:00
Vladislav Shchapov
c694bcdaf6 Add option to disable runtime CPU detection
Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
2024-03-06 23:32:15 +01:00
Hans Kristian Rosbach
9953f12e21 Move update_hash(), insert_string() and quick_insert_string() out of functable
and remove SSE4.2 and ACLE optimizations. The functable overhead is higher
than the benefit from using optimized functions.
2024-02-23 13:34:10 +01:00
Mika Lindqvist
4abe8881d7 [README] configure no longer supports --native 2023-10-31 12:31:10 +01:00
Hans Kristian Rosbach
af303eab2e Fix status badges 2023-10-25 21:59:50 +02:00
Nathan Moinvaziri
57a2ed9e50 Added instructions for cpack to readme. 2023-09-19 17:32:25 +02:00
Cameron Cawley
16fe1f885e Add ARMv6 version of slide_hash 2023-09-16 11:11:18 +02:00
alexsifivetw
de1b640ffb Optimize compare256 with rvv 2023-06-13 12:25:48 +02:00
Hans Kristian Rosbach
3d713fc48c Update README.md 2023-05-13 20:15:16 +02:00
Alex Chiang
c3cdf434f3 Add supporting RISC-V cross compilation workflows
Add RISC-V cross-compilation test
Enable RVV support at compile time
2023-05-12 16:57:32 +02:00
Cameron Cawley
b09215f75a Enable use of _mm_shuffle_epi8 on machines without SSE4.1 2023-04-01 17:27:49 +02:00
Hans Kristian Rosbach
17d98072a9 Remove FORCE_TZCNT/X86_NOCHECK_TZCNT 2023-02-07 16:25:46 +01:00
Mika T. Lindqvist
d5db5aa985 Sync with zlib 1.2.13 and declare compatibility. 2023-02-03 15:49:02 +01:00
Cameron Cawley
43fd141840 Allow gtest_zlib to be manually disabled 2023-01-09 15:10:53 +01:00
Ilya Leoshkevich
e63f36b1cf Introduce ZLIBNG_ENABLE_TESTS
This patch adds the ability to run zlib-ng test suite against the
original zlib as follows:

    cmake -DZLIB_COMPAT=ON -DZLIBNG_ENABLE_TESTS=OFF .
    make
    LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu make test

The benefit of this is that modifications to the original zlib can be
tested with a more extensive zlib-ng's testsuite, and the assumptions
that the zlib-ng tests make can be validated against the original zlib.

In addition to a number of tests that exercise purely zlib-ng specific
API, there are a few that expect zlib-ng specific behavior from the
original zlib API:

- deflate() (obviously) emits different streams
- zlib-ng's deflatePrime() can take more than 16 bits
- zVersion() returns a different string

Adjust or disable the respective tests for ZLIBNG_ENABLE_TESTS=OFF.
2022-11-01 13:25:19 +01:00
Cameron Cawley
3934621295 Fix typo in README.md 2022-10-23 14:54:52 +02:00
FrankXie
ad444f4715 format Vcpkg 2022-09-05 11:27:25 +02:00
FrankXie
8b7a69b641 format and add vcpkg headings. 2022-09-05 11:27:25 +02:00
FrankXie
03c989a105 Add vcpkg installation instructions 2022-09-05 11:27:25 +02:00
Nathan Moinvaziri
8df6650059 Remove ZLIB_DUAL_LINK option to simplify dual link tests. 2022-09-05 11:25:25 +02:00
Nathan Moinvaziri
d43822b9a7 zlib 1.2.12 2022-06-13 15:58:03 +02:00
Matheus Castanho
02d10b252c Implement power9 version of compare256.
Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>
2022-05-07 14:06:42 +02:00
Nathan Moinvaziri
48f346e806 Implement neon version of compare256.
Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
2022-05-06 12:19:35 +02:00
Nathan Moinvaziri
81d91d6f67 Remove sanitizer support from configure since it is better supported in cmake. Anybody who still needs it can use cmake or manually set CFLAGS and LDFLAGS. 2022-04-05 13:43:48 +02:00
Shlomi Fish
f98729de07 Grammar fixes 2022-04-03 16:29:10 +02:00
Nathan Moinvaziri
4284386103 Remove support for building fuzzers from configure. 2022-04-03 16:28:58 +02:00
Nathan Moinvaziri
c78c835a49 Make unaligned access being disabled configurable via build scripts. 2022-03-17 11:03:26 +01:00
Nathan Moinvaziri
e38c493337 Move UNALIGNED_OK detection to compile time instead of configure time. 2022-03-17 11:03:26 +01:00
Mika Lindqvist
15e5df63a5 [README] Add missing FORCE_SSE2 for CMake. 2022-03-16 11:43:09 +01:00
Mika Lindqvist
db3feb4cf2 Allow bypassing runtime feature check of TZCNT instructions.
* This avoids conditional branch when it's known at build time that TZCNT instructions are always supported
2022-03-16 11:43:09 +01:00
Adam Stylinski
b3260fd0c8 Axe the SSE4 compare256 functions 2022-02-11 09:56:19 +01:00
Nathan Moinvaziri
30c89988e2 Fixed incorrect version of AVX specified for inflate chunk copying in feature list. 2022-01-17 09:13:18 +01:00
Nathan Moinvaziri
e820a76cc9 Merge feature list entries for crc32. 2022-01-17 09:13:18 +01:00
Nathan Moinvaziri
6f179fd301 Added adler32, compare256, crc32, and slide_hash benchmarks using Google Benchmark.
Co-authored-by: Adam Stylinski <kungfujesus06@gmail.com>
2022-01-17 09:10:02 +01:00
Nathan Moinvaziri
66506ace8d Convert compare258 to compare256 and moved 2 byte check into deflate_quick. Prevents having multiple compare258 functions with 2 byte checks. 2022-01-16 17:30:15 +01:00
Nathan Moinvaziri
f20f9b610c VPCLMULQDQ implementation for Intel's CRC32 folding.
Based on PR https://github.com/jtkukunas/zlib/pull/28.

Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>
2022-01-09 21:27:42 +01:00
Adam Stylinski
49d1704a18 Added an SSE4 optimized adler32 checksum
This variant uses the lower number of cycles psadw insruction in place
of pmaddubsw for the running sum that does not need multiplication.

This allows this sum to be done independently, partially overlapping the
running "sum2" half of the checksum.  We also have moved the shift
outside of the loop, breaking a small data dependency chain. The code
also now does a vectorized horizontal sum without having to rebase to
the adler32 base, as NMAX is defined as the maximum number of scalar
sums that can be peformed, so we're actually safe in doing this without
upgrading to higher precision.  We can do a partial horizontal sum
because psadw only ends up accumulating 16 bit words in 2 vector lanes,
the other two can safely be assumed as 0.
2022-01-08 19:27:28 +01:00
Adam Stylinski
46031f5cde Have functioning avx512{,_vnni} adler32
The new adler32 checksum uses the VNNI instructions with appreciable
gains when possible. Otherwise, a pure avx512f variant exists which
still gives appreciable gains.
2022-01-08 13:55:35 +01:00
Dženan Zukić
a7b773697b Fix minor formatting issues
From ITK PR: https://github.com/InsightSoftwareConsortium/ITK/pull/2803
CI check: https://github.com/InsightSoftwareConsortium/ITK/runs/3864083025

commit 5434d42 adds bad whitespace:
README.md:223: new blank line at EOF.

commit 5434d42 is not allowed; missing newline at the end of file in .gitattributes.
2021-10-13 15:43:53 +02:00
Mika Lindqvist
8d6816604a Add AltiVec (VMX) to supported intrinsics for adler32 and slide_hash. 2021-08-11 12:02:55 +02:00
Ilya Leoshkevich
0573840dd0 IBM Z: Add vectorized CRC32 implementation
While DFLTCC takes care of accelerating compression on level 1, other
levels can be sped up too by computing CRC32 using various vector
instructions.

Take the Linux kernel assembly code that does that - its original
author (Hendrik Brueckner) works for IBM at the time of writing and has
allowed reusing the code under the zlib license. Rewrite it in C for
better maintainability, but keep the original structure, variable names
and comments.

Update the documentation.

Add CI configurations.
2021-07-07 19:54:01 +02:00
Mika Lindqvist
564d473c6d [Power8] Add chunk*_power8. 2021-06-25 20:38:14 +02:00
Hans Kristian Rosbach
40a12fe994 Change requested compiler standard to C11 2021-06-18 09:24:35 +02:00
Hans Kristian Rosbach
aeffa9bbf4 Remove misleading manpage.
README.md changes:
- Added a related projects section at the end.
- Added blank line after header where missing.
- Added extra blank line before header to make them easier to spot as plain-text.
- Changed line-length for Contributing section, to make it more readable as plain-text.
2021-03-09 16:44:07 +01:00
Hans Kristian Rosbach
f392871698 Fix incorrect --force-sse2 info in README.md
Describe DFLTCC options more similarly to the others.
2021-02-25 13:05:04 +01:00
Nathan Moinvaziri
ee295855b9 Separate sanitizers so they can be run independently. 2020-12-12 10:17:44 +01:00