zlib-ng

mirror of https://github.com/GerbilSoft/zlib-ng.git synced 2025-06-18 19:45:37 -04:00

Author	SHA1	Message	Date
yintong	10b51fa592	riscv: add crc32 optimization using zbc extension Some checks failed Configure / ${{ matrix.name }} (gcc, --warn, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --sprefix=zTest_, macOS GCC Symbol Prefix, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --warn, macOS GCC, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (gcc-11, --zlib-compat --sprefix=zTest_, macOS GCC Symbol Prefix & Compat, macos-13, gcc@11) (push) Has been cancelled Details Configure / ${{ matrix.name }} (mips-linux-gnu, mips-linux-gnu-gcc, --warn, Ubuntu GCC MIPS, ubuntu-latest, qemu-user gcc-mips-linux-gnu libc-dev-mips-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (mips64-linux-gnuabi64, mips64-linux-gnuabi64-gcc, --warn, Ubuntu GCC MIPS64, ubuntu-latest, qemu-user gcc-mips64-linux-gnuabi64 libc-dev-mips64-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (powerpc-linux-gnu, powerpc-linux-gnu-gcc, --warn --without-power8, Ubuntu GCC PPC No Power8, ubuntu-latest, qemu-user gcc-powerpc-linux-gnu libc-dev-powerpc-cross) (push) Has been cancelled Details Configure / ${{ matrix.name }} (powerpc64le-linux-gnu, powerpc64le-linux-gnu-gcc, --warn, Ubuntu GCC PPC64LE, ubuntu-latest, qemu-user gcc-powerpc64le-linux-gnu libc-dev-ppc64el-cross) (push) Has been cancelled Details OSS-Fuzz / Fuzzing (push) Has been cancelled Details Libpng / Ubuntu Clang (push) Has been cancelled Details Link / Link zlib (push) Has been cancelled Details Link / Link zlib-ng compat (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DCMAKE_TOOLCHAIN_FILE=../../cmake/toolchain-aarch64.cmake, ubuntu_gcc_pigz_aarch64, Ubuntu GCC AARCH64, ubuntu-latest, qemu-user gcc-aarch64-linux-gnu libc-dev-arm64-cross) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DWITH_OPTIM=OFF, ubuntu_clang_pigz_no_optim, clang, llvm-cov-15 gcov, Ubuntu Clang No Optim, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DWITH_THREADS=OFF -DPIGZ_VERSION=v2.6, ubuntu_clang_pigz_no_threads, clang, llvm-cov-15 gcov, Ubuntu Clang No Threads, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (-DZLIB_SYMBOL_PREFIX=zTest_, ubuntu_gcc_pigz, gcc, Ubuntu GCC Symbol Prefix, ubuntu-latest) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (ubuntu_clang_pigz, clang, llvm-cov-15 gcov, Ubuntu Clang, ubuntu-latest, llvm-15 llvm-15-tools) (push) Has been cancelled Details Pigz / ${{ matrix.name }} (ubuntu_gcc_pigz, gcc, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (-DZLIB_SYMBOL_PREFIX=zTest_, clang, --sprefix=zTest_, clang++, macOS Clang Symbol Prefix, macOS-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (-m32, -DCMAKE_C_FLAGS=-m32 -DCMAKE_CXX_FLAGS=-m32, gcc, g++, -m32, -m32, Ubuntu GCC -m32, ubuntu-latest, gcc-multilib g++-multilib) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (aarch64-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-aarch64.cmake, aarch64-linux-gnu-gcc, aarch64-linux-gnu-g++, Ubuntu GCC AARCH64, ubuntu-latest, qemu-user gcc-aarch64-linux-gnu g++-aarch64-linux-gnu libc6-dev-arm64-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (arm-linux-gnueabihf, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-armhf.cmake, arm-linux-gnueabihf-gcc, arm-linux-gnueabihf-g++, Ubuntu GCC ARM HF, ubuntu-latest, qemu-user gcc-arm-linux-gnueabihf g++-arm-linux-gnueabihf libc6-dev-armhf-c… (push) Has been cancelled Details Package Check / ${{ matrix.name }} (clang, clang++, macOS Clang, macOS-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (gcc, g++, Ubuntu GCC, ubuntu-latest) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (mips-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-mips.cmake, mips-linux-gnu-gcc, mips-linux-gnu-g++, Ubuntu GCC MIPS, ubuntu-latest, qemu-user gcc-mips-linux-gnu g++-mips-linux-gnu libc6-dev-mips-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (mips64-linux-gnuabi64, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-mips64.cmake, mips64-linux-gnuabi64-gcc, mips64-linux-gnuabi64-g++, Ubuntu GCC MIPS64, ubuntu-latest, qemu-user gcc-mips64-linux-gnuabi64 g++-mips64-linux-gnuabi64 libc6-… (push) Has been cancelled Details Package Check / ${{ matrix.name }} (powerpc-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-powerpc.cmake, powerpc-linux-gnu-gcc, powerpc-linux-gnu-g++, Ubuntu GCC PPC, ubuntu-latest, qemu-user gcc-powerpc-linux-gnu g++-powerpc-linux-gnu libc6-dev-powerpc-cross) (push) Has been cancelled Details Package Check / ${{ matrix.name }} (powerpc64le-linux-gnu, -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain-powerpc64le.cmake, powerpc64le-linux-gnu-gcc, powerpc64le-linux-gnu-g++, Ubuntu GCC PPC64LE, ubuntu-latest, qemu-user gcc-powerpc64le-linux-gnu g++-powerpc64le-linux-gnu … (push) Has been cancelled Details CMake / Upload Coverage Reports (push) Has been cancelled Details Pigz / Upload Coverage Reports (push) Has been cancelled Details	2025-04-27 18:23:50 +02:00
Adam Stylinski	46fc33f39d	SSE4.1 optimized chorba This is ~25-30% faster than the SSE2 variant on a core2 quad. The main reason for this has to do with the fact that, while incurring far fewer shifts, an entirely separate stack buffer has to be managed that is the size of the L1 cache on most CPUs. This was one of the main reasons the 32k specialized function was slower for the scalar counterpart, despite auto vectorizing. The auto vectorized loop was setting up the stack buffer at unaligned offsets, which is detrimental to performance pre-nehalem. Additionally, we were losing a fair bit of time to the zero initialization, which we are now doing more selectively. There are a ton of loads and stores happening, and for sure we are bound on the fill buffer + store forwarding. An SSE2 version of this code is probably possible by simply replacing the shifts with unpacks with zero and the palignr's with shufpd's. I'm just not sure it'll be all that worth it, though. We are gating against SSE4.1 not because we are using specifically a 4.1 instruction but because that marks when Wolfdale came out and palignr became a lot faster.	2025-04-15 14:11:12 +02:00
Cameron Cawley	231c4b3a64	Use -Wa,-march with older ARM toolchains	2025-02-12 13:54:30 +01:00
Cameron Cawley	7ea78f12c8	Provide an inline asm fallback for the ARMv8 intrinsics	2025-02-12 13:54:30 +01:00
Cameron Cawley	721c488aff	Rename most ACLE references to ARMv8	2025-02-12 13:54:30 +01:00
Adam Stylinski	7020cb3f74	Enable AVX2 functions to be built with BMI2 instructions While these are technically different instructions, no such CPU exists that has AVX2 that doesn't have BMI2. Enabling BMI2 allows us to eliminate several flag stalls by having flagless versions of shifts, and allows us to not clobber and move around GPRs so much in scalar code. There's usually a sizeable benefit for enabling it. Since we're building with BMI2 for AVX2 functions, let's also just make sure the CPU claims to support it (just to cover our bases).	2024-12-07 22:32:29 +01:00
Adam Stylinski	0ed5ac8289	Make an AVX512 inflate fast with low cost masked writes This takes advantage of the fact that on AVX512 architectures, masked moves are incredibly cheap. There are many places where we have to fallback to the safe C implementation of chunkcopy_safe because of the assumed overwriting that occurs. We're to sidestep most of the branching needed here by simply controlling the bounds of our writes with a mask.	2024-11-20 22:14:44 +01:00
Alexander Smorkalov	4549279dbf	Fixed false positive HAVE_ARMV6_INTRIN value on old ARM platforms.	2024-09-11 12:40:39 +02:00
Ilya Leoshkevich	f858914696	IBM zSystems: Hardcode HWCAP_S390_VXRS Compiling zlib-ng with glibc 2.17 (minimum version still supported by crosstool-ng) fails due to the lack of HWCAP_S390_VX - it was introduced in glibc 2.23. Strictly speaking, this is a problem with the feature detection logic in cmake. However, it's not worth disabling the s390x vectorized CRC32 if the hwcap constant is missing and the compiler intrinsics are available. So fix by hardcoding the constant. It's a part of the kernel ABI, which does not change.	2024-08-16 11:52:11 +02:00
Un1q32	c5b4b35106	Improved ACLE check (#1727 ) Co-authored-by: Cameron Cawley <ccawley2011@gmail.com>	2024-06-13 13:23:29 +02:00
Mika Lindqvist	93b870fbef	Add test for checking if -march=native needs -mfpu=neon for 32-bit ARM.	2024-02-24 14:40:52 +01:00
Mika Lindqvist	ca0e4634e1	Fix PCLMULQDQ support for IntelLLVM.	2024-02-21 11:52:25 +01:00
Mika T. Lindqvist	9d945f0d71	Fix xsave intrinsic test for clang, and gcc 8.2 or later, and icc.	2024-02-18 10:10:45 +01:00
Vladislav Shchapov	00e06ab5e1	Allow overwrite NATIVEFLAG value by option NATIVE_ARCH_OVERRIDE. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-02-18 10:08:45 +01:00
Mika Lindqvist	598128f5d1	Fix regression caused by `2fa631e029` * POWER8/9 feature checks were enabled even if the toolchain didn't support AT_HWCAP2 * Add detection if we need to include <linux/auxvec.h>	2024-01-30 20:49:32 +01:00
Vladislav Shchapov	1aa53f40fc	Improve x86 intrinsics dependencies. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-01-25 10:21:49 +01:00
Vladislav Shchapov	44e6bfcc5b	Remove unused macro X86_MASK_INTRIN. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-01-25 10:21:49 +01:00
Mika Lindqvist	b7fc54ef87	Make sure uqsub16 mnemonic doesn't get optimized away.	2023-12-25 20:44:58 +01:00
Hans Kristian Rosbach	0b080ede77	Always run CMake tests without LTO.	2023-12-24 16:01:42 +01:00
Yoshiki Matsuda	1003ae6b6a	Fix clang-cl warnings	2023-11-28 10:25:13 +01:00
Hajin Jang	f9228d8475	Support llvm-mingw toolchain zlib-ng requires some patches to make it compilable on LLVM-mingw. 1. Add -Wno-pedantic-ms-format only if a toolchain is MinGW GCC. - llvm-mingw does not support it, causing build to break. 2. Include arm_neon.h instead of arm64_neon.h (aarch64 only). - arm64_neon.h is MSVC only. - GCC, Clang does not have arm64_neon.h but arm_neon.h on aarch64. - Also applied to configure and detect-instrinsics.cmake	2023-09-28 00:15:12 +02:00
Nathan Moinvaziri	31497b545c	Don't run test intrinsic code with native flag in CMake. Native flag should already determine what code will run on the architecture. This appears to have just been an extra run check with limited benefits. Any compiler that compiles code not available on the native platform is buggy and not our problem.	2023-09-19 17:32:07 +02:00
Deniz Bahadir	3eb7cd2d8a	Match CMAKE_GENERATOR_TOOLSET variable case-insensitive The Visual Studio CMake generator allows to select different toolsets. One of these toolsets is Clang-Cl. However, the generator does accept the toolset name case-agnostic, so it could be "ClangCl", but also "Clangcl" or "clangcl" or ... This value will be stored verbatim in variable CMAKE_GENERATOR_TOOLSET by CMake. Therefore, this variable must be matched case-insensitive, which is what this commit does. fixes: #1576 Signed-off-by: Deniz Bahadir <deniz@code.bahadir.email>	2023-09-16 11:12:01 +02:00
Cameron Cawley	16fe1f885e	Add ARMv6 version of slide_hash	2023-09-16 11:11:18 +02:00
Cameron Cawley	1c1e728637	Use GCC cpuid intrinsics with MinGW	2023-09-16 11:08:25 +02:00
Nathan Moinvaziri	7ecbaa25fc	Use consistent NEON_AVAILABLE variable across CMake/configure.	2023-09-13 11:55:01 +02:00
Harmen Stoppels	ca2d4e5adc	cast _xgetbv to int to silence conversion warning	2023-09-13 11:54:42 +02:00
Harmen Stoppels	120fe069d3	Do the same for detect-intrinsics.cmake	2023-09-13 11:54:42 +02:00
Nathan Moinvaziri	ca7573297a	Clean up extra whitespaces at line endings in check_rvv_intrinsics.	2023-08-13 17:53:01 +02:00
Nathan Moinvaziri	c7d98c239a	Remove inert check for HAVE_ACLE_FLAG in check_acle_compiler_flag.	2023-08-13 17:53:01 +02:00
Hans Kristian Rosbach	4894be9c93	Move check_c_source_compile_or_run cmake macro to the only place it is used.	2023-08-06 10:17:24 +02:00
Hans Kristian Rosbach	2167377c46	Clean up SSE4.2 support, and no longer use asm fallback or gcc builtin. Defines changing meaning: X86_SSE42 used to mean the compiler supports crc asm fallback. X86_SSE42_CRC_INTRIN used to mean compiler supports SSE4.2 intrinsics. X86_SSE42 now means compiler supports SSE4.2 intrinsics. This therefore also fixes the adler32_sse42 checks, since those were depending on SSE4.2 intrinsics but was mistakenly checking the X86_SSE42 define. Now the X86_SSE42 define actually means what it appears to.	2023-08-06 10:17:24 +02:00
David Korth	8976caa3f0	Handle ARM64EC as ARM64. ARM64EC is a new ARM64 variant introduced in Windows 11 that uses an ABI similar to AMD64, which allows for better interoperability with emulated AMD64 applications. When enabled in MSVC, it defines _M_AMD64 and _M_ARM64EC, but not _M_ARM64, so we need to check for _M_ARM64EC.	2023-07-16 12:42:38 +02:00
Mika T. Lindqvist	7cda3bf660	Use endianess-specific built-in function for gcc < 12 on PowerPC64 * Add support for cross-compiling using clang 13 and later for PowerPC64 little-endian and big-endian * Fix detection for availability of Power9 intrinsics	2023-06-23 19:43:34 +02:00
Hans Kristian Rosbach	362945baec	Fix the same AVX512 error in CMake.	2023-05-13 22:57:47 +02:00
Hans Kristian Rosbach	f2da905287	Fix AVX512-VNNI compile flags.	2023-05-13 20:15:00 +02:00
Alex Chiang	c3cdf434f3	Add supporting RISC-V cross compilation workflows Add RISC-V cross-compilation test Enable RVV support at compile time	2023-05-12 16:57:32 +02:00
Cameron Cawley	b1aafe5c67	Clean up SSE4.2 detection	2023-04-15 15:22:36 +02:00
Cameron Cawley	b09215f75a	Enable use of _mm_shuffle_epi8 on machines without SSE4.1	2023-04-01 17:27:49 +02:00
Georgiy Manuilov	a4d9d697b3	Enable using AVX512 intrinsics with GCC <9 Replace missing '_mm512_set_epi8' with '_mm512_set_epi32' in test code for configuring; Add fallback for '-mtune=cascadelake' flag used when AVX512 is enabled.	2023-03-28 20:36:19 +02:00
Ilya Leoshkevich	b8c2114d51	IBM zSystems: Use HWCAP_S390_VXRS glibc defines HWCAP_S390_VX and, since v2.33, its alias HWCAP_S390_VXRS; musl has only HWCAP_S390_VXRS. Use the common HWCAP_S390_VXRS, define it as HWCAP_S390_VX if necessary.	2023-03-10 13:14:09 +01:00
Mika Lindqvist	b892331cf7	Fix MinGW build * Add detection of XSAVE intrinsics	2023-02-02 17:34:12 +01:00
Dimitri Papadopoulos	9119de005b	Fix typo found by codespell	2023-02-02 16:44:00 +01:00
Piotr Kubaj	0a59b4e745	Add FreeBSD/powerpc* support to cmake/detect-intrinsics.cmake	2023-01-13 20:23:15 +01:00
Vladislav Shchapov	b57e10d316	Fix AVX2 detect Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2022-10-11 21:25:02 +02:00
Hans Kristian Rosbach	6490b70c48	vpclmulqdq compilation fails without avx512f also enabled	2022-10-09 11:36:03 +02:00
Shawn Hoffman	ece74eec32	msvc/armv7: disable crc32_acle msvc compiler targeting 32bit arm supports only armv7 and lacks these intrinsics	2022-09-26 20:09:53 +02:00
Shawn Hoffman	8098fde200	fix ACLE detection on msvc/arm64	2022-09-05 11:26:37 +02:00
Mika Lindqvist	c62b35ffac	[ARM] We need to include NEON headers when testing for -mfpu=neon. * If -mfpu is already specified in C_FLAGS, it can disable NEON support.	2022-06-02 12:25:24 +02:00
Matheus Castanho	02d10b252c	Implement power9 version of compare256. Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>	2022-05-07 14:06:42 +02:00

1 2

67 Commits