zlib-ng

mirror of https://github.com/GerbilSoft/zlib-ng.git synced 2025-06-19 03:55:39 -04:00

Author	SHA1	Message	Date
Adam Stylinski	50e9ca06e2	Fold a copy into the adler32 function for UPDATEWINDOW for neon So a lot of alterations had to be done to make this not worse and so far, it's not really better, either. I had to force inlining for the adler routine, I had to remove the x4 load instruction otherwise pipelining stalled, and I had to use restrict pointers with a copy idiom for GCC to inline a copy routine for the tail. Still, we see a small benefit in benchmarks, particularly when done with size of our window or larger. There's also an added benefit that this will fix #1824.	2025-03-05 22:17:55 +01:00
Cameron Cawley	721c488aff	Rename most ACLE references to ARMv8	2025-02-12 13:54:30 +01:00
Adam Stylinski	785444de08	Fix native detection of CRC instruction It's unclear if raspberry pi OS's shipped GCC doesn't properly detect ACLE or not (/proc/cpuinfo claims to support AES), but in any case, the preprocessor macro for that flag is not defined with -march=native on a raspberry pi 5. Unfortunately that means when built "WITH_NATIVE", we do not get a fast CRC function. The CRC32 preprocessor macro _IS_ defined, and the auto detection when built without NATIVE support does properly get dispatched to. Since we only need the scalar CRC32 and not the polynomial stuff anyhow, let's make it be an \|\| condition and not a && one.	2024-12-01 16:05:15 +01:00
Adam Stylinski	94aacd8bd6	Try to simply the inflate loop by collapsing most cases to chunksets	2024-10-23 21:20:11 +02:00
Vladislav Shchapov	c694bcdaf6	Add option to disable runtime CPU detection Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-03-06 23:32:15 +01:00
Hans Kristian Rosbach	9953f12e21	Move update_hash(), insert_string() and quick_insert_string() out of functable and remove SSE4.2 and ACLE optimizations. The functable overhead is higher than the benefit from using optimized functions.	2024-02-23 13:34:10 +01:00
Nathan Moinvaziri	a090529ece	Remove deflate_state parameter from update_hash functions.	2024-02-23 13:34:10 +01:00
Vladislav Shchapov	ac25a2ea6a	Split CPU features checks and CPU-specific function prototypes and reduce include-dependencies. Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-02-22 20:11:46 +01:00

8 Commits