zlib-ng

mirror of https://github.com/GerbilSoft/zlib-ng.git synced 2025-06-18 11:35:35 -04:00

Author	SHA1	Message	Date
Carlos Sánchez López	3da40c259e	Fixes build issues C4242, C4244 and C4334 caused by loss of data bugs due to data type mismatch in various files.	2024-08-16 11:52:50 +02:00
Pavel P	2c801bd43a	Cast result of zalloc to char * to avoid warnings + remove unnecessary cast when using `original_buf`	2024-08-09 13:34:43 +02:00
Hans Kristian Rosbach	130055e8d1	Rewrite deflate memory allocation. Deflate used to call allocate 5 times during init. - 5 calls to external alloc function now becomes 1 - Handling alignment of allocated buffers is simplified - Efforts to align the allocated buffer now needs to happen only once. - Individual buffers are ordered so that they have natural sequential alignment. - Due to reduced losses to alignment, we allocate less memory in total. - While doing alloc(), we now store pointer to corresponding free(), avoiding crashes with applications that incorrectly set alloc/free pointers after running init function. - Removed need for extra padding after window, chunked reads can now go beyond the window buffer without causing a segfault. Co-authored-by: Ilya Leoshkevich <iii@linux.ibm.com>	2024-05-28 16:35:13 +02:00
Ilya Leoshkevich	7a55ec9aca	Prepare DFLTCC changes for new malloc system	2024-05-28 16:35:13 +02:00
Ilya Leoshkevich	05ef29eda5	IBM zSystems DFLTCC: Inline DLFTCC states into zlib states Currently DFLTCC states are allocated using hook macros, complicating memory management. Inline them into zlib states and remove the hooks.	2024-05-15 11:28:10 +02:00
Vladislav Shchapov	af8169a724	Replace conditional call to functable.force_init with macro FUNCTABLE_INIT Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-03-06 23:32:15 +01:00
Vladislav Shchapov	c694bcdaf6	Add option to disable runtime CPU detection Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-03-06 23:32:15 +01:00
Vladislav Shchapov	fe0a6407da	Explicitly indicate functions are conditionally dispatched Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2024-03-06 23:32:15 +01:00
Hans Kristian Rosbach	9953f12e21	Move update_hash(), insert_string() and quick_insert_string() out of functable and remove SSE4.2 and ACLE optimizations. The functable overhead is higher than the benefit from using optimized functions.	2024-02-23 13:34:10 +01:00
Nathan Moinvaziri	a090529ece	Remove deflate_state parameter from update_hash functions.	2024-02-23 13:34:10 +01:00
Mark Adler	4fe59efbe0	zlib 1.3.1 madler/zlib#51b7f2abdade71cd9bb0e7a373ef2610ec6f9daf	2024-02-07 19:15:56 +01:00
Hans Wennborg	6345d05782	Fix the copy of pending_buf in deflateCopy() for the LIT_MEM case. madler/zlib#60c31985ecdc2b40873564867e1ad2aef0b88697	2024-02-07 19:15:56 +01:00
Mark Adler	a3fb271c6e	Add LIT_MEM define to use more memory for a small deflate speedup. A bug fix in zlib 1.2.12 resulted in a slight slowdown (1-2%) of deflate. This commit provides the option to #define LIT_MEM, which uses more memory to reverse most of that slowdown. The memory for the pending buffer and symbol buffers is increased by 25%, which increases the total memory usage with the default parameters by about 6%. madler/zlib#ac8f12c97d1afd9bafa9c710f827d40a407d3266	2024-02-07 19:15:56 +01:00
Vladislav Shchapov	0c32ad4237	Add force initialization functable, because deflate captures function pointers from functable Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2023-12-21 16:12:00 +01:00
Letu Ren	6e4b8b194b	Update copyright to sync with zlib 1.3 https://github.com/zlib-ng/zlib-ng/pull/1563 this patch forgets to update copyright string.	2023-12-20 22:00:02 +01:00
Nathan Moinvaziri	e9a48a2ecb	Simplify deflate stream/state check.	2023-08-06 10:20:43 +02:00
Nathan Moinvaziri	afd2a0c323	Minor code cleanup in deflate.c.	2023-06-13 12:27:39 +02:00
Vladislav Shchapov	20d8fa8af1	Replace global CPU feature flag variables with local variable in init_functable Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>	2023-03-06 13:26:09 +01:00
Nathan Moinvaziri	fa9bfeddcf	Use named defines instead of hard coded numbers.	2023-02-18 20:30:55 +01:00
Hans Kristian Rosbach	cf5bb01da9	Fix prefixing for internal functions calloc/cfree	2023-02-09 01:54:19 +01:00
Mika T. Lindqvist	d5db5aa985	Sync with zlib 1.2.13 and declare compatibility.	2023-02-03 15:49:02 +01:00
Mark Adler	4af454bfa7	Fix bug in deflateBound() for level 0 and memLevel 9. memLevel 9 would cause deflateBound() to assume the use of fixed blocks, even if the compression level was 0, which forces stored blocks. That could result in a bound less than the size of the compressed data. Now level 0 always uses the stored blocks bound.	2023-02-03 15:49:02 +01:00
Nathan Moinvaziri	b047c7247f	Prefix shared functions to prevent symbol conflict when linking native api against compat api.	2023-01-09 15:10:11 +01:00
Ilya Leoshkevich	527610a84e	Fix deflate() with Z_BEST_COMPRESSION ignoring the dictionary deflate_slow() uses s->quick_insert_string(), while deflateSetDictionary() uses functable.insert_string(). These functions use different hashing algorithms, which leads to deflate_slow() ignoring the dictionary. Fix by using s->insert_string() instead of functable.insert_string(), which is set by lm_set_level() and matches what deflate_*() uses (suggested by Mika Lindqvist).	2022-10-23 14:53:54 +02:00
Nathan Moinvaziri	1532af4d85	Don't use zlib fork identifier in copyright statement.	2022-08-15 16:42:58 +02:00
Tobias Stoeckmann	956ff05383	Handle invalid windowBits in init functions Negative windowBits arguments are eventually turned positive in deflateInit2_ and inflateInit2_ (more precisely in inflateReset2). Such values are used to indicate that raw deflate/inflate should be performed. If a user supplies INT32_MIN for windowBits, the code will perform -INT32_MIN which does not fit into int32_t. In fact, this is undefined behavior in C and should be avoided. Clearly this is a user error, but given the careful validation of input arguments a few lines later in deflateInit2_ I think this might be of interest. Proof of Concept: - Compile zlib-ng with gcc -ftrapv or -fsanitize=undefined - Compile and run this program: ``` #include <limits.h> #include <stdio.h> #include <zlib-ng.h> int main(void) { zng_stream de_stream = { 0 }, in_stream = { 0 }; int result; result = zng_deflateInit2(&de_stream, 0, Z_DEFLATED, INT32_MIN, MAX_MEM_LEVEL, Z_DEFAULT_STRATEGY); printf("zng_deflateInit2: %d\n", result); result = zng_inflateInit2(&in_stream, INT32_MIN); printf("zng_inflateInit2: %d\n", result); return 0; } ```	2022-06-16 14:08:55 +02:00
Tobias Stoeckmann	3f7b0b411d	Extend GZIP conditional If gzip support has been disabled during compilation then also consider gzip relevant states as invalid in deflateStateCheck. Also the gzip state definitions can be removed. This change leads to failure in test/example, and I am not sure what the GZIP conditional is trying to achieve. All gzip related functions are still defined in zlib.h Alternative approach is to remove the GZIP define.	2022-06-16 14:08:44 +02:00
Nathan Moinvaziri	d43822b9a7	zlib 1.2.12	2022-06-13 15:58:03 +02:00
Hans Kristian Rosbach	28b029c726	Simplify version and struct size checking, and ensure we do it the same way everywhere.	2022-06-03 10:21:01 +02:00
Hans Kristian Rosbach	2f4e2372a2	Simplify zlib-ng native API by removing version and struct size checks. This should be backwards compatible with applications compiled for 2.0.x.	2022-06-03 10:21:01 +02:00
Adam Stylinski	d79984b5bc	Adding avx512_vnni inline + copy elision Interesting revelation while benchmarking all of this is that our chunkmemset_avx seems to be slower in a lot of use cases than chunkmemset_sse. That will be an interesting function to attempt to optimize. Right now though, we're basically beating google for all PNG decode and encode benchmarks. There are some variations of flags that can basically have us trading blows, but we're about as much as 14% faster than chromium's zlib patches. While we're here, add a more direct benchmark of the folded copy method versus the explicit copy + checksum.	2022-05-23 16:13:39 +02:00
Adam Stylinski	b8269bb7d4	Added inlined AVX512 adler checksum + copy While we're here, also simplfy the "fold" signature, as reducing the number of rebases and horizontal sums did not prove to be meaningfully faster (slower in many circumstances).	2022-05-23 16:13:39 +02:00
Adam Stylinski	21f461e238	Adding an SSE42 optimized copy + adler checksum implementation We are protecting its usage around a lot of preprocessor macros as the other methods are not yet implemented and calling this version bypasses the faster adler implementations implicitly. When more versions are written for faster vectorizations, the functable entries will be populated and preprocessor macros removed. This round, the copy + checksum is not employing as many tricks as one would hope with a "folded" checksum routine. The reason for this is the particularly tricky case of dealing with unaligned buffers. The implementations which don't have CPUs in the mix that have a huge penalty for unaligned loads will have a much faster implementation. Fancier methods that minimized rebasing, while having the potential to be faster, ended up being slower because the compiler structured the code in a way that ended up either spilling to the stack or trampolining out of a loop and back in it instead of just jumping over the first load and store. Revisiting this for AVX512, where more registers are abundant and more advanced loads exist, may be prudent.	2022-05-23 16:13:39 +02:00
Ilya Leoshkevich	c592b1b332	IBM Z DFLTCC: Split deflate and inflate states Currently deflate and inflate both use a common state struct. There are several variables in this struct that we don't need for inflate, and more may be coming in the future. Therefore split them in two separate structs. This in turn requires splitting ZALLOC_STATE and ZCOPY_STATE macros.	2022-04-28 12:01:57 +02:00
Ilya Leoshkevich	9be98893aa	Use PREFIX() for some of the Z_INTERNAL symbols https://github.com/powturbo/TurboBench links zlib and zlib-ng into the same binary, causing non-static symbol conflicts. Fix by using PREFIX() for flush_pending(), bi_reverse(), inflate_ensure_window() and all of the IBM Z symbols. Note: do not use an explicit zng_, since one of the long-term goals is to be able to link two versions of zlib-ng into the same binary for benchmarking [1]. [1] https://github.com/zlib-ng/zlib-ng/pull/1248#issuecomment-1096648932	2022-04-27 10:37:43 +02:00
Mika Lindqvist	4fadf3c49e	Add one extra byte to return value of compressBound and deflateBound for small lengths due to shift returning 0. * Treat 0 byte input as 1 byte input when calculating compressBound and deflateBound	2022-04-22 13:50:22 +02:00
Nathan Moinvaziri	a639a3d43f	Use cpu_check_features in inflate and deflate.	2022-01-23 16:39:48 +01:00
Nathan Moinvaziri	a5a0b40e17	Move cpu_feature includes out of zutil.h.	2022-01-23 16:39:48 +01:00
Hans Kristian Rosbach	70f608c79c	Fix deflateBound and compressBound returning very small size estimates. Remove workaround in switchlevels.c, so we do actual testing of this. Use named defines instead of magic numbers where we can.	2021-12-20 14:49:27 +01:00
Ilya Leoshkevich	b4ca25afab	DFLTCC update for window optimization from Jim & Nathan Stop relying on software and hardware inflate window formats being the same and act the way we already do for deflate: provide and implement window-related hooks. Another possibility would be to use an in-line history buffer (by not setting HBT_CIRCULAR), but this would require an extra memmove(). Also fix a couple corner cases in the software implementation of inflateGetDictionary() and inflateSetDictionary().	2021-12-02 09:26:32 +01:00
Nathan Moinvaziri	d802e8900f	Move crc32 folding functions into functable.	2021-08-13 15:05:34 +02:00
Nathan Moinvaziri	e52d08ea92	Separate slide_hash_c in the same way that insert_string_c is separated from deflate.c.	2021-07-08 09:33:41 +02:00
Nathan Moinvaziri	e4c622371d	Switch longest_match in deflate_slow based on whether or not rolling hash is being used. Co-authored-by: Hans Kristian Rosbach <hk-git@circlestorm.org>	2021-06-25 20:09:14 +02:00
Hans Kristian Rosbach	8916f295e0	Use UNLIKELY for branches related to rolling hash based on performance profiling. Co-authored-by: Nathan Moinvaziri <nathan@nathanm.com>	2021-06-25 20:09:14 +02:00
Nathan Moinvaziri	608b1c2020	Enable rolling hash function switching for fast-zlib.	2021-06-25 20:09:14 +02:00
Nathan Moinvaziri	1c766dbf67	Setup hash functions to be switched based on compression level.	2021-06-25 20:09:14 +02:00
Nathan Moinvaziri	6948789969	Added rolling hash functions for hash table.	2021-06-25 20:09:14 +02:00
Nathan Moinvaziri	3f5801f151	Use MIN and MAX macros.	2021-06-13 20:56:06 +02:00
Hans Kristian Rosbach	d7cac887af	Initialize s->prev_length to 0.	2021-06-13 20:55:01 +02:00
Hans Kristian Rosbach	94fd14ede7	Remove support for STD_MIN_MATCH != 3. It has always been broken and untested anyways.	2021-06-13 20:55:01 +02:00

1 2 3 4 5 ...

322 Commits