Commit Graph

44 Commits

Author SHA1 Message Date
Carlos Sánchez López
3da40c259e Fixes build issues C4242, C4244 and C4334 caused by loss of data bugs due to data type mismatch in various files. 2024-08-16 11:52:50 +02:00
Mark Adler
4fe59efbe0 zlib 1.3.1
madler/zlib#51b7f2abdade71cd9bb0e7a373ef2610ec6f9daf
2024-02-07 19:15:56 +01:00
Mark Adler
a3fb271c6e Add LIT_MEM define to use more memory for a small deflate speedup.
A bug fix in zlib 1.2.12 resulted in a slight slowdown (1-2%) of
deflate. This commit provides the option to #define LIT_MEM, which
uses more memory to reverse most of that slowdown. The memory for
the pending buffer and symbol buffers is increased by 25%, which
increases the total memory usage with the default parameters by
about 6%.

madler/zlib#ac8f12c97d1afd9bafa9c710f827d40a407d3266
2024-02-07 19:15:56 +01:00
Nathan Moinvaziri
b047c7247f Prefix shared functions to prevent symbol conflict when linking native api against compat api. 2023-01-09 15:10:11 +01:00
Ilya Leoshkevich
9be98893aa Use PREFIX() for some of the Z_INTERNAL symbols
https://github.com/powturbo/TurboBench links zlib and zlib-ng into the
same binary, causing non-static symbol conflicts. Fix by using PREFIX()
for flush_pending(), bi_reverse(), inflate_ensure_window() and all of
the IBM Z symbols.

Note: do not use an explicit zng_, since one of the long-term goals is
to be able to link two versions of zlib-ng into the same binary for
benchmarking [1].

[1] https://github.com/zlib-ng/zlib-ng/pull/1248#issuecomment-1096648932
2022-04-27 10:37:43 +02:00
Nathan Moinvaziri
e4c622371d Switch longest_match in deflate_slow based on whether or not rolling hash is being used.
Co-authored-by: Hans Kristian Rosbach <hk-git@circlestorm.org>
2021-06-25 20:09:14 +02:00
Hans Kristian Rosbach
cf9127a231 Separate MIN_MATCH into STD_MIN_MATCH and WANT_MIN_MATCH
Rename MAX_MATCH to STD_MAX_MATCH
2021-06-13 20:55:01 +02:00
Nathan Moinvaziri
156be5cf0f Separate huff, rle, and stored deflate strategies into their own source files. 2021-06-12 19:34:42 +02:00
Mika Lindqvist
4747276ffe Move MIN() macro to zbuild.h 2021-06-04 20:54:28 +02:00
Nathan Moinvaziri
a659d7f071 Fixed casting warnings when comparing MAX_DIST.
deflate_medium.c(127,76): warning C4244: '=': conversion from 'unsigned int' to 'Pos', possible loss of data
2020-11-02 17:01:58 +01:00
Nathan Moinvaziri
778d65f3b3 Fixed conversion warning when calling zng_tr_tally_dist. Signature for dist and len now match zng_emit_dist.
deflate.c(1575,67): warning C4244: 'function': conversion from 'uint32_t' to 'unsigned char', possible loss of data
  deflate_fast.c(60,94): warning C4244: 'function': conversion from 'uint32_t' to 'unsigned char', possible loss of data
  deflate_medium.c(39,102): warning C4244: 'function': conversion from 'int' to 'unsigned char', possible loss of data
  deflate_slow.c(75,101): warning C4244: 'function': conversion from 'unsigned int' to 'unsigned char', possible loss of data
2020-11-02 17:01:58 +01:00
Nathan Moinvaziri
7cffba4dd6 Rename ZLIB_INTERNAL to Z_INTERNAL for consistency. 2020-08-31 12:33:16 +02:00
Hans Kristian Rosbach
6264b5a58d Fix more conversion warnings related to s->bi_valid, stored_len and misc. 2020-08-27 19:20:38 +02:00
Hans Kristian Rosbach
3a26093baf Fix some of the old and new conversion warnings in deflate* 2020-08-27 19:20:38 +02:00
Ilya Leoshkevich
5fad6c557a Revert "zng_tr_tally_lit: disable -Wtype-limits"
This makes MSVC unhappy:

https://github.com/zlib-ng/zlib-ng/pull/726#issuecomment-681128124

D:\a\zlib-ng\zlib-ng\deflate_p.h(36,9): warning C4068: unknown pragma 'GCC' [D:\a\zlib-ng\zlib-ng\zlib.vcxproj]

This reverts commit 24c442c606.
2020-08-26 23:56:15 +02:00
Ilya Leoshkevich
24c442c606 zng_tr_tally_lit: disable -Wtype-limits
Some gcc versions complain that parameter c is always less than
MAX_MATCH-MIN_MATCH, and therefore the assertion that checks for this
is useless, but in reality some day MIN_MATCH and MAX_MATCH can change.

So disable the warning around the assertion.
2020-08-23 10:07:07 +02:00
Nathan Moinvaziri
feaa6b7993 Move Tracev flush statement into flush_pending. 2020-06-08 21:16:31 +02:00
Nathan Moinvaziri
36e0e0bdf0 Move Tracevv statements when emitting literal to zng_tr_tally_lit. 2020-06-08 21:16:31 +02:00
Nathan Moinvaziri
a0fa24f92f Remove IPos typedef which also helps to reduce casting warnings. 2020-05-30 21:29:44 +02:00
Nathan Moinvaziri
d569bfe23a Fixed dist casting warnings in zng_tr_tally_dist.
deflate_p.h(42,37): warning C4244: '=': conversion from 'unsigned int' to 'unsigned char', possible loss of data
    deflate_p.h(43,42): warning C4244: '=': conversion from 'unsigned int' to 'unsigned char', possible loss of data
2020-05-30 21:25:18 +02:00
Nathan Moinvaziri
69bbb0d823 Standardize insert_string functionality across architectures. Added unaligned conditionally compiled code for insert_string and quick_insert_string. Unify sse42 crc32 assembly between insert_string and quick_insert_string. Modified quick_insert_string to work across architectures. 2020-04-30 10:01:46 +02:00
Nathan Moinvaziri
c459b4f5e1 Clean up zng_tr_tally code. 2020-03-13 13:04:37 +01:00
Nathan Moinvaziri
e0a711cdde Fixed formatting, 4 spaces for code intent, 2 spaces for preprocessor indent, initial function brace on the same line as definition, removed extraneous spaces and new lines. 2020-02-07 10:44:20 +01:00
Nathan Moinvaziri
f06c71f981 Add zng_ prefix to internal functions to avoid linking conflicts with zlib. (#363) 2019-07-18 13:21:13 +02:00
Sebastian Pop
1cd1b4eb0e ARM: check cpu feature once at init time
This makes the checks for arm cpu features as inexpensive as on the x86 side
by calling the runtime feature detection once in deflate/inflate init and then
storing the result in a global variable.
2019-03-01 11:40:23 +01:00
Sebastian Pop
13619fd2b6 return an index for hash map collisions in insert_string
The current version of insert_string_c and variations for sse2, arm, and aarch64
in zlib-ng has changed semantics from the original code of INSERT_STRING macro
in zlib:

 #define INSERT_STRING(s, str, match_head) \
   (UPDATE_HASH(s, s->ins_h, s->window[(str) + (MIN_MATCH-1)]), \
    match_head = s->prev[(str) & s->w_mask] = s->head[s->ins_h], \
    s->head[s->ins_h] = (Pos)(str))

The code of INSERT_STRING assigns match_head with the content of s->head[s->ins_h].

In zlib-ng, the assignment to match_head happens in the caller of insert_string().
zlib-ng's insert_string_*() functions return 0 instead of str+idx in case of
collision, i.e., when if (s->head[s->ins_h] == str+idx).

The effect of returning 0 instead of the content of s->head[s->ins_h] is that
the search for a longest_match through s->prev[] chains will be cut short when
arriving at 0. This leads to a shorter compression time at the expense of a
worse compression rate: returning 0 cuts out the search space.

With this patch:

 Performance counter stats for './minigzip -9 llvm.tar':

      13422.379017      task-clock (msec)         #    1.000 CPUs utilized
                20      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               130      page-faults               #    0.010 K/sec
    58,926,104,511      cycles                    #    4.390 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
    77,543,740,646      instructions              #    1.32  insns per cycle
    17,158,892,214      branches                  # 1278.379 M/sec
       198,433,680      branch-misses             #    1.16% of all branches

      13.423365095 seconds time elapsed

45408 -rw-rw-r-- 1 spop spop 46493896 Dec 11 11:47 llvm.tar.gz

Without this patch the compressed file is larger:

 Performance counter stats for './minigzip -9 llvm.tar':

      13459.342312      task-clock (msec)         #    1.000 CPUs utilized
                25      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               129      page-faults               #    0.010 K/sec
    59,088,391,808      cycles                    #    4.390 GHz
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
    77,600,766,958      instructions              #    1.31  insns per cycle
    17,486,130,785      branches                  # 1299.182 M/sec
       196,281,761      branch-misses             #    1.12% of all branches

      13.463512830 seconds time elapsed

45408 -rw-rw-r-- 1 spop spop 46493896 Dec 11 11:48 llvm.tar.gz
2018-12-13 09:03:25 +01:00
Mika Lindqvist
aff0fc6e3c Adapt code to support PREFIX macros and update build scripts 2018-01-31 10:45:29 +01:00
Mika Lindqvist
725f7bc04c Fix that s->prev is not used uninitialized in insert_string_* 2017-08-24 12:35:36 +02:00
Hans Kristian Rosbach
da51338488 Add a struct func_table and function functableInit.
The struct contains pointers to select functions to be used by the
rest of zlib, and the init function selects what functions will be
used depending on what optimizations has been compiled in and what
instruction-sets are available at runtime.

Tests done on a haswell cpu running minigzip -6 compression of a
40M file shows a 2.5% decrease in branches, and a 25-30% reduction
in iTLB-loads. The reduction i iTLB-loads is likely mostly due to
the inability to inline functions. This also causes a slight
performance regression of around 1%, this might still be worth it
to make it much easier to implement new optimized functions for
various architectures and instruction sets.

The performance penalty will get smaller for functions that get more
alternative implementations to choose from, since there is no need
to add more branches to every call of the function.
Today insert_string has 1 branch to choose insert_string_sse
or insert_string_c, but if we also add for example insert_string_sse4
then that would have needed another branch, and it would probably
at some point hinder effective inlining too.
2017-04-24 11:02:56 +02:00
Mika Lindqvist
02f5b0b9c6 Add support for ARM ACLE instructions. 2017-03-24 23:55:58 +02:00
Mika Lindqvist
bcab6edb16 Type cleanup...
* uInt -> unsigned int
* ulg -> unsigned long
2017-02-16 11:20:27 +01:00
Mark Adler
23b4bb3ec4 Avoid use of DEBUG macro -- change to ZLIB_DEBUG. 2017-02-06 14:33:22 +01:00
Mark Adler
40039a039c Speed up deflation for level 0 (storing).
The previous code slid the window and the hash table and copied
every input byte three times in order to just write the data as
stored blocks with no compression. This commit minimizes sliding
and copying, especially for large input and output buffers.

Level 0 compression is now more than 20 times faster than before
the commit.

Most of the speedup is due to deferring hash table slides until
deflateParams() is called to change the compression level away
from 0. More speedup is due to copying directly from next_in to
next_out when the amounts of available input data and output space
permit it, avoiding the intermediate pending buffer. Additionally,
only the last 32K of the used input data is copied back to the
sliding window when large input buffers are provided.
2017-02-06 12:40:06 +01:00
Mika Lindqvist
602531cf3d Replace Z_NULL with NULL. Fix incorrect uses of NULL/Z_NULL. 2017-01-31 10:53:22 +01:00
Mika Lindqvist
f111d5cb42 Merge insert_string and bulk_insert_str.
** Partial merge of this commit, based on a8c94e9f5a3b9d3c62182bcf84e72304a3c1a6e5
Excludes changes to fill_window_sse.c, changes to fill_window_c() in deflate.c
and several unrelated changes in the commit.
2017-01-30 12:07:06 +01:00
Mika Lindqvist
631817cce8 local -> static
* local -> static
* Normalize and cleanup line-endings
* Fix warnings under Visual Studio.
* Whitespace cleanup

***
This patch has been edited to merge cleanly and to exclude type changes.
Based on 8d7a7c3b82c6e38734bd504dac800b148ab410d0 "Type Cleanup"
2017-01-30 10:35:05 +01:00
Mika Lindqvist
2b9bf9e672
Don't update prev if old head is same as new. 2016-07-04 16:24:35 +03:00
Hans Kristian Rosbach
76a02fa8c3 Merge pull request #56 from Dead2/hacknslash2
Improvements for MSVC and backport fixes from zlib upstream and intel's fork
2015-11-25 13:37:27 +01:00
Joergen Ibsen
8129b65a70 Fix uninitialized variable 2015-11-09 19:57:26 +01:00
Hans Kristian Rosbach
2c4ec8a5a3 Split insert_string_sse into separate file in arch folder. 2015-11-04 20:58:56 +01:00
Mat Berchtold
7e48233b59 Add support for MSVC crc32 intrinsic 2015-11-02 13:24:58 +01:00
Hans Kristian Rosbach
04a039de2b Whitespace cleanup 2015-06-26 15:43:09 +02:00
Hans Kristian Rosbach
43580a20b6 Make insert_string_sse more similar to insert_string_c. 2015-06-24 23:06:13 +02:00
Mika Lindqvist
fc1d0be4bb Split deflate.c
* Separate common inlines and macros to deflate_p.h
* Separate deflate_fast related code to deflate_fast.c
* Separate deflate_medium related code to deflate_medium.c
* Separate deflate_slow related code to deflate_slow.c
2015-06-24 20:34:55 +02:00