remove 16-byte alignment from deflate_state::crc0

We noticed recently on the Skia tree that if we build Chromium's zlib
with GCC, -O3, -m32, and -msse2, deflateInit2_() crashes.  Might also
need -fPIC... not sure.

I tracked this down to a `movaps` (16-byte aligned store) to an address
that was only 8-byte aligned.  This address was somewhere in the middle
of the deflate_state struct that deflateInit2_()'s job is to initialize.

That deflate_state struct `s` is allocated using ZALLOC, which calls any
user supplied zalloc if set, or the default if not.  Neither one of
these has any special alignment contract, so generally they'll tend to
be 2*sizeof(void*) aligned.  On 32-bit builds, that's 8-byte aligned.

But because we've annotated crc0 as zalign(16), the natural alignment of
the whole struct is 16-byte, and a compiler like GCC can feel free to
use 16-byte aligned stores to parts of the struct that are 16-byte
aligned, like the beginning, crc0, or any other part before or after
crc0 that happens to fall on a 16-byte boundary.  With -O3 and -msse2,
GCC does exactly that, writing a few of the fields with one 16-byte
store.

The fix is simply to remove zalign(16).  All the code that manipulates
this field was actually already using unaligned loads and stores.  You
can see it all right at the top of crc_folding.c, CRC_LOAD and CRC_SAVE.

This bug comes from the Intel performance patches we landed a few years
ago, and isn't present in upstream zlib, Android's zlib, or Google's
internal zlib.

It doesn't seem to be tickled by Clang, and won't happen on 64-bit GCC
builds: zalloc is likely 16-byte aligned there.  I _think_ it's possible
for it to trigger on non-x86 32-bit builds with GCC, but haven't tested
that.  I also have not tested MSVC.

Reviewed-on: https://chromium-review.googlesource.com/1236613
This commit is contained in:
Mike Klein 2018-09-20 20:34:42 +00:00 committed by Hans Kristian Rosbach
parent ccbb3fef5a
commit 796ad10efb

View File

@ -116,7 +116,7 @@ typedef struct internal_state {
int last_flush; /* value of flush param for previous deflate call */
#ifdef X86_PCLMULQDQ_CRC
unsigned ALIGNED_(16) crc0[4 * 5];
unsigned crc0[4 * 5];
#endif
/* used by deflate.c: */