The Deflate Compression Algorithm
In some ways compression is responsible for the very existence of the Portable
Network Graphics format (recall Chapter 1, "An Introduction to PNG"), and it is undoubtedly one of the
most important components of PNG. The PNG specification defines a single
compression method, the deflate algorithm, for all image types.
Part of the LZ77 class of compression algorithms, deflate was defined
by PKWARE in 1991 as part of the 1.93a beta version of their PKZIP
archiver. Independently implemented by Jean-loup Gailly
and Mark Adler,
first for Info-ZIP's Zip and UnZip utilities and shortly
thereafter for the GNU gzip utility, the deflate algorithm is
battle-tested and today is probably the most commonly used
file-compression algorithm on the Internet. Although it is not the
best-compressing algorithm known,[70]
deflate has a very desirable mix of characteristics: high
reliability, good compression, good encoding speed, excellent decoding speed,
minimal overhead on incompressible data, and modest, well-defined memory
footprints for both encoding and decoding.
As an LZ77-derived algorithm, deflate is fundamentally based on the concept
of a sliding window. One begins with the premise that many types of
interesting data, from binary computer instructions to source code to
ordinary text to images, are repetitious to varying degrees. The basic
idea of a sliding window is to imagine a window of some width immediately
preceding the current position in the data stream (and therefore
sliding along as the current position is updated), which one can use as
a kind of dictionary to encode subsequent data. For example, if the text
of this chapter is the data stream, then the current position at the very
instant you read this is here. Preceding that point is a little more
than 13,000 bytes of text, which includes, among other things, six copies of
the fragment ``or example'' (whoa, there's another one!). Instead of encoding
such strings as literal text, deflate replaces each with a pair of numbers
indicating its length (in this case, 10 bytes) and the distance back to one
of the previous instances (perhaps 950 bytes between the fifth and sixth).
The greater the length of the string, the greater the savings in encoding it
as a pointer into the window.
There are various ways to implement LZ77; the approach used by deflate is
a ``greedy'' algorithm originally devised by James Storer and Thomas
Szymanski--hence its name, LZSS. LZSS employs a look-ahead buffer and
finds the longest match for the buffer within the sliding window. If the
match exceeds a given threshold length, the string is encoded as a
length/distance pair and the buffer is advanced a corresponding amount. If the
longest match is not sufficiently long, the first character in the
look-ahead buffer is output as a literal value, and the buffer is advanced
by one. Either way, the algorithm continues by seeking the longest match
for the new contents of the buffer.
The deflate algorithm is actually a bit more clever than the preceding
description would suggest. Rather than simply storing the length/distance
pairs and literal bytes as is, it further compresses the data by
Huffman-encoding the LZ77 output. This approach is generically
referred to as LZH; deflate's uniqueness lies in its method of combining
literals and lengths into a single Huffman tree, its use of both fixed and
dynamic Huffman codes, and its division of the output stream into blocks
so that regions of incompressible data can be stored as is, rather than
expanding significantly, as can happen with the LZW algorithm.
The PNG specification further dictates that the deflate data stream
must conform to the zlib 1.0 format. In particular, the size of the
sliding window must be a power of 2 between 256 bytes and 32
kilobytes, inclusive, and a small zlib header and trailer are
required. The latter includes a 32-bit checksum on the
uncompressed data; recall that the compressed stream is already
covered by PNG's 32-bit CRC value in each IDAT chunk.
More detailed explanation of the deflate algorithm and the zlib data
format is beyond the scope of this book, but the full zlib and deflate
specifications are available from
http://www.zlib.org/zlib_docs.html .
In addition, a reference such as
The Data Compression Book, by Mark
Nelson and Jean-loup Gailly, is invaluable for understanding many
compression algorithms, including LZ77 and LZSS.
Practically speaking, independent implementation of the deflate algorithm is
both difficult and unnecessary. Almost every PNG implementation available
today makes use of the freely available zlib compression library, and
the examples in Part III, Programming with PNG, do so as well.[71]
For now I merely note that zlib supports ten compression levels
(including one with no compression at all), differing in the algorithms used
to find matching strings and in the thresholds for terminating the search
prematurely.
As an aside, note that the efficiency of the compression engine increases
as more data is processed, with peak efficiency being reached when there is
sufficient data to fill the sliding window. This occurs mainly because there
are fewer strings available in the ``dictionary,'' but also, initially, because
those strings that do exist are limited in length--obviously, they cannot be
any longer than the amount of data in the window. Thus, for example, when 25%
of the compressed data has been received, it may correspond to only 20% of the
pixels. But because of data buffering in network protocols and applications,
any large disparities due to the truly low-efficiency encoding at startup will
tend to be washed out at the 512-byte level (or higher). That is, even though
the first 50 bytes might represent only 1% compression, those bytes generally
will not be available until after the 512th byte has been received, by
which point the compression efficiency may have reached 10% or better. And
since this is generally true of most compression algorithms, including those
used by both GIF and PNG, it is reasonable to compare (as I did in Chapter 1, "An Introduction to PNG")
the appearance of the uncompressed pixels at an instant when equal
amounts of compressed data have been received.
A Final Word on Patents
As mentioned at the end of
Chapter 7, "History of the Portable Network Graphics Format",
Stac has reportedly claimed that the
deflate algorithm is covered by two of their patents. In fact, there
are a number of patents that can be infringed upon by a
compliant deflate implementation, including one held by PKWARE itself
that involves sorted hash tables. But the deflate specification
includes a section on implementing the algorithm without
infringing,[72]
and, of course, zlib itself follows that prescription. While these
things are never 100% certain unless and until they are tested in court,
developers and users can be reasonably confident that the use of zlib
and its implementation of the deflate algorithm is not subject to licensing
fees.
|