Overview of Image Properties
Before we dive right into some of PNG's more interesting features, it might
be helpful to introduce (or review) some essential image concepts and take a
quick look at a few older image formats. Those who are already familiar
with the most basic features of computer images can skip directly to the
next section.
There are two main formats for computer images: raster, based on colored dots,
which are almost always stored in a rectangular array and are usually packed
so close together that individual dots are no longer distinguishable, and vector, based on lines, circles, and other ``primitive'' elements that typically
cover a sizable area and are easily distinguishable from one another. Many images can be represented in either format; indeed, any
vector-based image can be approximated by a raster image (lots of dots), and
one could easily (though tediously) simulate a raster image in vector format
by converting each dot to a tiny box.
The whole point of having two classes of image formats--and, indeed,
of having numerous individual file formats--is implicit in the old saying,
``Use the best tool for the job.'' Vector formats are appropriate for simple
graphics and text, such as corporate logos, and their advantage is that they
can be extremely compact and yet maintain perfect sharpness regardless of
the size at which they are reproduced. But with the exception of pen-based
plotters and some ancient vector-based displays, the end result is almost
always a raster image.
For that reason, plus the fact that raster image formats are more common--and
because PNG is one of them--we'll take a closer look at raster features.
As I just noted, a raster image is composed of an array of dots, more
commonly referred to as pixels (short for picture elements).
One generally refers to a computer image's dimensions in terms of pixels;
this is also often (though slightly imprecisely) known as its
resolution. Some common image sizes are 640 × 480, 800 × 600,
and 1024 × 768 pixels, which also happen to be common dimensions for
computer displays.
In addition to horizontal and vertical dimensions, a raster image is
characterized by depth. The deeper the image, the more colors (or shades
of gray) it can have. Pixel depths are measured in bits, the tiniest
units of computer storage; a 1-bit image can represent two colors (often,
though not necessarily, black and white), a 2-bit image four colors, an
8-bit image 256 colors, and so on. To calculate the raw size of the
image data before any compression takes place, one needs only to know that
8 bits make a byte. Thus a 320 × 240, 24-bit image has 76,800 pixels,
each of which is 3 bytes deep, so its total uncompressed size is
230,400 bytes.
I'll return to the topic of compression in just a moment; first, let's take a
closer look at the precise relationship between pixels and colors. Within the
broad class of raster formats, there are three main image types: indexed-color,
grayscale, and truecolor. The indexed-color method, also known as
pseudocolor, colormapped, or palette-based, stores a copy of
each color value
needed for the image in a palette. The main image is then composed of index
values referring to different entries in the palette. For example, imagine an
image composed entirely of red, white, and blue pixels; the palette would have
three entries corresponding to these colors, and each pixel would be
represented by the value 0, 1, or 2. (The natural starting point for numbers
on a computer is 0, not 1.) Since an image 2 bits deep
can represent up to four colors, each pixel in this example would require
only 2 bits, even though the precise shades of red, white, and blue might
ordinarily require 24 bits each.
Grayscale and truecolor images are simpler in concept; the bytes used by
each pixel correspond directly to shades of gray or to colors. In a
grayscale image of a particular pixel depth, a 0 pixel usually
(though not always) means black, while the maximum value at that depth
corresponds to white. Intermediate pixel values are smoothly interpolated
to shades of gray, though this is often not as straightforward as it might
sound--gamma correction, a way of adjusting for differences in
computer display systems, comes in here. I'll give a brief overview of
gamma correction later in this chapter, and I'll discuss it at length in Chapter 10, "Gamma Correction and Precision Color",
Gamma Correction and Precision Color;
for now, I'll merely note that it is a Good Thing, and image formats that
provide support for it can be viewed on different platforms without appearing
too light on one and too dark on another.
A truecolor image uses three separate values for each pixel,
corresponding to shades of red, green, and blue. Such images are often
also referred to as RGB. In Chapter 8, "PNG Basics", I'll talk
about human vision and the reasons why mixtures of just three colors can
appear to reproduce all colors, or at least a sufficiently large percentage
of them that one need not quibble over the difference. I'll also mention
some common alternatives to the RGB color space. To be
considered truly truecolor instead of merely ``high color,'' an image must contain at least 8 bits for each of the three colors in each
pixel; thus, at a minimum, a truecolor image has a depth of 24 bits.
Two other concepts--samples and channels--are handy when speaking of images,
and RGB images are a good way to illustrate these concepts. A sample is one
component of a single color value. For example, each pixel in a truecolor
image consists of three samples: red, green, and blue. If the image is
24 bits deep, then each sample is 8 bits deep. A 256-shade grayscale image
also has 8-bit samples, which means that one can speak of the ``bits per
sample'' for either image type to indicate the level of precision of each
shade or color. Note that I have been careful to distinguish between
sample depth and pixel depth. The two terms are directly related
in grayscale and truecolor images, but in indexed-color images they can be
independent of each other. This is because the sample depth refers to the
color values in the palette, while the pixel depth refers to the index values
of each pixel (which reference the palette colors). To put it more concretely,
the color values in the palette are usually 24-bit values (8 bits per
sample), but the pixel indices are usually 8 bits or less.
Our previous red, white, and blue example used only two bits per pixel.
A channel, on the other hand, refers to the collection of all
samples of a given type in an image--for example, the green
components of every RGB pixel. Thus a truecolor image has three
channels, while a grayscale image has only one. (Ordinarily one does
not speak of a palette-based image as having channels.) And when
discussing transparency, yet another channel type is often used: the
alpha channel. This is a special kind of channel in that it does
not provide actual color information but rather a level of
transparency for each pixel--or, more precisely, a level of
opacity, since it is most common for the maximum sample value to
indicate that the pixel is completely opaque and for zero to indicate
complete transparency. A truecolor image with an alpha channel is
often called an RGBA image; grayscale images with alpha channels are
rarer and don't have a special abbreviation (although I may refer to
them as ``gray+alpha'').
Palette-based images almost never have a full alpha channel, but another
type of transparency is possible. Rather than associate alpha
information with every pixel, one can instead associate it with specific
palette entries. By far the most common approach is to specify that a single
palette entry represents complete transparency. Then when the image is
displayed against some sort of background, any pixel whose index refers to
this particular palette entry will be replaced by the background at the pixel's
location--or perhaps the pixel simply will not be drawn in the first place.
But there is no conceptual requirement that only one palette entry can have
transparency, nor that it must be fully transparent. As we'll see shortly,
PNG effectively allows any number of palette entries to have any level of
transparency.
While we're on the subject of colormapped images, two other concepts are worth
mentioning: quantization and dithering. Suppose one has a 24-bit truecolor
image, but it must be displayed on a 256-color, palette-based display.
Since truecolor images typically use anywhere from 10,000 to 100,000
colors, the conversion to a colormapped image will involve substituting many
of the color values with a much smaller range of colors. This process is
known as quantization. Because the resulting images have such a limited
palette of colors available to them, they often are unable to represent fine
color gradients such as the different shades of blue seen in the sky or the
range of facial tones in a softly lit portrait. One way around this is to
dither the image, which is a means of mixing pixels of the available
colors together to give the appearance of other colors (though generally at
the cost of some sharpness). For example, a checkerboard pattern of
alternating red and yellow pixels might appear orange. This effect is
perhaps best illustrated with an example.
Figure 1-1 shows a truecolor
photograph (here rendered in grayscale) together with two 256-color versions
of the same image--one simply quantized to 256 colors and the other both
quantized and dithered. The insets give a magnified view of one region,
showing the relative effects of the two procedures.
I'll round out our review of image properties and concepts with
a quick look at compression. There are really only two flavors: lossless
and lossy. Lossless compression preserves the exact image data down
to the last bit, so that what you get out after uncompressing is exactly the
same as what you started with. In contrast, lossy compression throws
away some of the data in return for much better compression ratios. For
photographic images, the best lossless methods may only manage a factor of
two or three in compression, whereas lossy methods typically achieve anywhere
from 8 to 25 times reduction with very little visible loss of quality.
I'll discuss the details of compression, particularly the lossless variety,
at greater length in Chapter 9, "Compression and Filtering".
Finally, in describing the advantages of PNG, I will necessarily compare
it with some older image formats. Although there are literally
hundreds of different formats, we will be most concerned with just three:
GIF, JPEG, and TIFF. GIF, short for the Graphics Interchange Format,
and JPEG, short for the Joint Photographic Experts Group (which defined
the format), are both very common image types often seen on the
Web. TIFF, on the other hand, short for Tagged Image File Format, is
almost never used on the Web but is quite popular as an output format from
scanners and as an intermediate ``save format'' while editing images. I'll
touch on the properties of each of these formats as we go.
|