The PNG Guide is an eBook based on Greg Roelofs' book, originally published by O'Reilly. |
![]() |
Home ![]() ![]() |
|
![]() ![]() ![]() ![]() ![]() ![]() |
|
Transfer Functions and GammaTo understand the solutions, one must first become acquainted with the problems. I won't attempt to cover the subject in detail; an entire book could be written on it--and, indeed, Charles Poynton has done just that. But I will give a brief overview of the main issues and explain how some of the features of the Portable Network Graphics format fit into the picture. I may even mention some physics and an equation or three, but you shouldn't need a technical degree to be able to understand the basic ideas. The ultimate goal of the entire process is for the light that leaves
your monitor to produce the same perception as the light that
originally entered the camera would have if it had entered your
eyeballs instead. Alternatively, for images created with an
image-editing application, the goal is for your display to produce the
same perception (and basically the same light) as the artist's monitor
produced while he was creating the image. Clearly this involves both
the encoding process performed by the editor or conversion program that writes the
image file, and the decoding process, perfromed by the viewer or browser that reads
and displays the image, as well as aspects of human physiology and
psychology. We'll refer to the combination of the encoding and
decoding processes as the end-to-end process. PNG's role
is to provide a way to store
not only the image samples, that is, the color components of each pixel
but also the information needed to relate those samples to the desired output
Storing the image samples themselves is easy. The tricky part is figuring out the two additional pieces of critical information: when encoding, how the original light is related to the samples, and when decoding, how image samples are related to the display's actual output (i.e., the reproduced light). The fundamental problem is that working with and storing light is nearly impossible; instead, light is typically converted to electrical signals. Indeed, there are several more conversions along the way, each of which potentially modifies the data in some way. As a concrete example, in an image captured via a video or electronic camera, light entering the camera is first converted to analog voltages, which are in turn converted to other voltages representing digital ones and zeros. These are stored in an image file as magnetic fields on a hard disk or as tiny pits on a CD-ROM. For display, the digital data in the file is optionally modified by the viewing application (this is where gamma correction and other tweaking is performed), then possibly converted again according to a lookup table (LUT), then generally converted by a graphics card (``frame buffer'') back to an analog electrical signal.[77] This analog signal is then converted by the monitor's electronics into a directed beam of electrons that excites various phosphors at the front of the monitor and thereby is converted back into light. Clearly, there is a bit of complexity here (no pun intended).
But all is not lost! One can simplify this model in several ways. For example, conversions from analog to digital and from digital to analog are well behaved--they introduce minimal artifacts--so they can be ignored. Likewise, the detailed physics of the monitor's operation, from electrical signal to high-voltage electric fields to electrons to light, also can be ignored; instead, the monitor can be treated as a black box that converts an electrical signal to light in a well-defined way. But the greatest simplification is yet to come. Each of the conversions that remain, in the camera, lookup table, and monitor, is represented mathematically by something called a transfer function. A transfer function is nothing more than a way to describe the relationship between what comes out of the conversion and what went into it, and it can be a fairly complex little beastie. The amazing thing is that each of the preceding conversions can almost always be approximated rather well by a very simple transfer function: output = inputexponentwhere the output and input values are scaled to the range between 0 and 1. The two scaling factors may be different, even if ``input'' and ``output'' both refer to light; for example, monitors are physically incapable of reproducing the brightness of actual daylight. Even better, since the output of one conversion is the input to the next, these transfer functions combine in a truly simple fashion: final output = ((inputexponent1)exponent2)exponent3 = inputexponent1*exponent2*exponent3This example happens to use three transfer functions, but the relation holds for any number of them. And the best part of all is that our ultimate goal, to have the final, reproduced output light be perceived the same as the original input light, is equivalent to the following trivial equation: exponent1*exponent2*exponent3 = constantOr in English: all of the exponents, when multiplied together, must equal a single, constant number. The value of the constant depends on the environments in which the image is captured and viewed, but for movies and slides projected in a dark room, it is usually around 1.5, and for video images shown in typical television or computer environments, it is usually about 1.14. Since the viewing application has the freedom to insert its own conversion with its own exponent, it could, in principle, ensure that the equation holds--if it knew what all the remaining exponents were. But in general, it lacks that knowledge. We'll come back to that in a moment. In practice, images may be created with any number of tools: an
electronic camera; the combination of a classic film-based camera,
commercial developing process, and electronic scanner; an
image-editing application; or even a completely artificial source such
as a ray-tracing program, VRML browser, or fractal generator. To a
viewing application, a file is a file; there is rarely any obvious
We'll come back and deal with encoders in a little while. For a decoder there are only two cases: either the file contains the additional information about how the samples are related to the desired output, or it doesn't. In the latter case, the decoder is no worse off than it would have been when dealing with a GIF or JPEG image; it can only make a guess about the proper conversion, which in most cases means it does nothing special. But the case in which the file does contain conversion information is where
things finally get interesting. Many types of conversion information are
possible, but the simplest is a single number that is usually referred to as
gamma.
|
|
Home ![]() ![]() |