Text Chunks
The other item worth looking at is the interactive text-entry code. Most
windowing systems will have more elegant ways to read in text than I use
here, but even they should ensure that the entered text conforms to
the recommended format for PNG text chunks. PNG text is required to use
the Latin-1 character set; strictly speaking, that does not restrict the
use of control characters (character code 127 and any code below 32 decimal),
but in practice only line feeds (code 10) are necessary. The use of
carriage-return characters (code 13) is explicitly discouraged by the spec
in favor of single line feeds; this has implications for DOS, OS/2, Windows,
and Macintosh systems. Horizontal tabs (code 9) are discouraged as well
since they don't display the same way on all systems, but there are
legitimate uses for tabs in text. The section of the spec dealing with
security considerations implicitly recommends against the use of the escape
character (code 27), which is commonly used to introduce ANSI escape
sequences. Since these can include potentially malicious macros, encoders
should restrict the use of the escape character for the sake of overly
simple-minded decoders. That leaves codes 9, 10, 32-126, and 160-255
as valid from a practical standpoint, with use of the first (tab) discouraged.
Note that codes 128-159 are not valid Latin-1 characters, at least
not in the printable sense. They are reserved for specialized control
characters.
The specification also recommends that lines in each text block be
no more than 79 characters long; I've chosen to restrict mine to 72
characters each, plus provide for one or two newline characters and a
trailing NULL. The spec does not specifically address the issue of the
final newline, but does require omitting the trailing NULL; logically,
one might extend that to include trailing newlines, so I have.
Finally, I have arbitrarily allowed only six predetermined keywords:
Title, Author, Description, Copyright
(all officially registered), and E-mail and URL
(unregistered). Description is limited to nine lines, mainly
so that the little line-counter prompts for each line are single
digits and therefore line up nicely; the others are limited to one
line each. Thus the code for reading the Title keyword, once
the text buffer (textbuf) has been allocated, looks like this:
do {
valid = TRUE;
p = textbuf + TEXT_TITLE_OFFSET;
fprintf(stderr, " Title: ");
fflush(stderr);
if (FGETS(p, 74, keybd) && (len = strlen(p)) > 1) {
if (p[len-1] == '\n')
p[--len] = '\0'; /* remove trailing newline */
wpng_info.title = p;
wpng_info.have_text |= TEXT_TITLE;
if ((result = wpng_isvalid_latin1((uch *)p, len)) >= 0) {
fprintf(stderr, " " PROGNAME " warning: character"
" code %u is %sdiscouraged by the PNG\n"
" specification [first occurrence was at
" character position #%d]\n", (unsigned)p[result],
(p[result] == 27)? "strongly " : "", result+1);
fflush(stderr);
#ifdef FORBID_LATIN1_CTRL
wpng_info.have_text &= ~TEXT_TITLE;
valid = FALSE;
#else
if (p[result] == 27) { /* escape character */
wpng_info.have_text &= ~TEXT_TITLE;
valid = FALSE;
}
#endif
}
}
} while (!valid);
Aside from some subtlety with the keybd stream that I won't cover
here (it has to do with reading from the keyboard even if standard input is
redirected), the only part of real interest is the test for nonrecommended
Latin-1 characters, which is accomplished in the wpng_isvalid_latin1()
function:
static int wpng_isvalid_latin1(uch *p, int len)
{
int i, result = -1;
for (i = 0; i < len; ++i) {
if (p[i] == 10 || (p[i] > 31 && p[i] < 127) || p[i] > 160)
continue;
if (result < 0 || (p[result] != 27 && p[i] == 27))
result = i;
}
return result;
}
If the function finds a control character that is discouraged by the PNG
specification, it returns the offset of the first one found. The only
exception is if an escape character (code 27) is found later in the string;
in that case, its offset is what gets returned. The main code then tests
for a non-negative value and prints a warning message. What happens next
depends on how the program has been compiled. By default, the presence of
an escape character forces the user to re-enter the text, but all of the
other discouraged characters are allowed. If the FORBID_LATIN1_CTRL macro
is defined, however, the user must re-enter the text whenever any of the
``bad'' control characters is found. The default behavior results in
output similar to the following:
Enter text info (no more than 72 characters per line);
to skip a field, hit the <Enter> key.
Title: L'Arc de Triomphe
Author: Greg Roelofs
Description (up to 9 lines):
[1] This line contains only normal characters.
[2] This line contains a tab character here: ^I
[3]
wpng warning: character code 9 is discouraged by the PNG
specification [first occurrence was at character position #85]
Copyright: We attempt an escape character here: ^[
wpng warning: character code 27 is strongly discouraged by the PNG
specification [first occurrence was at character position #38]
Copyright: Copyright 1981, 1999 Greg Roelofs
E-mail: roelofs@pobox.com
URL: http://www.libpng.org/pub/png/pngbook.html
Note that the Copyright keyword had to be entered twice since the
first attempt included an escape character. The Description
keyword also would have had to be reentered if the program had been
compiled with FORBID_LATIN1_CTRL defined.
Returning to more mundane issues, wpng_info is the struct by which
the front end communicates with the PNG-writing back end. It is of type
mainprog_info, and it is defined as follows:
typedef struct _mainprog_info {
double gamma;
long width;
long height;
time_t modtime;
FILE *infile;
FILE *outfile;
void *png_ptr;
void *info_ptr;
uch *image_data;
uch **row_pointers;
char *title;
char *author;
char *desc;
char *copyright;
char *email;
char *url;
int filter;
int pnmtype;
int sample_depth;
int interlaced;
int have_bg;
int have_time;
int have_text;
jmp_buf jmpbuf;
uch bg_red;
uch bg_green;
uch bg_blue;
} mainprog_info;
As in the previous programs, we use the abbreviated typedefs uch,
ush, and ulg in place of the more unwieldy
unsigned char, unsigned short, and unsigned long,
respectively. The title element is simply a pointer into the
text buffer, and the struct contains similar pointers for the other five
keywords. have_text is more than a simple Boolean (TRUE/FALSE)
value, however. Because the user may not want all six text chunks, the
program must keep track of which ones were provided with valid data.
Thus, have_text is a bit flag, and TEXT_TITLE sets the bit
corresponding to the Title keyword--but only if the length of
the entered string is greater than one.
The user indicates that a field
should be skipped by hitting the Enter key, and the fgets() function
includes the newline character in the string it returns; thus a string of
length one contains nothing but the newline.
|