This document has a standard, validated CSS2 stylesheet, which your browser does not seem to display properly. In a browser supporting web standards, this table of contents would be fixed at the side of the page for easy reference.
anastigmatix homeStreamIO is a small resource providing some convenience procedures for file and stream I/O built on the existing facilities in LanguageLevel 2 and later PostScript.
The most significant is extfilter, which is analogous to PostScript's own filter, but allows new filters to be defined by ordinary PostScript code. The set of standard filters in PostScript can be enumerated (using resourceforall in the Filter category), but cannot be extended by any means in standard PostScript. Reportedly, some products do permit new filters to be added, but by product-specific means outside of PostScript proper. With extfilter, the same can be achieved in portable code.
StreamIO supplies a small set of predefined filters useful for doing slightly more elaborate plumbing in PostScript programs. One of the motivating needs for StreamIO was a portable, systematic way to handle the common task of peeking part way into an inline data stream and then rereading it from the start. The task arises, for example, in a procedure to grovel the dimensions out of an inline image before placing it.
The ReusableStreamDecode filter introduced by Adobe in LanguageLevel 3 would seem to be an obvious fit, but the fit is disappointing. Besides being available only in LanguageLevel 3, it defers so much behavior to the choice of implementation as to leave very little to rely on. It may (or may not) pre-read the entire inline data stream, even when AsyncRead is requested, and that is a large resource demand, with a risk of resource exhaustion, when typically only the header of an image needs to be groveled.
A simple solution in a few steps is possible with the filters provided in StreamIO:
currentglobal true setglobal 1 array exch setglobal
).
The filter will honor the queue's original allocation mode.
StreamIO
is a ProcSet
resource.
To make it available to your own code, include in the setup section of
your file:
/net.anastigmatix.StreamIO /ProcSet findresource
The findresource
will succeed, leaving a dictionary on the
operand stack, if you have made
the
StreamIO
resource file [download]
available in any of these ways:
findresource
(which belongs in the
setup section)%%DocumentNeededResources
and
%%IncludeResource
DSC comments, you include these comments at
the right position in your file to specify that it needs
net.anastigmatix.Order, your document manager software is configured to
automatically insert needed resources in files being printed, and you have
put the resource files where your document manager can find
them.
StreamIO
relies on a few other resources, and you will need
those files also. If you use the first method, directly including resources in
your file's prolog, the prolog has to contain all of the needed resources, in
any order so long as no resource comes before one it depends on, and
categories come before resources that belong in them. Any of the other methods
should just work as long as all the files are where they need to be. These are
the resources you will need:
Resource | Category | Description |
---|---|---|
net.anastigmatix.MetaPre | ProcSet | Staged-programming extensions for PostScript |
net.anastigmatix.filter | Category | Category to contain filter resources usable with StreamIO. |
additional filters | net.anastigmatix.filter |
The net.anastigmatix.filter resource category is not
limited to the predefined filters described here. New filter types
can be defined in your own PostScript code or made available as
as additional resources of type net.anastigmatix.filter ,
and would have to be downloaded or placed in a document prolog before
use. The filters described below, however, are included in the
net.anastigmatix.filter category without any additional
download.
|
net.anastigmatix.StreamIO | ProcSet | The main attraction. |
The resource files are in a compact form. That is for efficiency, not to keep you from viewing them; there is a script for that on the resource packaging page.
The StreamIO
dictionary may be placed on the lookup stack (with
begin
) for convenient access to the definitions in it, without
the bother of get
and exec
. The dictionary is
read-only, so before creating any
definitions, you will want either userdict begin
or your
own dict begin
so that you have a writable dictionary on top
of the dictionary stack.
This section describes the contents of the read-only dictionary that is
returned by /net.anastigmatix.StreamIO /ProcSet findresource
.
Reads all contents from the current position of src to the end, writing to tgt. If src or tgt is a string rather than a file object, it will be taken as a file name and opened for reading or writing, respectively. copyfile does not close either file on completion, which is reasonable when file objects are used (since the calling code can close them at the appropriate time) and also convenient with special named files, as in
f (%stdout) copyfile
to copy from f to standard output (which ought not to be closed). The file name form should not be used in other cases where the program needs control over when the file is closed.
Creates and returns a filtered file, analogously to the native filter operator, except that the dictionary form for supplying filter parameters (as used for some native filters in LanguageLevel 2, and usable for all of them in LanguageLevel 3) is mandatory, and the dictionary is never optional. The type of src|tgt is typically a file, string, or procedure, as for the native filters, but the documentation for some special-purpose filters may specify other types for src or tgt.
The filter type may be specified by name, referring to a resource defined in the category net.anastigmatix.filter, just as native filter names are found in category Filter. Unlike Filter, however, which is an implicit category, net.anastigmatix.filter is an ordinary resource category and new filter types can be constructed with pure PostScript code and registered with defineresource, as described below.
As a convenience in development, the filter may also be given as a proc in the form appropriate to register in the net.anastigmatix.filter category, without registering it.
If the filter is given as name and not found in the net.anastigmatix.filter category, it is assumed to be a standard filter known to filter. This permits extfilter to be used to set up any filter whether standard or user-defined, and allows for user-defined filters to supersede standard ones or provide them if the implementation does not. For example, /FlateDecode extfilter could work either on a Level 3 interpreter with native FlateDecode, or on an older interpreter with a PostScript implementation of FlateDecode downloaded in the net.anastigmatix.filter category.
Like filter, extfilter returns a globally-allocated file object if, and only if, the src|tgt and any composite objects retained from dict are globally allocated, without regard to the current allocation mode.
Decoding filters created with extfilter do not have a CloseSource parameter, as there is no way in pure PostScript to implement it.
Efficiently skips and discards the next nbytes bytes from src, or to the end of src if fewer than nbytes bytes remain.
Efficiently skips and discards data from src through and discarding the next occurrence of string, or to the end of src if string does not occur.
Executes proc with a writable file object on top of stack, capturing everything proc writes on the file in a queue of strings, each string no larger than the int argument. Returns the queue and the total byte count written.
As for hold but returns a readable file instead of a queue of strings.
As for hold but returns a string containing everything written by proc.
Produces a readable file with zero bytes available; as with the null device on some operating systems, there can be times it is convenient to supply such a file to an existing procedure.
Produces a writable file that accepts and discards all data written to it; as with the null device on some operating systems, there can be times it is convenient to supply such a file to an existing procedure.
The xxfile procedures are equivalent to the file operator with the corresponding access-mode string on the stack. They simplify writing code free of writable objects.
These procedures push the corresponding standard file objects.
They cannot simply be the standard file objects because
they are defined in the resource dictionary, which may be global,
and the file objects can be local. You have to execute the procedures
to get the file objects. Still, //stderr exec
is cleaner
than (%stderr) (w) file
when you want to write
code free of writable objects.
The following filters are predefined in the net.anastigmatix.filter resource category and always available to extfilter.
This filter requires not a single src but an array of sources, that is, an array whose elements are any combination of files, strings, or procedures acceptable as normal filter sources. As the resulting file is read, each source in the array supplies data until exhausted, and the resulting file reaches EOD when EOD is reached for the last source in the array.
The filter dictionary has a single parameter BufferSize which is the size in bytes of the buffer to be used when reading from any element in the source array that has file type. There is no default and the parameter must always be supplied, but the buffer will not be allocated until an element of file type is encountered.
The filter's progress through the array of sources will not be disturbed by save and restore, even if the array is allocated in local VM. The effect of save/restore on any individual source in the array, however, depends on that source.
This filter has the same behavior as SourceArrayDecode but, instead of an array of sources, it accepts a queue of them, in the form used with the enq and deq operators in net.anastigmatix.MetaPre.
As for SourceArrayDecode, there is one mandatory parameter BufferSize.
An empty queue is created simply by 1 array. A source s is added to an existing queue q in one of the following ways:
s q 2 array enq astore pop s q 2 array //enq exec astore pop
The second form is recommended in resource code intended not to depend on the MetaPre resource dictionary being on the dictionary stack at run time.
The queue should be allocated in global VM if there is any possibility that save/restore will be used while the filter is being read. The filter's progress through a locally-allocated queue can be altered by restore and, as the effect cannot be synchronized with the interpreter's own buffering, the result will not be predictable or useful.
This filter copies all data that it reads from src into the file given as the TapTarget parameter. Its parameter dictionary may contain the following parameters.
Key | Type | Semantics |
---|---|---|
BufferSize | integer | The size of buffer to be allocated and used if src is of file type. No buffer is allocated if src is a string or procedure, but the parameter must still be present. |
CloseTarget | boolean | If true, the file given as TapTarget will be closed as soon as EOD is reached on src. The filter does not flush or close the TapTarget file if this parameter is false, or if src is not read all the way to EOD. In those cases, the calling program must be sure to retain a copy of the file object given with TapTarget and explicitly flush or close it before assuming that all data read from src can be recovered from the resulting file. This parameter has a default of false. |
TapTarget | file | A file to which all data read from src is to be written. If CloseTarget is false or if the filter is not read all the way to EOD on src, the program must be sure to flush or close this file before assuming that all data read from src can be recovered. This parameter must be present. |
This filter requires a queue as its tgt operand. See SourceQueueDecode for details on constructing a queue. 1 array creates an empty queue.
Data written to the filtered file will be accumulated in string objects placed on the queue. As with any encoding filter, be sure to flush or close it before assuming that the queue contains all data written. The filter honors the allocation mode of the queue, and uses the same mode for allocating strings to place on it.
The parameter dictionary may have the following entries.
Key | Type | Semantics |
---|---|---|
BufferSize | integer | The maximum size of any individual string to be placed on the queue. There is no default and this parameter must be present. |
Count | array | If this parameter is present, it must have length at least 1 and its first element must be an int. The int will be incremented by the number of bytes written through the filter. Only after a flush or close can its value be relied on. |
This filter requires not a single tgt but an array whose elements may be any combination of files, strings, or procedures acceptable as data targets to ordinary filters. All data written to the filtered file will be replicated and written to all of the targets. Its parameter dictionary has a mandatory BufferSize limiting the maximum size of any individual write to the underlying targets, and the standard CloseTarget parameter which, if true, ensures that the underlying targets will all be closed when the filtered file is.
The following filters are not bundled in the net.anastigmatix.filter category resource, but are available as separate resource files.
This filter provides support for reading and processing PostScript input that
conforms to TN 5001, the Document Structuring Convention. It maintains a
nesting level that is incremented by %%BeginDocument:
comments
and decremented by %%EndDocument
comments. The filter reaches EOD
upon reading an %%EndDocument
comment that decrements the nesting
level to zero, or upon reading at nesting level 1 any comment whose keyword
was supplied in the Keywords parameter array, or on reaching the end of
the header comments if the HeaderOnly parameter is true.
%%BeginData:
and %%EndData
comments are honored, and
the verbatim data passed without scanning for keywords. Lines longer than the
DSC-specified maximum of 255 characters are passed without alteration but are
not scanned for comments.
Once reading from the filter has begun, the position of the underlying src is indeterminate until the filter reaches EOD. At that point, if EOD was reached because of a comment line and not EOD on src, then the position of src depends on whether a Pushback entry was present in the filter dictionary:
src is positioned just past a single carriage-return or line-feed byte that terminated the comment line. If the line was terminated with a carriage-return/line-feed sequence, the line-feed remains to be read from src, while in any other case the next character read from src is the first of the following line. As line-feed is considered whitespace, the difference is unimportant for many purposes, but should be checked if lines must be distinguished accurately.
The comment line has been consumed along with any directly following
%%+
lines. The newline sequence that ends the last line
consumed has also been completely consumed, whether a one-byte
cr
or lf
or a two-byte crlf
sequence, so when Pushback is used there is no special attention
needed for accurate line counting. src is positioned up to
three bytes beyond the last byte consumed, and the buffer supplied to
Pushback contains those zero to three bytes and their count. If
a new DSCDecode filter is opened to resume reading from src,
correct behavior requires only that the same Pushback buffer be
supplied in its filter dictionary without alteration. If other code will
resume reading from src, the zero to three bytes in the
Pushback buffer must be treated as preceding the next byte read
from src. (This is the price of getting %%+
lines
read automagically and not having to manipulate CountLF.)
It is recommended that src be a file object. The position in a string or procedure source cannot be accurately determined after this filter reached EOD.
DSCDecode can be used as a simple filter for reading from an
inlined document until the balancing %%EndDocument
, or
can be used to simplify scanning for particular comments. Simply creating
a DSCDecode filter with certain keywords given in the
Keywords parameter and flushing it with flushfile will cause
the file to be scanned until the next matching comment is encountered at
nesting level 1 (or another EOD condition is matched). Another filter can
be created to resume reading from that point. The next filter should be
created with NestLevel specifying the nesting level at which the
previous filter terminated, and CountLF set to false if
and only if the last character read from the previous filter was
carriage-return when no Pushback buffer was used.
The parameter dictionary may have the following entries:
Key | Type | Semantics |
---|---|---|
CountLF | boolean | Whether the filter should consider a line-feed as the first character read to represent an actual newline. This should be set to false if and only if an immediately-prior read ended with a carriage-return and the Pushback entry is not present, as in that case an initial line-feed should be considered part of a single CRLF newline sequence. There is no use for this entry when Pushback is used. Default: true. |
HeaderOnly | boolean |
Whether the filter should reach EOD at the end of the header comments,
that is, either at an explicit %%EndComments line, or the
first line that does not begin %X where X
is a character value between 33 and 126 decimal. When using this
feature to scan through a header, care should be taken to set
CountLF correctly when Pushback is not used,
as a line-feed misinterpreted as a newline
would incorrectly be taken to end the header. Default: false.
|
Keywords | array |
An array of strings representing comment keywords. The filter will reach
EOD as soon as it has read, at nesting level 1, any comment line whose
keyword is included in this array. A keyword must be specified by all
characters from the first % through the terminating
: if any, which is part of the keyword per TN 5001.
Default: empty.
|
NestDepth | integer |
Specifies the initial value for the nesting level.
%%BeginDocument: comments increment the level,
%%EndDocument comments decrement it, and an
%%EndDocument comment that decrements the level to zero
is an EOD condition. Default: 0 (but see Unwrap).
|
Pushback | string |
A four-byte string to be used as a pushback buffer. Should be supplied
filled with zeros (the condition of a newly-allocated string) on the
first call.
When this entry is present, all forms of newline are handled
transparently, there is no need to fuss with CountLF, and
The last byte contains the count cnt of pushed-back bytes. Those bytes immediately precede the count byte and should be read in increasing-index order. So the following code would convert the buffer to a string of the buffered bytes in the right order: dup length 1 sub 2 copy get exch 1 index sub exch getinterval |
Status | array |
A four-element array for returning status to the caller.
If the filter reaches EOD by reading a line that matches an EOD
condition and this array is supplied, then the matching line is returned
in this array and not passed through to the filter's reader.
If the line is a DSC comment, the array's first element is the keyword
and the second is the remainder of the line, less the terminating
CR or LF. The third element is the entire matching line including the
final CR or LF, and the fourth is the nesting level.
When a Pushback entry is used, the third element contains the
entire matching text including the complete newline sequence, whether
it is a single-byte |
Unwrap | boolean |
Enables a feature useful for reading DSC-conformant input
interchangeably from an external file or inlined between
%%BeginDocument: and %%EndDocument in
a larger file. If the first line read is a BeginDocument:
comment, then the filter behaves normally except that this first line
and its balancing %%EndDocument are not passed through to
the filter's reader. If the first line read is anything else, the
nesting level is incremented to 1 (just as if a
%%BeginDocument: comment had been read), and the
filter thereafter behaves normally.
|
A new filter type is defined by writing a PostScript procedure and registering it in the net.anastigmatix.filter category with defineresource. When extfilter is invoked to set up an instance of the filter, it calls this procedure with the src|tgt and the parameter dictionary dict on the stack. The procedure must consume these two and return two items on the stack: an array and the name Encode or Decode to identify its filter direction.
The first element of the array must be a procedure that will be called to handle reading or writing of the filter. If the array length is greater than one, the remaining elements are nobody's business but the filter's.
For a decoding filter, the procedure in the array's first element will be called when data must be read. It is passed the entire array on the stack, which it may consult and modify, and must return a string. It can return a zero-length string to signal EOD.
For an encoding filter, the procedure is called when data must be written, and is passed three items on the stack: the array, a string, and a boolean. Except for receiving the array as an additional argument, the procedure must behave exactly as described in the PLRM filter section for a procedure as data target.
Let's implement a ROT13Decode filter. It will have just a single parameter, BufferSize, and will use a three-element array to keep its state. The first element will be the service procedure, as required. The second will be used to remember the src to read from, which for simplicity we will convert to a file (using a SubFileDecode filter) if it is a string or procedure. The array's third element will be used for a buffer that, again for simplicity, will be allocated unconditionally. The example is optimized for clarity rather than performance.
currentglobal true setglobal /r13rd { dup 1 get exch 2 get readstring pop dup 0 1 2 index length 1 sub { %for stack: buf buf i 2 copy get % stack: buf buf i c dup dup 16#41 ge exch 16#5A le and { 16#41 sub 13 add 26 mod 16#41 add } if dup dup 16#61 ge exch 16#7A le and { 16#61 sub 13 add 26 mod 16#61 add } if put dup } for % stack: buf buf pop } bind def setglobal /ROT13Decode { % src dict *ROT13Decode* state-array /Decode //r13rd 3 1 roll /BufferSize get % stack: proc src size 1 index type /filetype ne { exch 0 () /SubFileDecode filter exch } if 1 index gcheck setglobal string % stack: proc src buf 3 array astore /Decode % stack: [proc src buf] /Decode } currentdict /r13rd undef bind /net.anastigmatix.filter defineresource pop
Two techniques give the filter its correct memory-allocation properties. First, the reading procedure is factored out of the filter setup procedure so that it can be made unconditionally global even if the setup procedure is not. Its allocation occurs at the time the code is scanned. Second, the setup procedure, at the time it is used, checks whether the src parameter (for ROT13, the only composite parameter that will be retained) is global, and sets the allocation mode to match before proceeding to allocate space. It does not need to remember and restore the prior allocation mode, because extfilter itself looks after that.
If the state array will be in global VM, the buffer must be also (or it can't be stored in the state array). On the other hand, if the state array is in local VM it's ok for the buffer to be too, because contents of strings aren't molested by save and restore. So it's enough to allocate the buffer in whatever arena the array is placed in.
Here is an example of the filter in use:
/net.anastigmatix.StreamIO /ProcSet findresource begin (How can you tell an extrovert from an introvert at NSA? Va gur ryringbef, gur rkgebiregf ybbx ng gur BGURE thl'f fubrf.) <</BufferSize 80>> /ROT13Decode extfilter (%stdout) copyfile
A ROT13Encode filter would be nearly identical, except that the service procedure is called with three items on the stack rather than one, and the filter should accept and honor a CloseTarget parameter.
Anastigmatix-developed filters in the net.anastigmatix.filter resource category have simple names. To avoid naming conflicts, code from other sources that defines new filters should give them inverted-domain-style names (along the lines of com.example.ROT13Decode).
To simplify writing decode filters a little more efficient than the example, a procedure called .decodehelper is defined in (where else?) the category implementation dictionary for net.anastigmatix.filter (that is, the dictionary obtained with /net.anastigmatix.filter /Category findresource). The procedure takes the same two stack arguments passed to a filter procedure, namely the source and the parameter dictionary, and it takes care of whether the source is a file, string, or procedure, and buffer allocation if necessary according to the BufferSize parameter in the dictionary. Like any filter procedure, it returns a state array (but it does not return the name Decode), and the first element is a service procedure. The filter being developed can remember this array in its own state array, and obtain data at read time by placing this array on the stack and calling the service procedure. A string is returned, zero length if EOD has been reached.
To refer to the helper procedure, the category implementation dictionary should be temporarily pushed on the dictionary stack, and .decodehelper referenced with // and exec:
/net.anastigmatix.filter /Category findresource begin /ROT13Decode { ... //.decodehelper exec ... } end bind /net.anastigmatix.filter defineresource pop
Some PostScript interpreters have been known to be buggy when it comes to closing an encoding filter. The result can be incomplete output if the filter never gets the signal to flush the data still in its buffers. Certain versions of ghostscript were affected by the bug (it was their bug #688326). There may be other interpreters with similar bugs.
The bug is easy to test for, and StreamIO
tests for it when
loaded. The boolean filterCloseBug in the StreamIO
dictionary will be true if the interpreter is found to have such
a bug. There is no fully transparent workaround. A non-transparent
workaround can be added to StreamIO
if there are enough
buggy interpreters left to justify it. Calling code would have to take
extra steps to use it. For now, only the boolean flag is there, which at
least you can check if something doesn't seem to be working right.
If the flag is true, it could explain a problem. If nothing prevents
upgrading to a working PostScript interpreter, that's the best solution.
The boolean filterFlushfileBug in the StreamIO
dictionary will be true on an interpreter where flushfile
incurs an error if used on a decode-filter file object backed by a
procedure. This has been seen in level 2 and level 3 versions of
Hewlett-Packard's knockoff interpreter. On those interpreters a sufficient
workaround is to wrap any decode filter backed by a proc in an extra trivial
SubFileDecode filter. extfilter applies this workaround
automatically if this flag is true.