anastigmatix.net

This document has a standard, validated CSS2 stylesheet, which your browser does not seem to display properly. In a browser supporting web standards, this table of contents would be fixed at the side of the page for easy reference.

anastigmatix home
  • Why direct PostScript?
  • What's it like to write PostScript?
  • Benefits: PostScript in the small
  • Benefits: PostScript in the large
  • Benefits: change control
  • Barriers: underappreciated capability
  • Barriers: reusability
  • Barriers: transparency
  • Direct PostScript resources
  • The TinyDict
  • Quikscript
  • Gonzo PostScript Utilities
  • Barcode Writer in pure PostScript
  • Mathematical Illustrations resources
  • Function graphing tutorial
  • Appendix: PostScript as a programming language
  • Adobe's PostScript® language, and why “direct” PostScript makes sense

    The PostScript language, defined by Adobe Systems Incorporated, has in 20 years become almost the universal language for graphical and printed work, even if many people produce it more or less unawares by using a variety of WYSIWYG programs.

    These pages, and the tools on them, are meant to support and encourage what David Byram-Wigfield has called “direct PostScript”: work created by writing in PostScript from the start. PostScript is a powerful language that is not difficult to begin to learn, can express sophisticated graphical ideas in few words, and has the final say as to just where every mark appears on the output, giving unsurpassed visual control.

    Users of WYSIWYG design programs—which are fabulous tools for many jobs—may be aware that they tend to produce something called PostScript when the job is ready to print. But most programs of that sort produce PostScript code that is functional but awful; to any person at all curious to peek in the file, it looks like an explosion in a punctuation factory, not like any language a sane human would use. It is the same feeling one gets about learning HTML for web design, after one appalled look at the HTML that comes out of typical web-publisher programs.

    But the person who has written a few web pages directly in HTML knows the advantages in space, speed, reliability, and control, and the advantages of direct PostScript are still greater. PostScript is a full programming language—unlike HTML—tailored to graphics, so it is great for getting a computer/printer to do many precise, repetitive, calculated things all on its own. Design, illustration, and typesetting can all involve precise, repetitive, calculated things, and direct PostScript techniques are justly popular with people who would rather tell the computer to do the work than drag it by the mouse every step of the way.

    What's it like to program PostScript?

    Though it has some largely unsung similarities to sophisticated modern programming languages, PostScript may be more interesting in one way it differs from many familiar ones. To PostScript there is never really “a program” seen as a complete unit all at once: the language was designed so even printers with limited memory could print graphics too complicated to fit. A PostScript interpreter consumes a program as it comes in, start to finish, remembering only what's necessary from what it has already seen.

    What it will remember can include definitions that add new words and new capabilities to the language. If early in a file there are definitions for 3-D surface plotting, or staged programming, or hyphenation, or barcode production—and all those examples are available today—then the rest of the file can be written as if PostScript is a language with those capabilities built-in. With most PostScript printers, it is possible to send such procedure sets or other resources once every power-on, or once for all if the printer includes a hard disk or flash, effectively giving the printer new capabilities permanently; it will accept and print files that simply take the capabilities for granted. By the same token, a file that includes the resources it uses is self-contained and can be sent to any PostScript printer anywhere, independent of any accompanying software.

    Wherever “sent to a printer” is written, it is equally possible to run a file through Adobe's Distiller, producing a Portable Document File of the sort found all over the web. For electronic distribution, PDF viewers are somewhat more widely deployed than PostScript viewers, though both are freely available, and offer some navigation features (set up by pdfmarks in the original PostScript code) that are convenient for users. A PDF file is a static representation of the result of executing the PostScript file on a single occasion, and will be less compact if much of the original work was defined procedurally. It is also less easily edited, but may be attractive for that very reason at times when distributing work in source form is objectionable.

    It could be said that pure, vanilla PostScript is hardly ever—except for simple jobs—the language you really write in: you write in a language that you get by sending a prolog of one or more resources that grow PostScript into a language suiting your job, which may retain the basic syntax and flavor of PostScript, or have a whole new look. For the most part in this page, I will loosely use resource to mean any chunk of PostScript boilerplate that is meant to be sent in advance to add capabilities, whether or not it is really in the form PostScript's “resource” operators expect, only because a lot of useful resources are available that do not—but see the reusability section for why they'd be even more useful if they did.

    While experienced programmers find PostScript quite capable, beginners also can do interesting things right away. Here is a 24-word variation on the customary first program in any language:

    Black ellipse with Hello, world! in white Helvetica
    gsave
    1 0.5 scale
    70 100  48  0 360 arc
    fill
    grestore
    /Helvetica-Bold 14 selectfont
    1.0 setgray
    29 45 moveto
    (Hello, world!) show
    showpage
    

    It is worth comparing those 24 words to the size of the file saved after creating the same thing in a typical drawing program.

    Benefits: PostScript in the small

    Frequent, low-volume projects with highly individualized requirements can be natural jobs for direct PostScript. Often it is easy with a WYSIWYG program to get ninety percent of the desired result, and then the last ten percent becomes a battle with the program's built-in style or assumptions. Even if you can find the way to tell the program what you want, it is not guaranteed to be easier than to say it in PostScript—and to say it in PostScript can be the more promising investment, in learning coherent concepts of one established graphical language rather than workarounds and tricks for a hodge-podge of application programs.

    For those doing mathematical figures, the point could not be made better than by Bill Casselman in his book, Mathematical Illustrations:

    The truth is that the trade-off is unnecessary--once one has made a small initial investment of effort, by far the best thing to do in most situations is to write a program in the graphics programming language PostScript. ... The apparent complexity involved in producing simple figures by programming in PostScript, as I hope this book will demonstrate, is largely an illusion. And the amount of work involved in producing more complicated figures will usually be neither more nor less than what is necessary.

    For those who have taken to heart Edward Tufte's work on presenting information visually, there is no such thing any more as a generic chart or graph. Any information display worth making is worth making with close attention to how all the visual and typographic aspects combine to convey the information without distortion or distraction. A sampling of published papers in some fields might invite the conclusion that many charts and graphs were not worth making! The trouble is, they are typically made with software that tries to make charts and graphs generic, and then offers a cornucopia of ways to dress them up with gee-whiz chartjunk that adds nothing to the presentation.

    All the ornamentation and noise in that sort of a graph would be tedious to replicate in PostScript—but a clean, readable graph is not hard, and in PostScript your control over the position and typography of every element, legend, and callout is more direct than in any graphing program.

    Don Lancaster's pages (see below) offer examples of many other publishing-in-the-small projects—stationery, cards and numbered tickets, fancy borders (exactly the kind of repetitive work the computer excels at) and so on—where direct PostScript can be the way to go.

    Benefits: PostScript in the large

    At the other extreme, direct PostScript has significant value in the volume activities of document archival, book-on-demand printing, and database publishing. The three widely available procedure sets for direct PostScript typesetting are all veterans of such use. David Byram-Wigfield's TinyDict was developed for his Cappella Archive, an on-demand publishing business; Don Lancaster's Gonzo Utilities have supported his ventures into Book-on-Demand publishing as well as his copious output of meticulously typeset and illustrated magazine and eZine articles; Graham Freeman's Quikscript has supported a billing system, fed on raw data from a database.

    An obvious efficiency in database publishing is that only the raw values from the database need to be retrieved and sent directly to the printer, after sending first a resource that will do the formatting and produce pages as the values come in. In a case other than direct PostScript, some program on a host computer will retrieve the values and format the report, sending on to the printer volumes of data on where to place every line, box, and number on the page—a much larger file on the print queue and a greater demand for communication bandwidth.

    As archival and print-on-demand entail significant storage demands, an advantage of a direct PostScript solution is that each document exists in one form: the very file or files to be sent to the printer. It is created, edited, archived, and printed in that form. The question does not arise whether the print file was regenerated from the master file recently enough, or whether prepress software has been used to modify the print file directly so that it does not correspond to the master, or whether the software originally used is available to edit the master or regenerate the print file. In The Tiny Guide for PostScript Markup, David Byram-Wigfield has it this way:

    There are other difficulties in using commercial typesetting programs. The generated PostScript scripts are impossible to edit without a return to the original software version, which makes archiving unreliable.

    The point is that even though the PostScript language itself is software and sees new releases over time—it has had a second major revision—it is bound tightly to a strict, published specification and is extended with a strict view toward continuing compatibility. The number of copies embedded in hardware devices serves as a check on reckless revision: if you have a PostScript file that printed correctly on an Apple LaserWriter in 1985, it will almost certainly print correctly on a Level 3 PostScript printer sold today. It can be viewed in today's PostScript viewers, and converted to PDF by today's Distiller. The situation is qualitatively different for the layout or design software you may have used 20 years ago—if you can find a copy, or if the current version will open the old file, or if a current version is available for the computer and operating system you are now using.

    Eric Lindsay (who uses Quikscript) has explained these benefits from another angle, and includes links to two other pages that press certain points even more strongly (proprietary data formats and inefficiency of editing for appearance rather than structure).

    Benefits: change control

    An issue sometimes overlooked that touches both small-scale and large-scale projects is revision tracking and version control. Work created in direct PostScript, maintained as a human-readable text file, shares with other traditional text-based markups like TeX the advantage that ordinary file difference utilities and version control software can readily be used to track changes in the file, and the changes identified will be intelligible. The files saved by design and layout programs typically cannot be version-controlled effectively using general-purpose tools, other than by retaining each version in its entirety and giving up on tracking the content of revisions. Even if the program uses a text-based file format such as XML, the mechanical regeneration of the file on every save will often cause even a very local revision to result in widespread differences reported by general-purpose differencing tools, confounding the effort to make sense of reported revisions.

    A direct PostScript file enjoys the further advantage that the file being version controlled is also the file to be printed; there is no separate step to generate a print file that may fall behind the version-controlled master.

    Barriers: underappreciated capability

    PostScript is an interpreted language, and comes built into printers, and those facts lead some to underestimate it, and to overlook direct PostScript solutions on an assumption that some job or other in PostScript would be too complicated or unacceptably slow. Sometimes those assumptions need only a small dose of real data for calibration.

    For example, the wikibooks PostScript FAQ, at the time of this writing, had the following comment:

    It is strongly recommended to do text formatting in your application. Then, and only then, set the text - by now a simple question of dumping things at x, y coordinates. ... A good text formatter supports hyphenation, multi-column text, footnotes .... PostScript language has no such features.

    The comment was followed immediately by a link to David Byram-Wigfield's TinyDict, a procedure set for direct PostScript typesetting that does everything quoted above. Graham Freeman's Quikscript could as easily have been listed also, or Don Lancaster's utilities with his picojustification approach.

    Looking just at hyphenation for an example, the procedure used in TinyDict is the same one developed by Knuth and Liang for the widely-used TeX, ported to PostScript by Olavi Sakari. When loaded with the original TeX patterns, running interpreted in Ghostscript and compared to the native, compiled code of TeX itself on the same hardware, in marking all divisions in 234,964 headwords of Webster's Second New International Dictionary it achieves a speed of TeX / 46, hardly impractical considering today's computers are more than 46 times faster than those for which TeX was developed. And Sakari achieved that performance with a more-or-less direct, partial translation to PostScript of the original data structures Knuth used when writing TeX in Pascal.

    Another port of the same TeX algorithm to PostScript is net.anastigmatix.Hyphenate, which replaces Knuth's original data structures with a design that exploits PostScript's peculiar strengths. It is intended to be quite general: its pattern set is a selectable PostScript resource so the original TeX patterns, or many developed for other languages in the popular Knuth-Liang form, may be used. It is also updated to employ the international Unicode standard as a pivot between the pattern encoding and selectable encodings for the content of the input file, and to accomodate character encodings that use more than one byte. Even with these functional enhancements, and running in interpreted PostScript whereas TeX is compiled, net.anastigmatix.Hyphenate achieves a speed of TeX / 8 on the benchmark described above. The moral is that quite significant functionality can be implemented in PostScript and at quite acceptable performance levels. The feasibility of direct PostScript projects should not be evaluated on untested pessimistic assumptions. For students of programming languages, some features of PostScript as a language are highlighted in the appendix.

    Barriers: reusability

    The promise of direct PostScript rests on the availability of existing, tested PostScript resources for common tasks such as typographical layout, graphs and charts, barcodes, and other elements that might be wanted in a given document. To make a document that includes some typeset text and a 3-D illustration, what you would like is to find a good typesetting resource and a good 3-D resource that you can include together in the document's prolog, or download in advance to your printer so they need not be included in the document. You would like these two building blocks of PostScript code, which you may have obtained from different authors, to just work, the typesetting module doing its typesetting without throwing off the 3-D drawing, and so the other way around.

    The current state of affairs is not yet that simple. To write a program in PostScript for one's own immediate purpose is not difficult; to write one that will be well-behaved in someone else's environment and free from surprise interactions with other programs is not substantially more difficult, but requires its own discipline and habits of thought. The trick is to design so that the person ultimately using the resource needs to understand as little as possible about its insides to judge whether it and some other resource will peacefully coexist in the same document, a judgment that can be forbidding if the programmer needs to review almost everything in each resource in order to make the call. The resources reviewed here are impressive and capable, and have been heavily used by their own authors in production work, but in general are not quite packaged to offer worry-free reusability to others—though for some of them only very minor changes would be entailed.

    One common trouble arises when two resources that define a great many new names use some of the same ones for conflicting purposes. The discipline required is for each resource to confine its own definitions to a dictionary of its very own. Even better, as typically only a few variables and procedures are meant to be visible to the program using the resource, the rest being only its internal implementation, the internal definitions can be confined to a dictionary that is not exposed to the program using the resource at all. Bill Casselman put it well in his book Mathematical Illustrations: the key considerations are “locality, Locality, LOCALITY.”

    Another source of trouble lies in building some settable behaviors into a resource so that the programmer using the resource in a document is expected to make a few changes in the resource itself—a practice that plays havoc with the very idea of downloading a single copy of the resource in advance to be used in many documents, or storing it in a file to be merged into documents by a document manager. The discipline is to provide for the resource's configurable behavior to be set by arguments to its procedures, variable settings, or parameter dictionaries.

    Adobe introduced, with PostScript Level 2 in 1990, the named resource features in the language, to support the development of libraries of reusable PostScript resources, but few PostScript programmers seem to be familiar with them or to use them routinely as they could be used. PostScript resources on this site are provided in this form, both for their own reusability and to provide example code to other developers of reusable PostScript code who could easily take this simple and valuable step. The Packaging PostScript Resources page suggests how.

    Barriers: transparency

    Much of the potential of direct PostScript is in having the final language of the page description always available. A preloaded resource for text setting and justification might be used for the bulk of a job, but if there is a bit on one page that just needs to be rotated five degrees or moved exactly three millimeters over, a line or two of simple PostScript can just be thrown in at that spot. That's good both for the end user, who is able to achieve any needed effect, and for the developer of a resource, who does not need to try to anticipate and provide for every effect the end user might need. The resource developer can focus on supporting a tight cluster of functions for general setting of text, say, knowing the end user will always have access to the expressive language underneath, or to another, special-purpose resource, for any out-of-the-ordinary effect. The approach favors an assortment of fairly lightweight resources addressing different common layout and design objectives, and available to include in a document singly or in combination, rather than a software monolith that hopes to include the kitchen sink.

    That potential is nearest to reality when the resources used for typesetting or other parts of the job are designed with an eye toward transparency. While they may be sophisticated and powerful, allowing the user to do concisely what would be longwinded and burdensome in plain PostScript, the user should nonetheless be able to think of them as elaborate new PostScript operations, picture their effect on the PostScript interpreter state, and, whenever reasonable, write other PostScript code intermingled and get sensible, predictable results. Unless another particular syntax is obviously suited to the job at hand, a PostScript-like syntax and evaluation order is desirable so that, for the user, to mingle use of the resource with use of plain PostScript does not feel like a clashing of gears. For the user who knows PostScript, transparency helps in understanding a new resource in already familiar terms; for the user whose first experience of PostScript is through using a resource for text setting or graphics, transparency invites learning what more is possible because the full language is available.

    How existing direct PostScript resources stack up on this point varies. Bill Casselman's ps3d is designed as a seamless extension of PostScript to three dimensions, with a current point, current path, graphics state stack, and all the expected features. Terry Burton's resource for bar codes provides a small number of operators with a very PostScript flavor.

    Of the resources available for direct PostScript text setting, Byram-Wigfield's TinyDict perhaps retains the most flavor of PostScript, because it is scanned by the PostScript scanner according to the language's lexical rules, and its many markup commands are simply one- or two-letter names defined as PostScript procedures in the usual way. The transparency so achieved has a cost neither Freeman nor Lancaster was willing to pay, namely that the text itself in a TinyDict document must all appear in the form of PostScript strings, a form less than natural for reading and editing the source. Both Quikscript and the Gonzo Utilities follow more traditional markup systems in expecting markup codes flagged by escape characters of some sort in otherwise unadorned text; they offer a more natural way to view and edit the source text, but at the cost of new, less PostScript-like scanning rules, and new mechanisms for defining markup commands, parallel to and different from the PostScript mechanisms for defining and naming new procedures. The resources also tend to supply their own mechanisms for such tasks as font selection, parallel to, and less general than, those built into PostScript, creating a distance from the underlying PostScript operations.

    A starting point for exploring a compromise design can be seen in net.anastigmatix.Markup, with a markup-in-plain-text approach for natural viewing and editing, but with the markup language deliberately kept close to familiar PostScript.

    Available resources for direct PostScript

    This section describes existing resources I am aware of that are intended to be used in direct PostScript content. These are resources that are readily available for download, though the terms for their use vary. I first describe three resources for general text setting, ordered alphabetically by author.

    The TinyDict (David Byram-Wigfield)

    The TinyDict was developed for David Byram-Wigfield's on-demand publishing business, Cappella Archive. It supports most of the text formatting involved in book publication, including left, center, right, and full justification, hyphenation (with an implementation of the general Knuth-Liang algorithm used in TeX, for which many language-specific hyphen patterns exist), multiple columns, tables, and so on. It is thoroughly documented in a manual written (naturally) using TinyDict markup, and available as a PDF. The manual includes a good introduction to the PDF metadata that can be included in a document, information that is useful in any direct PostScript setting, not specific to TinyDict. The documentation for tables makes a good demonstration of the appeal of direct PostScript techniques, with an example of a running calculation performed in PostScript as the table is being set.

    The TinyDict occupies about 127 kilobytes of virtual memory after garbage collection in an Adobe Level 3 PostScript interpreter. That breaks down into about 37 kB for everything save hyphenation, and 90 kB for its Knuth-Liang hyphenator loaded with a custom-developed, minimal pattern set called TinyDivi. That pattern set can be swapped for the popular TeX US English patterns, growing the 90 kB to 445 kB. Another Knuth-Liang implementation, net.anastigmatix.Hyphenate, can also be used with the TinyDict; it occupies 300 kB when loaded with the original TeX patterns. At 37 kB without hyphenation (for comparison to the other resources described here), the TinyDict is easily the tiniest.

    The TinyDict requires input text to obey the syntax for PostScript strings; its markup codes appear between strings of input text as would any PostScript command. The TinyDict can be used in combination with net.anastigmatix.Markup to accept a markup-codes-in-plain-text input style for comparison with the other text setting resources appearing here, adding about 8 kB. The TinyDict includes styles for several multi-page impositions common in book printing. I have not found it always easy to see how to adjust some typical “stylesheet” properties without editing the TinyDict itself.

    The TinyDict is available for download without obvious restriction, but its terms of use are nowhere clearly stated; only "all rights reserved" appears in the file itself. The Knuth-Liang hyphenator developed for the TinyDict by Olavi Sakari is provided under the GNU LGPL, and net.anastigmatix.Hyphenate has a permissive license; either hyphenator could be used in any PostScript text setting resource.

    Quikscript (Graham Freeman)

    Quikscript, developed by Graham Freeman, accepts text with embedded markup codes, and fills (with a concept of nonbreaking spaces), offering left, center, right, or full justification, dotted leaders, multiple columns, tables, tables of contents and indexing. It is thoroughly documented in a manual written (naturally) in Quikscript, and available as a PDF. The source for the manual is included in the distribution, so it is easy to see how every effect in the manual was achieved.

    Quikscript occupies about 77 kilobytes after garbage collection in an Adobe Level 3 PostScript interpreter. It does not include hyphenation, but could probably be adapted with little trouble to use either of the hyphenation resources described above for the TinyDict.

    Quikscript is available for download without restriction, but requires a license (available from the author) for commercial use.

    Gonzo Utilities (Don Lancaster)

    Don Lancaster's Gonzo PostScript Utilities set type from text input with embedded markup codes. They have been heavily used by their author for magazine articles rich with technical illustration, and many other forms of publication. Left, center, right, and full justification are supported, with a picojustification feature: not only is intercharacter and interword spacing adjusted to achieve the target line length, but actual characters are stretched by very slightly rescaling fonts, in a small number of discrete steps to minimize pressure on the font cache. Automatic hanging punctuation and last line stretch refine the appearance of the justified text. Multiple columns, headings, and “supertabs” are supported. Reflecting the author's interests, the utilities include many provisions for graphics, curve tracing, nonlinear transformations, electronic design and circuit-board layout.

    The Gonzo Utilities are documented in an unusual but very effective way, illustrated with an array of brilliant examples, and pointers to production resources, in the form of 134 pages of projects for a deceptively-named “PostScript beginner” course.

    The Gonzo Utilities occupy about 198 kilobytes after garbage collection in an Adobe Level 3 PostScript interpreter. They do not include hyphenation as distributed—Lancaster has dropped hints that he has an implementation he uses in his own work—but could probably be adapted with little trouble to use either of the hyphenation resources described above for the TinyDict.

    The Gonzo Utilities are available for download without obvious restriction, but carry the notice “all commercial rights fully reserved.”

    Barcode Writer in Pure PostScript (Terry Burton)

    ISBN barcode 0-201-37922-8

    Terry Burton's Barcode Writer in Pure PostScript is an example of a direct PostScript resource tailored to a very special purpose. It allows about three dozen styles of barcode to be rendered on the current page with no more code than the following:

    150 450 moveto (0-201-37922-8) (includetext) isbn

    Naturally, this resource can be included in the prolog of a PostScript document in combination with something like Quikscript or the TinyDict, and barcodes can be included (and precisely placed) within the document with a simple line of PostScript embedded in the text—that's the kind of thing direct PostScript can make possible.

    Terry Burton's Barcode Writer in Pure PostScript is available under the permissive MIT/X Consortium license.

    Bill Casselman's Mathematical Illustrations resources

    The web site for Bill Casselman's book Mathematical Illustrations—a whole book devoted to the benefits of direct PostScript for that kind of work—contains a number of downloadable PostScript resources that provide functionality described in the book, those of most general interest probably being ps3d, which adds to PostScript a full complement of 3-D operations styled after the 2-D ones, and bsp, for creating binary space partitions needed for painting 3-D scenes in the proper order.

    The web page, which contains the entire book as well as the code resources, sets out the following terms: Permission is granted for users of this resource to make one copy for their own personal use. Further reproduction is strictly prohibited without the express permission of the copyright holder. No contrary or additional terms are found in the code files themselves.

    Gernot Hoffmann's function-graphing tutorial

    Function Graphs and Other Applications for PostScript is another useful collection of techniques for mathematical illustration, with the following notice of terms: Copyright Gernot Hoffman. Code is free. Please mention the author. It appears on a page with many other interesting links (www.fho-emden.de/~hoffmann/howww41a.html).

    Appendix: PostScript as a programming language

    PostScript as a programming language is kin to other sophisticated, modern languages in several features that are often unrecognized because the Adobe manuals describe them quietly by their effects, rather than using the terminology most familiar in the study of programming languages. This appendix targets readers with a background in that study, and is meant to draw some overdue attention to those features of the language.

    Some of the features described here were introduced with Language Level 2 in 1990, and my discussion applies to Level 2 and later PostScript. At the time of this writing, Level 3 is supported by the free ghostscript interpreter, and built into printers widely and inexpensively available on the used market; little reason remains to write new programs for pre-Level 2 PostScript, now fifteen years outdated, whose lack of these features made it much less suitable for serious use. Existing programs for pre-Level 2 PostScript can still be run on modern interpreters.

    Automatic memory management

    PostScript is a language with garbage collection, like Java and other modern languages: its composite data types are allocated from a heap of virtual memory, and the programmer does not need to manage returning them to the heap when they are no longer needed. As needed during a program's execution, the garbage collector can find the allocated memory that the program could possibly refer to again, and reclaim the rest. The collector runs automatically, but sophisticated programs can tune performance with the vmreclaim and setvmthreshold operators to disable collection, force a collection, or change the condition for automatic collection.

    The sophistication of PostScript's memory management can be seen in the treatment of arrays and strings. The getinterval operator can return a reference to an interior subarray or substring:

    65535 array              % Stack contains: whole-array
    dup 20000 42 getinterval % whole-array sub-array
    exch pop                 % sub-array
    

    After the third line of the example, the only live reference is to a 42-element subinterval near the middle of the original large array; an Adobe PostScript interpreter (I tested in version 3010.108) will in fact reclaim all of the array except those 42 elements, on the next collection.

    Scoped memory

    PostScript augments its fully-automated garbage collection with a nested scoped memory mechanism. Like the scoping mechanism in real-time Java, PostScript's save and restore allow a program to declare a scope for memory allocations that follow, and later exit the scope, causing the memory allocated in it to be reclaimed at once, without the overhead of a garbage collection. Allocation at any time can be done from the current scope (“local”) or from an area of the heap outside the scope discipline (“global”). References to local objects cannot be stored into global objects. They can be stored into local objects allocated in an enclosing (longer-lived) scope, something real-time Java forbids in order to prevent a dangling reference in the enclosing scope after the inner one goes away. PostScript prevents that in another way:

    In PostScript, the restore that exits a scope not only reclaims the memory allocated within it, but backs out changes to data structures in all enclosing scopes (but not in the global heap) since the scope was entered. After a restore, data structures in scoped memory do not contain any references to the scope just exited, because any such references stored in them have been backed out.

    It is not only references to allocated objects that get backed out, but all contents of all objects allocated in scoped memory, except for contents of strings. That is, save and restore are not simply for scoping of memory, but for wholesale scoping of state, which is often useful but perhaps not as useful as if the two effects could be separated. The exception for strings is quirky, and no doubt a compromise to avoid tracking changes at byte granularity. The only way to return a heap-allocated result from a scope is to locate it in the global heap, or, if it is a string, copy it into an existing string preallocated in an enclosing scope.

    Tail-call elimination

    The language specification guarantees that a PostScript procedure, an executable array, is popped from the execution stack just before its last element is executed. That turns the last step of any procedure into a chain rather than a call; if it is recursion, the effect is a simple loop, without growth of the execution stack. Continuation-passing style can therefore be used to advantage in PostScript for tasks where it makes sense.

    What if procedure A needs to call either B or C as its last step? A only has one last element, but that isn't a problem: in the code:

    /A { ... {B} {C} ifelse } def
    

    the last element of A is ifelse. At that point A is off the execution stack, the two choices B and C are on the operand stack, and ifelse discards one and chains to the other. The method extends to an arbitrary number of choices and an arbitrarily complex choice: just get the chosen next step onto the operand stack, and let the last element of the procedure be exec. Finite automata are easily implemented this way.

    Staged programming

    Because PostScript procedures are simply arrays and can be manipulated as data, PostScript lends itself naturally to a programming style where procedures can be dynamically created and then executed. Such a staged approach can boost efficiency, as when a certain condition known at the start of a loop would have to be tested on each iteration. A staged approach tests the condition once, creates the loop body accordingly, and then loops it. Staging is also an elegant answer to many stack manipulation puzzles that arise in PostScript programming: an often-needed value that's in the way on the stack can be taken off by staging the procedure or loop body to simply put the value back when it is needed.

    Staging in plain PostScript by using the array operators can be straightforward for simple tasks, but quickly gets hard to read and follow after that. Explicit staging in PostScript is possible with net.anastigmatix.MetaPre, which adds syntax for it to the language.

    Packing and speed/space tradeoffs

    PostScript's setpacking operator selects whether subsequently scanned procedures are built into packed executable arrays. When packing is off, procedures are built as ordinary arrays (using, typically, eight bytes per element). Packed arrays save space for procedures by using a variable width representation where some operators and scalar values are narrower than eight bytes. Packing all the procedures in a resource can reduce its memory footprint substantially. On the other hand, packed procedures execute a bit more slowly.

    An obvious and effective compromise is to factor out any speed-critical, tight loops into procedures of their own, and define those after a false setpacking; then define the remaining, less speed-critical procedures after a true setpacking. In a file intended to be included in other code, save the value of currentpacking before any changes, and restore it at the end.

    Valid XHTML 1.0! Valid CSS! $Id: direct.html,v 1.18 2009/11/14 02:18:43 chap Exp $