Adding Appendix 1

[mudman.git] / md / language.md
diff --git a/md/language.md b/md/language.md

index ec7fb04238faebc7b39e3f54ce3016a9ceafd013..01c68d890314f559bdc872eb5a6c723b5cd0e9e1 100644 (file)
--- a/md/language.md
+++ b/md/language.md
@@ -9180,3 +9180,684 @@ looks better than
  
  You can see the nature of the choices. Nesting is still and all better
  than `GO`.
+
+Appendix 1. A Look Inside
+=========================
+
+This appendix tells about the mapping between MDL objects and PDP-10
+storage -- in other words, the way things look "on the inside". None
+of this information is essential to knowing how to program in MDL, but
+it does give some reasons for capabilities and restrictions that
+otherwise you have to memorize. The notation and terminology get a
+little awkward in this discussion, because we are in a twilight zone
+between the worlds of MDL objects and of bit patterns. In general the
+words and phrases appearing in diagrams refer to bit patterns not MDL
+objects. A lower-case word (like "tuple") refers to the storage
+occupied by an object of the corresponding `PRIMTYPE` (like `TUPLE`).
+
+First some terminology needs discussion. The sine qua non of any MDL
+object is a **pair** of 36-bit computer words. In general, lists
+consist of pairs chained together by pointers (addresses), and vectors
+consist of contiguous blocks of pairs. `==?` essentially tests two
+pairs to see whether they contain the same bit patterns.
+
+The first (lower-addressed) word of a pair is called the **`TYPE`
+word**, because it contains a numeric **`TYPE` code** that represents
+the object's `TYPE`. The second (higher-addressed) word of a pair is
+called the **value word**, because it contains (part of or the
+beginning of) the "data part" of the object. The `TYPE` word (and
+sometimes the value word) is considered to be made of a left half and
+a right half. We will picture a pair like this:
+
+    ---------------------------------
+    |      TYPE     |               |
+    | - - - - - - - - - - - - - - - |
+    |             value             |
+    ---------------------------------
+
+where a vertical bar in the middle of a word means the word's halves
+are used independently. You can see that the `TYPE` code is confined
+to the left half of the `TYPE` word. (Half-)words are sometimes
+subdivided into **fields** appropriate for the context; fields are
+also pictured as separated by vertical bars. The right half of the
+`TYPE` word is used for different purposes depending on the `TYPE` of
+the object and actual location of the value.
+
+Actually the 18-bit `TYPE` field is further decoded. The high-order
+(leftmost) bit is the mark bit, used exclusively by the garbage
+collector when it runs. The next two bits are monitor bits, used to
+cause `"READ"` and `"WRITE"` interrupts on read and write references
+to the pair. The next bit is used to differentiate between list
+elements and vector dope words. The next bit is unused but could be
+used in the future for an "execute" monitor. The remaining 13 bits
+specify the actual `TYPE` code. What `CHTYPE` does is to copy the pair
+and put a new `TYPE` code into the new pair.
+
+Each data `TYPE` (predefined and `NEWTYPE`s) must belong to one of
+about 25 "storage allocation classes" (roughly corresponding to MDL
+`PRIMTYPE`s). These classes are characterized primarily by the manner
+in which the garbage collector treats them. Some of these classes will
+now be described.
+
+"One Word"
+
+This class includes all data that are not pointers to some kind of
+structure. All external (program-available) `TYPE`s in this class are
+of `PRIMTYPE` `WORD`. Example:
+
+    ---------------------------------
+    |       FIX     |       0       |
+    | - - - - - - - - - - - - - - - |
+    |              105              |
+    ---------------------------------
+
+"Two Word"
+
+The members of this class are all 18-bit pointers to list elements.
+All external `TYPE`s in this class are of `PRIMTYPE` `LIST`. Example:
+
+    ---------------------------------
+    |      LIST     |       0       |
+    | - - - - - - - - - - - - - - - |
+    |       0       |    pointer    |
+    ---------------------------------
+
+where `pointer` is a pointer to the first list element. If there are
+no elements, `pointer` is zero; thus empty objects of `PRIMTYPE`
+`LIST` are `==?` if their `TYPE`s are the same.
+
+"Two N Word"
+
+Members of this class are all "counting pointers" to blocks of
+two-word pairs. The right half of a counting pointer is an address,
+and the left half is the negative of the number of 36-bit words in the
+block. (This format is tailored to the PDP-10 `AOBJN` instruction.)
+The number of pairs in the block (`LENGTH`) is half that number, since
+each pair is two words. All external `TYPE`s in this class are of
+`PRIMTYPE` `VECTOR`. Example:
+
+    ---------------------------------
+    |     VECTOR    |       0       |
+    | - - - - - - - - - - - - - - - |
+    |   -2*length   |    pointer    |
+    ---------------------------------
+
+where `length` is the `LENGTH` of the `VECTOR` and `pointer` is the
+location of the start (the element selected by an `NTH` argument of 1)
+of the `VECTOR`.
+
+"N word"
+
+This class is the same as the previous one, except that the block
+contains objects all of the same `TYPE` without individual `TYPE`
+words. The `TYPE` code for all the elements is in vector dope words,
+which are at addresses just larger than the block itself. Thus, any
+object that carries information in its `TYPE` word cannot go into the
+block: `PRIMTYPE`s `STRING`, `BYTES`, `TUPLE` (and the corresponding
+locatives `LOCS`, `LOCB`, `LOCA`), `FRAME`, and `LOCD`. All external
+`TYPE`s in this class are of `PRIMTYPE` `UVECTOR`. Example:
+
+    ---------------------------------
+    |    UVECTOR    |       0       |
+    | - - - - - - - - - - - - - - - |
+    |    -length    |    pointer    |
+    ---------------------------------
+
+where `length` is the `LENGTH` of the `UVECTOR` and `pointer` points
+to the beginning of the `UVECTOR`.
+
+"Byte String" and "Character String"
+
+These two classes are almost identical. Byte strings are byte pointers
+to strings of arbitrary-size bytes. `PRIMTYPE` `BYTES` is the only
+member of this class. Character strings are byte pointers to strings
+of ASCII characters. `PRIMTYPE` `STRING` is the only member of this
+class. Both of these classes consist of a length and a PDP-10 byte
+pointer. In the case of character strings, the byte-size field in the
+byte pointer is always seven bits per byte (hence five bytes per
+word). Example:
+
+    ---------------------------------
+    |     STRING    |    length     |
+    | - - - - - - - - - - - - - - - |
+    |         byte-pointer          |
+    ---------------------------------
+
+where `length` is the `LENGTH` of the `STRING` (in bytes) and
+`byte-pointer` points to a byte just before the beginning of the
+string (an `ILDB` instruction is needed to get the first byte). A
+newly-created `STRING` always has `*010700*` in the left half of
+`byte-pointer`. Unless the string was created by `SPNAME`,
+`byte-pointer` points to a uvector, where the elements (characters) of
+the `STRING` are stored, packed together five to a word.
+
+"Frame"
+
+This class gives the user program a handle on its control and
+variable-reference structures. All external `TYPE`s in this class are
+of `PRIMTYPE` `FRAME`. Three numbers are needed to designate a frame:
+a unique 18-bit identifying number, a pointer to the frame's storage
+on a control stack, and a pointer to the `PROCESS` associated with the
+frame. Example:
+
+    ---------------------------------
+    |     FRAME     |PROCESS-pointer|
+    | - - - - - - - - - - - - - - - |
+    |   unique-id   | frame-pointer |
+    ---------------------------------
+
+where `PROCESS-pointer` points to the dope words of a `PROCESS`
+vector, and `unique-id` is used for validating (testing `LEGAL?`) the
+`frame-pointer`, which points to a frame for some Subroutine call on
+the control stack.
+
+"Tuple"
+
+A tuple pointer is a counting pointer to a vector on the control
+stack. It may be a pointer to the arguments to a Subroutine or a
+pointer generated by the `"TUPLE"` declaration in a `FUNCTION`. Like
+objects in the previous class, these objects contain a unique
+identifying number used for validation. `PRIMTYPE` `TUPLE` is the only
+member of this class. Example:
+
+    ---------------------------------
+    |     TUPLE     |   unique-id   |
+    | - - - - - - - - - - - - - - - |
+    |   -2*length   |    pointer    |
+    ---------------------------------
+
+Other Storage Classes
+
+The rest of the storage classes include strictly internal `TYPE`s and
+pointers to special kinds of lists and vectors like locatives, `ATOM`s
+and `ASOC`s. A pair for any `LOCATIVE` except a `LOCD` looks like a
+pair for the corresponding structure, except of course that the `TYPE`
+is different. A `LOCD` pair looks like a tuple pair and needs a word
+and a half for its value; the `unique-id` refers to a binding on the
+control stack or to the "global stack" if zero. Thus `LOCD`s are in a
+sense "stack objects" and are more restricted than other locatives.
+
+An `OFFSET` is stored with the `INDEX` in the right half of the value
+word and the Pattern in the left half. Since the Pattern can be either
+an `ATOM` or a `FORM`, the left half actually points to a pair, which
+points to the actual Pattern. The Patttern `ANY` is recognized as a
+special case: the left-half pointer is zero, and no pair is used.
+Thus, if you're making the production version of your program and want
+to save some storage, can do something like
+`<SETG FOO <PUT-DECL ,FOO ANY>>` for all `OFFSET`s.
+
+Basic Data Structures
+---------------------
+
+Lists
+
+List elements are pairs linked together by the right halves of their
+first words. The list is terminated by a zero in the right half of the
+last pair. For example the LIST (1 2 3) would look like this:
+
+    -------------
+    | LIST | 0  |
+    | - - - - - |   -----------     -----------     -----------
+    |  0   | ------>| FIX | ------->| FIX | ------->| FIX | 0 |
+    -------------   | - - - - |     | - - - - |     | - - - - |
+                    |    1    |     |    2    |     |    3    |
+                    -----------     -----------     -----------
+
+The use of pointers to tie together elements explains why new elements
+can be added easily to a list, how sharing and circularity work, etc.
+The links go in only one direction through the list, which is why a
+list cannot be `BACK`ed or `TOP`ped: there's no way to find the
+`REST`ed elements.
+
+Since some MDL values require a word and a half for the value in the
+pair, they do not fit directly into list elements. This problem is
+solved by having "deferred pointers". Instead of putting the datum
+directly into the list element, a pointer to another pair is used as
+the value with the special internal `TYPE` `DEFER`, and the real datum
+is put in the deferred pair. For example the `LIST` `(1 "hello" 3)`
+would look like this:
+
+    -------------
+    | LIST | 0  |
+    | - - - - - |   -----------     -----------     -----------
+    |  0   | ------>| FIX | ------->|DEFER| ------->| FIX | 0 |
+    -------------   | - - - - |     | - - - - |     | - - - - |
+                    |    1    |     |       -----   |    3    |
+                    -----------     ----------- |   -----------
+                                                |
+                                    ----------- |
+                                    |STRING| 5|<-
+                                    | - - - - |
+                                    |byte-pntr|
+                                    -----------
+
+Vectors
+
+A vector is a block of contiguous words. More than one pair can point
+to the block, possibly at different places in the block; this is how
+sharing occurs among vectors. Pointers that are different arise from
+`REST` or `GROW`/`BACK` operations. The block is followed by two "dope
+words", at addresses just larger than the largest address in the
+block. Dope words have the following format:
+
+    /                               /
+    |                               |
+    |                               |
+    ---------------------------------
+    |      type     |      grow     |
+    | - - - - - - - - - - - - - - - |
+    |     length    |       gc      |
+    ---------------------------------
+
+The various fields have the following meanings:
+
+`type` -- The fourth bit from the left (the "vector bit", `40000`
+octal) is always one, to distinguish these vector dope words from a
+`TYPE`/value pair.
+
+If the high-order bit is zero, then the vector is a `UVECTOR`, and the
+remaining bits specify the uniform `TYPE` of the elements. `CHUTYPE`
+just puts a new `TYPE` code in this field. Each element is limited to
+a one-word value: clearly `PRIMTYPE` `STRING`s and `BYTES`es and stack
+objects can't go in uniform vectors.
+
+If the high-order bit is one and the `TYPE` bits are zero, then this
+is a regular `VECTOR`.
+
+If the high-order bit is one and the `TYPE` bits are not all zero,
+then this is either an `ATOM`, a `PROCESS`, an `ASOC`, or a
+`TEMPLATE`. The special internal format of these objects will be
+described a little later in this appendix.
+
+`length` -- The high-order bit is the mark bit, used by the garbage
+collector. The rest of this field specifies the number of words in the
+block, including the dope words. This differs from the length given in
+pairs pointing to this vector, since such pairs may be the result of
+`REST` operations.
+
+`grow` -- This is actually two nine-bit fields, specifying either
+growth or shrinkage at both the high and low ends of the vector. The
+fields are usually set only when a stack must be grown or shrunk.
+
+`gc` -- This is used by the garbage collector to specify where this
+vector is moving during compaction.
+
+Examples (numbers in octal): the `VECTOR` `[1 "bye" 3]` looks like:
+
+    ---------------
+    | VECTOR |  0 |
+    | - - - - - - |         -----------------
+    |   -6   |  ----------->|  FIX  |       |
+    ---------------         | - - - - - - - |
+                            |       1       |
+                            -----------------
+                            | STRING |  3   |
+                            | - - - - - - - |
+                            |  byte pointer |
+                            -----------------
+                            |  FIX  |       |
+                            | - - - - - - - |
+                            |       3       |
+                            -----------------
+                            | 440000 |  0   |
+                            | - - - - - - - |
+                            |   10   |      |
+                            -----------------
+
+The `UVECTOR` `![-1 7 -4!]` looks like:
+
+    ---------------
+    | UVECTOR | 0 |
+    | - - - - - - |         -----------------
+    |   -3    | ----------->|       -1      |
+    ---------------         -----------------
+                            |        7      |
+                            -----------------
+                            |       -4      |
+                            -----------------
+                            | 40000+FIX | 0 |
+                            | - - - - - - - |
+                            |   5       |   |
+                            -----------------
+
+Atoms
+
+Internally, atoms are special vector-like objects. An atom contains a
+value cell (the first two words of the block, filled in whenever the
+global or local value of the `ATOM` is referenced and is not already
+there), an `OBLIST` pointer, and a print name (`PNAME`), in the
+following format:
+
+    ---------------------------------
+    |      type     |     bindid    |
+    ---------------------------------
+    |       pointer-to-value        |
+    ---------------------------------
+    |       pointer-to-oblist       |
+    ---------------------------------
+    |           print-name          |
+    /                               /
+    /                               /
+    |(ASCII with NUL padding on end)|
+    ---------------------------------
+    |      ATOM     |   valid-type  |
+    | - - - - - - - - - - - - - - - |
+    |     length    |       gc      |
+    ---------------------------------
+
+If the type field corresponds to `TYPE` `UNBOUND`, then the `ATOM` is
+locally and globally unbound. (This is different from a pair, where
+the same `TYPE` `UNBOUND` is used to mean unassigned.) If it
+corresponds to `TYPE` `LOCI` (an internal `TYPE`), then the value cell
+points either to the global stack, if `bindid` is zero, or to a local
+control stack, if `bindid` is non-zero. The `bindid` field is used to
+verify whether the local value pointed to by the value cell is valid
+in the current environment. The `pointer-to-OBLIST` is either a
+counting pointer to an oblist (uvector). a positive offset into the
+"transfer vector" (for pure `ATOM`s), or zero, meaning that this
+`ATOM` is not on an `OBLIST`. The `valid-type` field tells whether or
+not the `ATOM` represents a `TYPE` and if so the code for that `TYPE`:
+`grow` values are never needed for atoms.
+
+Associations
+
+Associations are also special vector-like objects. The first six words
+of the block contain `TYPE`/value pairs for the `ITEM`, `INDICATOR`
+and `AVALUE` of the `ASOC`. The next word contains forward and
+backward pointers in the chain for that bucket of the association hash
+table. The last word contains forward and backward pointers in the
+chain of all the associations.
+
+    ---------------------------------
+    |             ITEM              |
+    | - - - - - - - - - - - - - - - |
+    |             pair              |
+    ---------------------------------
+    |          INDICATOR            |
+    | - - - - - - - - - - - - - - - |
+    |             pair              |
+    ---------------------------------
+    |            AVALUE             |
+    | - - - - - - - - - - - - - - - |
+    |             pair              |
+    ---------------------------------
+    |     bucket-chain-pointers     |
+    ---------------------------------
+    |  association-chain-pointers   |
+    ---------------------------------
+    |      ASOC     |       0       |
+    | - - - - - - - - - - - - - - - |
+    |    12 octal   |       gc      |
+    ---------------------------------
+
+`PROCESS`es
+
+A `PROCESS` vector looks exactly like a vector of `TYPE`/value pairs.
+It is different only in that the garbage collector treats it
+differently from a normal vector, and it contains extremely volatile
+information when the `PROCESS` is `RUNNING`.
+
+Templates
+
+In a template, the number in the type field (left half or first dope
+word) identifies to which "storage allocation class" this `TEMPLATE`
+belongs, and it is used to find PDP-10 instructions in internal tables
+(frozen uvectors) for performing `LENGTH`, `NTH`, and `PUT` operations
+on any object of this `TYPE`. The programs to build these tables are
+not part of the interpreter, but the interpreter does know how to use
+them properly. The compiler can put these instructions directly in
+compiled programs if a `TEMPLATE` is never `REST`ed; otherwise it must
+let the interpreter discover the appropriate instruction. The value
+word of a template pair contains, not a counting pointer, but the
+number of elements that have been `REST`ed off in the left half and a
+pointer to the first dope word in the right half.
+
+The Control Stack
+-----------------
+
+Accumulators with symbolic names `AB`, `TB`, and `TP` are all pointers
+into the `RUNNING` `PROCESS`'s control stack. `AB` ("argument base")
+is a pointer to the arguments to the Subroutine now being run. It is
+set up by the Subroutine-call mediator, and its old value is always
+restored after a mediated Subroutine call returns. `TB` ("temporaries
+base") points to the frame for the running Subroutine and also serves
+as a stack base pointer. The `TB` pointer is really all that is
+necessary to return from a Subroutine -- given a value to return, for
+example by `ERRET` -- since the frame specifies the entire state of
+the calling routine. `TP` ("temporaries pointer") is the actual stack
+pointer and always points to the current top of the control stack.
+
+While we're on the subject of accumulators, we might as well be
+complete. Each accumulator contains the value word of a pair, the
+corresponding `TYPE` words residing in the `RUNNING` `PROCESS` vector.
+When a `PROCESS` is not `RUNNING` (or when the garbage collector is
+running), the accumulator contents are stored in the vector, so that
+the Objects they point to look like elements of the `PROCESS` and thus
+are not garbage-collectible.
+
+Accumulators `A`, `B`, `C`, `D`, `E` and `O` are used almost entirely
+as scratch accumulators, and they are not saved or restored across
+Subroutine calls. Of course the interrupt machinery always saves these
+and all other accumulators. `A` and `B` are used to return a pair as
+the value of a Subroutine call. Other than that special feature, they
+are just like the other scratch accumulators.
+
+`M` and `R` are used in running `RSUBR`s. `M` is always set up to
+point to the start of the `RSUBR`'s code, which is actually just a
+uniform vector of instructions. All jumps and other references to the
+code use `M` as an index register. This makes the code
+location-insensitive, which is necessary because the code uvector will
+move around. `R` is set up to point to the vector of objects needed by
+the `RSUBR`. This accumulator is necessary because objects in
+garbage-collected space can move around, but the pointers to them in
+the reference vector are always at the same place relative to its
+beginning.
+
+`FRM` is the internal frame pointer, used in compiled code to keep
+track of pending Subroutine calls when the control stack is heavily
+used. `P` is the internal-stack pointer, used primarily for internal
+calls in the interpreter.
+
+One of the nicest features of the MDL environment is the uniformity of
+the calling and returning sequence. All Subroutines -- both built-in
+F/SUBRs and compiled `RSUBR(-ENTRY)`s -- are called in exactly the
+same way and return the same way. Arguments are always passed on the
+control stack and results always end up in the same accumulators. For
+efficiency reasons, a lot of internal calls within the interpreter
+circumvent the calling sequence. However, all calls made by the
+interpreter when running user programs go through the standard calling
+sequence.
+
+A Subroutine call is initiated by one of three UUOs (PDP-10
+instructions executed by software rather than hardware). `MCALL` ("MDL
+call") is used when the number of arguments is known at assemble or
+compile time, and this number is less than 16. `QCALL` ("quick call")
+may be used if, in addition, an `RSUBR(-ENTRY)` is being called that
+can be called "quickly" by virtue of its having special information in
+its reference vector. `ACALL` ("accumulator call") is used otherwise.
+The general method of calling a Subroutine is to `PUSH` (a PDP-10
+instruction) pairs representing the arguments onto the control stack
+via `TP` and then either (1) `MCALL` or `QCALL` or (2) put the number
+of arguments into an accumulator and `ACALL`. Upon return the object
+returned by the Subroutine will be in accumulators `A` and `B`, and
+the arguments will have been `POP`ped off the control stack.
+
+The call mediator stores the contents of `P` and `TP` and the address
+of the calling instruction in the current frame (pointed to by `TB`).
+It also stores MDL's "binding pointer" to the topmost binding in the
+control stack. (The bindings are linked together through the control
+stack so that searching through them is more efficient than looking at
+every object on the stack.) This frame now specifies the entire state
+of the caller when the call occurred. The mediator then builds a new
+frame on the control stack and stores a pointer back to the caller's
+frame (the current contents of `TB`), a pointer to the Subroutine
+being called, and the new contents of `AB`, which is a counting
+pointer to the arguments and is computed from the information in the
+`MCALL` or `QCALL` instruction or the `ACALL` accumulator. `TB` is
+then set up to point to the new frame, and its left half is
+incremented by one, making a new `unique-id`. The mediator then
+transfers control to the Subroutine.
+
+A control stack frame has seven words as shown:
+
+    ---------------------------------
+    |     ENTRY     |  called-addr  |
+    ---------------------------------
+    |   unique-id   |  prev frame   |
+    ---------------------------------
+    |       argument pointer        |
+    ---------------------------------
+    |    saved binding pointer      |
+    ---------------------------------
+    |           saved P             |
+    ---------------------------------
+    |           saved TP            |
+    ---------------------------------
+    |    saved calling address      |
+    ---------------------------------
+
+The first three words are set up during the call to the Subroutine.
+The rest are filled in when this routine calls another Subroutine. The
+left half of `TB` is incremented every time a Subroutine call occurs
+and is used as the `unique-id` for the frame, stored in frame and
+tuple pairs as mentioned before. Obviously this `id` is not strictly
+unique, since each 256K calls it wraps around to zero. The right half
+of `TB` is always left pointing one word past the
+saved-calling-address word in the frame. `TP` is also left pointing at
+that word, since that is the top of the control stack at Subroutine
+entry. The arguments to the called Subroutine are below the frame on
+the control stack (at lower storage addresses), and the temporaries
+for the called Subroutine are above the frame (at higher storage
+addresses). These arguments and temporaries are just pairs stored on
+the control stack while needed: they are all that remain of
+`UNSPECIAL` values in compiled programs.
+
+The following figure shows what the control stack might look like
+after several Subroutine calls.
+
+    /               /
+    |               |
+    -----------------
+    |               |
+    |  args for S1  |
+    |               |
+    -----------------
+    | frame for S1  |
+    ----------------- <--
+    |               |   |
+    | temps for S1  |   |
+    |               |   |
+    -----------------   |
+    |               |   |
+    |  args for S2  |   |
+    |               |   |
+    -----------------   |
+    | frame for S2  | ---
+    ----------------- <------
+    |               |       |
+    | temps for S2  |       |
+    |               |       |
+    -----------------       |
+    |  args for S3  |       |
+    -----------------       |
+    | frame for S3  | -------
+    -----------------
+    |               |
+    | temps for S3  |
+    |               |
+    |               |
+    -----------------
+          (top)
+
+The above figure shows the frames all linked together through the
+control stack (the "execution path"), so that it is easy to return to
+the caller of a given Subroutine (`ERRET` or `RETRY`).
+
+Subroutine exit is accomplished simply by the call mediator, which
+loads the right half of `TB` from the previous frame pointer, restores
+the "binding pointer", `P`, and `TP`, and transfers control back to
+the instruction following the saved calling address.
+
+Variable Bindings
+-----------------
+
+All local `ATOM` values are kept on the control stack of the `PROCESS`
+to which they are local. As described before, the atom contains a word
+that points to the value on the control stack. The pointer is actually
+to a six-word "binding block" on the control stack. Binding blocks
+have the following format:
+
+    ---------------------------------
+    | BIND or UBIND |      prev     |
+    ---------------------------------
+    |        pointer to ATOM        |
+    ---------------------------------
+    |             value             |
+    | - - - - - - - - - - - - - - - |
+    |             pair              |
+    ---------------------------------
+    |     decl      |   unique-id   |
+    ---------------------------------
+    |       previous-binding        |
+    ---------------------------------
+
+where:
+
+-   `BIND` means this is a binding for a `SPECIAL` `ATOM` (the only
+    kind used by compiled programs), and `UBIND` means this is a
+    binding for an `UNSPECIAL` `ATOM` -- for `SPECIAL` checking by the
+    interpreter;
+-   `prev` points to the closest previous binding block for any `ATOM`
+    (the "access path" -- `UNWIND` objects are also linked in this
+    chain);
+-   `decl` points to a `DECL` associated with this value, for
+    `SET(LOC)` to check;
+-   `unique-id` is used for validation of this block; and
+-   `previous-binding` points to the closest previous binding for this
+    `ATOM` (used in unbinding).
+
+Bindings are generated by an internal subroutine called `SPECBIND`
+(name comes from `SPECIAL`). The caller to `SPECBIND` `PUSH`es
+consecutive six-word blocks onto the control stack via `TP` before
+calling `SPECBIND`. The first word of each block contains the `TYPE`
+code for `ATOM` in its left half and all ones in its right half.
+`SPECBIND` uses this bit pattern to identify the binding blocks.
+`SPECBIND`'s caller also fills in the next three words and leaves the
+last two words empty. `SPECBIND` fills in the rest and leaves the
+"binding pointer" pointing at the topmost binding on the control
+stack. `SPECBIND` also stores a pointer to the current binding in the
+value cell of the atom.
+
+Unbinding is accomplished during Subroutine return. When the previous
+frame is being restored, the call mediator checks to see if the saved
+"binding pointer" and the current one are different; if they are,
+`SPECSTORE` is called. `SPECSTORE` runs through the binding blocks,
+restoring old value pointers in atoms until the "binding pointer" is
+equal to the one saved in the frame.
+
+Obviously variable binding is more complicated than this, because
+`ATOM`s can have both local and global values and even different local
+values in different `PROCESS`es. The solution to all of these
+additional problems lies in the `bindid` field of the atom. Each
+`PROCESS` vector also contains a current `bindid`. Whenever an ATOM's
+local value is desired, the `RUNNING` `PROCESS`'s `bindid` is checked
+against that of the atom: if they are the same, the atom points to the
+current value; if not, the current `PROCESS`'s control stack must be
+searched to find a binding block for this `ATOM`. This binding scheme
+might be called "shallow binding". The searching is facilitated by
+having all binding blocks linked together. Accessing global variables
+is accomplished in a similar way, using a `VECTOR` that is referred to
+as the "global stack". The global stack has only an `ATOM` and a value
+slot for each variable, since global values never get rebound.
+
+`EVAL` with respect to a different environment causes some additional
+problems. Whenever this kind of `EVAL` is done, a brand new `bindid`
+is generated, forcing all current local value cells of atoms to appear
+invalid. Local values must now be obtained by searching the control
+stack, which is inefficient compared to just pulling them out of the
+atoms. (The greatest inefficiency occurs when an `ATOM`'s `LVAL` is
+never accessed twice in a row in the same environment.) A special
+block is built on the control stack and linked into the binding-block
+chain. This block is called a "skip block" or "environment splice",
+and it diverts the "access path" to the new environment, causing
+searches to become relative to this new environment.
+\ No newline at end of file