X-Git-Url: https://jxself.org/git/?a=blobdiff_plain;f=md%2Flanguage.md;h=01c68d890314f559bdc872eb5a6c723b5cd0e9e1;hb=40a6e235227eb90be38105289a8b3143bc77ee4d;hp=7f745b5089eb469dd787da038cca2b64e01bedb4;hpb=4310108f37019b602bcbf9e80acde4a184ee4796;p=mudman.git diff --git a/md/language.md b/md/language.md index 7f745b5..01c68d8 100644 --- a/md/language.md +++ b/md/language.md @@ -335,7 +335,7 @@ Chapter 2. Read, Evaluate, and Print Once you type `$` and all brackets are correctly paired and nested, the current contents of the input buffer go through processing by -three functions successively: first `READ`, whcih passes its output to +three functions successively: first `READ`, which passes its output to `EVAL` ("evaluate"), which passes its output to `PRINT`, whose output is typed on the terminal. @@ -413,7 +413,7 @@ Then `EVAL` noted that its input was of `TYPE` `FIX`. An object of undisturbed. Then `PRINT` saw that its input was of `TYPE` `FIX`, and printed on -the terminal the decimal characer representation of the corresponding +the terminal the decimal character representation of the corresponding integer. 2.4. Example (TYPE FLOAT) \[1\] @@ -583,9 +583,9 @@ brackets to be closed. In other cases, they will produce errors during met. Instead, the right-hand column will be used to state just what `READ` thought the input in the left-hand column really was. - ------------------------------------------------------------------------------------- + -------------------------------------------------------------------------------- Input Explanation - --------------------------- --------------------------------------------------------- + --------------------------- ---------------------------------------------------- `ABC$` an `ATOM` of `PNAME` `ABC` `abc$` an `ATOM` of `PNAME` `abc` @@ -605,12 +605,12 @@ met. Instead, the right-hand column will be used to state just what `ONE`. `ab(cd$` an `ATOM` of `PNAME` `ab`, followed by the start of - something else (The something else will contain an `ATOM` - of `PNAME` beginning `cd.`) + something else (The something else will contain an + `ATOM` of `PNAME` beginning `cd.`) - `12345A34$` an `ATOM` of `PNAME` `12345A35` (If the A had been an E, - the object would have been a `FLOAT`.) - ------------------------------------------------------------------------------------- + `12345A34$` an `ATOM` of `PNAME` `12345A35` (If the A had been + an E, the object would have been a `FLOAT`.) + -------------------------------------------------------------------------------- #### 2.6.3.3.  (Backslash) in ATOMs @@ -639,19 +639,19 @@ non-standard, this time not because anything is unfinished or in error, but because commenting is needed: `PRINT` doesn't do it full justice. - --------------------------------------------------------------------------------- + --------------------------------------------------------------------------- Input Explanation - ------------------------ -------------------------------------------------------- + ------------------------ -------------------------------------------------- `a\ one\ and\ a\ two$` one `ATOM`, whose `PNAME` has four spaces in it - `1234\56789$` an `ATOM` of `PNAME` `123456789`, which `PRINT`s as - `\1233456789` + `1234\56789$` an `ATOM` of `PNAME` `123456789`, which `PRINT`s + as `\1233456789` `123\ $` an `ATOM` of `PNAME` `123space`, which `PRINT`s as `\123\`, with a space on the end `\\$` an `ATOM` whose `PNAME` is a single backslash - --------------------------------------------------------------------------------- + --------------------------------------------------------------------------- Chapter 3. Built-in Functions ============================= @@ -1142,7 +1142,7 @@ really is changed while `EVAL`uating the body of a `FUNCTION`: 5 The first number after the application `FORM` was typed out by the -`PRINT`; the second is the value of the applcation. +`PRINT`; the second is the value of the application. Remembering that `LVAL`s of `ATOM`s **not** in argument `LIST`s are not changed, we can reference them within `FUNCTION`s, as in @@ -1891,7 +1891,7 @@ archaic kind of `UVECTOR` that is not garbage-collected.\] "characters" represents a `STRING` of ASCII text. A `STRING` containing the -chatacter `"` (double-quote) is represented by placing a `\` +character `"` (double-quote) is represented by placing a `\` (backslash) before the double-quote inside the `STRING`. A `\` in a `STRING` is represented by two consecutive backslashes. @@ -3200,7 +3200,7 @@ confusion inevitably results. The indicator `"ARGS"` can appear in an argument `LIST` with precisely the same syntax as `"TUPLE"`. However, `"ARGS"` causes the `ATOM` -following it to be bound to a `LIST` of the remaining **unevaluted** +following it to be bound to a `LIST` of the remaining **unevaluated** arguments. `"ARGS"` does not cause any copying to take place. It simply gives you @@ -4131,30 +4131,30 @@ used on the `CHANNEL`, and whether or not the *device* is a terminal. The following table tells which `SUBR`s can be used with which modes, where `OK` indicates an allowed use: - ------------------------------------------------------------------------------------------------------------------------------------------------------- - "READ" "PRINT" "READB" "PRINTB", "PRINTO" mode / SUBRs - -------------------- ---------------------- ---------------------- ---------------------------------------------- ------------------------------------- - OK OK `READ` `READCHR` `NEXTCHR` - `READSTRING` `FILECOPY` - `FILE-LENGTH LOAD` + ------------------------------------------------------------------------------------------------------------------------------------------- + "READ" "PRINT" "READB" "PRINTB", "PRINTO" mode / SUBRs + -------------------- -------------------- -------------------- ------------------------------------------ --------------------------------- + OK OK `READ` `READCHR` `NEXTCHR` + `READSTRING` `FILECOPY` + `FILE-LENGTH LOAD` - OK OK\* `PRINT` `PRIN1` `PRINC` `IMAGE` - `CRLF` `TERPRI` `FILECOPY` - `PRINTSTRING` `BUFOUT` `NETS` - `RENAME` + OK OK\* `PRINT` `PRIN1` `PRINC` `IMAGE` + `CRLF` `TERPRI` `FILECOPY` + `PRINTSTRING` `BUFOUT` `NETS` + `RENAME` - OK `READB` `GC-READ` + OK `READB` `GC-READ` - OK `PRINTB` `GC-DUMP` + OK `PRINTB` `GC-DUMP` - OK OK OK `ACCESS` + OK OK OK `ACCESS` - OK OK OK OK `RESET` + OK OK OK OK `RESET` - OK OK `ECHOPAIR` + OK OK `ECHOPAIR` - OK `TTYECHO` `TYI` - ------------------------------------------------------------------------------------------------------------------------------------------------------- + OK `TTYECHO` `TYI` + ------------------------------------------------------------------------------------------------------------------------------------------- `*` PRINTing (or `PRIN1`ing) an `RSUBR` (chapter 19) on a `"PRINTB"` or `"PRINTO"` `CHANNEL` has special effects. @@ -4166,7 +4166,7 @@ not used with terminals. `"READ"` is the mode used by default. The next one to four arguments to `OPEN` specify the file involved. If only one `STRING` is used, it can contain the entire specification, according to standard operating-system syntax. Otherwise, the -string(s) are intepreted as follows: +string(s) are interpreted as follows: *name1* is the first file name, that part to the left of the space (in the ITS version) or period (in the Tenex and Tops-20 versions). The @@ -4277,52 +4277,52 @@ each element, and an interpretation. The format used is the following: *element-number: type interpretation* - --------------------------------------------------------------------------------------------------------- - element-number type interpretation - --------------------------- ------------------ ---------------------------------------------------------- - -1 `LIST` transcript channel(s) (see below) + ------------------------------------------------------------------------------------------------ + element-number type interpretation + ------------------------- ------------------ --------------------------------------------------- + -1 `LIST` transcript channel(s) (see below) - \* 0 varies device-dependent information + \* 0 varies device-dependent information - \* 1 `FIX` channel number (ITS) or JFN (Tenex and Tops-20), `0` for - internal or closed + \* 1 `FIX` channel number (ITS) or JFN (Tenex and Tops-20), + `0` for internal or closed - \* 2 `STRING` mode + \* 2 `STRING` mode - \* 3 `STRING` first file name argument + \* 3 `STRING` first file name argument - \* 4 `STRING` second file name argument + \* 4 `STRING` second file name argument - \* 5 `STRING` device name argument + \* 5 `STRING` device name argument - \* 6 `STRING` directory name argument + \* 6 `STRING` directory name argument - \* 7 `STRING` real first file name + \* 7 `STRING` real first file name - \* 8 `STRING` real second file name + \* 8 `STRING` real second file name - \* 9 `STRING` real device name + \* 9 `STRING` real device name - \* 10 `STRING` real directory name + \* 10 `STRING` real directory name - \* 11 `FIX` various status bits + \* 11 `FIX` various status bits - \* 12 `FIX` PDP-10 instruction used to do one I/O operation + \* 12 `FIX` PDP-10 instruction used to do one I/O operation - 13 `FIX` number of characters per line of output + 13 `FIX` number of characters per line of output - 14 `FIX` current character position on a line + 14 `FIX` current character position on a line - 15 `FIX` number of lines per page + 15 `FIX` number of lines per page - 16 `FIX` current line number on a page + 16 `FIX` current line number on a page - 17 `FIX` access pointer for file-oriented devices + 17 `FIX` access pointer for file-oriented devices - 18 `FIX` radix for `FIX` conversion + 18 `FIX` radix for `FIX` conversion - 19 `FIX` sink for an internal `CHANNEL` - --------------------------------------------------------------------------------------------------------- + 19 `FIX` sink for an internal `CHANNEL` + ------------------------------------------------------------------------------------------------ N.B.: The elements of a `CHANNEL` below number 1 are usually invisible but are obtainable via ` fix>`, for some appropriate @@ -4573,8 +4573,8 @@ execution upon `RESTORE`ation. eventually returns `"DONE"`. First, however, it `READ`s and `EVAL`s every Muddle object in the file pointed to by *input*, and then -`CLOSE`s *input*. Any occurrences of rubout, ^@,\ ^D, \^L, etc., in -the file are given no special meaning; they are simply `ATOM` +`CLOSE`s *input*. Any occurrences of rubout, ^@, ^D, \^L, etc., in the +file are given no special meaning; they are simply `ATOM` constituents. *look-up* is optional, used to specify a `LIST` of `OBLIST`s for the @@ -4739,8 +4739,8 @@ available, that is, when `READCHR` would return `-1`. returns its first argument, after making the two `CHANNEL`s "know -about each other" so that rubout, ^@,\ ^D and \^L on *terminal-in* -will cause the appropriate output on *terminal-out*. +about each other" so that rubout, ^@, ^D and \^L on *terminal-in* will +cause the appropriate output on *terminal-out*. ### 11.8.2. TTYECHO @@ -4935,7 +4935,7 @@ optional, `1` by default. The exact `TYPE` of the locative returned depends on the `PRIMTYPE` of *structured*: `LOCL` for `LIST`, `LOCV` for `VECTOR`, `LOCU` for `UVECTOR`, `LOCS` for `STRING`, `LOCB` for `BYTES`, `LOCT` for `TEMPLATE`, and `LOCA` for `TUPLE`. If *N* is -greated than `` or less than `1`, or an `OFFSET` +greater than `` or less than `1`, or an `OFFSET` with a Pattern that doesn't match *structured*, an error occurs. The locative is unaffected by applications of `REST`, `BACK`, `TOP`, `GROW`, etc. to *structured*. @@ -5791,7 +5791,7 @@ returns an object of `TYPE` and `PRIMTYPE` `OFFSET`. An `OFFSET`, like a `FIX`, may be given as an argument to `NTH` or `PUT` and may be applied to arguments. The only difference is that the `STRUCTURED` argument must match the Pattern contained in the `OFFSET`, or an error -will resuly. Thus: +will result. Thus: >>$ %> @@ -7014,3 +7014,2850 @@ the left; if *amount* is negative, rotation is to the right. Examples: #WORD *000000001000* $ #WORD *100000000000* + +Chapter 19. Compiled Programs +============================= + +19.1. RSUBR (the TYPE) +---------------------- + +`RSUBR`s ("relocatable subroutines") are machine-language programs +written to run in the Muddle environment. They are usually produced by +the Muddle assembler (often from output produced by the compiler) +although this is not necessary. All `RSUBR`s have two components: the +"reference vector" and the "code vector". In some cases the code +vector is in pure storage. There is also a set of "fixups" associated +with every `RSUBR`, although it may not be available in the running +Muddle. + +19.2. The Reference Vector +-------------------------- + +An `RSUBR` is basically a `VECTOR` that has been `CHTYPE`d to `TYPE` +`RSUBR` via the `SUBR` `RSUBR` (see below). This ex-`VECTOR` is the +reference vector. The first three elements of the reference vector +have predefined meanings: + +- The first element is of `TYPE` `CODE` or `PCODE` and is the impure + or pure code vector respectively. +- The second element is an `ATOM` and specifies the name of the + `RSUBR`. +- The third element is of `TYPE` `DECL` and declares the + type/structure of the `RSUBR`'s arguments and result. + +The rest of the elements of the reference vector are objects in +garbage-collected storage that the `RSUBR` needs to reference and any +impure slots that the `RSUBR` needs to use. + +When the `RSUBR` is running, one of the PDP-10 accumulators (with +symbolic name `R`) is always pointing to the reference vector, to +permit rapid access to the various elements. + +19.3. RSUBR Linking +------------------- + +`RSUBR`s can call any `APPLICABLE` object, all in a uniform manner. In +general, a call to an F/SUBR is linked up at assembly/compile time so +that the calling instruction (UUO) points directly at the code in the +interpreter for the F/SUBR. However, the locations of most other +`APPLICABLE`s are not known at assembly/compile time. Therefore, the +calling UUO is set up to point at a slot in the reference vector (by +indexing off accumulator `R`). This slot initially contains the `ATOM` +whose G/LVAL is the called object. The calling mechanism (UUO handler) +causes control to be transferred to the called object and, depending +on the state of the `RSUBR`-link flag, the `ATOM` will be replaced by +its G/LVAL. (If the call is of the "quick" variety, the called `RSUBR` +or `RSUBR-ENTRY` will be `CHTYPE`d to a `QUICK-RSUBR` or +`QUICK-ENTRY`, respectively, before replacement.) Regardless of the +`RSUBR`-link flag's state, calls to `FUNCTION`s are never permanently +linked. A call to a non-Subroutine generates an extra `FRAME`, whose +`FUNCT` is the dummy `ATOM` `CALLER`. + +`RSUBR`s are linked together for faster execution, but linking may not +be desirable if the `RSUBR`s are being debugged, and various revisions +are being re-loaded. A linked call will forever after go to the same +code, regardless of the current G/LVAL of the called `ATOM`. Thus, +while testing `RSUBR`s, you may want to disable linking, by calling +the `RSUBR-LINK` `SUBR` with a `FALSE` argument. Calling it with a +non-`FALSE` argument enables linking thereafter. It returns the +previous state of the link flag, either `T` or `#FALSE ()`. Calling it +with no argument returns the current state. + +19.4. Pure and Impure Code +-------------------------- + +The first element of an `RSUBR` is the code vector, of `TYPE` `CODE` +or `PCODE`. `TYPE` `CODE` is of `PRIMTYPE` `UVECTOR`, and the `UTYPE` +should be of `PRIMTYPE` `WORD`. The code vector is simply a block of +words that are the instructions which comprise the `RSUBR`. Since the +code vector is stored just like a standard `UVECTOR`, it will be moved +around by the garbage collector. Therefore, all `RSUBR` code is +required to be location-insensitive. The compiler guarantees the +location-insensitivity of its output. The assembler helps to make the +code location-insensitive by defining all labels as offsets relative +to the beginning of the code vector and causing instructions that +refer to labels to index automatically off the PDP-10 accumulator +symbolically named `M`. `M`, like `R`, is set up by the UUO handler, +but it points to the code vector instead of the reference vector. The +code vector of an `RSUBR` can be frozen (using the `FREEZE` `SUBR`) to +prevent it from moving during debugging by DDT in the superior +operating-system process. + +If the first element of an `RSUBR` is of `TYPE` `PCODE` ("pure code"), +the code vector of the `RSUBR` is pure and sharable. `TYPE` `PCODE` is +of `PRIMTYPE` `WORD`. The left half of the word specifies an offset +into an internal table of pure `RSUBR`s, and the right half specifies +an offset into the block of code where this `RSUBR` starts. The +`PCODE` prints out as: + + % + +where *name* names the entry in the user's pure-`RSUBR` table, and +*offset* is the offset. (Obviously, `PCODE` is also the name of a +`SUBR`, which generates a pure code vector.) Pure `RSUBR`s may also +move around, but only by being included in Muddle's page map at +different places. Once again `M` can be used exactly as before to do +location-independent address referencing. Individual pure code vectors +can be "unmapped" (marked as being not in primary storage but in their +original pure-code disk files) if the space in storage allocated for +pure code is exhausted. An unmapped `RSUBR` is mapped in again +whenever needed. All pure `RSUBR`s are unmapped before a `SAVE` file +is written, so that the code is not duplicated on disk. A purified +`RSUBR` must use `RGLOC` ("relative GLOC") instead of `GLOC`. `RGLOC` +produces objects of `TYPE` `LOCR` instead of `LOCD`. + +19.5. TYPE-C and TYPE-W +======================= + +In order to handle user `NEWTYPE`s reasonably, the internal `TYPE` +codes for them have to be able to be different from one Muddle run to +another. Therefore, references to the `TYPE` codes must be in the +reference vector rather than the code vector. To help handle this +problem, two `TYPE`s exist, `TYPE-C` ("type code") and `TYPE-W` ("type +word"), both of `PRIMTYPE` `WORD`. They print as: + + % + % + +The `SUBR` `TYPE-C` produces an internal `TYPE` code for the *type*, +and `TYPE-W` produces a prototype "`TYPE` word" (appendix 1) for an +object of that `TYPE`. The *primtype* argument is optional, included +only as a check against the call to `NEWTYPE`. `TYPE-W` can also take +a third argument, of `PRIMTYPE` `WORD`, whose right half is included +in the generated "`TYPE` word". If *type* is not a valid `TYPE`, a +`NEWTYPE` is automatically done. + +To be complete, a similar `SUBR` and `TYPE` should be mentioned here. + + + +produces an internal "storage allocation code" (appendix 1) for the +*type*. The value is of `TYPE` `PRIMTYPE-C`, `PRIMTYPE` `WORD`. In +almost all cases the `SUBR` `TYPEPRIM` gives just as much information, +except in the case of `TEMPLATE`s: all `TYPE`s of `TEMPLATE`s have the +same `TYPEPRIM`, but they all have different `PRIMTYPE-C`s. + +19.6. RSUBR (the SUBR) +---------------------- + + + +`CHTYPE`s its argument to an `RSUBR`, after checking it for legality. +`RSUBR` is rarely called other than in the Muddle Assembler (Lebling, +1979). It can be used if changes must be made to an `RSUBR` that are +prohibited by Muddle's built-in safety mechanisms. For example, if the +`GVAL` of *name* is an `RSUBR`: + + >$ + [...] + + ...(changes to .FIXIT)... + + >$ + #RSUBR [...] + +19.7. RSUBR-ENTRY +----------------- + +`RSUBR`s can have multiple entry points. An `RSUBR-ENTRY` can be +applied to arguments exactly like an `RSUBR`. + + + +returns the `VECTOR` argument `CHTYPE`d to an `RSUBR-ENTRY` into the +*rsubr* at the specified *offset*. If the `RSUBR-ENTRY` is to have a +`DECL` (`RSUBR` style), it should come as shown. + + + +("entry location") returns the *offset* into the `RSUBR` of this +entry. + +19.8. RSUBRs in Files +--------------------- + +There are three kinds of files that can contain `RSUBR`s, identified +by second names `BINARY`, `NBIN` and `FBIN`. There is nothing magic +about these names, but they are used by convention. + +A `BINARY` file is a completely ASCII file containing complete impure +`RSUBR`s in character representation. Even a code vector appears as +`#CODE` followed by a `UVECTOR` of `PRIMTYPE` `WORD`s. `BINARY` files +are generally slow to load, because of all the parsing that must be +done. + +An `NBIN` file contains a mixture of ASCII characters and binary code. +The start of a binary portion is signalled to `READ` by the character +control-C, so naive readers of an `NBIN` file under ITS may +incorrectly assume that it ends before any binary code appears. An +`NBIN` file cannot be edited with a text editor. An `RSUBR` is written +in `NBIN` format by being `PRINT`ed on a `"PRINTB"` `CHANNEL`. The +`RSUBR`s in `NBIN` files are not purified either. + +An `FBIN` file is actually part of a triad of files. The `FBIN` +file(s) itself is the impure part of a collection of purified +`RSUBR`s. It is simply ASCII and can be edited at will. (Exception: in +the ITS and Tops-20 versions, the first object in the file should not +be removed or changed in any way, lest a "grim reaper" program for +`FBIN` files think that the other files in the triad are obsolete and +delete them.) The pure code itself resides (in the ITS and Tops-20 +versions) in a special large file that contains all currently-used +pure code, or (in the Tenex version) in a file in a special disk +directory with first name the same as the *name* argument to `PCODE` +for the `RSUBR`. The pure-code file is page-mapped directly into +Muddle storage in read-only mode. It can be unmapped when the pure +storage must be reclaimed, and it can be mapped at a different storage +address when pure storage must be compacted. There is also a "fixup" +file (see below) or portion of a file associated with the `FBIN` to +round out the triad. + +An initial Muddle can have pure `RSUBR`s in it that were "loaded" +during the initialization procedure. The files are not page-mapped in +until they are actually needed. The "loading" has other side effects, +such as the creation of `OBLIST`s (chapter 15). Exactly what is +pre-loaded is outside the scope of this document. + +19.9. Fixups +------------ + +The purpose of "fixups" is to correct references in the `RSUBR` to +parts of the interpreter that change from one release of Muddle to the +next. The reason the fixups contain a release number is so that they +can be completely ignored when an `RSUBR` is loaded into the same +release of Muddle as that from which it was last written out. + +There are three forms of fixups, corresponding to the three kinds of +`RSUBR` files. ASCII `RSUBR`s, found in `BINARY` files, have ASCII +fixups. The fixups are contained in a `LIST` that has the following +format: + + (Muddle-release:fix + name:atom value:fix (use:fix use:fix ...) + name:atom value:fix (use:fix use:fix ...) + ...) + +The fixups in `NBIN` files and the fixup files associated with `FBIN` +files are in a fast internal format that looks like a `UVECTOR` of +`PRIMTYPE` `WORD`s. + +Fixups are usually discarded after they are used during the loading +procedure. However, if, while reading a `BINARY` or `NBIN` file the +`ATOM` `KEEP-FIXUPS!-` has a non-`FALSE` `LVAL`, the fixups will be +kept, via an association between the `RSUBR` and the `ATOM` `RSUBR`. +It should be noted that, besides correcting the code, the fixups +themselves are corrected when `KEEP-FIXUPS` is bound and true. Also, +the assembler and compiler make the same association when they first +create an `RSUBR`, so that it can be written out with its fixups. + +In the case of pure `RSUBR`s (`FBIN` files), things are a little +different. If a pure-code file exists for this release of Muddle, it +is used immediately, and the fixups are completely ignored. If a +pure-code file for this release doesn't exist, the fixup file is used +to create a new copy of the file from an old one, and also a new +version of the fixup file is created to go with the new pure-code +file. This all goes on automatically behind the user's back. + +Chapter 20. Coroutines +====================== + +This chapter purports to explain the coroutine primitives of Muddle. +It does make some attempt to explain coroutines as such, but only as +required to specify the primitives. If you are unfamiliar with the +basic concepts, confusion will probably reign. + +A coroutine in Muddle is implemented by an object of `TYPE` `PROCESS`. +In this manual, this use of the word "process" is distinguished by a +capitalization from its normal use of denoting an operating-system +process (which various systems call a process, job, fork, task, etc.). + +Muddle's built-in coroutine primitives do not include a "time-sharing +system". Only one `PROCESS` is ever running at a time, and control is +passed back and forth between `PROCESS`es on a coroutine-like basis. +The primitives are sufficient, however, to allow the writing of a +"time-sharing system" **in Muddle**, with the additional use of the +Muddle interrupt primitives. This has, in fact, been done. + +20.1. PROCESS (the TYPE) +------------------------ + +A `PROCESS` is an object which contains the "current state" of a +computation. This includes the `LVAL`s of `ATOM`s ("bindings"), +"depth" of functional application, and "position" within the +application of each applied function. Some of the things which are +**not** part of any specific `PROCESS` are the `GVAL`s of `ATOM`s, +associations (`ASOC`s), and the contents of `OBLIST`s. `GVAL`s (with +`OBLIST`s) are a chief means of communication and sharing between +`PROCESS`es (all `PROCESS`es can refer to the `SUBR` which is the +`GVAL` of `+`, for instance.) Note that an `LVAL` in one `PROCESS` +cannot easily be directly referenced from another `PROCESS`. + +A `PROCESS` `PRINT`s as `#PROCESS` *p*, where *p* is a `FIX` which +uniquely identifies the `PROCESS`; *p* is the "`PROCESS` number" typed +out by `LISTEN`. A `PROCESS` cannot be read in by `READ`. + +The term "run a `PROCESS`" will be used below to mean "perform some +computation, using the `PROCESS` to record the intermediate state of +that computation". + +N.B.: A `PROCESS` is a rather large object; creating one will often +cause a garbage collection. + +20.2. STATE of a PROCESS +------------------------ + + + +returns an `ATOM` (in the `ROOT` `OBLIST`) which indicates the "state" +of the `PROCESS` *process*. The `ATOM`s which `STATE` can return, and +their meanings, are as follows: + +- `RUNABLE` (sic) -- *process* has never ever been run. +- `RUNNING` -- *process* is currently running, that is, it did the + application of `STATE`. +- `RESUMABLE` -- *process* has been run, is not currently running, + and can run again. +- `DEAD` -- *process* has been run, but it can **not** run again; it + has "terminated". + +In addition, an interrupt (chapter 21) can be enabled to detect the +time at which a `PROCESS` becomes "blocked" (waiting for terminal +input) or "unblocked" (terminal input arrived). (The `STATE` `BLOCKED` +has not been implemented.) + +20.3. PROCESS (the SUBR) +------------------------ + + + +creates and returns a new `PROCESS` but does **not** run it; the +`STATE` of the returned `PROCESS` is `RUNABLE` (sic). + +*starter* is something applicable to **one** argument, which must be +evaluated. *starter* is used both in starting and "terminating" a +`PROCESS`. In particular, if the *starter* of a `PROCESS` **ever** +returns a value, that `PROCESS` becomes `DEAD`. + +20.4. RESUME +------------ + +The `SUBR` `RESUME` is used to cause a computation to start or to +continue running in another `PROCESS`. An application of `RESUME` +looks like this: + + + +where *retval* is the "returned value" (see below) of the `PROCESS` +that does the `RESUME`, and *process* is the `PROCESS` to be started +or continued. + +The *process* argument to `RESUME` is optional, by default the last +`PROCESS`, if any, to `RESUME` the `PROCESS` in which this `RESUME` is +applied. If and when the current `PROCESS` is later `RESUME`d by +another `PROCESS`, that `RESUME`'s *retval* is returned as the value +of this `RESUME`. + +20.5. Switching PROCESSes +------------------------- + +### 20.5.1. Starting Up a New PROCESS + +Let us say that we are running in some `PROCESS`, and that this +original `PROCESS` is the `GVAL` of `P0`. Somewhere, we have evaluated + + > + +where `,STARTER` is some appropriate function. Now, **in `,P0`** we +evaluate + + + +and the following happens: + +1. **In `,P0`** the arguments of the `RESUME` are evaluated: that is, + we get that `LVAL` of `A` which is current in `,P0` and the `GVAL` + of `P1`. +2. The `STATE` of `,P0` is changed to `RESUMABLE` and `,P0` is + "frozen" right where it is, in the middle of the `RESUME`. +3. The `STATE` of `,P1` is changed to `RUNNING`, and `,STARTER` is + applied to `,P0`'s `LVAL` of `A` **in `,P1`**. `,P1` now continues + on its way, evaluating the body of `,STARTER.` + +The `.A` in the `RESUME` could have been anything, of course. The +important point is that, whatever it is, it is evaluated in `,P0`. + +What happens next depends, of course, on what `,STARTER` does. + +### 20.5.2. Top-level Return + +Let us initially assume that `,STARTER` does nothing relating to +`PROCESS`es, but instead simply returns a value -- say *starval*. What +happens when `,STARTER` returns is this: + +1. The `STATE` of `,P1` is changed to `DEAD`. `,P1` can never again + be `RESUME`d. +2. The last `PROCESS` to `RESUME` `,P1` is found, namely `,P0`, and + its `STATE` is changed to `RUNNING`. +3. *starval* is returned in `,P0` as the value of the original + `RESUME`, and `,P0` continues where it left off. + +All in all, this simple case looks just like an elaborate version of +applying `,STARTER` to `.A` in `,P0`. + +### 20.5.3. Symmetric RESUMEing + +Now suppose that while still in `,P1`, the following is evaluated, +either in `,STARTER` or in something called by `,STARTER`: + + + +This is what happens: + +1. The arguments of the `RESUME` are evaluated **in `,P1`**. +2. The `STATE` of `,P1` is changed to `RESUMABLE`, and `,P1` is + "frozen" right in the middle of the `RESUME`. +3. The `STATE` of `,P0` is changed to `RUNNING`, and `,P1`'s `LVAL` + of `BAR` is returned as the value of **`,P0'`s** original `RESUME` + `,P0` then continues right where it left off. + +This is **the** interesting case, because `,P0` can now do **another** +`RESUME` of `,P1`; this will "turn off" `,P0`, pass a value to `,P1` +and "turn on" `,P1`. `,P1` can now again `RESUME` `,P0`. which can +`RESUME` `,P1` back again, etc. **ad nauseam**, with everything done +in a perfectly symmetric manner. This can obviously also be done with +three or more `PROCESS`es in the same manner. + +Note how this differs from normal functional application: you cannot +"return" from a function without destroying the state that function is +in. The whole point of `PROCESS`es is that you can "return" +(`RESUME`), remembering your state, and later continue where you left +off. + +20.6. Example +------------- + + ;"Initially, we are in LISTEN in some PROCESS. + ) + ) + >> + >> + >>>$ + SUM3 + ;"SUM3, used as the startup function of another PROCESS, + gets RESUMEd with numbers. It returns the sum of the last + three numbers it was given every third RESUME." + >$ + ;"Now we start SUMUP and give SUM3 its three numbers." + $ + "GOT 1" + $ + "GOT 2" + $ + 8 + +Just as a note, by taking advantage of Muddle's order of evaluation, +SUM3 could be have been written as: + + ) + >>>>> + +20.7. Other Coroutining Features +-------------------------------- + +### 20.7.1. BREAK-SEQ + + + +("break evaluation sequence") returns *process*, which must be +`RESUMABLE`, after having modified it so that when it is next +`RESUME`d, it will **first** evaluate *any* and **then** do an +absolutely normal `RESUME`; the value returned by any is thrown away, +and the value given by the `RESUME` is used normally. + +If a `PROCESS` is `BREAK-SEQ`ed more than once between `RESUME`s, +**all** of the *any*s `BREAK-SEQ`ed onto it will be remembered and +evaluated when the `RESUME` is finally done. The *any*s will be +evaluated in "last-in first-out" order. The `FRAME` generated by +`EVAL`ing more than one *any* will have as its `FUNCT` the dummy +`ATOM` `BREAKER`. + +### 20.7.2. MAIN + +When you initially start up Muddle, the `PROCESS` in which you are +running is slightly "special" in these two ways: + +1. Any attempt to cause it become `DEAD` will be met with an error. +2. `
` always returns that `PROCESS`. + +The `PROCESS` number of `
` is always `1`. The initial `GVAL` of +`THIS-PROCESS` is what `MAIN` always returns, `#PROCESS 1`. + +### 20.7.3. ME + + + +returns the `PROCESS` in which it is evaluated. The `LVAL` of +`THIS-PROCESS` in a `RUNABLE` (new) `PROCESS` is what `ME` always +returns. + +### 20.7.4. RESUMER + + + +returns the `PROCESS` which last `RESUME`d *process*. If no `PROCESS` +has ever `RESUME`d process, it returns `#FALSE ()`. *process* is +optional, `` by default. Note that `
` does not ever have any +resumer. Example: + + )) ;"not effective in
" + #DECL ((R) ) + RESUMABLE> + >> + +### 20.7.5. SUICIDE + + + +acts just like `RESUME`, but clobbers the `PROCESS` (which cannot be +`
`) in which it is evaluated to the `STATE` `DEAD`. + +### 20.7.6. 1STEP + + <1STEP process> + +returns *process*, after putting it into "single-step mode". + +A `PROCESS` in single-step mode, whenever `RESUME`d, runs only until +an application of `EVAL` in it begins or finishes. At that point in +time, the `PROCESS` that did the `1STEP` is `RESUME`d, with a *retval* +which is a `TUPLE`. If an application of `EVAL` just began, the +`TUPLE` contains the `ATOM` `EVLIN` and the arguments to `EVAL`. If an +application of `EVAL` just finished, the `TUPLE` contains the `ATOM` +`EVLOUT` and the result of the evaluation. + +*process* will remain in single-step mode until `FREE-RUN` (below) is +applied to it. Until then, it will stop before and after each `EVAL` +in it. Exception: if it is `RESUME`d from an `EVLIN` break with a +*retval* of `TYPE` `DISMISS` (`PRIMTYPE` `ATOM`), it will leave +single-step mode only until the current call to EVAL is about to +return. Thus lower-level `EVAL`s are skipped over without leaving the +mode. The usefulness of this mode in debugging is obvious. + +### 20.7.7. FREE-RUN + + + +takes its argument out of single-step mode. Only the `PROCESS` that +put *process* into single-step mode can take it out of the mode; if +another `PROCESS` tries, `FREE-RUN` returns a `FALSE`. + +20.8. Sneakiness with PROCESSes +------------------------------- + +`FRAME`s, `ENVIRONMENT`s, `TAG`s, and `ACTIVATION`s are specific to +the `PROCESS` which created them, and each "knows its own father". +**Any** `SUBR` which takes these objects as arguments can take one +which was generated by **any** `PROCESS`, no matter where the `SUBR` +is really applied. This provides a rather sneaky means of crossing +between `PROCESS`es. The various cases are as follows: + +`GO`, `RETURN`, `AGAIN`, and `ERRET`, given arguments which lie in +another `PROCESS`, each effectively "restarts" the `PROCESS` of its +argument and acts as if it were evaluated over there. If the `PROCESS` +in which it was executed is later `RESUME`d, it **returns** a value +just like `RESUME`! + +`SET`, `UNASSIGN`, `BOUND?`, `ASSIGNED?`, `LVAL`, `VALUE`, and `LLOC`, +given optional `ENVIRONMENT` arguments which lie in another `PROCESS`, +will gleefully change, or return, the local values of `ATOM`s in the +other `PROCESS`. The optional argument can equally well be a +`PROCESS`, `FRAME`, or `ACTIVATION` in another `PROCESS`; in those +cases, each uses the `ENVIRONMENT` which is current in the place +specified. + +`FRAME`, `ARGS`, and `FUNCT` will be glad to return the `FRAME`s, +argument `TUPLE`s, and applied Subroutine names of another `PROCESS`. +If one is given a `PROCESS` (including ``) as an argument instead +of a `FRAME`, it returns all or the appropriate part of the topmost +`FRAME` on that `PROCESS`'s control stack. + +If `EVAL` is applied in `PROCESS` `P1` with an `ENVIRONMENT` argument +from a `PROCESS` `P2`, it will do the evaluation **in `P1`** but with +`P2`'s `ENVIRONMENT` (!). That is, the other `PROCESS`'s `LVAL`s, etc. +will be used, but (1) any **new** `FRAME`s needed in the course of the +evaluation will be created in `P1`; and (2) **`P1`** will be `RUNNING` +-- not `P2`. Note the following: if the `EVAL` in `P1` eventually +causes a `RESUME` of `P2`, `P2` could functionally return to below the +point where the `ENVIRONMENT` used in `P1` is defined; a `RESUME` of +`P1` at this point would cause an `ERROR` due to an invalid +`ENVIRONMENT`. (Once again, `LEGAL?` can be used to forestall this.) + +20.9. Final Notes +----------------- + +1. A `RESUMABLE` `PROCESS` can be used in place of an `ENVIRONMENT` + in any application. The "current" `ENVIRONMENT` of the `PROCESS` + is effectively used. +2. `FRAME`s and `ENVIRONMENT`s can be `CHTYPE`d arbitrarily to one + another, or an `ACTIVATION` can be `CHTYPE`d to either of them, + and the result "works". Historically, these different `TYPE`s were + first used with different `SUBR`s -- `FRAME` with `ERRET`, + `ENVIRONMENT` with `LVAL`, `ACTIVATION` with `RETURN` -- hence the + invention of different `TYPE`s with similar properties. +3. Bugs in multi-`PROCESS` programs usually exhibit a degree of + subtlety and nastiness otherwise unknown to the human mind. If + when attempting to work with multiple processes you begin to feel + that you are rapidly going insane, you are in good company. + +Chapter 21. Interrupts +====================== + +The Muddle interrupt handling facilities provide the ability to say +the following: whenever "this event" occurs, stop whatever is being +done at the time and perform "this action"; when "this action" is +finished, continue with whatever was originally being done. "This +event" can be things like the typing of a character at a terminal, a +time interval ending, a `PROCESS` becoming blocked, or a +program-defined and -generated "event". "This action" is the +application of a specified `APPLICABLE` object to arguments provided +by the Muddle interrupt system. The sets of events and actions can be +changed in extremely flexible ways, which accounts for both the +variety of `SUBR`s and arguments, and the rich interweaving of the +topics in this chapter. Interrupt handling is a kind of parallel +processing: a program can be divided into a "main-level" part and one +or more interrupt handlers that execute only when conditions are ripe. + +21.1. Definitions of Terms +-------------------------- + +An **interrupt** is not an object in Muddle, but rather a class of +events, for example, "ticks" of a clock, garbage collections, the +typing of a character at a terminal, etc. + +An interrupt is said to **occur** when one of the events in its class +takes place. + +An **external** interrupt is one whose occurrences are signaled to +Muddle by the operating system, for example, "ticks" of a clock. An +**internal** interrupt is one whose occurrences are detected by Muddle +itself, for example, garbage collections. Muddle can arrange for the +operating system to not signal occurrences of an external interrupt to +it; then, as far as Muddle is concerned, that interrupt does not +occur. + +Each interrupt has a **name** which is either a `STRING` (for example, +`"GC"`, `"CHAR"`, `"WRITE"`) or an `ATOM` with that `PNAME` in a +special `OBLIST`, named `INTERRUPTS!-`. (This `OBLIST` is returned by +``.) Certain names must always be further specified by a +`CHANNEL` or a `LOCATIVE` to tell **which** interrupt by that name is +meant. + +When an interrupt occurs, the interpreter looks for an association on +the interrupt's name. If there is an association, its `AVALUE` should +be an `IHEADER`, which heads a list of actions to be performed. In +each `IHEADER` is the name of the interrupt with which the `IHEADER` +is or was associated. + +In each `IHEADER` is an element telling whether it is disabled. If an +`IHEADER` is **disabled**, then none of its actions is performed. The +opposite of disabled is **enabled**. It is sometimes useful to disable +an `IHEADER` temporarily, but removing its association with the +interrupt's name is better than long-term disabling. There are `SUBR`s +for creating an `IHEADER`, associating it with an interrupt, and later +removing the association. + +In each `IHEADER` is a **priority**, a `FIX` greater than `0` which +specifies the interrupt's "importance". The processing of a +higher-priority (larger-numbered) interrupt will supersede the +processing of a lower-priority (smaller-numbered) interrupt until the +high-priority interrupt has been handled. + +In each `IHEADER` is a (possibly empty) list of `HANDLER`s. (This list +is not a Muddle `LIST`.) Each `HANDLER` corresponds to an action to +perform. There are `SUBR`s for creating a `HANDLER`, adding it to an +`IHEADER`'s list, and later removing it. + +In each `HANDLER` is a function that we will call a **handler** (in +lower case), despite possible confusion, because that is really the +best name for it. An **action** consists of applying a handler to +arguments supplied by the interrupt system. The number and meaning of +the arguments depend on the name of the interrupt. In each `HANDLER` +is an element telling in which `PROCESS` the action should be +performed. + +21.2. EVENT +----------- + + + +creates and returns an enabled `IHEADER` with no `HANDLER`s. The +*name* may be an `ATOM` in the `INTERRUPTS` `OBLIST` or a `STRING`; if +it is a `STRING`, `EVENT` does a `LOOKUP` or `INSERT` in +``. If there already is an `IHEADER` associated with +*name*, `EVENT` just returns it, ignoring the given *priority*. + +*which* must be given only for certain *name*s: + +- It must be a `CHANNEL` if and only if *name* is `"CHAR"` (or + `CHAR!-INTERRUPTS`). In this case it is the input `CHANNEL` from + the (pseudo-)terminal or Network socket whose received characters + will cause the interrupt to occur, or the output `CHANNEL` to the + pseudo-terminal or Network socket whose desired characters will + cause the interrupt to occur. (See below. Pseudo-terminals are not + available in the Tenex and Tops-20 versions.) +- The argument must be a `LOCATIVE` if and only if *name* is + `"READ"` (or `READ!-INTERRUPTS`) or `"WRITE"` (or + `WRITE!-INTERRUPTS`). In this case it specifies an object to be + "monitored" for usage by (interpreted) Muddle programs (section + 21.8.9). + +If the interrupt is external, Muddle arranges for the operating system +to signal its occurrences. + +21.3. HANDLER (the SUBR) +------------------------ + + + +creates a `HANDLER`, adds it to the front of *iheader*'s `HANDLER` +list (first action to be performed), and returns it as a value. +*applicable* may be any `APPLICABLE` object that takes the proper +number of arguments. (None of the arguments can be `QUOTE`d; they must +all be evaluated at call time.) *process* is the `PROCESS` in which +the handler will be applied, by default whatever `PROCESS` was running +when the interrupt occurred. + +The value returned by the handler is ignored, unless it is of `TYPE` +`DISMISS` (`PRIMTYPE` `ATOM`), in which case none of the remaining +actions in the list will be performed. + +The processing of an interrupt's actions can terminate prematurely if +a handler calls the `SUBR` `DISMISS` (see below.) + +21.4. OFF +--------- + + + +removes the association between *iheader* and the name of its +interrupt, and then disables *iheader* and returns it. (An error +occurs if there is no association.) If the interrupt is external, +Muddle arranges for the operating system not to signal its +occurrences. + + + +finds the `IHEADER` associated with *name* and proceeds as above, +returning the `IHEADER`. *which* must be given only for certain +*names*, as for `EVENT`. Caution: if you ``, +Muddle will become deaf. + + + +returns *handler* after removing it from its list of actions. There is +no effect on any other `HANDLER`s in the list. + +Now that you know how to remove `IHEADER`s and `HANDLER`s from their +normal places, you need to know how to put them back: + + + +If *iheader* was previously disabled or disassociated from its name, +`EVENT` will associate and enable it. + + + +If *handler* was previously removed from its list, `HANDLER` will add +it to the front of *iheader*'s list of actions. Note that *process* +cannot be specified. + +21.5. IHEADER and HANDLER (the TYPEs) +------------------------------------- + +Both these `TYPE`s are of `PRIMTYPE` `VECTOR`, but they do not `PRINT` +that way, since they are self-referencing. Instead they `PRINT` as + + #type most-interesting-component + +The contents of `IHEADER`s and `HANDLER`s can be changed by `PUT`, and +the new values will then determine the behavior of Muddle. + +Before describing the elements of these `TYPE`s in detail, here are a +picture and a Pattern, both purporting to show how they look: + + #IHEADER [name:atom or which + disabled? + *-----------> #HANDLER [*-----------> #HANDLER [#HANDLER [] + priority] <-------------* +------* + applicable | applicable + process] <-------+ process] + + + + APPLICABLE PROCESS> + FIX> + +### 21.5.1. IHEADER + +The elements of an `IHEADER` are as follows: + +1. name of interrupt (`ATOM`, or `CHANNEL` if the name is `"CHAR"`, + or `LOCATIVE` if the name is `"READ"` or `"WRITE"`) +2. non-zero if and only if disabled +3. first `HANDLER`, if any, else a zero-length `HANDLER` +4. priority + +If you lose track of an `IHEADER`, you can get it via the association: + +- For `"CHAR"` interrupts, `` returns the + `IHEADER` or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if + there is no association. +- For `"READ"` interrupts, `` returns + the `IHEADER` or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if + there is no association. +- For `"WRITE"` interrupts, `` + returns the `IHEADER` or `#FALSE ()` if there is no association: + `` returns the `IHEADER`, creating it if + there is no association. +- Otherwise, the `IHEADER` is `PUT` on the name `ATOM` with the + indicator `INTERRUPT`. Thus, for example, + `` returns the `IHEADER` for the + clock interrupt or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if there is + no association. + +### 21.5.2. HANDLER + +A `HANDLER` specifies a **particular** action for a **particular** +interrupt. The elements of a `HANDLER` are as follows: + +1. next `HANDLER` if any, else a zero-length `HANDLER` +2. previous `HANDLER` or the `IHEADER` (Thus the `HANDLER`s of a + given interrupt form a "doubly-linked list" chaining between each + other and back to the `IHEADER`.) +3. handler to be applied (anything but `APPLICABLE` that evaluates + its arguments -- the application is done not by `APPLY` but by + `RUNINT`, which can take a `PROCESS` argument: see next line) +4. `PROCESS` in which the handler will be applied, or `#PROCESS 0`, + meaning whatever `PROCESS` was running when the interrupt occurred + (In the former case, `RUNINT` is applied to the handler and its + arguments in the currently running `PROCESS`, which causes an + `APPLY` in the `PROCESS` stored in the `HANDLER`, which `PROCESS` + must be `RESUMABLE`. The running `PROCESS` becomes `RESUMABLE`, + and the stored `PROCESS` becomes `RUNNING`, but no other `PROCESS` + variables (for example `RESUMER`) are changed.) + +21.6. Other SUBRs +----------------- + + + +is equivalent to + + + applicable process> + +`ON` is a combination of `EVENT` and `HANDLER`: it creates (or finds) +the `IHEADER`, associates and enables it, adds a `HANDLER` to the +front the list (first to be performed), and returns the `HANDLER`. + + + +is effectively ``. Actually the `TYPE` `LOSE` +is unimportant, but the `-1` signifies that *iheader* is disabled. + + + +is effectively ``. Actually the `TYPE` `LOSE` +is unimportant, but the `0` signfies that *iheader* is enabled. + +21.7. Priorities and Interrupt Levels +------------------------------------- + +At any given time there is a defined **interrupt level**. This is a +`FIX` which determines which interrupts can really "interrupt" -- that +is, cause the current processing to be suspended while their wants are +satisfied. Normal, non-interrupt programs operate at an interrupt +level of 0 (zero.) An interrupt is processed at an interrupt level +equal to the interrupt's priority. + +### 21.7.1. Interrupt Processing + +Interrupts "actually" only occur at well-defined points in time: +during a call to a Subroutine, or at critical places within +Subroutines (for example, during each iteration of `MAPF` on a `LIST`, +which may be circular), or while a `PROCESS` is `"BLOCKED"` (see +below). No interrupts can occur during garbage collection. + +What actually happens when an enabled interrupt occurs is that the +priority of the interrupt is compared with the current interrupt +level, and the following is done: + +If the priority is **greater than** the current interrupt level, the +current processing is "frozen in its tracks" and processing of the +action(s) specified for that interrupt begins. + +If the priority is less than or equal to the current interrupt level, +the interrupt occurrence is **queued** -- that is, the fact that it +occurred is saved away for processing when the interrupt level becomes +low enough. + +When the processing of an interrupt's actions is completed, Muddle +usually (1) "acts as if" the previously-existing interrupt level is +restored, and processing continues on what was left off (perhaps for +no time duration); and (2) "acts as if" any queued interrupt +occurrences actually occurred right then, in their original order of +occurrence. + +### 21.7.2. INT-LEVEL + +The `SUBR` `INT-LEVEL` is used to examine and change the current +interrupt level directly. + + + +simply returns the current interrupt level. + + + +changes the interrupt level to its argument and returns the +**previously**-existing interrupt level. + +If `INT-LEVEL` lowers the priority of the interrupt level, it does not +"really" return until all queued occurrences of interrupts of higher +priority than the target priority have been processed. + +Setting the `INT-LEVEL` extremely high (for example, +` FIX>>`) effectively disables all interrupts +(but occurrences of enabled interrupts will still be queued). + +If `LISTEN` or `ERROR` is called when the `INT-LEVEL` is not zero, +then the typeout will be + + LISTENING-AT-LEVEL I PROCESS p INT-LEVEL i + +### 21.7.3. DISMISS + +`DISMISS` permits a handler to return an arbitrary value for an +arbitrary `ACTIVATION` at an arbitrary interrupt level. The call is as +follows: + + + +where only the *value* is required. If *activation* is omitted, return +is to the place interrupted from, and *value* is ignored. If +*int-level* is omitted, the `INT-LEVEL` prior to the current interrupt +is restored. + +21.8. Specific Interrupts +------------------------- + +Descriptions of the characteristics of particular "built-in" Muddle +interrupts follow. Each is named by its `STRING` name. Expect this +list to be incomplete yesterday. + +`"CHAR"` is currently the most complex built-in interrupt, because it +serves duty in several ways. These different ways will be described in +several different sections. All ways are concerned with characters or +machine words that arrive or depart at unpredictable times, because +Muddle is communicating with a person or another processor. Each +`"CHAR"` `IHEADER` has a `CHANNEL` for the element that names the +interrupt, and the mode of the `CHANNEL` tells what kinds of `"CHAR"` +interrupts occur to be handled through that `IHEADER`. + +1. If the `CHANNEL` is for `INPUT`, "CHAR" occurs every time an + "interesting" character (see below) is received from the + `CHANNEL`'s real terminal, or any character is received from the + `CHANNEL`'s pseudo-terminal, or a character or word is received + from the `CHANNEL`'s Network socket, or indeed (in the ITS + version) the operating system generates an interrupt for any + reason. +2. If the `CHANNEL` is for output to a pseudo-terminal or Network + socket, `"CHAR"` occurs every time a character or word is wanted. +3. If the `CHANNEL` is for output to a terminal, `"CHAR"` occurs + every time a line-feed character is output or (in the ITS version) + the operating system generates a screen-full interrupt for the + terminal. + +### 21.8.1. "CHAR" received + +A handler for an input `"CHAR"` interrupt on a real terminal must take +two arguments: the `CHARACTER` which was typed, and the `CHANNEL` on +which it was typed. + +In the ITS version, the "interesting" characters are those "enabled +for interrupts" on a real terminal, namely \^@ through +\^G, \^K through \^\_, and +DEL (that is, ASCII codes 0-7, 13-37, and 177 octal.) + +In the Tenex and Tops-20 versions, the operating system can be told +which characters typed on a terminal should cause this interrupt to +occur, by calling the `SUBR` `ACTIVATE-CHARS` with a `STRING` argument +containing those characters (no more than six, all with ASCII codes +less than 33 octal). If called with no argument, `ACTIVATE-CHARS` +returns a `STRING` containing the characters that currently interrupt. +Initially, only \^G, \^S, and \^O +interrupt. + +An initial Muddle already has `"CHAR"` enabled on `,INCHAN` with a +priority 8 (eight), the `SUBR` `QUITTER` for a handler to run in +`#PROCESS 0` (the running `PROCESS`); this is how `^G` and +`^S` are processed. In addition, every time a new `CHANNEL` +is `OPEN`ed in `"READ"` mode to a terminal, a similar `IHEADER` and +`HANDLER` are associated with that new `CHANNEL` automatically. These +automatically-generated `IHEADER`s and `HANDLER`s use the standard +machinery, and they can be `DISABLE`d or `OFF`ed at will. **However**, +the `IHEADER` for `,INCHAN` should not be `OFF`ed: Muddle knows that +`$` is typed only by an interrupt! + +Example: the following causes the given message to be printed out +whenever a `^Y` is typed on `.INCHAN`: + + + #FUNCTION ((CHAR CHAN) + #DECL ((VALUE) ANY (CHAR) CHARACTER (CHAN) CHANNEL) + + >)>>$ + #HANDLER #FUNCTION **CHAR CHAN) ...) + <+ 2 ^Y [Some of my best friends are ^Ys.] 2>$ + 4 + $ + #HANDLER #FUNCTION (...) + +Note that occurrences of `"CHAR"` do **not** wait for the `$` to be +typed, and the interrupting character is omitted from the input +stream. + +A `"CHAR"` interrupt can also be associated with an input `CHANNEL` +open to a Network socket (`"NET"` device). A handler gets applied to a +`NETSTATE` array (which see) and the `CHANNEL`. + +In the ITS version, a `"CHAR"` interrupt can also be associated with +an input `CHANNEL` open to a pseudo-terminal ("STY" device and +friends). An interrupt occurs when a character is available for input. +These interrupts are set up in exactly the same way as real-terminal +interrupts, except that a handler gets applied to only **one** +argument, the `CHANNEL`. Pseudo-terminal are not available in the +Tenex and Tops-20 versions. + +For any other flavor of ITS channel interrupt, a handler gets applied +to only **one** argument, the `CHANNEL`. + +### 21.8.2. "CHAR" wanted + +A `"CHAR"` interrupt can be associated with an output `CHANNEL` open +to a Network socket (`"NET"` device). A handlers gets applied to a +`NETSTATE` array (which see) and the `CHANNEL`. + +In the ITS version, a `"CHAR"` interrupt can also be associated with +an output `CHANNEL` open to a pseudo-terminal (`"STY"` device and +friends). An interrupt occurs when the program at the other end needs +a character (and the operating-system buffer is empty). A handler gets +applied to one argument, the `CHANNEL`. Pseudo-terminals are not +available in the Tenex and Tops-20 versions. + +### 21.8.3. "CHAR" for new line + +A handler for an output `"CHAR"` interrupt on a real terminal must +take **one or two** arguments (using `"OPTIONAL"` or `"TUPLE"`): if +two arguments are supplied by the interrupt system, they are the line +number (`FIX`) and the `CHANNEL`, respectively, and the interrupt is +for a line-feed; if only one argument is supplied (only in the ITS +version), it is the `CHANNEL`, and the interrupt is for a full +terminal screen. Note: the supplied line number comes from the +`CHANNEL`, and it may not be accurate if the program alters it in +subtle ways, for example, via `IMAGE` calls or special control +characters. (The program can compensate by putting the proper line +number into the `CHANNEL`.) + +### 21.8.4. "GC" + +`"GC"` occurs just **after** every garbage collection. Enabling this +interrupt is the only way a program can know that a garbage collection +has occurred. A handler for `"GC"` takes three arguments. The first is +a FLOAT indicating the number of seconds the garbage collection took. +The second argument is a FIX indicating the cause of the garbage +collection, as follows (chapter 22): + +0. Program called GC. +1. Movable storage was exhausted. +2. Control stack overflowed. +3. Top-level LVALs overflowed. +4. GVAL vector overflowed. +5. TYPE vector overflowed. +6. Immovable garbage-collected storage was exhausted. +7. Internal stack overflowed. +8. Both control and internal stacks overflowed (rare). +9. Pure storage was exhausted. +10. Second, exhaustive garbage collection occurred. + +The third argument is an ATOM indicating what initiated the garbage +collection: `GC-READ`, `BLOAT`, `GROW`, `LIST`, `VECTOR`, `SET`, +`SETG`, `FREEZE`, `GC`, `NEWTYPE`, `PURIFY`, `PURE-PAGE-LOADER` (pure +storage was exhausted), or `INTERRUPT-HANDLER` (stack overflow, +unfortunately). + +### 21.8.5. "DIVERT-AGC" + +`"DIVERT-AGC"` ("Automatic Garbage Collection") occurs just **before** +a deferrable garbage collection that is needed because of exhausted +movable garbage-collected storage. Enabling this interrupt is the only +way a program can know that a garbage collection is about to occur. A +handler takes two arguments: A `FIX` telling the number of machine +words needed and an `ATOM` telling what initiated the garbage +collection (see above). If it wishes, a handler can try to prevent a +garbage collection by calling `BLOAT` with the `FIX` argument. If the +pending request for garbage-collected storage cannot then be +satisfied, a garbage collection occurs anyway. `AGC-FLAG` is `SET` to +`T` while the handler is running, so that new storage requests do not +try to cause a garbage collection. + +### 21.8.6. "CLOCK" + +`"CLOCK"`, when enabled, occurs every half second (the ITS +"slow-clock" tick.) It is not available in the Tenex or Tops-20 +versions. It wants handlers which take no arguments. Example: + + > 1> + +### 21.8.7. "BLOCKED" + +`"BLOCKED"` occurs whenever **any** `PROCESS` (not only the `PROCESS` +which may be in a `HANDLER`) starts waiting or terminal input: that +is, an occurrence indicates that somewhere, somebody did a `READ`, +`READCHR`, `NEXTCHR`, `TYI`, etc. to a console. The handler for a +`"BLOCKED"` interrupt should take one argument, namely the `PROCESS` +which started waiting (which will also be the `PROCESS` in which the +handler runs, if no specific one is in the `HANDLER`). + +Example: the following will cause Muddle to acquire a `*` prompting +character. + + ) 5> + +### 21.8.8. "UNBLOCKED" + +`"UNBLOCKED"` occurs whenever a `$` (`ESC`) is typed on a +terminal if a program was hanging and waiting for input, or when a TYI +call (which see) is satisfied. A handler takes one argument: the +`CHANNEL` via which the `$` or character is input. + +### 21.8.9. "READ" and "WRITE" + +`"READ"` and `"WRITE"` are associated with read or write references to +Muddle objects. These interrupts are often called "monitors", and +enabling the interrupt is often called "monitoring" the associated +object. A "read reference" to an `ATOM`'s local value includes +applying `BOUND?` or `ASSIGNED?` to the `ATOM`; similarly for a global +value and `GASSIGNED?`. If the `INT-LEVEL` is too high when `"READ"` +or `"WRITE"` occurs, an error occurs, because occurrences of these +interrupts cannot be queued. + +Monitors are set up with `EVENT` or `ON`, using a locative to the +object being monitored as the extra *which* argument, just as a +`CHANNEL` is given for `"CHAR"`. A handler for `"READ"` takes two +arguments: the locative and the `FRAME` of the function application +that make the reference. A handler for `"WRITE"` takes three +arguments: the locative, the new value, and the `FRAME`. For example: + + $ + (1 2 3) + >$ + #LOCL 2 + + + + + + + + > + 4 0 .B>$ + #HANDLER FUNCTION (...) + <1 .A 10>$ + (10 2 3) + <2 .A 20>$ + Program changed #LOCL 2 to 20 via #FRAME PUT + (10 20 3) + $ + #IHEADER #LOCL 20 + +### 21.8.10. "SYSDOWN" + +`"SYSDOWN"` occurs when a system-going-down or system-revived signal +is received from ITS. It is not available in the Tenex or Tops-20 +versions. If no `IHEADER` is associated and enabled, a warning message +is printed on the terminal. A handler takes one argument: a `FIX` +giving the number of thirtieths of a second until the shutdown (-1 for +a reprieve). + +### 21.8.11. "ERROR" + +In an effort to simplify error handling by programs, Muddle has a +facility allowing errors to be handled like interrupts. `SETG`ing +`ERROR` to a user function is a distasteful method, not safe if any +bugs are around. An `"ERROR"` interrupt wants a handler that takes any +number of arguments, via `"TUPLE"`. When an error occurs, handlers are +applied to the `FRAME` of the `ERROR` call and the `TUPLE` of `ERROR` +arguments. If a given handler "takes care of the error", it can +`ERRET` with a value from the `ERROR` `FRAME`, after having done +``. If no handler takes care of the error, it falls into +the normal `ERROR`. + +If an error occurs at an `INT-LEVEL` greater than or equal to that of +the `"ERROR"` interrupt, real `ERROR` will be called, because +`"ERROR"`interrupts cannot be queued. + +### 21.8.12. "IPC" + +`"IPC"` occurs when a message is received on the ITS IPC device +(chapter 23). It is not available in the Tenex and Tops-20 versions. + +### 21.8.13. "INFERIOR" + +`"INFERIOR"` occurs when an inferior ITS process interrupts the Muddle +process. It is not available in the Tenex and Tops-20 versions. A +handler takes one argument: A `FIX` between `0` and `7` inclusive, +telling which inferior process is interrupting. + +### 21.8.14. "RUNT and "REALT" + +These are not available in the Tenex and Tops-20 versions. + +`"RUNT"`, if enabled, occurs **once**, *N* seconds of Muddle running +time (CPU time) after calling ``, which +returns its argument. A handler takes no arguments. If `RUNTIMER` is +called with no argument, it returns a `FIX`, the number of run-time +seconds left until the interrupt occurs, or `#FALSE ()` if the +interrupt is not going to occur. + +`"REALT"`, if enabled, occurs **every** *N* seconds of real-world time +after calling ``, which returns its +argument. A handler takes no arguments. `` tells the +operating system not to generate real-time interrupts. If `REALTIMER` +is called with no argument, it returns a `FIX`, the number of +real-time seconds given in the most recent call to `REALTIMER` with an +argument, or `#FALSE ()` if `REALTIMER` has not been called. + +### 21.8.15. "Dangerous" Interrupts + +`"MPV"` ("memory protection violation") occurs if Muddle tries to +refer to a storage address not in its address space. `"PURE"` occurs +if Muddle tries to alter read-only storage. `"ILOPR"` occurs if Muddle +executes and illegal instruction ("operator"). `"PARITY"` occurs if +the CPU detects a parity error in Muddle's address space. All of these +require a handler that takes one argument: the address (`TYPE` `WORD`) +following the instruction that was being executed at the time. + +`"IOC"` occurs if Muddle tries to deal illegally with an I/O channel. +A handler must take two arguments: a three-element `FALSE` like one +that `OPEN` might return, and the `CHANNEL` that got the error. + +Ideally these interrupts should never occur. In fact, in the Tenex and +Tops-20 versions, these interrupts always go to the superior operating +system process instead of to Muddle. In the ITS version, if and when a +"dangerous" interrupt does occur: + +- If no `IHEADER` is associated with the interrupt, then the + interrupt goes to the superior operating system process. +- If an `IHEADER` is associated but disabled, the error + `DANGEROUS-INTERRUPT-NOT-HANDLED` occurs (`FILE-SYSTEM-ERROR` for + \`"IOC"). +- If an `IHEADER` is associated and enabled, but the `INT-LEVEL` is + too high, the error `ATTEMPT-TO-DEFER-UNDEFERABLE-INTERRUPT` + occurs. + +21.9. User-Defined Interrupts +----------------------------- + +If the interrupt name given to `EVENT` or `ON` is **not** one of the +standard predefined interrupts of Muddle, they will gleefully create +an `ATOM` in `` and an associated `IHEADER` anyway, making +the assumption that you are setting up a "program-defined" interrupt. + +Program-defined interrupts are made to occur by applying the `SUBR` +`INTERRUPT`, as in + + + +where *name* is a `STRING`, `ATOM` or `IHEADER`, and *arg1* through +*argN* are the arguments wanted by the handlers for the interrupt. + +If the interrupt specified by `INTERRUPT` is enabled, `INTERRUPT` +returns `T`; otherwise it returns `#FALSE ()`. All the usual priority +and queueing rules hold, so that even if `INTERRUPT` returns `T`, it +is possible that nothing "really happened" (yet). + +`INTERRUPT` can also be used to cause "artificial" occurrences of +standard predefined Muddle interrupts. + +Making a program-defined interrupt occur is similar to calling a +handler directly, but there are differences. The value returned by a +handler is ignored, so side effects must be used in order to +communicate information back to the caller, other than whether any +handler ran or will run. One good use for a program-defined interrupt +is to use the priority and queueing machinery of `INT-LEVEL` to +control the execution of functions that must not run concurrently. For +example, if a `"CHAR"` handler just deposits characters in a buffer, +then a function to process the buffered characters should probably run +at a higher priority level -- to prevent unpredictable changes to the +buffer during the processing -- and it is natural to invoke the +processing with `INTERRUPT`. + +In more exotic applications, `INTERRUPT` can signal a condition to be +handled by an unknown number of independent and "nameless" functions. +The functions are "nameless" because the caller doesn't know their +name, only the name of the interrupt. This programming style is +modular and event-driven, and it is one way of implementing +"heuristic" algorithms. In addition, each `HANDLER` has a `PROCESS` in +which to run its handler, and so the different handlers for a given +condition can do their thing in different environments quite easily, +with less explicit control than when using `RESUME`. + +21.10. Waiting for Interrupts +----------------------------- + +### 21.10.1. HANG + + + +hangs interruptibly, without consuming any CPU time, potentially +forever. `HANG` is nice for a program that cannot do anything until an +interrupt occurs. If the optional *pred* is given, it is evaluated +every time an interrupt occurs and is dismissed back into the `HANG`; +if the result of evaluation is not `FALSE`, `HANG` unhangs and returns +it as a value. If *pred* is not given, there had better be a named +`ACTIVATION` somewhere to which a handler can return. + +### 21.10.2. SLEEP + + + +suspends execution, interruptibly, without consuming any CPU time, for +*time* seconds, where *time* is non-negative, and then returns `T`. +*pred* is the same as for `HANG`. + +Chapter 22. Storage Management +============================== + +The reason this chapter comes so late in this document is that, except +for special cases, Muddle programs have their storage needs handled +automatically. There is usually no need even to consider storage +management, except as it affects efficiency (chapter 24). This chapter +gives some explanation of why this is so, and covers those special +means by which a program can assume control of storage management. + +The Muddle address space is divided into five parts, which are usually +called + +1. movable garbage-collected space, +2. immovable space (both garbage-collected and not), +3. user pure/page space, +4. pure-`RSUBR` mapping space, and +5. internal storage. + +Internal storage occupies both the highest and lowest addresses in the +address space, and its size never changes as Muddle executes. The +other spaces can vary in size according to the needs of the executing +program. Generally the interpreter allocates a contiguous set of +addresses for each space, and each space gradually fills up as new +objects are created and as disk files are mapped in. The action taken +when space becomes full varies, as discussed below. + +22.1. Movable Garbage-collected Storage +--------------------------------------- + +Most storage used explicitly by Muddle programs is obtained from a +pool of free storage managed by a "garbage collector". Storage is +obtained from this pool by the `SUBR`s which construct objects. When a +`SUBR` finds that the pool of available storage is exhausted, it +automatically calls the garbage collector. + +The garbage collector has two algorithms available to it: the +"copying" algorithm, which is used by default, and the "mark-sweep" +algorithm. Actually, one often speaks of two separate garbage +collectors, the "copying" one and the "mark-sweep" one, because each +is an independent module that is mapped in to the interpreter's +internal storage from disk only during garbage collection. For +simplicity, this document speaks of "the" garbage collector, which has +two algorithms. + +The garbage collector examines the storage pool and **marks** all the +objects there, separating them into two classes: those which cannot +possibly be referenced by a program, and those which can. The +"copying" algorithm then copies the latter into one compact section of +the pool, and the remainder of the pool is made available for newly +constructed objects. The "mark-sweep" algorithm, instead, puts all +objects in the former class (garbage) into "free lists", where the +object-construction `SUBR`s can find them and re-use their storage. + +If the request for more storage still cannot be satisfied from +reclaimed storage, the garbage collector will attempt to obtain more +total storage from the operating system under which Muddle runs. +(Also, if there is a gross superfluity of storage space, the garbage +collector will politely return some storage to the operating system.) +Only when the total system resources are exhausted will you finally +lose. + +Thus, if you just "forget about" an object, that is, lose all possible +means of referencing it, its storage is automatically reclaimed. +"Object" in this context includes that stack-structured storage space +used in `PROCESS`es for functional application. + +### 22.1.1. Stacks and Other Internal Vectors + +Control stacks are used in Muddle to control the changes in +environment caused by calling and binding. Each active `PROCESS` has +its own control stack. On this stack are stored `LVAL`s for `ATOM`s; +`PRIMTYPE` `TUPLE`s, which are otherwise like `VECTOR`s; `PRIMTYPE` +`FRAME`s, which are generated by calling Subroutines; and +`ACTIVATION`s, which are generated by calling `FUNCTION`s with named +`ACTIVATION`s, `PROG`, and `REPEAT`. `TAG` and `LLOC` can make `TAG`s +and `LOCD`s (respectively) that refer to a specific place on a +specific control stack. (`LEGAL?` returns `T` if and only if the +portion of the control stack in which its argument is found or to +which its argument refers is still active, or if its argument doesn't +care about the control stack. The garbage collector may change a +non-`LEGAL?` object to `TYPE` `ILLEGAL` before reclaiming it.) As the +word "stack" implies, things can be put on it and removed from it at +only one end, called the top. It has a maximum size (or depth), and +attempting to put too many things on it will cause overflow. A stack +is stored like a `VECTOR`, and it must be `GROW`n if and when it +overflows. + +A control stack is actually two stacks in one. One section is used for +"top-level" `LVAL`s -- those `SET` while the `ATOM` is not bound by +any active Function's argument `LIST` or Subroutine's `SPECIAL` +binding -- and the other section is used for everything else. Either +section can overflow, of course. The top-level-`LVAL` section is below +the other one, so that a top-level `LVAL` will be found only if the +`ATOM` is not currently bound elsewhere, namely in the other section. + +Muddle also has an internal stack, used for calling and temporary +storage within the interpreter and compiled programs. It too is stored +like a `VECTOR` and can overflow. There are other internal vectors +that can overflow: the "global vector" holds pairs ("slots") of +`ATOM`s and corresponding `GVAL`s ("globally bound" or `GBOUND?` means +that the `ATOM` in question is in this vector, whether or not it +currently has a global value), and the "`TYPE` vector" holds `TYPE` +names (predefined and `NEWTYPE`s) and how they are to be treated. + +22.2. Immovable Storage +----------------------- + +### 22.2.1. Garbage-collected: FREEZE + +In very special circumstances, such as debugging `RSUBR`s, you may +need to prevent an object from being moved by the garbage collector. +`FREEZE` takes one argument, of `PRIMTYPE` `VECTOR`, `UVECTOR`, +`STRING`, `BYTES` or `TUPLE`. It copies its argument into non-moving +garbage-collected space. `FREEZE` returns the copy `CHTYPE`d to its +`PRIMTYPE`, except in the case of a `TUPLE`, which is changed to a +`VECTOR`. + +### 22.2.2. Non-garbage-collected: STORAGE (the PRIMTYPE) + +An object of `PRIMTYPE` `STORAGE` is really a frozen `UVECTOR` whose +`UTYPE` is of `PRIMTYPE` `WORD`, but it is always pointed to by +something internal to Muddle and thus is never garbage-collectible. +The use of `FREEZE` is always preferable, except when for historical +reasons a `STORAGE` is necessary. + +22.3. Other Storage +------------------- + +User pure/page space serves two purposes. First, when a user program +`PURIFY`s (see below) Muddle objects, they are copied into this space. +Second, so-called hand-crafted `RSUBR`s (assembled but not compiled) +can call on the interpreter to map pages of disk files into this space +for arbitrary purposes. + +Pure-`RSUBR` mapping space is used by the interpreter to dynamically +map pages of pure compiled programs into and out of the Muddle address +space. Pure code can refer to impure storage through the "transfer +vector", another internal vector. This space is the most vulnerable to +being compressed in size by the long-term growth of other spaces. + +Internal storage has both pure and impure parts. The interpreter +program itself is pure and sharable, while impure storage is used for +internal pointers, counters, and flags, for example, pointers to the +boundaries of other spaces. In the pure part of this space are most of +the `ATOM`s in an initial Muddle, along with their `OBLIST` buckets +(`LIST`s) and `GVAL` slots (a pure extension of the global vector), +where possible. A `SET` or `SETG` of a pure `ATOM` automatically +impurifies the `ATOM` and as much of its `OBLIST` bucket as needs to +be impure. + +22.4. Garbage Collection: Details +--------------------------------- + +When either of the garbage-collected spaces (movable or immovable) +becomes full, Muddle goes through the following procedure: + +1. A `"DIVERT-AGC"` interrupt occurs if the garbage collection can be + deferred temporarily by shifting boundaries between storage spaces + slightly. The interrupt handler may postpone a garbage collection + by moving boundaries itself with a call to `BLOAT` (below). +2. The garbage collector begins execution. The "copying" algorithm + creates an inferior operating-system process (named `AGC` in the + ITS version) whose address space is used to hold the new copies of + non-garbage objects. Muddle accesses the inferior's address space + through two pages ("frontier" and "window") in its internal space + that are shared with the inferior. If the garbage collection + occurred because movable garbage-collected space was exhausted, + then the "mark-sweep" algorithm might be used instead (see below) + and no inferior process is created. +3. The garbage collector marks and moves all objects that can + possibly be referenced hereafter. It begins with the `
` + `PROCESS` and the currently running `PROCESS` ``, considered + as vectors containing the control stacks, object pointers in live + registers, etc. Every object in these "`PROCESS` vectors" is + marked "accessible", and every element of these objects (bindings, + etc.), and so on recursively. The "copying" algorithm moves + objects into the inferior process's address space as it marks + them. +4. If the garbage collection is "exhaustive" -- which is possible + only in the "copying" algorithm -- then both the chain of + associations and top-level local/global bindings are examined + thoroughly, which takes more time but is more likely to uncover + garbage therein. In a normal garbage collection these constructs + are not treated specially. +5. Finally, the "mark-sweep" algorithm sweeps through the storage + space, adding unmarked objects to the internal free lists for + later re-use. The "copying" algorithm maps the inferior process's + address space into Muddle's own, replacing old garbagey with the + new compact storage, and the inferior process is destroyed. + +22.5 GC +------- + + + +causes the garbage collector to run and returns the total number of +words of storage reclaimed. All of its arguments are optional: if they +are not supplied, a call to GC simply causes a "copying" garbage +collection. + +If *min* is explicitly supplied as an argument, a garbage-collection +parameter is changed permanently before the garbage collector runs. +*min* is the smallest number of words of "free" (unclaimed, available +for use) movable garbage-collected storage the garbage collector will +be satisfied with having after it is done. Initially it is 8192 words. +If the total amount of reclaimed storage is less than *min*, the +garbage collector will ask the operating system for enough storage (in +1024 word blocks) to make it up. N.B.: the system may be incivil +enough not to grant the request; in that case, the garbage collector +will be content with what it has, **unless** that is not enough to +satisfy a **pending** request for storage. Then it will inform you +that it is losing. A large *min* will result in fewer total garbage +collections, but they will take longer since the total quantity of +storage to be dealt with will generally be larger. Smaller *min*s +result in shorter, more frequent garbage collections. + +22.6. BLOAT +----------- + +`BLOAT` is used to cause a temporary expansion of the available +storage space with or without changing the garbage-collection +parameters. `BLOAT` is particularly useful for avoiding unnecessary +garbage collections when loading a large file. It will cause (at most) +one garbage collection, at the end of which the available storage will +be at least the amount specified in the call to `BLOAT`. (Unless, of +course, the operating system is cranky and will not provide the +storage. Then you will get an error. `` from this error will +cause the `BLOAT` to return `1`, which usually just causes you to lose +at a later time -- unless the operating system feels nicer when the +storage is absolutely necessary.) + +A call to BLOAT looks like this: + + + +where all arguments on the first line above are `FIX`, optional (`0` +by default), and indicate the following: + +- *fre*: number of words of free movable storage desired (for + `LIST`s, `VECTOR`s, `ATOM`s, etc.) +- *stk*: number of words of free control-stack space desired (for + functional applications and binding of `ATOM`s) +- *lcl*: number of new top-level `LVAL`s for which to leave space + (`SET`s of `ATOM`s which are not currently bound) +- *glb*: number of new `GVAL`s for which to leave space (in the + global vector) +- *typ*: number of new `TYPE` definitions for which to leave space + (in the `TYPE` vector) +- *sto*: number of words of immovable garbage-collected storage + desired +- *pstk*: number of words of free internal-stack space desired (for + `READ`ing large `STRING`s, and calling routines within the + interpreter and compiled programs) + +Arguments on the second line are also `FIX` and optional, but they set +garbage-collection parameters permanently, as follows: + +- *min*: as for `GC` +- *plcl*: number of slots for `LVAL`s added when the space for + top-level `LVAL`s is expanded (initially 64) +- *pglb*: number of slots for `GVAL`s added when the global vector + is grown (initially 64) +- *ptyp*: number of slots for `TYPE`s added when the `TYPE` vector + is grown (initially 32) +- *imp*: number of words of immovable garbage-collected storage + added when it is expanded (initially 1024) +- *pur*: number of words reserved for pure compiled programs, if + possible (initially 0) +- *dpstk*: most desirable size for the internal stack, to prevent + repeated shrinking and `GROW`ing (initially 512) +- *dstk*: most desirable size for the control stack (initially 4096) + +`BLOAT` returns the actual number of words of free movable +garbage-collected storage available when it is done. + +22.7. BLOAT-STAT +---------------- + +`BLOAT-STAT` can be used with `BLOAT` to "tune" the garbage collector +to particular program requirements. + + + +fills the *uvector* with information about the state of storage of +Muddle. The argument should be a `UVECTOR` of length 27 and `UTYPE` +`FIX`. If `BLOAT-STAT` does not get an argument, it will provide its +own `UVECTOR`. The information returned is as follows: the first 8 +elements indicate the number of garbage collections that are +attributable to certain causes, and the other 19 give information +about certain areas of storage. In detail: + +1. number of garbage collections caused by exhaustion of movable + garbage-collected storage +2. ditto by overflow of control stack(s) +3. ditto by overflow of top-level-`LVAL` section of control stack(s) +4. ditto by overflow of global vector +5. ditto by overflow of `TYPE` vector +6. ditto by exhaustion of immovable garbage-collected storage +7. ditto by overflow of internal stack +8. ditto by overflow of both stacks at the same time (rare) + +9. number of words of movable storage +10. number of words of movable storage used since last `BLOAT-STAT` +11. maximum number of words of movable storage ever existing +12. number of words of movable storage used since Muddle began running +13. maximum size of control stack +14. number of words on control stack in use +15. maximum size of control stack(s) ever reached +16. number of slots for top-level `LVAL`s +17. number of top-level `LVAL`s existing +18. number of slots for `GVAL`s in global vector +19. number of `GVAL`s existing +20. number of slots for `TYPE`s in `TYPE` vector +21. number of `TYPE`s existing +22. number of words of immovable garbage-collected storage +23. number of words of immovable storage unused +24. size of largest unused contiguous immovable-storage block +25. number of words on internal stack +26. number of words on internal stack in use +27. maximum size of internal stack ever reached + +22.8. GC-MON +------------ + + + +("garbage-collector monitor") determines whether or not the +interpreter will hereafter print information on the terminal when a +garbage collection starts and finishes, according to whether or not +its argument is true. It returns the previous state. Calling it with +no argument returns the current state. The initial state is false. + +When typing is enabled, the "copying" garbage collector prints, when +it starts: + + GIN reason subr-that-caused:atom + +and, when it finishes: + + GOUT seconds-needed + +The "mark-sweep" garbage collector prints `MSGIN` and `MSGOUT` instead +of `GIN` and `GOUT`. + +22.9. Related Subroutines +------------------------- + +Two `SUBR`s, described next, use only part of the garbage-collector +algorithm, in order to find all pointers to an object. `GC-DUMP` and +`GC-READ`, as their names imply, also use part in order to translate +between Muddle objects and binary representation thereof. + +### 22.9.1. SUBSTITUTE + + + +returns *old*, after causing a miniature garbage collection to occur, +during which **all** references to *old* are changed so as to refer to +*new*. Neither argument can be of `PRIMTYPE` `STRING` or `BYTES` or +`LOCD` or live on the control stack, unless both are of the same +`PRIMTYPE`. One `TYPE` name cannot be substituted for another. One of +the few legitimate uses for it is to substitute the "right" `ATOM` for +the "wrong" one, after `OBLIST`s have been in the wrong state. This is +more or less the way `ATOM`s are impurified. It is also useful for +unlinking `RSUBR`s. `SUBSTITUTE` returns *old* as a favor: unless you +hang onto *old* at that point, it will be garbage-collected. + +22.9.2 PURIFY +------------- + + + +returns its last argument, after causing a miniature garbage +collection that results in all the arguments becoming pure and +sharable, and ignored afterward by the garbage collector. No argument +can live on the control stack or be of `PRIMTYPE` `PROCESS` or `LOCD` +or `ASOC`. Sharing between operating-system processes actually occurs +after a `SAVE`, if and when the `SAVE` file is `RESTORE`d. + +Chapter 23. Muddle as a System Process +====================================== + +This chapter treats Muddle considered as executing in an +operating-system process, and interactions between Muddle and other +operating-system processes. See also section 21.8.13. + +23.1. TIME +---------- + +`TIME` takes any number of arguments, which are evaluated but ignored, +and returns a `FLOAT` giving the number of seconds of CPU time the +Muddle process has used so far. `TIME` is often used in machine-level +debugging to examine the values of its arguments, by having Muddle's +superior process (say, DDT) plant a breakpoint in the code for `TIME`. + +23.2. Names +----------- + + + +returns a `STRING` which is the "user name" of Muddle's process. This +is the "uname" process-control variable in the ITS version and the +logged-in directory in the Tenex and Tops-20 versions. + + + +returns a `STRING` which is the "intended user name" of Muddle's +process. This is the "xuname" process-control variable in the ITS +version and identical to `` in the Tenex and Tops-20 versions. + + + +returns a `STRING` which is the "job name" of Muddle's process. This +is the "jname" process-control variable in the ITS version and the +`SETNM` name in the Tenex and Tops-20 versions. The characters belong +to the "sixbit" or "printing" subset of ASCII, namely those between +`` and `` inclusive. + + + +returns a `STRING` which is the "intended job name" of Muddle's +process. This is the "xjname" process-control variable in the ITS +version and identical to `` in the Tenex and Tops-20 versions. + +23.3. Exits +----------- + + + +attempts to log out the process in which it is executed. It will +succeed only if the Muddle is the top-level process, that is, it is +running disowned or as a daemon. If it succeeds, it of course never +returns. If it does not, it returns `#FALSE ()`. + + + +causes Muddle to stop running, in an orderly manner. In the ITS +version, it is equivalent to a `.LOGOUT 1` instruction. In the Tenex +and Tops-20 versions, it is equivalent to a control-C signal, and +control passes to the superior process. + + + +("value return") seldom returns. It passes control back up the process +tree to the superior of Muddle, passing its argument as a message to +that superior. If it does return, the value is `#FALSE ()`. If the +argument is a `STRING`, it is passed to the superior as a command to +be executed, via `.VALUE` in the ITS version and `RSCAN` in the +Tops-20 version. If the argument is a `FIX`, it is passed to the +superior as the "effective address" of a `.BREAK 16`, instruction in +the ITS version and ignored in other versions. + +23.4. Inter-process Communication +--------------------------------- + +All of the `SUBR`s in this section are available only in the ITS +version. + +The IPC ("inter-process communication") device is treated as an I/O +device by ITS but not explicitly so by Muddle: that is, it is never +`OPEN`ed. It allows Muddle to communicate with other ITS processes by +means of sending and receiving messages. A process identifies itself +as sender or recipient of a message with an ordered pair of "sixbit" +`STRING`s, which are often but not always `` and ``. A +message has a "body" and a "type". + +### 23.4.1. SEND and SEND-WAIT + + + + + +both send an IPC message to any job that is listening for it as +*othern1* *othern2*. *body* must be either a `STRING`, or a `UVECTOR` +of objects of `PRIMTYPE` `WORD`. *type* is an optional `FIX`, `0` by +default, which is part of the information the other guy receives. The +last two arguments are from whom the message is to be sent. These are +optional, and `` and `` respectively are used by +default. `SEND` returns a `FALSE` if no one is listening, while +`SEND-WAIT` hangs until someone wants it. Both return `T` if someone +accepts the message. + +### 23.4.2. The "IPC" Interrupt + +When your Muddle process receives an IPC message, `"IPC"` occurs +(chapter 21). A handler is called with either four or six arguments +gleaned from the message. *body*, *type*, *othern1*, and *othern2* are +supplied only if they are not this process's `` and ``. + +There is a built-in `HANDLER` for the `"IPC"` interrupt, with a +handler named `IPC-HANDLER` and `0` in the `PROCESS` slot. The handler +prints out on the terminal the *body*, whom it is from, the *type* if +not `0`, and whom it is to if not `` ``. If the *type* +is `1` and the *body* is a `STRING`, then, after the message +information is printed out, the `STRING` is `PARSE`d and `EVAL`uated. + +### 23.4.3. IPC-OFF + +`` stops all listening on the IPC device. + +### 23.4.4. IPC-ON + + + +causes listening on the IPC device as *myname1* *myname2*. If no +arguments are provided, listening is on `` ``. When a +message arrives, `"IPC"` occurs. + +Muddle is initially listening as `` `` with the built-in +`HANDLER` set up on the `"IPC"` interrupt with a priority of `1`. + +### 23.4.5. DEMSIG + + + +signals to ITS (directly, not via the IPC device) that the daemon +named by its argument should run now. It returns `T` if the daemon +exists, `#FALSE ()` otherwise. + +Chapter 24. Efficiency and Tastefulness +======================================= + +24.1. Efficiency +---------------- + +Actually, you make Muddle programs efficient by thinking hard about +what they really make the interpreter **do**, and making them do less. +Some guidelines, in order of decreasing expense: + +1. Free storage is expensive. +2. Calling functions is expensive. +3. `PROG` and `REPEAT` are expensive, except when compiled. + +Explanation: + +1. Unnecessary use of free storage (creating needless `LIST`s, + `VECTOR`s, `UVECTOR`s, etc.) will cause the garbage collector to + run more often. This is **expensive!** A fairly large Muddle (for + example, 60,000 36-bit words) can take ten seconds of PDP-10 CPU + time for a garbage collection. Be especially wary of constructions + like `(0)`. Every time that is evaluated, it creates a new + one-element `LIST`; it is too easy to write such things when they + aren't really necessary. Unless you are doing `PUT`s or `PUTREST`s + on it, use `'(0)` instead. +2. Sad, but true. Also generally ignored. If you call a function only + once, or if it is short (less than one line), you are much better + off in speed if you substitute its body in by hand. On the other + hand, you may be much worse off in modularity. There are + techniques for combining several `FUNCTION`s into one `RSUBR` + (with `RSUBR-ENTRY`s), either during or after compilation, and for + changing `FUNCTION`s into `MACRO`s. +3. `PROG` is almost never necessary, given (a) `"AUX"` in + `FUNCTION`s; (b) the fact that `FUNCTION`s can contain any number + of `FORM`s; (c) the fact that `COND` clauses can contain any + number of `FORM`s; and (d) the fact that new variables can be + generated and initialized by `REPEAT`. However, `PROG` may be + useful when an error occurs, to establish bindings needed for + cleaning things up or interacting with a human. + +The use of `PROG` may be sensible when the normal flow of control can +be cut short by unusual conditions, so that the program wants to +`RETURN` before reaching the end of `PROG`. Of course, nested `COND`s +can accomplish the same end, but deep nesting may tend to make the +program unreadable. For example: + + > + > + + > + > + > + +could instead be written + + + + + )>)> + +By the way, `REPEAT` is faster than `GO` in a `PROG`. The `` +`FORM` has to be separately interpreted, right? In fact, if you +organize things properly you **very** seldom need a `GO`; using `GO` +is generally considered "bad style", but in some cases it's needed. +Very few. + +In many cases, a `REPEAT` can be replaced with a `MAPF` or `MAPR`, or +an `ILIST`, `IVECTOR`, etc. of the form + + > + +which generates an `N`-element `LIST` of successive numbers starting +at `X+1`. + +Whether a program is interpreted or compiled, the first two +considerations mentioned above hold: garbage collection and function +calling remain expensive. Garbage collection is, clearly, exactly the +same. Function calling is relatively more expensive. However, the +compiler careth not whether you use `REPEAT`, `GO`, `PROG`, `ILIST`, +`MAPF`, or whatnot: it all gets compiled into practically the same +thing. However, the `REPEAT` or `PROG` will be slower if it has an +`ACTIVATION` that is `SPECIAL` or used other than by `RETURN` or +`AGAIN`. + +### 24.1.1. Example + +There follows an example of a `FUNCTION` that does many things wrong. +It is accompanied by commentary, and two better versions of the same +thing. (This function actually occurred in practice. Needless to say, +names are withheld to protect the guilty.) + +Blunt comment: this is terrible. Its purpose is to output the +characters needed by a graphics terminal to draw lines connecting a +set of points. The points are specified by two input lists: `X` values +and `Y` values. The output channel is the third argument. The actual +characters for each line are returned in a `LIST` by the function +`TRANS`. + + > >> + )> + + <.N .Y>>)> + > .L>)> > + )) + > .CHN> + > .L1> + )> >> + +Comments: + +1. `LIST` is only temporarily necessary. It is just created and then + thrown away. +2. Worse, the construct `(!.LIST !)` **copies** the + previous elements of `LIST` every time it is executed! +3. Indexing down the elements of `LIST` as in `<.N .LIST>` takes a + long time, if the `LIST` is long. `<3 ...>` or `<4 ...>` is not + worth worrying about, but `<10 ...>` is, and `<100 ...>` takes + quite a while. Even if the indexing were not phased out, the + compiler would be happier with ``. +4. The variable `CHN` is unnecessary if `OUTCHAN` is bound to the + argument `CHANNEL`. +5. It is tasteful to call `ERROR` in the same way that F/SUBRs do. + This includes using an `ATOM` from the `ERRORS` `OBLIST` (if one + is appropriate) to tell what is wrong, and it includes identifying + yourself. + +So, do it this way: + + ) + >> + )> + > + )> + <1 .Y>>)) + >> + >> + )>> + > + >>> + +Of course, if you know how long is the `LIST` that `TRANS` returns, +you can avoid using the inner `REPEAT` loop and have explicit `PRINC`s +for each element. This can be done even better by using `MAPF`, as in +the next version, which does exactly the same thing as the previous +one, but uses `MAPF` to do the `REST`ing and the end conditional: + + ) + >> + )> + > + #FUNCTION ((XE YE) + #FUNCTION ((T) >) >) + .X + .Y> + "DONE"> + +24.2. Creating a LIST in Forward Order +-------------------------------------- + +If you must create the elements of a `LIST` in sequence from first to +last, you can avoid copying earlier ones when adding a later one to +the end. One way is to use `MAPF` or `MAPR` with a first argument of +`,LIST`: the elements are put on the control stack rather than in free +storage, until the final call to `LIST`. If you know how many elements +there will be, you can put them on the control stack yourself, in a +`TUPLE` built for that purpose. Another way is used when `REPEAT` is +necessary: + + >> + ... + >> + ...> + +Here, `.LAST` always points to the current last element of the `LIST`. +Because of the order of evaluation, the `` could also be +written `>`. + +24.3. Read-only Free Variables +------------------------------ + +If a Function uses the value of a free variable +(`` or ``) without changing +it, the compiled version may be more efficient if the value is +assigned to a dummy `UNSPECIAL` `ATOM` in the Function's `"AUX"` list. +This is true because an `UNSPECIAL` `ATOM` gets compiled into a slot +on the control stack, which is accessible very quickly. The tradeoff +is probably worthwhile if a *special* is referenced more than once, or +if an *unmanifest* is referenced more than twice. Example: + + >) + > .THINGS>> + +24.4. Global and Local Values +----------------------------- + +In the interpreter the sequence `,X .X ,X .X` is slower than +`,X ,X .X .X` because of interference between the `GVAL` and `LVAL` +mechanisms (appendix 1). Thus it is not good to use both the `GVAL` +and `LVAL` of the same `ATOM` frequently, unless references to the +`LVAL` will be compiled away (made into control stack references). + +24.5. Making Offsets for Arrays +------------------------------- + +It is often the case that you want to attach some meaning to each +element of an array and access it independently of other elements. +Firstly, it is a good idea to use names (`ATOM`s) rather than integers +(`FIX`es or even `OFFSET`s) for offsets into the array, to make future +changes easier. Secondly, it is a good idea to use the `GVAL`s of the +name `ATOM`s to remember the actual `FIX`es, so that the `ATOM`s can +be `MANIFEST` for the compiler's benefit. Thirdly, to establish the +`GVAL`s, both the interpreter and the compiler will be happier with +`` rather than +`>`. + +24.6. Tables +------------ + +There are several ways in Muddle to store a table, that is, a +collection of (names and) values that will be searched. +Unsurprisingly, choosing the best way is often dictated by the size of +the table and/or the nature of the (names and) values. + +For a small table, the names and values can be put in (separate) +structures -- the choice of `LIST` or array being determined by +volatility and limitability -- which are searched using `MEMQ` or +`MEMBER`. This method is very space-efficient. If the table gets +larger, and if the elements are completely orderable, a (uniform) +vector can be used, kept sorted, and searched with a binary search. + +For a large table, where reasonably efficient searches are required, a +hashing scheme is probably best. Two methods are available in Muddle: +associations and `OBLIST`s. + +In the first method, `PUTPROP` and `GETPROP` are used, which are very +fast. The number of hashing buckets is fixed. Duplicates are +eliminated by `==?` testing. If it is necessary to use `=?` testing, +or to find all the entries in the table, you can duplicate the table +in a `LIST` or array, to be used only for those purposes. + +In the second method, `INSERT` and `LOOKUP` on a specially-built +`OBLIST` are used. (If the names are not `STRING`s, they can be +converted to `STRING`s using `UNPARSE`, which takes a little time.) +The number of hashing buckets can be chosen for best efficiency. +Duplicates are eliminated by `=?` testing. MAPF/R can be used to find +all the entries in the table. + +24.7. Nesting +------------- + +The beauty of deeply-nested control structures in a single `FUNCTION` +is definitely in the eye of the beholder. (`PPRINT`, a preloaded +`RSUBR`, finds them trying. However, the compiler often produces +better code from them.) **If** you don't like excessive nesting, then +you will agree that + + + ...) ...> + +looks better than + + > ...) ...> + +and that + + )> + ... + ...> + +looks better than + + ) + (ELSE ...)> + ...> + +You can see the nature of the choices. Nesting is still and all better +than `GO`. + +Appendix 1. A Look Inside +========================= + +This appendix tells about the mapping between MDL objects and PDP-10 +storage -- in other words, the way things look "on the inside". None +of this information is essential to knowing how to program in MDL, but +it does give some reasons for capabilities and restrictions that +otherwise you have to memorize. The notation and terminology get a +little awkward in this discussion, because we are in a twilight zone +between the worlds of MDL objects and of bit patterns. In general the +words and phrases appearing in diagrams refer to bit patterns not MDL +objects. A lower-case word (like "tuple") refers to the storage +occupied by an object of the corresponding `PRIMTYPE` (like `TUPLE`). + +First some terminology needs discussion. The sine qua non of any MDL +object is a **pair** of 36-bit computer words. In general, lists +consist of pairs chained together by pointers (addresses), and vectors +consist of contiguous blocks of pairs. `==?` essentially tests two +pairs to see whether they contain the same bit patterns. + +The first (lower-addressed) word of a pair is called the **`TYPE` +word**, because it contains a numeric **`TYPE` code** that represents +the object's `TYPE`. The second (higher-addressed) word of a pair is +called the **value word**, because it contains (part of or the +beginning of) the "data part" of the object. The `TYPE` word (and +sometimes the value word) is considered to be made of a left half and +a right half. We will picture a pair like this: + + --------------------------------- + | TYPE | | + | - - - - - - - - - - - - - - - | + | value | + --------------------------------- + +where a vertical bar in the middle of a word means the word's halves +are used independently. You can see that the `TYPE` code is confined +to the left half of the `TYPE` word. (Half-)words are sometimes +subdivided into **fields** appropriate for the context; fields are +also pictured as separated by vertical bars. The right half of the +`TYPE` word is used for different purposes depending on the `TYPE` of +the object and actual location of the value. + +Actually the 18-bit `TYPE` field is further decoded. The high-order +(leftmost) bit is the mark bit, used exclusively by the garbage +collector when it runs. The next two bits are monitor bits, used to +cause `"READ"` and `"WRITE"` interrupts on read and write references +to the pair. The next bit is used to differentiate between list +elements and vector dope words. The next bit is unused but could be +used in the future for an "execute" monitor. The remaining 13 bits +specify the actual `TYPE` code. What `CHTYPE` does is to copy the pair +and put a new `TYPE` code into the new pair. + +Each data `TYPE` (predefined and `NEWTYPE`s) must belong to one of +about 25 "storage allocation classes" (roughly corresponding to MDL +`PRIMTYPE`s). These classes are characterized primarily by the manner +in which the garbage collector treats them. Some of these classes will +now be described. + +"One Word" + +This class includes all data that are not pointers to some kind of +structure. All external (program-available) `TYPE`s in this class are +of `PRIMTYPE` `WORD`. Example: + + --------------------------------- + | FIX | 0 | + | - - - - - - - - - - - - - - - | + | 105 | + --------------------------------- + +"Two Word" + +The members of this class are all 18-bit pointers to list elements. +All external `TYPE`s in this class are of `PRIMTYPE` `LIST`. Example: + + --------------------------------- + | LIST | 0 | + | - - - - - - - - - - - - - - - | + | 0 | pointer | + --------------------------------- + +where `pointer` is a pointer to the first list element. If there are +no elements, `pointer` is zero; thus empty objects of `PRIMTYPE` +`LIST` are `==?` if their `TYPE`s are the same. + +"Two N Word" + +Members of this class are all "counting pointers" to blocks of +two-word pairs. The right half of a counting pointer is an address, +and the left half is the negative of the number of 36-bit words in the +block. (This format is tailored to the PDP-10 `AOBJN` instruction.) +The number of pairs in the block (`LENGTH`) is half that number, since +each pair is two words. All external `TYPE`s in this class are of +`PRIMTYPE` `VECTOR`. Example: + + --------------------------------- + | VECTOR | 0 | + | - - - - - - - - - - - - - - - | + | -2*length | pointer | + --------------------------------- + +where `length` is the `LENGTH` of the `VECTOR` and `pointer` is the +location of the start (the element selected by an `NTH` argument of 1) +of the `VECTOR`. + +"N word" + +This class is the same as the previous one, except that the block +contains objects all of the same `TYPE` without individual `TYPE` +words. The `TYPE` code for all the elements is in vector dope words, +which are at addresses just larger than the block itself. Thus, any +object that carries information in its `TYPE` word cannot go into the +block: `PRIMTYPE`s `STRING`, `BYTES`, `TUPLE` (and the corresponding +locatives `LOCS`, `LOCB`, `LOCA`), `FRAME`, and `LOCD`. All external +`TYPE`s in this class are of `PRIMTYPE` `UVECTOR`. Example: + + --------------------------------- + | UVECTOR | 0 | + | - - - - - - - - - - - - - - - | + | -length | pointer | + --------------------------------- + +where `length` is the `LENGTH` of the `UVECTOR` and `pointer` points +to the beginning of the `UVECTOR`. + +"Byte String" and "Character String" + +These two classes are almost identical. Byte strings are byte pointers +to strings of arbitrary-size bytes. `PRIMTYPE` `BYTES` is the only +member of this class. Character strings are byte pointers to strings +of ASCII characters. `PRIMTYPE` `STRING` is the only member of this +class. Both of these classes consist of a length and a PDP-10 byte +pointer. In the case of character strings, the byte-size field in the +byte pointer is always seven bits per byte (hence five bytes per +word). Example: + + --------------------------------- + | STRING | length | + | - - - - - - - - - - - - - - - | + | byte-pointer | + --------------------------------- + +where `length` is the `LENGTH` of the `STRING` (in bytes) and +`byte-pointer` points to a byte just before the beginning of the +string (an `ILDB` instruction is needed to get the first byte). A +newly-created `STRING` always has `*010700*` in the left half of +`byte-pointer`. Unless the string was created by `SPNAME`, +`byte-pointer` points to a uvector, where the elements (characters) of +the `STRING` are stored, packed together five to a word. + +"Frame" + +This class gives the user program a handle on its control and +variable-reference structures. All external `TYPE`s in this class are +of `PRIMTYPE` `FRAME`. Three numbers are needed to designate a frame: +a unique 18-bit identifying number, a pointer to the frame's storage +on a control stack, and a pointer to the `PROCESS` associated with the +frame. Example: + + --------------------------------- + | FRAME |PROCESS-pointer| + | - - - - - - - - - - - - - - - | + | unique-id | frame-pointer | + --------------------------------- + +where `PROCESS-pointer` points to the dope words of a `PROCESS` +vector, and `unique-id` is used for validating (testing `LEGAL?`) the +`frame-pointer`, which points to a frame for some Subroutine call on +the control stack. + +"Tuple" + +A tuple pointer is a counting pointer to a vector on the control +stack. It may be a pointer to the arguments to a Subroutine or a +pointer generated by the `"TUPLE"` declaration in a `FUNCTION`. Like +objects in the previous class, these objects contain a unique +identifying number used for validation. `PRIMTYPE` `TUPLE` is the only +member of this class. Example: + + --------------------------------- + | TUPLE | unique-id | + | - - - - - - - - - - - - - - - | + | -2*length | pointer | + --------------------------------- + +Other Storage Classes + +The rest of the storage classes include strictly internal `TYPE`s and +pointers to special kinds of lists and vectors like locatives, `ATOM`s +and `ASOC`s. A pair for any `LOCATIVE` except a `LOCD` looks like a +pair for the corresponding structure, except of course that the `TYPE` +is different. A `LOCD` pair looks like a tuple pair and needs a word +and a half for its value; the `unique-id` refers to a binding on the +control stack or to the "global stack" if zero. Thus `LOCD`s are in a +sense "stack objects" and are more restricted than other locatives. + +An `OFFSET` is stored with the `INDEX` in the right half of the value +word and the Pattern in the left half. Since the Pattern can be either +an `ATOM` or a `FORM`, the left half actually points to a pair, which +points to the actual Pattern. The Patttern `ANY` is recognized as a +special case: the left-half pointer is zero, and no pair is used. +Thus, if you're making the production version of your program and want +to save some storage, can do something like +`>` for all `OFFSET`s. + +Basic Data Structures +--------------------- + +Lists + +List elements are pairs linked together by the right halves of their +first words. The list is terminated by a zero in the right half of the +last pair. For example the LIST (1 2 3) would look like this: + + ------------- + | LIST | 0 | + | - - - - - | ----------- ----------- ----------- + | 0 | ------>| FIX | ------->| FIX | ------->| FIX | 0 | + ------------- | - - - - | | - - - - | | - - - - | + | 1 | | 2 | | 3 | + ----------- ----------- ----------- + +The use of pointers to tie together elements explains why new elements +can be added easily to a list, how sharing and circularity work, etc. +The links go in only one direction through the list, which is why a +list cannot be `BACK`ed or `TOP`ped: there's no way to find the +`REST`ed elements. + +Since some MDL values require a word and a half for the value in the +pair, they do not fit directly into list elements. This problem is +solved by having "deferred pointers". Instead of putting the datum +directly into the list element, a pointer to another pair is used as +the value with the special internal `TYPE` `DEFER`, and the real datum +is put in the deferred pair. For example the `LIST` `(1 "hello" 3)` +would look like this: + + ------------- + | LIST | 0 | + | - - - - - | ----------- ----------- ----------- + | 0 | ------>| FIX | ------->|DEFER| ------->| FIX | 0 | + ------------- | - - - - | | - - - - | | - - - - | + | 1 | | ----- | 3 | + ----------- ----------- | ----------- + | + ----------- | + |STRING| 5|<- + | - - - - | + |byte-pntr| + ----------- + +Vectors + +A vector is a block of contiguous words. More than one pair can point +to the block, possibly at different places in the block; this is how +sharing occurs among vectors. Pointers that are different arise from +`REST` or `GROW`/`BACK` operations. The block is followed by two "dope +words", at addresses just larger than the largest address in the +block. Dope words have the following format: + + / / + | | + | | + --------------------------------- + | type | grow | + | - - - - - - - - - - - - - - - | + | length | gc | + --------------------------------- + +The various fields have the following meanings: + +`type` -- The fourth bit from the left (the "vector bit", `40000` +octal) is always one, to distinguish these vector dope words from a +`TYPE`/value pair. + +If the high-order bit is zero, then the vector is a `UVECTOR`, and the +remaining bits specify the uniform `TYPE` of the elements. `CHUTYPE` +just puts a new `TYPE` code in this field. Each element is limited to +a one-word value: clearly `PRIMTYPE` `STRING`s and `BYTES`es and stack +objects can't go in uniform vectors. + +If the high-order bit is one and the `TYPE` bits are zero, then this +is a regular `VECTOR`. + +If the high-order bit is one and the `TYPE` bits are not all zero, +then this is either an `ATOM`, a `PROCESS`, an `ASOC`, or a +`TEMPLATE`. The special internal format of these objects will be +described a little later in this appendix. + +`length` -- The high-order bit is the mark bit, used by the garbage +collector. The rest of this field specifies the number of words in the +block, including the dope words. This differs from the length given in +pairs pointing to this vector, since such pairs may be the result of +`REST` operations. + +`grow` -- This is actually two nine-bit fields, specifying either +growth or shrinkage at both the high and low ends of the vector. The +fields are usually set only when a stack must be grown or shrunk. + +`gc` -- This is used by the garbage collector to specify where this +vector is moving during compaction. + +Examples (numbers in octal): the `VECTOR` `[1 "bye" 3]` looks like: + + --------------- + | VECTOR | 0 | + | - - - - - - | ----------------- + | -6 | ----------->| FIX | | + --------------- | - - - - - - - | + | 1 | + ----------------- + | STRING | 3 | + | - - - - - - - | + | byte pointer | + ----------------- + | FIX | | + | - - - - - - - | + | 3 | + ----------------- + | 440000 | 0 | + | - - - - - - - | + | 10 | | + ----------------- + +The `UVECTOR` `![-1 7 -4!]` looks like: + + --------------- + | UVECTOR | 0 | + | - - - - - - | ----------------- + | -3 | ----------->| -1 | + --------------- ----------------- + | 7 | + ----------------- + | -4 | + ----------------- + | 40000+FIX | 0 | + | - - - - - - - | + | 5 | | + ----------------- + +Atoms + +Internally, atoms are special vector-like objects. An atom contains a +value cell (the first two words of the block, filled in whenever the +global or local value of the `ATOM` is referenced and is not already +there), an `OBLIST` pointer, and a print name (`PNAME`), in the +following format: + + --------------------------------- + | type | bindid | + --------------------------------- + | pointer-to-value | + --------------------------------- + | pointer-to-oblist | + --------------------------------- + | print-name | + / / + / / + |(ASCII with NUL padding on end)| + --------------------------------- + | ATOM | valid-type | + | - - - - - - - - - - - - - - - | + | length | gc | + --------------------------------- + +If the type field corresponds to `TYPE` `UNBOUND`, then the `ATOM` is +locally and globally unbound. (This is different from a pair, where +the same `TYPE` `UNBOUND` is used to mean unassigned.) If it +corresponds to `TYPE` `LOCI` (an internal `TYPE`), then the value cell +points either to the global stack, if `bindid` is zero, or to a local +control stack, if `bindid` is non-zero. The `bindid` field is used to +verify whether the local value pointed to by the value cell is valid +in the current environment. The `pointer-to-OBLIST` is either a +counting pointer to an oblist (uvector). a positive offset into the +"transfer vector" (for pure `ATOM`s), or zero, meaning that this +`ATOM` is not on an `OBLIST`. The `valid-type` field tells whether or +not the `ATOM` represents a `TYPE` and if so the code for that `TYPE`: +`grow` values are never needed for atoms. + +Associations + +Associations are also special vector-like objects. The first six words +of the block contain `TYPE`/value pairs for the `ITEM`, `INDICATOR` +and `AVALUE` of the `ASOC`. The next word contains forward and +backward pointers in the chain for that bucket of the association hash +table. The last word contains forward and backward pointers in the +chain of all the associations. + + --------------------------------- + | ITEM | + | - - - - - - - - - - - - - - - | + | pair | + --------------------------------- + | INDICATOR | + | - - - - - - - - - - - - - - - | + | pair | + --------------------------------- + | AVALUE | + | - - - - - - - - - - - - - - - | + | pair | + --------------------------------- + | bucket-chain-pointers | + --------------------------------- + | association-chain-pointers | + --------------------------------- + | ASOC | 0 | + | - - - - - - - - - - - - - - - | + | 12 octal | gc | + --------------------------------- + +`PROCESS`es + +A `PROCESS` vector looks exactly like a vector of `TYPE`/value pairs. +It is different only in that the garbage collector treats it +differently from a normal vector, and it contains extremely volatile +information when the `PROCESS` is `RUNNING`. + +Templates + +In a template, the number in the type field (left half or first dope +word) identifies to which "storage allocation class" this `TEMPLATE` +belongs, and it is used to find PDP-10 instructions in internal tables +(frozen uvectors) for performing `LENGTH`, `NTH`, and `PUT` operations +on any object of this `TYPE`. The programs to build these tables are +not part of the interpreter, but the interpreter does know how to use +them properly. The compiler can put these instructions directly in +compiled programs if a `TEMPLATE` is never `REST`ed; otherwise it must +let the interpreter discover the appropriate instruction. The value +word of a template pair contains, not a counting pointer, but the +number of elements that have been `REST`ed off in the left half and a +pointer to the first dope word in the right half. + +The Control Stack +----------------- + +Accumulators with symbolic names `AB`, `TB`, and `TP` are all pointers +into the `RUNNING` `PROCESS`'s control stack. `AB` ("argument base") +is a pointer to the arguments to the Subroutine now being run. It is +set up by the Subroutine-call mediator, and its old value is always +restored after a mediated Subroutine call returns. `TB` ("temporaries +base") points to the frame for the running Subroutine and also serves +as a stack base pointer. The `TB` pointer is really all that is +necessary to return from a Subroutine -- given a value to return, for +example by `ERRET` -- since the frame specifies the entire state of +the calling routine. `TP` ("temporaries pointer") is the actual stack +pointer and always points to the current top of the control stack. + +While we're on the subject of accumulators, we might as well be +complete. Each accumulator contains the value word of a pair, the +corresponding `TYPE` words residing in the `RUNNING` `PROCESS` vector. +When a `PROCESS` is not `RUNNING` (or when the garbage collector is +running), the accumulator contents are stored in the vector, so that +the Objects they point to look like elements of the `PROCESS` and thus +are not garbage-collectible. + +Accumulators `A`, `B`, `C`, `D`, `E` and `O` are used almost entirely +as scratch accumulators, and they are not saved or restored across +Subroutine calls. Of course the interrupt machinery always saves these +and all other accumulators. `A` and `B` are used to return a pair as +the value of a Subroutine call. Other than that special feature, they +are just like the other scratch accumulators. + +`M` and `R` are used in running `RSUBR`s. `M` is always set up to +point to the start of the `RSUBR`'s code, which is actually just a +uniform vector of instructions. All jumps and other references to the +code use `M` as an index register. This makes the code +location-insensitive, which is necessary because the code uvector will +move around. `R` is set up to point to the vector of objects needed by +the `RSUBR`. This accumulator is necessary because objects in +garbage-collected space can move around, but the pointers to them in +the reference vector are always at the same place relative to its +beginning. + +`FRM` is the internal frame pointer, used in compiled code to keep +track of pending Subroutine calls when the control stack is heavily +used. `P` is the internal-stack pointer, used primarily for internal +calls in the interpreter. + +One of the nicest features of the MDL environment is the uniformity of +the calling and returning sequence. All Subroutines -- both built-in +F/SUBRs and compiled `RSUBR(-ENTRY)`s -- are called in exactly the +same way and return the same way. Arguments are always passed on the +control stack and results always end up in the same accumulators. For +efficiency reasons, a lot of internal calls within the interpreter +circumvent the calling sequence. However, all calls made by the +interpreter when running user programs go through the standard calling +sequence. + +A Subroutine call is initiated by one of three UUOs (PDP-10 +instructions executed by software rather than hardware). `MCALL` ("MDL +call") is used when the number of arguments is known at assemble or +compile time, and this number is less than 16. `QCALL` ("quick call") +may be used if, in addition, an `RSUBR(-ENTRY)` is being called that +can be called "quickly" by virtue of its having special information in +its reference vector. `ACALL` ("accumulator call") is used otherwise. +The general method of calling a Subroutine is to `PUSH` (a PDP-10 +instruction) pairs representing the arguments onto the control stack +via `TP` and then either (1) `MCALL` or `QCALL` or (2) put the number +of arguments into an accumulator and `ACALL`. Upon return the object +returned by the Subroutine will be in accumulators `A` and `B`, and +the arguments will have been `POP`ped off the control stack. + +The call mediator stores the contents of `P` and `TP` and the address +of the calling instruction in the current frame (pointed to by `TB`). +It also stores MDL's "binding pointer" to the topmost binding in the +control stack. (The bindings are linked together through the control +stack so that searching through them is more efficient than looking at +every object on the stack.) This frame now specifies the entire state +of the caller when the call occurred. The mediator then builds a new +frame on the control stack and stores a pointer back to the caller's +frame (the current contents of `TB`), a pointer to the Subroutine +being called, and the new contents of `AB`, which is a counting +pointer to the arguments and is computed from the information in the +`MCALL` or `QCALL` instruction or the `ACALL` accumulator. `TB` is +then set up to point to the new frame, and its left half is +incremented by one, making a new `unique-id`. The mediator then +transfers control to the Subroutine. + +A control stack frame has seven words as shown: + + --------------------------------- + | ENTRY | called-addr | + --------------------------------- + | unique-id | prev frame | + --------------------------------- + | argument pointer | + --------------------------------- + | saved binding pointer | + --------------------------------- + | saved P | + --------------------------------- + | saved TP | + --------------------------------- + | saved calling address | + --------------------------------- + +The first three words are set up during the call to the Subroutine. +The rest are filled in when this routine calls another Subroutine. The +left half of `TB` is incremented every time a Subroutine call occurs +and is used as the `unique-id` for the frame, stored in frame and +tuple pairs as mentioned before. Obviously this `id` is not strictly +unique, since each 256K calls it wraps around to zero. The right half +of `TB` is always left pointing one word past the +saved-calling-address word in the frame. `TP` is also left pointing at +that word, since that is the top of the control stack at Subroutine +entry. The arguments to the called Subroutine are below the frame on +the control stack (at lower storage addresses), and the temporaries +for the called Subroutine are above the frame (at higher storage +addresses). These arguments and temporaries are just pairs stored on +the control stack while needed: they are all that remain of +`UNSPECIAL` values in compiled programs. + +The following figure shows what the control stack might look like +after several Subroutine calls. + + / / + | | + ----------------- + | | + | args for S1 | + | | + ----------------- + | frame for S1 | + ----------------- <-- + | | | + | temps for S1 | | + | | | + ----------------- | + | | | + | args for S2 | | + | | | + ----------------- | + | frame for S2 | --- + ----------------- <------ + | | | + | temps for S2 | | + | | | + ----------------- | + | args for S3 | | + ----------------- | + | frame for S3 | ------- + ----------------- + | | + | temps for S3 | + | | + | | + ----------------- + (top) + +The above figure shows the frames all linked together through the +control stack (the "execution path"), so that it is easy to return to +the caller of a given Subroutine (`ERRET` or `RETRY`). + +Subroutine exit is accomplished simply by the call mediator, which +loads the right half of `TB` from the previous frame pointer, restores +the "binding pointer", `P`, and `TP`, and transfers control back to +the instruction following the saved calling address. + +Variable Bindings +----------------- + +All local `ATOM` values are kept on the control stack of the `PROCESS` +to which they are local. As described before, the atom contains a word +that points to the value on the control stack. The pointer is actually +to a six-word "binding block" on the control stack. Binding blocks +have the following format: + + --------------------------------- + | BIND or UBIND | prev | + --------------------------------- + | pointer to ATOM | + --------------------------------- + | value | + | - - - - - - - - - - - - - - - | + | pair | + --------------------------------- + | decl | unique-id | + --------------------------------- + | previous-binding | + --------------------------------- + +where: + +- `BIND` means this is a binding for a `SPECIAL` `ATOM` (the only + kind used by compiled programs), and `UBIND` means this is a + binding for an `UNSPECIAL` `ATOM` -- for `SPECIAL` checking by the + interpreter; +- `prev` points to the closest previous binding block for any `ATOM` + (the "access path" -- `UNWIND` objects are also linked in this + chain); +- `decl` points to a `DECL` associated with this value, for + `SET(LOC)` to check; +- `unique-id` is used for validation of this block; and +- `previous-binding` points to the closest previous binding for this + `ATOM` (used in unbinding). + +Bindings are generated by an internal subroutine called `SPECBIND` +(name comes from `SPECIAL`). The caller to `SPECBIND` `PUSH`es +consecutive six-word blocks onto the control stack via `TP` before +calling `SPECBIND`. The first word of each block contains the `TYPE` +code for `ATOM` in its left half and all ones in its right half. +`SPECBIND` uses this bit pattern to identify the binding blocks. +`SPECBIND`'s caller also fills in the next three words and leaves the +last two words empty. `SPECBIND` fills in the rest and leaves the +"binding pointer" pointing at the topmost binding on the control +stack. `SPECBIND` also stores a pointer to the current binding in the +value cell of the atom. + +Unbinding is accomplished during Subroutine return. When the previous +frame is being restored, the call mediator checks to see if the saved +"binding pointer" and the current one are different; if they are, +`SPECSTORE` is called. `SPECSTORE` runs through the binding blocks, +restoring old value pointers in atoms until the "binding pointer" is +equal to the one saved in the frame. + +Obviously variable binding is more complicated than this, because +`ATOM`s can have both local and global values and even different local +values in different `PROCESS`es. The solution to all of these +additional problems lies in the `bindid` field of the atom. Each +`PROCESS` vector also contains a current `bindid`. Whenever an ATOM's +local value is desired, the `RUNNING` `PROCESS`'s `bindid` is checked +against that of the atom: if they are the same, the atom points to the +current value; if not, the current `PROCESS`'s control stack must be +searched to find a binding block for this `ATOM`. This binding scheme +might be called "shallow binding". The searching is facilitated by +having all binding blocks linked together. Accessing global variables +is accomplished in a similar way, using a `VECTOR` that is referred to +as the "global stack". The global stack has only an `ATOM` and a value +slot for each variable, since global values never get rebound. + +`EVAL` with respect to a different environment causes some additional +problems. Whenever this kind of `EVAL` is done, a brand new `bindid` +is generated, forcing all current local value cells of atoms to appear +invalid. Local values must now be obtained by searching the control +stack, which is inefficient compared to just pulling them out of the +atoms. (The greatest inefficiency occurs when an `ATOM`'s `LVAL` is +never accessed twice in a row in the same environment.) A special +block is built on the control stack and linked into the binding-block +chain. This block is called a "skip block" or "environment splice", +and it diverts the "access path" to the new environment, causing +searches to become relative to this new environment. \ No newline at end of file