From: Jason Self Date: Thu, 6 Jul 2017 00:40:29 +0000 (-0700) Subject: language.md: Add chapter 19 - 24 X-Git-Url: https://jxself.org/git/?a=commitdiff_plain;h=96d4de443ff63799b8ebfbe193887b213793ecc6;p=mudman.git language.md: Add chapter 19 - 24 --- diff --git a/md/language.md b/md/language.md index c638792..4494f87 100644 --- a/md/language.md +++ b/md/language.md @@ -7014,3 +7014,2167 @@ the left; if *amount* is negative, rotation is to the right. Examples: #WORD *000000001000* $ #WORD *100000000000* + +Chapter 19. Compiled Programs +============================= + +19.1. RSUBR (the TYPE) +---------------------- + +`RSUBR`s ("relocatable subroutines") are machine-language programs +written to run in the MDL environment. They are usually produced by +the MDL assembler (often from output produced by the compiler) +although this is not necessary. All `RSUBR`s have two components: the +"reference vector" and the "code vector". In some cases the code +vector is in pure storage. There is also a set of "fixups" associated +with every `RSUBR`, although it may not be available in the running +MDL. + +19.2. The Reference Vector +-------------------------- + +An `RSUBR` is basically a `VECTOR` that has been `CHTYPE`d to `TYPE` +`RSUBR` via the `SUBR` `RSUBR` (see below). This ex-`VECTOR` is the +reference vector. The first three elements of the reference vector +have predefined meanings: + +- The first element is of `TYPE` `CODE` or `PCODE` and is the impure + or pure code vector respectively. +- The second element is an `ATOM` and specifies the name of the + `RSUBR`. +- The third element is of `TYPE` `DECL` and declares the + type/structure of the `RSUBR`'s arguments and result. + +The rest of the elements of the reference vector are objects in +garbage-collected storage that the `RSUBR` needs to reference and any +impure slots that the `RSUBR` needs to use. + +When the `RSUBR` is running, one of the PDP-10 accumulators (with +symbolic name `R`) is always pointing to the reference vector, to +permit rapid access to the various elements. + +19.3. RSUBR Linking +------------------- + +`RSUBR`s can call any `APPLICABLE` object, all in a uniform manner. In +general, a call to an F/SUBR is linked up at assembly/compile time so +that the calling instruction (UUO) points directly at the code in the +interpreter for the F/SUBR. However, the locations of most other +`APPLICABLE`s are not known at assembly/compile time. Therefore, the +calling UUO is set up to point at a slot in the reference vector (by +indexing off accumulator `R`). This slot initially contains the `ATOM` +whose G/LVAL is the called object. The calling mechanism (UUO handler) +causes control to be transferred to the called object and, depending +on the state of the `RSUBR`-link flag, the `ATOM` will be replaced by +its G/LVAL. (If the call is of the "quick" variety, the called `RSUBR` +or `RSUBR-ENTRY` will be `CHTYPE`d to a `QUICK-RSUBR` or +`QUICK-ENTRY`, respectively, before replacement.) Regardless of the +`RSUBR`-link flag's state, calls to `FUNCTION`s are never permanently +linked. A call to a non-Subroutine generates an extra `FRAME`, whose +`FUNCT` is the dummy `ATOM` `CALLER`. + +`RSUBR`s are linked together for faster execution, but linking may not +be desirable if the `RSUBR`s are being debugged, and various revisions +are being re-loaded. A linked call will forever after go to the same +code, regardless of the current G/LVAL of the called `ATOM`. Thus, +while testing `RSUBR`s, you may want to disable linking, by calling +the `RSUBR-LINK` `SUBR` with a `FALSE` argument. Calling it with a +non-`FALSE` argument enables linking thereafter. It returns the +previous state of the link flag, either `T` or `#FALSE ()`. Calling it +with no argument returns the current state. + +19.4. Pure and Impure Code +-------------------------- + +The first element of an `RSUBR` is the code vector, of `TYPE` `CODE` +or `PCODE`. `TYPE` `CODE` is of `PRIMTYPE` `UVECTOR`, and the `UTYPE` +should be of `PRIMTYPE` `WORD`. The code vector is simply a block of +words that are the instructions which comprise the `RSUBR`. Since the +code vector is stored just like a standard `UVECTOR`, it will be moved +around by the garbage collector. Therefore, all `RSUBR` code is +required to be location-insensitive. The compiler guarantees the +location-insensitivity of its output. The assembler helps to make the +code location-insensitive by defining all labels as offsets relative +to the beginning of the code vector and causing instructions that +refer to labels to index automatically off the PDP-10 accumulator +symbolically named `M`. `M`, like `R`, is set up by the UUO handler, +but it points to the code vector instead of the reference vector. The +code vector of an `RSUBR` can be frozen (using the `FREEZE` `SUBR`) to +prevent it from moving during debugging by DDT in the superior +operating-system process. + +If the first element of an `RSUBR` is of `TYPE` `PCODE` ("pure code"), +the code vector of the `RSUBR` is pure and sharable. `TYPE` `PCODE` is +of `PRIMTYPE` `WORD`. The left half of the word specifies an offset +into an internal table of pure `RSUBR`s, and the right half specifies +an offset into the block of code where this `RSUBR` starts. The +`PCODE` prints out as: + + % + +where *name* names the entry in the user's pure-`RSUBR` table, and +*offset* is the offset. (Obviously, `PCODE` is also the name of a +`SUBR`, which generates a pure code vector.) Pure `RSUBR`s may also +move around, but only by being included in MDL's page map at different +places. Once again `M` can be used exactly as before to do +location-independent address referencing. Individual pure code vectors +can be "unmapped" (marked as being not in primary storage but in their +original pure-code disk files) if the space in storage allocated for +pure code is exhausted. An unmapped `RSUBR` is mapped in again +whenever needed. All pure `RSUBR`s are unmapped before a `SAVE` file +is written, so that the code is not duplicated on disk. A purified +`RSUBR` must use `RGLOC` ("relative GLOC") instead of `GLOC`. `RGLOC` +produces objects of `TYPE` `LOCR` instead of `LOCD`. + +19.5. TYPE-C and TYPE-W +======================= + +In order to handle user `NEWTYPE`s reasonably, the internal `TYPE` +codes for them have to be able to be different from one MDL run to +another. Therefore, references to the `TYPE` codes must be in the +reference vector rather than the code vector. To help handle this +problem, two `TYPE`s exist, `TYPE-C` ("type code") and `TYPE-W` ("type +word"), both of `PRIMTYPE` `WORD`. They print as: + + % + % + +The `SUBR` `TYPE-C` produces an internal `TYPE` code for the *type*, +and `TYPE-W` produces a prototype "`TYPE` word" (appendix 1) for an +object of that `TYPE`. The *primtype* argument is optional, included +only as a check against the call to `NEWTYPE`. `TYPE-W` can also take +a third argument, of `PRIMTYPE` `WORD`, whose right half is included +in the generated "`TYPE` word". If *type* is not a valid `TYPE`, a +`NEWTYPE` is automatically done. + +To be complete, a similar `SUBR` and `TYPE` should be mentioned here. + + + +produces an internal "storage allocation code" (appendix 1) for the +*type*. The value is of `TYPE` `PRIMTYPE-C`, `PRIMTYPE` `WORD`. In +almost all cases the `SUBR` `TYPEPRIM` gives just as much information, +except in the case of `TEMPLATE`s: all `TYPE`s of `TEMPLATE`s have the +same `TYPEPRIM`, but they all have different `PRIMTYPE-C`s. + +19.6. RSUBR (the SUBR) +---------------------- + + + +`CHTYPE`s its argument to an `RSUBR`, after checking it for legality. +`RSUBR` is rarely called other than in the MDL Assembler (Lebling, +1979). It can be used if changes must be made to an `RSUBR` that are +prohibited by MDL's built-in safety mechanisms. For example, if the +`GVAL` of *name* is an `RSUBR`: + + >$ + [...] + + ...(changes to .FIXIT)... + + >$ + #RSUBR [...] + +19.7. RSUBR-ENTRY +----------------- + +`RSUBR`s can have multiple entry points. An `RSUBR-ENTRY` can be +applied to arguments exactly like an `RSUBR`. + + + +returns the `VECTOR` argument `CHTYPE`d to an `RSUBR-ENTRY` into the +*rsubr* at the specified *offset*. If the `RSUBR-ENTRY` is to have a +`DECL` (`RSUBR` style), it should come as shown. + + + +("entry location") returns the *offset* into the `RSUBR` of this +entry. + +19.8. RSUBRs in Files +--------------------- + +There are three kinds of files that can contain `RSUBR`s, identified +by second names `BINARY`, `NBIN` and `FBIN`. There is nothing magic +about these names, but they are used by convention. + +A `BINARY` file is a completely ASCII file containing complete impure +`RSUBR`s in character representation. Even a code vector appears as +`#CODE` followed by a `UVECTOR` of `PRIMTYPE` `WORD`s. `BINARY` files +are generally slow to load, because of all the parsing that must be +done. + +An `NBIN` file contains a mixture of ASCII characters and binary code. +The start of a binary portion is signalled to `READ` by the character +control-C, so naive readers of an `NBIN` file under ITS may +incorrectly assume that it ends before any binary code appears. An +`NBIN` file cannot be edited with a text editor. An `RSUBR` is written +in `NBIN` format by being `PRINT`ed on a `"PRINTB"` `CHANNEL`. The +`RSUBR`s in `NBIN` files are not purified either. + +An `FBIN` file is actually part of a triad of files. The `FBIN` +file(s) itself is the impure part of a collection of purified +`RSUBR`s. It is simply ASCII and can be edited at will. (Exception: in +the ITS and Tops-20 versions, the first object in the file should not +be removed or changed in any way, lest a "grim reaper" program for +`FBIN` files think that the other files in the triad are obsolete and +delete them.) The pure code itself resides (in the ITS and Tops-20 +versions) in a special large file that contains all currently-used +pure code, or (in the Tenex version) in a file in a special disk +directory with first name the same as the *name* argument to `PCODE` +for the `RSUBR`. The pure-code file is page-mapped directly into MDL +storage in read-only mode. It can be unmapped when the pure storage +must be reclaimed, and it can be mapped at a different storage address +when pure storage must be compacted. There is also a "fixup" file (see +below) or portion of a file associated with the `FBIN` to round out +the triad. + +An initial MDL can have pure `RSUBR`s in it that were "loaded" during +the initialization procedure. The files are not page-mapped in until +they are actually needed. The "loading" has other side effects, such +as the creation of `OBLIST`s (chapter 15). Exactly what is pre-loaded +is outside the scope of this document. + +19.9. Fixups +------------ + +The purpose of "fixups" is to correct references in the `RSUBR` to +parts of the interpreter that change from one release of MDL to the +next. The reason the fixups contain a release number is so that they +can be completely ignored when an `RSUBR` is loaded into the same +release of MDL as that from which it was last written out. + +There are three forms of fixups, corresponding to the three kinds of +`RSUBR` files. ASCII `RSUBR`s, found in `BINARY` files, have ASCII +fixups. The fixups are contained in a `LIST` that has the following +format: + + (MDL-release:fix + name:atom value:fix (use:fix use:fix ...) + name:atom value:fix (use:fix use:fix ...) + ...) + +The fixups in `NBIN` files and the fixup files associated with `FBIN` +files are in a fast internal format that looks like a `UVECTOR` of +`PRIMTYPE` `WORD`s. + +Fixups are usually discarded after they are used during the loading +procedure. However, if, while reading a `BINARY` or `NBIN` file the +`ATOM` `KEEP-FIXUPS!-` has a non-`FALSE` `LVAL`, the fixups will be +kept, via an association between the `RSUBR` and the `ATOM` `RSUBR`. +It should be noted that, besides correcting the code, the fixups +themselves are corrected when `KEEP-FIXUPS` is bound and true. Also, +the assembler and compiler make the same association when they first +create an `RSUBR`, so that it can be written out with its fixups. + +In the case of pure `RSUBR`s (`FBIN` files), things are a little +different. If a pure-code file exists for this release of MDL, it is +used immediately, and the fixups are completely ignored. If a +pure-code file for this release doesn't exist, the fixup file is used +to create a new copy of the file from an old one, and also a new +version of the fixup file is created to go with the new pure-code +file. This all goes on automatically behind the user's back. + +Chapter 20. Coroutines +====================== + +This chapter purports to explain the coroutine primitives of MDL. It +does make some attempt to explain coroutines as such, but only as +required to specify the primitives. If you are unfamiliar with the +basic concepts, confusion will probably reign. + +A coroutine in MDL is implemented by an object of `TYPE` `PROCESS`. In +this manual, this use of the word "process" is distinguished by a +capitalization from its normal use of denoting an operating-system +process (which various systems call a process, job, fork, task, etc.). + +MDL's built-in coroutine primitives do not include a "time-sharing +system". Only one `PROCESS` is ever running at a time, and control is +passed back and forth between `PROCESS`es on a coroutine-like basis. +The primitives are sufficient, however, to allow the writing of a +"time-sharing system" **in MDL**, with the additional use of the MDL +interrupt primitives. This has, in fact, been done. + +20.1. PROCESS (the TYPE) +------------------------ + +A `PROCESS` is an object which contains the "current state" of a +computation. This includes the `LVAL`s of `ATOM`s ("bindings"), +"depth" of functional application, and "position" within the +application of each applied function. Some of the things which are +**not** part of any specific `PROCESS` are the `GVAL`s of `ATOM`s, +associations (`ASOC`s), and the contents of `OBLIST`s. `GVAL`s (with +`OBLIST`s) are a chief means of communication and sharing between +`PROCESS`es (all `PROCESS`es can refer to the `SUBR` which is the +`GVAL` of `+`, for instance.) Note that an `LVAL` in one `PROCESS` +cannot easily be directly referenced from another `PROCESS`. + +A `PROCESS` `PRINT`s as `#PROCESS` *p*, where *p* is a `FIX` which +uniquely identifies the `PROCESS`; *p* is the "`PROCESS` number" typed +out by `LISTEN`. A `PROCESS` cannot be read in by `READ`. + +The term "run a `PROCESS`" will be used below to mean "perform some +computation, using the `PROCESS` to record the intermediate state of +that computation". + +N.B.: A `PROCESS` is a rather large object; creating one will often +cause a garbage collection. + +20.2. STATE of a PROCESS +------------------------ + + + +returns an `ATOM` (in the `ROOT` `OBLIST`) which indicates the "state" +of the `PROCESS` *process*. The `ATOM`s which `STATE` can return, and +their meanings, are as follows: + +- `RUNABLE` (sic) -- *process* has never ever been run. +- `RUNNING` -- *process* is currently running, that is, it did the + application of `STATE`. +- `RESUMABLE` -- *process* has been run, is not currently running, + and can run again. +- `DEAD` -- *process* has been run, but it can **not** run again; it + has "terminated". + +In addition, an interrupt (chapter 21) can be enabled to detect the +time at which a `PROCESS` becomes "blocked" (waiting for terminal +input) or "unblocked" (terminal input arrived). (The `STATE` `BLOCKED` +has not been implemented.) + +20.3. PROCESS (the SUBR) +------------------------ + + + +creates and returns a new `PROCESS` but does **not** run it; the +`STATE` of the returned `PROCESS` is `RUNABLE` (sic). + +*starter* is something applicable to **one** argument, which must be +evaluated. *starter* is used both in starting and "terminating" a +`PROCESS`. In particular, if the *starter* of a `PROCESS` **ever** +returns a value, that `PROCESS` becomes `DEAD`. + +20.4. RESUME +------------ + +The `SUBR` `RESUME` is used to cause a computation to start or to +continue running in another `PROCESS`. An application of `RESUME` +looks like this: + + + +where *retval* is the "returned value" (see below) of the `PROCESS` +that does the `RESUME`, and *process* is the `PROCESS` to be started +or continued. + +The *process* argument to `RESUME` is optional, by default the last +`PROCESS`, if any, to `RESUME` the `PROCESS` in which this `RESUME` is +applied. If and when the current `PROCESS` is later `RESUME`d by +another `PROCESS`, that `RESUME`'s *retval* is returned as the value +of this `RESUME`. + +20.5. Switching PROCESSes +------------------------- + +### 20.5.1. Starting Up a New PROCESS + +Let us say that we are running in some `PROCESS`, and that this +original `PROCESS` is the `GVAL` of `P0`. Somewhere, we have evaluated + + > + +where `,STARTER` is some appropriate function. Now, **in `,P0`** we +evaluate + + + +and the following happens: + +1. **In `,P0`** the arguments of the `RESUME` are evaluated: that is, + we get that `LVAL` of `A` which is current in `,P0` and the `GVAL` + of `P1`. +2. The `STATE` of `,P0` is changed to `RESUMABLE` and `,P0` is + "frozen" right where it is, in the middle of the `RESUME`. +3. The `STATE` of `,P1` is changed to `RUNNING`, and `,STARTER` is + applied to `,P0`'s `LVAL` of `A` **in `,P1`**. `,P1` now continues + on its way, evaluating the body of `,STARTER.` + +The `.A` in the `RESUME` could have been anything, of course. The +important point is that, whatever it is, it is evaluated in `,P0`. + +What happens next depends, of course, on what `,STARTER` does. + +### 20.5.2. Top-level Return + +Let us initially assume that `,STARTER` does nothing relating to +`PROCESS`es, but instead simply returns a value -- say *starval*. What +happens when `,STARTER` returns is this: + +1. The `STATE` of `,P1` is changed to `DEAD`. `,P1` can never again + be `RESUME`d. +2. The last `PROCESS` to `RESUME` `,P1` is found, namely `,P0`, and + its `STATE` is changed to `RUNNING`. +3. *starval* is returned in `,P0` as the value of the original + `RESUME`, and `,P0` continues where it left off. + +All in all, this simple case looks just like an elaborate version of +applying `,STARTER` to `.A` in `,P0`. + +### 20.5.3. Symmetric RESUMEing + +Now suppose that while still in `,P1`, the following is evaluated, +either in `,STARTER` or in something called by `,STARTER`: + + + +This is what happens: + +1. The arguments of the `RESUME` are evaluated **in `,P1`**. +2. The `STATE` of `,P1` is changed to `RESUMABLE`, and `,P1` is + "frozen" right in the middle of the `RESUME`. +3. The `STATE` of `,P0` is changed to `RUNNING`, and `,P1`'s `LVAL` + of `BAR` is returned as the value of **`,P0'`s** original `RESUME` + `,P0` then continues right where it left off. + +This is **the** interesting case, because `,P0` can now do **another** +`RESUME` of `,P1`; this will "turn off" `,P0`, pass a value to `,P1` +and "turn on" `,P1`. `,P1` can now again `RESUME` `,P0`. which can +`RESUME` `,P1` back again, etc. **ad nauseam**, with everything done +in a perfectly symmetric manner. This can obviously also be done with +three or more `PROCESS`es in the same manner. + +Note how this differs from normal functional application: you cannot +"return" from a function without destroying the state that function is +in. The whole point of `PROCESS`es is that you can "return" +(`RESUME`), remembering your state, and later continue where you left +off. + +20.6. Example +------------- + + ;"Initially, we are in LISTEN in some PROCESS. + ) + ) + >> + >> + >>>$ + SUM3 + ;"SUM3, used as the startup function of another PROCESS, + gets RESUMEd with numbers. It returns the sum of the last + three numbers it was given every third RESUME." + >$ + ;"Now we start SUMUP and give SUM3 its three numbers." + $ + "GOT 1" + $ + "GOT 2" + $ + 8 + +Just as a note, by taking advantage of MDL's order of evaluation, SUM3 +could be have been written as: + + ) + >>>>> + +20.7. Other Coroutining Features +-------------------------------- + +### 20.7.1. BREAK-SEQ + + + +("break evaluation sequence") returns *process*, which must be +`RESUMABLE`, after having modified it so that when it is next +`RESUME`d, it will **first** evaluate *any* and **then** do an +absolutely normal `RESUME`; the value returned by any is thrown away, +and the value given by the `RESUME` is used normally. + +If a `PROCESS` is `BREAK-SEQ`ed more than once between `RESUME`s, +**all** of the *any*s `BREAK-SEQ`ed onto it will be remembered and +evaluated when the `RESUME` is finally done. The *any*s will be +evaluated in "last-in first-out" order. The `FRAME` generated by +`EVAL`ing more than one *any* will have as its `FUNCT` the dummy +`ATOM` `BREAKER`. + +### 20.7.2. MAIN + +When you initially start up MDL, the `PROCESS` in which you are +running is slightly "special" in these two ways: + +1. Any attempt to cause it become `DEAD` will be met with an error. +2. `
` always returns that `PROCESS`. + +The `PROCESS` number of `
` is always `1`. The initial `GVAL` of +`THIS-PROCESS` is what `MAIN` always returns, `#PROCESS 1`. + +### 20.7.3. ME + + + +returns the `PROCESS` in which it is evaluated. The `LVAL` of +`THIS-PROCESS` in a `RUNABLE` (new) `PROCESS` is what `ME` always +returns. + +### 20.7.4. RESUMER + + + +returns the `PROCESS` which last `RESUME`d *process*. If no `PROCESS` +has ever `RESUME`d process, it returns `#FALSE ()`. *process* is +optional, `` by default. Note that `
` does not ever have any +resumer. Example: + + )) ;"not effective in
" + #DECL ((R) ) + RESUMABLE> + >> + +### 20.7.5. SUICIDE + + + +acts just like `RESUME`, but clobbers the `PROCESS` (which cannot be +`
`) in which it is evaluated to the `STATE` `DEAD`. + +### 20.7.6. 1STEP + + <1STEP process> + +returns *process*, after putting it into "single-step mode". + +A `PROCESS` in single-step mode, whenever `RESUME`d, runs only until +an application of `EVAL` in it begins or finishes. At that point in +time, the `PROCESS` that did the `1STEP` is `RESUME`d, with a *retval* +which is a `TUPLE`. If an application of `EVAL` just began, the +`TUPLE` contains the `ATOM` `EVLIN` and the arguments to `EVAL`. If an +application of `EVAL` just finished, the `TUPLE` contains the `ATOM` +`EVLOUT` and the result of the evaluation. + +*process* will remain in single-step mode until `FREE-RUN` (below) is +applied to it. Until then, it will stop before and after each `EVAL` +in it. Exception: if it is `RESUME`d from an `EVLIN` break with a +*retval* of `TYPE` `DISMISS` (`PRIMTYPE` `ATOM`), it will leave +single-step mode only until the current call to EVAL is about to +return. Thus lower-level `EVAL`s are skipped over without leaving the +mode. The usefulness of this mode in debugging is obvious. + +### 20.7.7. FREE-RUN + + + +takes its argument out of single-step mode. Only the `PROCESS` that +put *process* into single-step mode can take it out of the mode; if +another `PROCESS` tries, `FREE-RUN` returns a `FALSE`. + +20.8. Sneakiness with PROCESSes +------------------------------- + +`FRAME`s, `ENVIRONMENT`s, `TAG`s, and `ACTIVATION`s are specific to +the `PROCESS` which created them, and each "knows its own father". +**Any** `SUBR` which takes these objects as arguments can take one +which was generated by **any** `PROCESS`, no matter where the `SUBR` +is really applied. This provides a rather sneaky means of crossing +between `PROCESS`es. The various cases are as follows: + +`GO`, `RETURN`, `AGAIN`, and `ERRET`, given arguments which lie in +another `PROCESS`, each effectively "restarts" the `PROCESS` of its +argument and acts as if it were evaluated over there. If the `PROCESS` +in which it was executed is later `RESUME`d, it **returns** a value +just like `RESUME`! + +`SET`, `UNASSIGN`, `BOUND?`, `ASSIGNED?`, `LVAL`, `VALUE`, and `LLOC`, +given optional `ENVIRONMENT` arguments which lie in another `PROCESS`, +will gleefully change, or return, the local values of `ATOM`s in the +other `PROCESS`. The optional argument can equally well be a +`PROCESS`, `FRAME`, or `ACTIVATION` in another `PROCESS`; in those +cases, each uses the `ENVIRONMENT` which is current in the place +specified. + +`FRAME`, `ARGS`, and `FUNCT` will be glad to return the `FRAME`s, +argument `TUPLE`s, and applied Subroutine names of another `PROCESS`. +If one is given a `PROCESS` (including ``) as an argument instead +of a `FRAME`, it returns all or the appropriate part of the topmost +`FRAME` on that `PROCESS`'s control stack. + +If `EVAL` is applied in `PROCESS` `P1` with an `ENVIRONMENT` argument +from a `PROCESS` `P2`, it will do the evaluation **in `P1`** but with +`P2`'s `ENVIRONMENT` (!). That is, the other `PROCESS`'s `LVAL`s, etc. +will be used, but (1) any **new** `FRAME`s needed in the course of the +evaluation will be created in `P1`; and (2) **`P1`** will be `RUNNING` +-- not `P2`. Note the following: if the `EVAL` in `P1` eventually +causes a `RESUME` of `P2`, `P2` could functionally return to below the +point where the `ENVIRONMENT` used in `P1` is defined; a `RESUME` of +`P1` at this point would cause an `ERROR` due to an invalid +`ENVIRONMENT`. (Once again, `LEGAL?` can be used to forestall this.) + +20.9. Final Notes +----------------- + +1. A `RESUMABLE` `PROCESS` can be used in place of an `ENVIRONMENT` + in any application. The "current" `ENVIRONMENT` of the `PROCESS` + is effectively used. +2. `FRAME`s and `ENVIRONMENT`s can be `CHTYPE`d arbitrarily to one + another, or an `ACTIVATION` can be `CHTYPE`d to either of them, + and the result "works". Historically, these different `TYPE`s were + first used with different `SUBR`s -- `FRAME` with `ERRET`, + `ENVIRONMENT` with `LVAL`, `ACTIVATION` with `RETURN` -- hence the + invention of different `TYPE`s with similar properties. +3. Bugs in multi-`PROCESS` programs usually exhibit a degree of + subtlety and nastiness otherwise unknown to the human mind. If + when attempting to work with multiple processes you begin to feel + that you are rapidly going insane, you are in good company. + +Chapter 21. Interrupts +====================== + +The MDL interrupt handling facilities provide the ability to say the +following: whenever "this event" occurs, stop whatever is being done +at the time and perform "this action"; when "this action" is finished, +continue with whatever was originally being done. "This event" can be +things like the typing of a character at a terminal, a time interval +ending, a `PROCESS` becoming blocked, or a program-defined and +-generated "event". "This action" is the application of a specified +`APPLICABLE` object to arguments provided by the MDL interrupt system. +The sets of events and actions can be changed in extremely flexible +ways, which accounts for both the variety of `SUBR`s and arguments, +and the rich interweaving of the topics in this chapter. Interrupt +handling is a kind of parallel processing: a program can be divided +into a "main-level" part and one or more interrupt handlers that +execute only when conditions are ripe. + +21.1. Definitions of Terms +-------------------------- + +An **interrupt** is not an object in MDL, but rather a class of +events, for example, "ticks" of a clock, garbage collections, the +typing of a character at a terminal, etc. + +An interrupt is said to **occur** when one of the events in its class +takes place. + +An **external** interrupt is one whose occurrences are signaled to MDL +by the operating system, for example, "ticks" of a clock. An +**internal** interrupt is one whose occurrences are detected by MDL +itself, for example, garbage collections. MDL can arrange for the +operating system to not signal occurrences of an external interrupt to +it; then, as far as MDL is concerned, that interrupt does not occur. + +Each interrupt has a **name** which is either a `STRING` (for example, +`"GC"`, `"CHAR"`, `"WRITE"`) or an `ATOM` with that `PNAME` in a +special `OBLIST`, named `INTERRUPTS!-`. (This `OBLIST` is returned by +``.) Certain names must always be further specified by a +`CHANNEL` or a `LOCATIVE` to tell **which** interrupt by that name is +meant. + +When an interrupt occurs, the interpreter looks for an association on +the interrupt's name. If there is an association, its `AVALUE` should +be an `IHEADER`, which heads a list of actions to be performed. In +each `IHEADER` is the name of the interrupt with which the `IHEADER` +is or was associated. + +In each `IHEADER` is an element telling whether it is disabled. If an +`IHEADER` is **disabled**, then none of its actions is performed. The +opposite of disabled is **enabled**. It is sometimes useful to disable +an `IHEADER` temporarily, but removing its association with the +interrupt's name is better than long-term disabling. There are `SUBR`s +for creating an `IHEADER`, associating it with an interrupt, and later +removing the association. + +In each `IHEADER` is a **priority**, a `FIX` greater than `0` which +specifies the interrupt's "importance". The processing of a +higher-priority (larger-numbered) interrupt will supersede the +processing of a lower-priority (smaller-numbered) interrupt until the +high-priority interrupt has been handled. + +In each `IHEADER` is a (possibly empty) list of `HANDLER`s. (This list +is not a MDL `LIST`.) Each `HANDLER` corresponds to an action to +perform. There are `SUBR`s for creating a `HANDLER`, adding it to an +`IHEADER`'s list, and later removing it. + +In each `HANDLER` is a function that we will call a **handler** (in +lower case), despite possible confusion, because that is really the +best name for it. An **action** consists of applying a handler to +arguments supplied by the interrupt system. The number and meaning of +the arguments depend on the name of the interrupt. In each `HANDLER` +is an element telling in which `PROCESS` the action should be +performed. + +21.2. EVENT +----------- + + + +creates and returns an enabled `IHEADER` with no `HANDLER`s. The +*name* may be an `ATOM` in the `INTERRUPTS` `OBLIST` or a `STRING`; if +it is a `STRING`, `EVENT` does a `LOOKUP` or `INSERT` in +``. If there already is an `IHEADER` associated with +*name*, `EVENT` just returns it, ignoring the given *priority*. + +*which* must be given only for certain *name*s: + +- It must be a `CHANNEL` if and only if *name* is `"CHAR"` (or + `CHAR!-INTERRUPTS`). In this case it is the input `CHANNEL` from + the (pseudo-)terminal or Network socket whose received characters + will cause the interrupt to occur, or the output `CHANNEL` to the + pseudo-terminal or Network socket whose desired characters will + cause the interrupt to occur. (See below. Pseudo-terminals are not + available in the Tenex and Tops-20 versions.) +- The argument must be a `LOCATIVE` if and only if *name* is + `"READ"` (or `READ!-INTERRUPTS`) or `"WRITE"` (or + `WRITE!-INTERRUPTS`). In this case it specifies an object to be + "monitored" for usage by (interpreted) MDL programs (section + 21.8.9). + +If the interrupt is external, MDL arranges for the operating system to +signal its occurrences. + +21.3. HANDLER (the SUBR) +------------------------ + + + +creates a `HANDLER`, adds it to the front of *iheader*'s `HANDLER` +list (first action to be performed), and returns it as a value. +*applicable* may be any `APPLICABLE` object that takes the proper +number of arguments. (None of the arguments can be `QUOTE`d; they must +all be evaluated at call time.) *process* is the `PROCESS` in which +the handler will be applied, by default whatever `PROCESS` was running +when the interrupt occurred. + +The value returned by the handler is ignored, unless it is of `TYPE` +`DISMISS` (`PRIMTYPE` `ATOM`), in which case none of the remaining +actions in the list will be performed. + +The processing of an interrupt's actions can terminate prematurely if +a handler calls the `SUBR` `DISMISS` (see below.) + +21.4. OFF +--------- + + + +removes the association between *iheader* and the name of its +interrupt, and then disables *iheader* and returns it. (An error +occurs if there is no association.) If the interrupt is external, MDL +arranges for the operating system not to signal its occurrences. + + + +finds the `IHEADER` associated with *name* and proceeds as above, +returning the `IHEADER`. *which* must be given only for certain +*names*, as for `EVENT`. Caution: if you ``, MDL +will become deaf. + + + +returns *handler* after removing it from its list of actions. There is +no effect on any other `HANDLER`s in the list. + +Now that you know how to remove `IHEADER`s and `HANDLER`s from their +normal places, you need to know how to put them back: + + + +If *iheader* was previously disabled or disassociated from its name, +`EVENT` will associate and enable it. + + + +If *handler* was previously removed from its list, `HANDLER` will add +it to the front of *iheader*'s list of actions. Note that *process* +cannot be specified. + +21.5. IHEADER and HANDLER (the TYPEs) +------------------------------------- + +Both these `TYPE`s are of `PRIMTYPE` `VECTOR`, but they do not `PRINT` +that way, since they are self-referencing. Instead they `PRINT` as + + #type most-interesting-component + +The contents of `IHEADER`s and `HANDLER`s can be changed by `PUT`, and +the new values will then determine the behavior of MDL. + +Before describing the elements of these `TYPE`s in detail, here are a +picture and a Pattern, both purporting to show how they look: + + #IHEADER [name:atom or which + disabled? + *-----------> #HANDLER [*-----------> #HANDLER [#HANDLER [] + priority] <-------------* +------* + applicable | applicable + process] <-------+ process] + + + + APPLICABLE PROCESS> + FIX> + +### 21.5.1. IHEADER + +The elements of an `IHEADER` are as follows: + +1. name of interrupt (`ATOM`, or `CHANNEL` if the name is `"CHAR"`, + or `LOCATIVE` if the name is `"READ"` or `"WRITE"`) +2. non-zero if and only if disabled +3. first `HANDLER`, if any, else a zero-length `HANDLER` +4. priority + +If you lose track of an `IHEADER`, you can get it via the association: + +- For `"CHAR"` interrupts, `` returns the + `IHEADER` or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if + there is no association. +- For `"READ"` interrupts, `` returns + the `IHEADER` or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if + there is no association. +- For `"WRITE"` interrupts, `` + returns the `IHEADER` or `#FALSE ()` if there is no association: + `` returns the `IHEADER`, creating it if + there is no association. +- Otherwise, the `IHEADER` is `PUT` on the name `ATOM` with the + indicator `INTERRUPT`. Thus, for example, + `` returns the `IHEADER` for the + clock interrupt or `#FALSE ()` if there is no association; + `` returns the `IHEADER`, creating it if there is + no association. + +### 21.5.2. HANDLER + +A `HANDLER` specifies a **particular** action for a **particular** +interrupt. The elements of a `HANDLER` are as follows: + +1. next `HANDLER` if any, else a zero-length `HANDLER` +2. previous `HANDLER` or the `IHEADER` (Thus the `HANDLER`s of a + given interrupt form a "doubly-linked list" chaining between each + other and back to the `IHEADER`.) +3. handler to be applied (anything but `APPLICABLE` that evaluates + its arguments -- the application is done not by `APPLY` but by + `RUNINT`, which can take a `PROCESS` argument: see next line) +4. `PROCESS` in which the handler will be applied, or `#PROCESS 0`, + meaning whatever `PROCESS` was running when the interrupt occurred + (In the former case, `RUNINT` is applied to the handler and its + arguments in the currently running `PROCESS`, which causes an + `APPLY` in the `PROCESS` stored in the `HANDLER`, which `PROCESS` + must be `RESUMABLE`. The running `PROCESS` becomes `RESUMABLE`, + and the stored `PROCESS` becomes `RUNNING`, but no other `PROCESS` + variables (for example `RESUMER`) are changed.) + +21.6. Other SUBRs +----------------- + + + +is equivalent to + + + applicable process> + +`ON` is a combination of `EVENT` and `HANDLER`: it creates (or finds) +the `IHEADER`, associates and enables it, adds a `HANDLER` to the +front the list (first to be performed), and returns the `HANDLER`. + + + +is effectively ``. Actually the `TYPE` `LOSE` +is unimportant, but the `-1` signifies that *iheader* is disabled. + + + +is effectively ``. Actually the `TYPE` `LOSE` +is unimportant, but the `0` signfies that *iheader* is enabled. + +21.7. Priorities and Interrupt Levels +------------------------------------- + +At any given time there is a defined **interrupt level**. This is a +`FIX` which determines which interrupts can really "interrupt" -- that +is, cause the current processing to be suspended while their wants are +satisfied. Normal, non-interrupt programs operate at an interrupt +level of 0 (zero.) An interrupt is processed at an interrupt level +equal to the interrupt's priority. + +### 21.7.1. Interrupt Processing + +Interrupts "actually" only occur at well-defined points in time: +during a call to a Subroutine, or at critical places within +Subroutines (for example, during each iteration of `MAPF` on a `LIST`, +which may be circular), or while a `PROCESS` is `"BLOCKED"` (see +below). No interrupts can occur during garbage collection. + +What actually happens when an enabled interrupt occurs is that the +priority of the interrupt is compared with the current interrupt +level, and the following is done: + +If the priority is **greater than** the current interrupt level, the +current processing is "frozen in its tracks" and processing of the +action(s) specified for that interrupt begins. + +If the priority is less than or equal to the current interrupt level, +the interrupt occurrence is **queued** -- that is, the fact that it +occurred is saved away for processing when the interrupt level becomes +low enough. + +When the processing of an interrupt's actions is completed, MDL +usually (1) "acts as if" the previously-existing interrupt level is +restored, and processing continues on what was left off (perhaps for +no time duration); and (2) "acts as if" any queued interrupt +occurrences actually occurred right then, in their original order of +occurrence. + +### 21.7.2. INT-LEVEL + +The `SUBR` `INT-LEVEL` is used to examine and change the current +interrupt level directly. + + + +simply returns the current interrupt level. + + + +changes the interrupt level to its argument and returns the +**previously**-existing interrupt level. + +If `INT-LEVEL` lowers the priority of the interrupt level, it does not +"really" return until all queued occurrences of interrupts of higher +priority than the target priority have been processed. + +Setting the `INT-LEVEL` extremely high (for example, +` FIX>>`) effectively disables all interrupts +(but occurrences of enabled interrupts will still be queued). + +If `LISTEN` or `ERROR` is called when the `INT-LEVEL` is not zero, +then the typeout will be + + LISTENING-AT-LEVEL I PROCESS p INT-LEVEL i + +### 21.7.3. DISMISS + +`DISMISS` permits a handler to return an arbitrary value for an +arbitrary `ACTIVATION` at an arbitrary interrupt level. The call is as +follows: + + + +where only the *value* is required. If *activation* is omitted, return +is to the place interrupted from, and *value* is ignored. If +*int-level* is omitted, the `INT-LEVEL` prior to the current interrupt +is restored. + +21.8. Specific Interrupts +------------------------- + +Descriptions of the characteristics of particular "built-in" MDL +interrupts follow. Each is named by its `STRING` name. Expect this +list to be incomplete yesterday. + +`"CHAR"` is currently the most complex built-in interrupt, because it +serves duty in several ways. These different ways will be described in +several different sections. All ways are concerned with characters or +machine words that arrive or depart at unpredictable times, because +MDL is communicating with a person or another processor. Each `"CHAR"` +`IHEADER` has a `CHANNEL` for the element that names the interrupt, +and the mode of the `CHANNEL` tells what kinds of `"CHAR"` interrupts +occur to be handled through that `IHEADER`. + +1. If the `CHANNEL` is for `INPUT`, "CHAR" occurs every time an + "interesting" character (see below) is received from the + `CHANNEL`'s real terminal, or any character is received from the + `CHANNEL`'s pseudo-terminal, or a character or word is received + from the `CHANNEL`'s Network socket, or indeed (in the ITS + version) the operating system generates an interrupt for any + reason. +2. If the `CHANNEL` is for output to a pseudo-terminal or Network + socket, `"CHAR"` occurs every time a character or word is wanted. +3. If the `CHANNEL` is for output to a terminal, `"CHAR"` occurs + every time a line-feed character is output or (in the ITS version) + the operating system generates a screen-full interrupt for the + terminal. + +### 21.8.1. "CHAR" received + +A handler for an input `"CHAR"` interrupt on a real terminal must take +two arguments: the `CHARACTER` which was typed, and the `CHANNEL` on +which it was typed. + +In the ITS version, the "interesting" characters are those "enabled +for interrupts" on a real terminal, namely \^@ through +\^G, \^K through \^\_, and +DEL (that is, ASCII codes 0-7, 13-37, and 177 octal.) + +In the Tenex and Tops-20 versions, the operating system can be told +which characters typed on a terminal should cause this interrupt to +occur, by calling the `SUBR` `ACTIVATE-CHARS` with a `STRING` argument +containing those characters (no more than six, all with ASCII codes +less than 33 octal). If called with no argument, `ACTIVATE-CHARS` +returns a `STRING` containing the characters that currently interrupt. +Initially, only \^G, \^S, and \^O +interrupt. + +An initial MDL already has `"CHAR"` enabled on `,INCHAN` with a +priority 8 (eight), the `SUBR` `QUITTER` for a handler to run in +`#PROCESS 0` (the running `PROCESS`); this is how `^G` and +`^S` are processed. In addition, every time a new `CHANNEL` +is `OPEN`ed in `"READ"` mode to a terminal, a similar `IHEADER` and +`HANDLER` are associated with that new `CHANNEL` automatically. These +automatically-generated `IHEADER`s and `HANDLER`s use the standard +machinery, and they can be `DISABLE`d or `OFF`ed at will. **However**, +the `IHEADER` for `,INCHAN` should not be `OFF`ed: MDL knows that `$` +is typed only by an interrupt! + +Example: the following causes the given message to be printed out +whenever a `^Y` is typed on `.INCHAN`: + + + #FUNCTION ((CHAR CHAN) + #DECL ((VALUE) ANY (CHAR) CHARACTER (CHAN) CHANNEL) + + >)>>$ + #HANDLER #FUNCTION **CHAR CHAN) ...) + <+ 2 ^Y [Some of my best friends are ^Ys.] 2>$ + 4 + $ + #HANDLER #FUNCTION (...) + +Note that occurrences of `"CHAR"` do **not** wait for the `$` to be +typed, and the interrupting character is omitted from the input +stream. + +A `"CHAR"` interrupt can also be associated with an input `CHANNEL` +open to a Network socket (`"NET"` device). A handler gets applied to a +`NETSTATE` array (which see) and the `CHANNEL`. + +In the ITS version, a `"CHAR"` interrupt can also be associated with +an input `CHANNEL` open to a pseudo-terminal ("STY" device and +friends). An interrupt occurs when a character is available for input. +These interrupts are set up in exactly the same way as real-terminal +interrupts, except that a handler gets applied to only **one** +argument, the `CHANNEL`. Pseudo-terminal are not available in the +Tenex and Tops-20 versions. + +For any other flavor of ITS channel interrupt, a handler gets applied +to only **one** argument, the `CHANNEL`. + +### 21.8.2. "CHAR" wanted + +A `"CHAR"` interrupt can be associated with an output `CHANNEL` open +to a Network socket (`"NET"` device). A handlers gets applied to a +`NETSTATE` array (which see) and the `CHANNEL`. + +In the ITS version, a `"CHAR"` interrupt can also be associated with +an output `CHANNEL` open to a pseudo-terminal (`"STY"` device and +friends). An interrupt occurs when the program at the other end needs +a character (and the operating-system buffer is empty). A handler gets +applied to one argument, the `CHANNEL`. Pseudo-terminals are not +available in the Tenex and Tops-20 versions. + +### 21.8.3. "CHAR" for new line + +A handler for an output `"CHAR"` interrupt on a real terminal must +take **one or two** arguments (using `"OPTIONAL"` or `"TUPLE"`): if +two arguments are supplied by the interrupt system, they are the line +number (`FIX`) and the `CHANNEL`, respectively, and the interrupt is +for a line-feed; if only one argument is supplied (only in the ITS +version), it is the `CHANNEL`, and the interrupt is for a full +terminal screen. Note: the supplied line number comes from the +`CHANNEL`, and it may not be accurate if the program alters it in +subtle ways, for example, via `IMAGE` calls or special control +characters. (The program can compensate by putting the proper line +number into the `CHANNEL`.) + +### 21.8.4. "GC" + +`"GC"` occurs just **after** every garbage collection. Enabling this +interrupt is the only way a program can know that a garbage collection +has occurred. A handler for `"GC"` takes three arguments. The first is +a FLOAT indicating the number of seconds the garbage collection took. +The second argument is a FIX indicating the cause of the garbage +collection, as follows (chapter 22): + +0. Program called GC. +1. Movable storage was exhausted. +2. Control stack overflowed. +3. Top-level LVALs overflowed. +4. GVAL vector overflowed. +5. TYPE vector overflowed. +6. Immovable garbage-collected storage was exhausted. +7. Internal stack overflowed. +8. Both control and internal stacks overflowed (rare). +9. Pure storage was exhausted. +10. Second, exhaustive garbage collection occurred. + +The third argument is an ATOM indicating what initiated the garbage +collection: `GC-READ`, `BLOAT`, `GROW`, `LIST`, `VECTOR`, `SET`, +`SETG`, `FREEZE`, `GC`, `NEWTYPE`, `PURIFY`, `PURE-PAGE-LOADER` (pure +storage was exhausted), or `INTERRUPT-HANDLER` (stack overflow, +unfortunately). + +### 21.8.5. "DIVERT-AGC" + +`"DIVERT-AGC"` ("Automatic Garbage Collection") occurs just **before** +a deferrable garbage collection that is needed because of exhausted +movable garbage-collected storage. Enabling this interrupt is the only +way a program can know that a garbage collection is about to occur. A +handler takes two arguments: A `FIX` telling the number of machine +words needed and an `ATOM` telling what initiated the garbage +collection (see above). If it wishes, a handler can try to prevent a +garbage collection by calling `BLOAT` with the `FIX` argument. If the +pending request for garbage-collected storage cannot then be +satisfied, a garbage collection occurs anyway. `AGC-FLAG` is `SET` to +`T` while the handler is running, so that new storage requests do not +try to cause a garbage collection. + +### 21.8.6. "CLOCK" + +`"CLOCK"`, when enabled, occurs every half second (the ITS +"slow-clock" tick.) It is not available in the Tenex or Tops-20 +versions. It wants handlers which take no arguments. Example: + + > 1> + +### 21.8.7. "BLOCKED" + +`"BLOCKED"` occurs whenever **any** `PROCESS` (not only the `PROCESS` +which may be in a `HANDLER`) starts waiting or terminal input: that +is, an occurrence indicates that somewhere, somebody did a `READ`, +`READCHR`, `NEXTCHR`, `TYI`, etc. to a console. The handler for a +`"BLOCKED"` interrupt should take one argument, namely the `PROCESS` +which started waiting (which will also be the `PROCESS` in which the +handler runs, if no specific one is in the `HANDLER`). + +Example: the following will cause MDL to acquire a `*` prompting +character. + + ) 5> + +### 21.8.8. "UNBLOCKED" + +`"UNBLOCKED"` occurs whenever a `$` (`ESC`) is typed on a +terminal if a program was hanging and waiting for input, or when a TYI +call (which see) is satisfied. A handler takes one argument: the +`CHANNEL` via which the `$` or character is input. + +### 21.8.9. "READ" and "WRITE" + +`"READ"` and `"WRITE"` are associated with read or write references to +MDL objects. These interrupts are often called "monitors", and +enabling the interrupt is often called "monitoring" the associated +object. A "read reference" to an `ATOM`'s local value includes +applying `BOUND?` or `ASSIGNED?` to the `ATOM`; similarly for a global +value and `GASSIGNED?`. If the `INT-LEVEL` is too high when `"READ"` +or `"WRITE"` occurs, an error occurs, because occurrences of these +interrupts cannot be queued. + +Monitors are set up with `EVENT` or `ON`, using a locative to the +object being monitored as the extra *which* argument, just as a +`CHANNEL` is given for `"CHAR"`. A handler for `"READ"` takes two +arguments: the locative and the `FRAME` of the function application +that make the reference. A handler for `"WRITE"` takes three +arguments: the locative, the new value, and the `FRAME`. For example: + + $ + (1 2 3) + >$ + #LOCL 2 + + + + + + + + > + 4 0 .B>$ + #HANDLER FUNCTION (...) + <1 .A 10>$ + (10 2 3) + <2 .A 20>$ + Program changed #LOCL 2 to 20 via #FRAME PUT + (10 20 3) + $ + #IHEADER #LOCL 20 + +### 21.8.10. "SYSDOWN" + +`"SYSDOWN"` occurs when a system-going-down or system-revived signal +is received from ITS. It is not available in the Tenex or Tops-20 +versions. If no `IHEADER` is associated and enabled, a warning message +is printed on the terminal. A handler takes one argument: a `FIX` +giving the number of thirtieths of a second until the shutdown (-1 for +a reprieve). + +### 21.8.11. "ERROR" + +In an effort to simplify error handling by programs, MDL has a +facility allowing errors to be handled like interrupts. `SETG`ing +`ERROR` to a user function is a distasteful method, not safe if any +bugs are around. An `"ERROR"` interrupt wants a handler that takes any +number of arguments, via `"TUPLE"`. When an error occurs, handlers are +applied to the `FRAME` of the `ERROR` call and the `TUPLE` of `ERROR` +arguments. If a given handler "takes care of the error", it can +`ERRET` with a value from the `ERROR` `FRAME`, after having done +``. If no handler takes care of the error, it falls into +the normal `ERROR`. + +If an error occurs at an `INT-LEVEL` greater than or equal to that of +the `"ERROR"` interrupt, real `ERROR` will be called, because +`"ERROR"`interrupts cannot be queued. + +### 21.8.12. "IPC" + +`"IPC"` occurs when a message is received on the ITS IPC device +(chapter 23). It is not available in the Tenex and Tops-20 versions. + +### 21.8.13. "INFERIOR" + +`"INFERIOR"` occurs when an inferior ITS process interrupts the MDL +process. It is not available in the Tenex and Tops-20 versions. A +handler takes one argument: A `FIX` between `0` and `7` inclusive, +telling which inferior process is interrupting. + +### 21.8.14. "RUNT and "REALT" + +These are not available in the Tenex and Tops-20 versions. + +`"RUNT"`, if enabled, occurs **once**, *N* seconds of MDL running time +(CPU time) after calling ``, which returns +its argument. A handler takes no arguments. If `RUNTIMER` is called +with no argument, it returns a `FIX`, the number of run-time seconds +left until the interrupt occurs, or `#FALSE ()` if the interrupt is +not going to occur. + +`"REALT"`, if enabled, occurs **every** *N* seconds of real-world time +after calling ``, which returns its +argument. A handler takes no arguments. `` tells the +operating system not to generate real-time interrupts. If `REALTIMER` +is called with no argument, it returns a `FIX`, the number of +real-time seconds given in the most recent call to `REALTIMER` with an +argument, or `#FALSE ()` if `REALTIMER` has not been called. + +### 21.8.15. "Dangerous" Interrupts + +`"MPV"` ("memory protection violation") occurs if MDL tries to refer +to a storage address not in its address space. `"PURE"` occurs if MDL +tries to alter read-only storage. `"ILOPR"` occurs if MDL executes and +illegal instruction ("operator"). `"PARITY"` occurs if the CPU detects +a parity error in MDL's address space. All of these require a handler +that takes one argument: the address (`TYPE` `WORD`) following the +instruction that was being executed at the time. + +`"IOC"` occurs if MDL tries to deal illegally with an I/O channel. A +handler must take two arguments: a three-element `FALSE` like one that +`OPEN` might return, and the `CHANNEL` that got the error. + +Ideally these interrupts should never occur. In fact, in the Tenex and +Tops-20 versions, these interrupts always go to the superior operating +system process instead of to MDL. In the ITS version, if and when a +"dangerous" interrupt does occur: + +- If no `IHEADER` is associated with the interrupt, then the + interrupt goes to the superior operating system process. +- If an `IHEADER` is associated but disabled, the error + `DANGEROUS-INTERRUPT-NOT-HANDLED` occurs (`FILE-SYSTEM-ERROR` for + \`"IOC"). +- If an `IHEADER` is associated and enabled, but the `INT-LEVEL` is + too high, the error `ATTEMPT-TO-DEFER-UNDEFERABLE-INTERRUPT` + occurs. + +21.9. User-Defined Interrupts +----------------------------- + +If the interrupt name given to `EVENT` or `ON` is **not** one of the +standard predefined interrupts of MDL, they will gleefully create an +`ATOM` in `` and an associated `IHEADER` anyway, making +the assumption that you are setting up a "program-defined" interrupt. + +Program-defined interrupts are made to occur by applying the `SUBR` +`INTERRUPT`, as in + + + +where *name* is a `STRING`, `ATOM` or `IHEADER`, and *arg1* through +*argN* are the arguments wanted by the handlers for the interrupt. + +If the interrupt specified by `INTERRUPT` is enabled, `INTERRUPT` +returns `T`; otherwise it returns `#FALSE ()`. All the usual priority +and queueing rules hold, so that even if `INTERRUPT` returns `T`, it +is possible that nothing "really happened" (yet). + +`INTERRUPT` can also be used to cause "artificial" occurrences of +standard predefined MDL interrupts. + +Making a program-defined interrupt occur is similar to calling a +handler directly, but there are differences. The value returned by a +handler is ignored, so side effects must be used in order to +communicate information back to the caller, other than whether any +handler ran or will run. One good use for a program-defined interrupt +is to use the priority and queueing machinery of `INT-LEVEL` to +control the execution of functions that must not run concurrently. For +example, if a `"CHAR"` handler just deposits characters in a buffer, +then a function to process the buffered characters should probably run +at a higher priority level -- to prevent unpredictable changes to the +buffer during the processing -- and it is natural to invoke the +processing with `INTERRUPT`. + +In more exotic applications, `INTERRUPT` can signal a condition to be +handled by an unknown number of independent and "nameless" functions. +The functions are "nameless" because the caller doesn't know their +name, only the name of the interrupt. This programming style is +modular and event-driven, and it is one way of implementing +"heuristic" algorithms. In addition, each `HANDLER` has a `PROCESS` in +which to run its handler, and so the different handlers for a given +condition can do their thing in different environments quite easily, +with less explicit control than when using `RESUME`. + +21.10. Waiting for Interrupts +----------------------------- + +### 21.10.1. HANG + + + +hangs interruptibly, without consuming any CPU time, potentially +forever. `HANG` is nice for a program that cannot do anything until an +interrupt occurs. If the optional *pred* is given, it is evaluated +every time an interrupt occurs and is dismissed back into the `HANG`; +if the result of evaluation is not `FALSE`, `HANG` unhangs and returns +it as a value. If *pred* is not given, there had better be a named +`ACTIVATION` somewhere to which a handler can return. + +### 21.10.2. SLEEP + + + +suspends execution, interruptibly, without consuming any CPU time, for +*time* seconds, where *time* is non-negative, and then returns `T`. +*pred* is the same as for `HANG`. + +Chapter 22. Storage Management +============================== + +The reason this chapter comes so late in this document is that, except +for special cases, MDL programs have their storage needs handled +automatically. There is usually no need even to consider storage +management, except as it affects efficiency (chapter 24). This chapter +gives some explanation of why this is so, and covers those special +means by which a program can assume control of storage management. + +The MDL address space is divided into five parts, which are usually +called + +1. movable garbage-collected space, +2. immovable space (both garbage-collected and not), +3. user pure/page space, +4. pure-`RSUBR` mapping space, and +5. internal storage. + +Internal storage occupies both the highest and lowest addresses in the +address space, and its size never changes as MDL executes. The other +spaces can vary in size according to the needs of the executing +program. Generally the interpreter allocates a contiguous set of +addresses for each space, and each space gradually fills up as new +objects are created and as disk files are mapped in. The action taken +when space becomes full varies, as discussed below. + +22.1. Movable Garbage-collected Storage +--------------------------------------- + +Most storage used explicitly by MDL programs is obtained from a pool +of free storage managed by a "garbage collector". Storage is obtained +from this pool by the `SUBR`s which construct objects. When a `SUBR` +finds that the pool of available storage is exhausted, it +automatically calls the garbage collector. + +The garbage collector has two algorithms available to it: the +"copying" algorithm, which is used by default, and the "mark-sweep" +algorithm. Actually, one often speaks of two separate garbage +collectors, the "copying" one and the "mark-sweep" one, because each +is an independent module that is mapped in to the interpreter's +internal storage from disk only during garbage collection. For +simplicity, this document speaks of "the" garbage collector, which has +two algorithms. + +The garbage collector examines the storage pool and **marks** all the +objects there, separating them into two classes: those which cannot +possibly be referenced by a program, and those which can. The +"copying" algorithm then copies the latter into one compact section of +the pool, and the remainder of the pool is made available for newly +constructed objects. The "mark-sweep" algorithm, instead, puts all +objects in the former class (garbage) into "free lists", where the +object-construction `SUBR`s can find them and re-use their storage. + +If the request for more storage still cannot be satisfied from +reclaimed storage, the garbage collector will attempt to obtain more +total storage from the operating system under which MDL runs. (Also, +if there is a gross superfluity of storage space, the garbage +collector will politely return some storage to the operating system.) +Only when the total system resources are exhausted will you finally +lose. + +Thus, if you just "forget about" an object, that is, lose all possible +means of referencing it, its storage is automatically reclaimed. +"Object" in this context includes that stack-structured storage space +used in `PROCESS`es for functional application. + +### 22.1.1. Stacks and Other Internal Vectors + +Control stacks are used in MDL to control the changes in environment +caused by calling and binding. Each active `PROCESS` has its own +control stack. On this stack are stored `LVAL`s for `ATOM`s; +`PRIMTYPE` `TUPLE`s, which are otherwise like `VECTOR`s; `PRIMTYPE` +`FRAME`s, which are generated by calling Subroutines; and +`ACTIVATION`s, which are generated by calling `FUNCTION`s with named +`ACTIVATION`s, `PROG`, and `REPEAT`. `TAG` and `LLOC` can make `TAG`s +and `LOCD`s (respectively) that refer to a specific place on a +specific control stack. (`LEGAL?` returns `T` if and only if the +portion of the control stack in which its argument is found or to +which its argument refers is still active, or if its argument doesn't +care about the control stack. The garbage collector may change a +non-`LEGAL?` object to `TYPE` `ILLEGAL` before reclaiming it.) As the +word "stack" implies, things can be put on it and removed from it at +only one end, called the top. It has a maximum size (or depth), and +attempting to put too many things on it will cause overflow. A stack +is stored like a `VECTOR`, and it must be `GROW`n if and when it +overflows. + +A control stack is actually two stacks in one. One section is used for +"top-level" `LVAL`s -- those `SET` while the `ATOM` is not bound by +any active Function's argument `LIST` or Subroutine's `SPECIAL` +binding -- and the other section is used for everything else. Either +section can overflow, of course. The top-level-`LVAL` section is below +the other one, so that a top-level `LVAL` will be found only if the +`ATOM` is not currently bound elsewhere, namely in the other section. + +MDL also has an internal stack, used for calling and temporary storage +within the interpreter and compiled programs. It too is stored like a +`VECTOR` and can overflow. There are other internal vectors that can +overflow: the "global vector" holds pairs ("slots") of `ATOM`s and +corresponding `GVAL`s ("globally bound" or `GBOUND?` means that the +`ATOM` in question is in this vector, whether or not it currently has +a global value), and the "`TYPE` vector" holds `TYPE` names +(predefined and `NEWTYPE`s) and how they are to be treated. + +22.2. Immovable Storage +----------------------- + +### 22.2.1. Garbage-collected: FREEZE + +In very special circumstances, such as debugging `RSUBR`s, you may +need to prevent an object from being moved by the garbage collector. +`FREEZE` takes one argument, of `PRIMTYPE` `VECTOR`, `UVECTOR`, +`STRING`, `BYTES` or `TUPLE`. It copies its argument into non-moving +garbage-collected space. `FREEZE` returns the copy `CHTYPE`d to its +`PRIMTYPE`, except in the case of a `TUPLE`, which is changed to a +`VECTOR`. + +### 22.2.2. Non-garbage-collected: STORAGE (the PRIMTYPE) + +An object of `PRIMTYPE` `STORAGE` is really a frozen `UVECTOR` whose +`UTYPE` is of `PRIMTYPE` `WORD`, but it is always pointed to by +something internal to MDL and thus is never garbage-collectible. The +use of `FREEZE` is always preferable, except when for historical +reasons a `STORAGE` is necessary. + +22.3. Other Storage +------------------- + +User pure/page space serves two purposes. First, when a user program +`PURIFY`s (see below) MDL objects, they are copied into this space. +Second, so-called hand-crafted `RSUBR`s (assembled but not compiled) +can call on the interpreter to map pages of disk files into this space +for arbitrary purposes. + +Pure-`RSUBR` mapping space is used by the interpreter to dynamically +map pages of pure compiled programs into and out of the MDL address +space. Pure code can refer to impure storage through the "transfer +vector", another internal vector. This space is the most vulnerable to +being compressed in size by the long-term growth of other spaces. + +Internal storage has both pure and impure parts. The interpreter +program itself is pure and sharable, while impure storage is used for +internal pointers, counters, and flags, for example, pointers to the +boundaries of other spaces. In the pure part of this space are most of +the `ATOM`s in an initial MDL, along with their `OBLIST` buckets +(`LIST`s) and `GVAL` slots (a pure extension of the global vector), +where possible. A `SET` or `SETG` of a pure `ATOM` automatically +impurifies the `ATOM` and as much of its `OBLIST` bucket as needs to +be impure. + +22.4. Garbage Collection: Details +--------------------------------- + +When either of the garbage-collected spaces (movable or immovable) +becomes full, MDL goes through the following procedure: + +1. A `"DIVERT-AGC"` interrupt occurs if the garbage collection can be + deferred temporarily by shifting boundaries between storage spaces + slightly. The interrupt handler may postpone a garbage collection + by moving boundaries itself with a call to `BLOAT` (below). +2. The garbage collector begins execution. The "copying" algorithm + creates an inferior operating-system process (named `AGC` in the + ITS version) whose address space is used to hold the new copies of + non-garbage objects. MDL accesses the inferior's address space + through two pages ("frontier" and "window") in its internal space + that are shared with the inferior. If the garbage collection + occurred because movable garbage-collected space was exhausted, + then the "mark-sweep" algorithm might be used instead (see below) + and no inferior process is created. +3. The garbage collector marks and moves all objects that can + possibly be referenced hereafter. It begins with the `
` + `PROCESS` and the currently running `PROCESS` ``, considered + as vectors containing the control stacks, object pointers in live + registers, etc. Every object in these "`PROCESS` vectors" is + marked "accessible", and every element of these objects (bindings, + etc.), and so on recursively. The "copying" algorithm moves + objects into the inferior process's address space as it marks + them. +4. If the garbage collection is "exhaustive" -- which is possible + only in the "copying" algorithm -- then both the chain of + associations and top-level local/global bindings are examined + thoroughly, which takes more time but is more likely to uncover + garbage therein. In a normal garbage collection these constructs + are not treated specially. +5. Finally, the "mark-sweep" algorithm sweeps through the storage + space, adding unmarked objects to the internal free lists for + later re-use. The "copying" algorithm maps the inferior process's + address space into MDL's own, replacing old garbagey with the new + compact storage, and the inferior process is destroyed. + +22.5 GC +------- + + + +causes the garbage collector to run and returns the total number of +words of storage reclaimed. All of its arguments are optional: if they +are not supplied, a call to GC simply causes a "copying" garbage +collection. + +If *min* is explicitly supplied as an argument, a garbage-collection +parameter is changed permanently before the garbage collector runs. +*min* is the smallest number of words of "free" (unclaimed, available +for use) movable garbage-collected storage the garbage collector will +be satisfied with having after it is done. Initially it is 8192 words. +If the total amount of reclaimed storage is less than *min*, the +garbage collector will ask the operating system for enough storage (in +1024 word blocks) to make it up. N.B.: the system may be incivil +enough not to grant the request; in that case, the garbage collector +will be content with what it has, **unless** that is not enough to +satisfy a **pending** request for storage. Then it will inform you +that it is losing. A large *min* will result in fewer total garbage +collections, but they will take longer since the total quantity of +storage to be dealt with will generally be larger. Smaller *min*s +result in shorter, more frequent garbage collections. + +22.6. BLOAT +----------- + +`BLOAT` is used to cause a temporary expansion of the available +storage space with or without changing the garbage-collection +parameters. `BLOAT` is particularly useful for avoiding unnecessary +garbage collections when loading a large file. It will cause (at most) +one garbage collection, at the end of which the available storage will +be at least the amount specified in the call to `BLOAT`. (Unless, of +course, the operating system is cranky and will not provide the +storage. Then you will get an error. `` from this error will +cause the `BLOAT` to return `1`, which usually just causes you to lose +at a later time -- unless the operating system feels nicer when the +storage is absolutely necessary.) + +A call to BLOAT looks like this: + + + +where all arguments on the first line above are `FIX`, optional (`0` +by default), and indicate the following: + +- *fre*: number of words of free movable storage desired (for + `LIST`s, `VECTOR`s, `ATOM`s, etc.) +- *stk*: number of words of free control-stack space desired (for + functional applications and binding of `ATOM`s) +- *lcl*: number of new top-level `LVAL`s for which to leave space + (`SET`s of `ATOM`s which are not currently bound) +- *glb*: number of new `GVAL`s for which to leave space (in the + global vector) +- *typ*: number of new `TYPE` definitions for which to leave space + (in the `TYPE` vector) +- *sto*: number of words of immovable garbage-collected storage + desired +- *pstk*: number of words of free internal-stack space desired (for + `READ`ing large `STRING`s, and calling routines within the + interpreter and compiled programs) + +Arguments on the second line are also `FIX` and optional, but they set +garbage-collection parameters permanently, as follows: + +- *min*: as for `GC` +- *plcl*: number of slots for `LVAL`s added when the space for + top-level `LVAL`s is expanded (initially 64) +- *pglb*: number of slots for `GVAL`s added when the global vector + is grown (initially 64) +- *ptyp*: number of slots for `TYPE`s added when the `TYPE` vector + is grown (initially 32) +- *imp*: number of words of immovable garbage-collected storage + added when it is expanded (initially 1024) +- *pur*: number of words reserved for pure compiled programs, if + possible (initially 0) +- *dpstk*: most desirable size for the internal stack, to prevent + repeated shrinking and `GROW`ing (initially 512) +- *dstk*: most desirable size for the control stack (initially 4096) + +`BLOAT` returns the actual number of words of free movable +garbage-collected storage available when it is done. + +22.7. BLOAT-STAT +---------------- + +`BLOAT-STAT` can be used with `BLOAT` to "tune" the garbage collector +to particular program requirements. + + + +fills the *uvector* with information about the state of storage of +MDL. The argument should be a `UVECTOR` of length 27 and `UTYPE` +`FIX`. If `BLOAT-STAT` does not get an argument, it will provide its +own `UVECTOR`. The information returned is as follows: the first 8 +elements indicate the number of garbage collections that are +attributable to certain causes, and the other 19 give information +about certain areas of storage. In detail: + +1. number of garbage collections caused by exhaustion of movable + garbage-collected storage +2. ditto by overflow of control stack(s) +3. ditto by overflow of top-level-`LVAL` section of control stack(s) +4. ditto by overflow of global vector +5. ditto by overflow of `TYPE` vector +6. ditto by exhaustion of immovable garbage-collected storage +7. ditto by overflow of internal stack +8. ditto by overflow of both stacks at the same time (rare) + +9. number of words of movable storage +10. number of words of movable storage used since last `BLOAT-STAT` +11. maximum number of words of movable storage ever existing +12. number of words of movable storage used since MDL began running +13. maximum size of control stack +14. number of words on control stack in use +15. maximum size of control stack(s) ever reached +16. number of slots for top-level `LVAL`s +17. number of top-level `LVAL`s existing +18. number of slots for `GVAL`s in global vector +19. number of `GVAL`s existing +20. number of slots for `TYPE`s in `TYPE` vector +21. number of `TYPE`s existing +22. number of words of immovable garbage-collected storage +23. number of words of immovable storage unused +24. size of largest unused contiguous immovable-storage block +25. number of words on internal stack +26. number of words on internal stack in use +27. maximum size of internal stack ever reached + +22.8. GC-MON +------------ + + + +("garbage-collector monitor") determines whether or not the +interpreter will hereafter print information on the terminal when a +garbage collection starts and finishes, according to whether or not +its argument is true. It returns the previous state. Calling it with +no argument returns the current state. The initial state is false. + +When typing is enabled, the "copying" garbage collector prints, when +it starts: + + GIN reason subr-that-caused:atom + +and, when it finishes: + + GOUT seconds-needed + +The "mark-sweep" garbage collector prints `MSGIN` and `MSGOUT` instead +of `GIN` and `GOUT`. + +22.9. Related Subroutines +------------------------- + +Two `SUBR`s, described next, use only part of the garbage-collector +algorithm, in order to find all pointers to an object. `GC-DUMP` and +`GC-READ`, as their names imply, also use part in order to translate +between MDL objects and binary representation thereof. + +### 22.9.1. SUBSTITUTE + + + +returns *old*, after causing a miniature garbage collection to occur, +during which **all** references to *old* are changed so as to refer to +*new*. Neither argument can be of `PRIMTYPE` `STRING` or `BYTES` or +`LOCD` or live on the control stack, unless both are of the same +`PRIMTYPE`. One `TYPE` name cannot be substituted for another. One of +the few legitimate uses for it is to substitute the "right" `ATOM` for +the "wrong" one, after `OBLIST`s have been in the wrong state. This is +more or less the way `ATOM`s are impurified. It is also useful for +unlinking `RSUBR`s. `SUBSTITUTE` returns *old* as a favor: unless you +hang onto *old* at that point, it will be garbage-collected. + +22.9.2 PURIFY +------------- + + + +returns its last argument, after causing a miniature garbage +collection that results in all the arguments becoming pure and +sharable, and ignored afterward by the garbage collector. No argument +can live on the control stack or be of `PRIMTYPE` `PROCESS` or `LOCD` +or `ASOC`. Sharing between operating-system processes actually occurs +after a `SAVE`, if and when the `SAVE` file is `RESTORE`d. + +Chapter 23. MDL as a System Process +=================================== + +This chapter treats MDL considered as executing in an operating-system +process, and interactions between MDL and other operating-system +processes. See also section 21.8.13. + +23.1. TIME +---------- + +`TIME` takes any number of arguments, which are evaluated but ignored, +and returns a `FLOAT` giving the number of seconds of CPU time the MDL +process has used so far. `TIME` is often used in machine-level +debugging to examine the values of its arguments, by having MDL's +superior process (say, DDT) plant a breakpoint in the code for `TIME`. + +23.2. Names +----------- + + + +returns a `STRING` which is the "user name" of MDL's process. This is +the "uname" process-control variable in the ITS version and the +logged-in directory in the Tenex and Tops-20 versions. + + + +returns a `STRING` which is the "intended user name" of MDL's process. +This is the "xuname" process-control variable in the ITS version and +identical to in the Tenex and Tops-20 versions. + + + +returns a `STRING` which is the "job name" of MDL's process. This is +the "jname" process-control variable in the ITS version and the +`SETNM` name in the Tenex and Tops-20 versions. The characters belong +to the "sixbit" or "printing" subset of ASCII, namely those between +`` and `` inclusive. + + + +returns a `STRING` which is the "intended job name" of MDL's process. +This is the "xjname" process-control variable in the ITS version and +identical to `` in the Tenex and Tops-20 versions. + +23.3. Exits +----------- + + + +attempts to log out the process in which it is executed. It will +succeed only if the MDL is the top-level process, that is, it is +running disowned or as a daemon. If it succeeds, it of course never +returns. If it does not, it returns `#FALSE ()`. + + + +causes MDL to stop running, in an orderly manner. In the ITS version, +it is equivalent to a `.LOGOUT 1` instruction. In the Tenex and +Tops-20 versions, it is equivalent to a control-C signal, and control +passes to the superior process. + + + +("value return") seldom returns. It passes control back up the process +tree to the superior of MDL, passing its argument as a message to that +superior. If it does return, the value is `#FALSE ()`. If the argument +is a `STRING`, it is passed to the superior as a command to be +executed, via `.VALUE` in the ITS version and `RSCAN` in the Tops-20 +version. If the argument is a `FIX`, it is passed to the superior as +the "effective address" of a `.BREAK 16`, instruction in the ITS +version and ignored in other versions. + +23.4. Inter-process Communication +--------------------------------- + +All of the `SUBR`s in this section are available only in the ITS +version. + +The IPC ("inter-process communication") device is treated as an I/O +device by ITS but not explicitly so by MDL: that is, it is never +`OPEN`ed. It allows MDL to communicate with other ITS processes by +means of sending and receiving messages. A process identifies itself +as sender or recipient of a message with an ordered pair of "sixbit" +`STRING`s, which are often but not always `` and ``. A +message has a "body" and a "type". + +### 23.4.1. SEND and SEND-WAIT + + + + + +both send an IPC message to any job that is listening for it as +*othern1* *othern2*. *body* must be either a `STRING`, or a `UVECTOR` +of objects of `PRIMTYPE` `WORD`. *type* is an optional `FIX`, `0` by +default, which is part of the information the other guy receives. The +last two arguments are from whom the message is to be sent. These are +optional, and `` and `` respectively are used by +default. `SEND` returns a `FALSE` if no one is listening, while +`SEND-WAIT` hangs until someone wants it. Both return `T` if someone +accepts the message. + +### 23.4.2. The "IPC" Interrupt + +When your MDL process receives an IPC message, `"IPC"` occurs (chapter +21). A handler is called with either four or six arguments gleaned +from the message. *body*, *type*, *othern1*, and *othern2* are +supplied only if they are not this process's `` and ``. + +There is a built-in `HANDLER` for the `"IPC"` interrupt, with a +handler named `IPC-HANDLER` and `0` in the `PROCESS` slot. The handler +prints out on the terminal the *body*, whom it is from, the *type* if +not `0`, and whom it is to if not `` ``. If the *type* +is `1` and the *body* is a `STRING`, then, after the message +information is printed out, the `STRING` is `PARSE`d and `EVAL`uated. + +### 23.4.3. IPC-OFF + +`` stops all listening on the IPC device. + +### 23.4.4. IPC-ON + + + +causes listening on the IPC device as *myname1* *myname2*. If no +arguments are provided, listening is on `` ``. When a +message arrives, `"IPC"` occurs. + +MDL is initially listening as `` `` with the built-in +`HANDLER` set up on the `"IPC"` interrupt with a priority of `1`. + +### 23.4.5. DEMSIG + + + +signals to ITS (directly, not via the IPC device) that the daemon +named by its argument should run now. It returns `T` if the daemon +exists, `#FALSE ()` otherwise. + +Chapter 24. Efficiency and Tastefulness +======================================= + +24.1. Efficiency +---------------- + +Actually, you make MDL programs efficient by thinking hard about what +they really make the interpreter **do**, and making them do less. Some +guidelines, in order of decreasing expense: + +1. Free storage is expensive. +2. Calling functions is expensive. +3. `PROG` and `REPEAT` are expensive, except when compiled. + +Explanation: + +1. Unnecessary use of free storage (creating needless `LIST`s, + `VECTOR`s, `UVECTOR`s, etc.) will cause the garbage collector to + run more often. This is **expensive!** A fairly large MDL (for + example, 60,000 36-bit words) can take ten seconds of PDP-10 CPU + time for a garbage collection. Be especially wary of constructions + like `(0)`. Every time that is evaluated, it creates a new + one-element `LIST`; it is too easy to write such things when they + aren't really necessary. Unless you are doing `PUT`s or `PUTREST`s + on it, use `'(0)` instead. +2. Sad, but true. Also generally ignored. If you call a function only + once, or if it is short (less than one line), you are much better + off in speed if you substitute its body in by hand. On the other + hand, you may be much worse off in modularity. There are + techniques for combining several `FUNCTION`s into one `RSUBR` + (with `RSUBR-ENTRY`s), either during or after compilation, and for + changing `FUNCTION`s into `MACRO`s. +3. `PROG` is almost never necessary, given (a) `"AUX"` in + `FUNCTION`s; (b) the fact that `FUNCTION`s can contain any number + of `FORM`s; (c) the fact that `COND` clauses can contain any + number of `FORM`s; and (d) the fact that new variables can be + generated and initialized by `REPEAT`. However, `PROG` may be + useful when an error occurs, to establish bindings needed for + cleaning things up or interacting with a human. + +The use of `PROG` may be sensible when the normal flow of control can +be cut short by unusual conditions, so that the program wants to +`RETURN` before reaching the end of `PROG`. Of course, nested `COND`s +can accomplish the same end, but deep nesting may tend to make the +program unreadable. For example: + + > + > + + > + > + > + +could instead be written + + + + + )>)> + +By the way, `REPEAT` is faster than `GO` in a `PROG`. The `` +`FORM` has to be separately interpreted, right? In fact, if you +organize things properly you **very** seldom need a `GO`; using `GO` +is generally considered "bad style", but in some cases it's needed. +Very few. + +In many cases, a `REPEAT` can be replaced with a `MAPF` or `MAPR`, or +an `ILIST`, `IVECTOR`, etc. of the form + + > + +which generates an `N`-element `LIST` of successive numbers starting +at `X+1`. + +Whether a program is interpreted or compiled, the first two +considerations mentioned above hold: garbage collection and function +calling remain expensive. Garbage collection is, clearly, exactly the +same. Function calling is relatively more expensive. However, the +compiler careth not whether you use `REPEAT`, `GO`, `PROG`, `ILIST`, +`MAPF`, or whatnot: it all gets compiled into practically the same +thing. However, the `REPEAT` or `PROG` will be slower if it has an +`ACTIVATION` that is `SPECIAL` or used other than by `RETURN` or +`AGAIN`. + +### 24.1.1. Example + +There follows an example of a `FUNCTION` that does many things wrong. +It is accompanied by commentary, and two better versions of the same +thing. (This function actually occurred in practice. Needless to say, +names are withheld to protect the guilty.) + +Blunt comment: this is terrible. Its purpose is to output the +characters needed by a graphics terminal to draw lines connecting a +set of points. The points are specified by two input lists: `X` values +and `Y` values. The output channel is the third argument. The actual +characters for each line are returned in a `LIST` by the function +`TRANS`. + + > >> + )> + + <.N .Y>>)> + > .L>)> > + )) + > .CHN> + > .L1> + )> >> + +Comments: + +1. `LIST` is only temporarily necessary. It is just created and then + thrown away. +2. Worse, the construct `(!.LIST !)` **copies** the + previous elements of `LIST` every time it is executed! +3. Indexing down the elements of `LIST` as in `<.N .LIST>` takes a + long time, if the `LIST` is long. `<3 ...>` or `<4 ...>` is not + worth worrying about, but `<10 ...>` is, and `<100 ...>` takes + quite a while. Even if the indexing were not phased out, the + compiler would be happier with ``. +4. The variable `CHN` is unnecessary if `OUTCHAN` is bound to the + argument `CHANNEL`. +5. It is tasteful to call `ERROR` in the same way that F/SUBRs do. + This includes using an `ATOM` from the `ERRORS` `OBLIST` (if one + is appropriate) to tell what is wrong, and it includes identifying + yourself. + +So, do it this way: + + ) + >> + )> + > + )> + <1 .Y>>)) + >> + >> + )>> + > + >>> + +Of course, if you know how long is the `LIST` that `TRANS` returns, +you can avoid using the inner `REPEAT` loop and have explicit `PRINC`s +for each element. This can be done even better by using `MAPF`, as in +the next version, which does exactly the same thing as the previous +one, but uses `MAPF` to do the `REST`ing and the end conditional: + + ) + >> + )> + > + #FUNCTION ((XE YE) + #FUNCTION ((T) >) >) + .X + .Y> + "DONE"> + +24.2. Creating a LIST in Forward Order +-------------------------------------- + +If you must create the elements of a `LIST` in sequence from first to +last, you can avoid copying earlier ones when adding a later one to +the end. One way is to use `MAPF` or `MAPR` with a first argument of +`,LIST`: the elements are put on the control stack rather than in free +storage, until the final call to `LIST`. If you know how many elements +there will be, you can put them on the control stack yourself, in a +`TUPLE` built for that purpose. Another way is used when `REPEAT` is +necessary: + + >> + ... + >> + ...> + +Here, `.LAST` always points to the current last element of the `LIST`. +Because of the order of evaluation, the `` could also be +written `>`. + +24.3. Read-only Free Variables +------------------------------ + +If a Function uses the value of a free variable +(`` or ``) without changing +it, the compiled version may be more efficient if the value is +assigned to a dummy `UNSPECIAL` `ATOM` in the Function's `"AUX"` list. +This is true because an `UNSPECIAL` `ATOM` gets compiled into a slot +on the control stack, which is accessible very quickly. The tradeoff +is probably worthwhile if a *special* is referenced more than once, or +if an *unmanifest* is referenced more than twice. Example: + + >) + > .THINGS>> + +24.4. Global and Local Values +----------------------------- + +In the interpreter the sequence `,X .X ,X .X` is slower than +`,X ,X .X .X` because of interference between the `GVAL` and `LVAL` +mechanisms (appendix 1). Thus it is not good to use both the `GVAL` +and `LVAL` of the same `ATOM` frequently, unless references to the +`LVAL` will be compiled away (made into control stack references). + +24.5. Making Offsets for Arrays +------------------------------- + +It is often the case that you want to attach some meaning to each +element of an array and access it independently of other elements. +Firstly, it is a good idea to use names (`ATOM`s) rather than integers +(`FIX`es or even `OFFSET`s) for offsets into the array, to make future +changes easier. Secondly, it is a good idea to use the `GVAL`s of the +name `ATOM`s to remember the actual `FIX`es, so that the `ATOM`s can +be `MANIFEST` for the compiler's benefit. Thirdly, to establish the +`GVAL`s, both the interpreter and the compiler will be happier with +`` rather than +`>`. + +24.6. Tables +------------ + +There are several ways in MDL to store a table, that is, a collection +of (names and) values that will be searched. Unsurprisingly, choosing +the best way is often dictated by the size of the table and/or the +nature of the (names and) values. + +For a small table, the names and values can be put in (separate) +structures -- the choice of `LIST` or array being determined by +volatility and limitability -- which are searched using `MEMQ` or +`MEMBER`. This method is very space-efficient. If the table gets +larger, and if the elements are completely orderable, a (uniform) +vector can be used, kept sorted, and searched with a binary search. + +For a large table, where reasonably efficient searches are required, a +hashing scheme is probably best. Two methods are available in MDL: +associations and `OBLIST`s. + +In the first method, `PUTPROP` and `GETPROP` are used, which are very +fast. The number of hashing buckets is fixed. Duplicates are +eliminated by `==?` testing. If it is necessary to use `=?` testing, +or to find all the entries in the table, you can duplicate the table +in a `LIST` or array, to be used only for those purposes. + +In the second method, `INSERT` and `LOOKUP` on a specially-built +`OBLIST` are used. (If the names are not `STRING`s, they can be +converted to `STRING`s using `UNPARSE`, which takes a little time.) +The number of hashing buckets can be chosen for best efficiency. +Duplicates are eliminated by `=?` testing. MAPF/R can be used to find +all the entries in the table. + +24.7. Nesting +------------- + +The beauty of deeply-nested control structures in a single `FUNCTION` +is definitely in the eye of the beholder. (`PPRINT`, a preloaded +`RSUBR`, finds them trying. However, the compiler often produces +better code from them.) **If** you don't like excessive nesting, then +you will agree that + + + ...) ...> + +looks better than + + > ...) ...> + +and that + + )> + ... + ...> + +looks better than + + ) + (ELSE ...)> + ...> + +You can see the nature of the choices. Nesting is still and all better +than `GO`.