-*- Text -*-
Copyright (c) 1999 Massachusetts Institute of Technology

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or (at
your option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
------------------------------


	This file contains documentation for both the LOCK device,
	and the locked switch list and critical routine features.


THE LOCK DEVICE.	[ In ITS version 1630 and later ]

The LOCK device provides a simple and foolproof technique for interlocking
between jobs.  While not as efficient as using an AOSE-style switch in
shared memory, the LOCK device is much easier to use and is suitable for
use in those situations where a lock is only rarely siezed, and is only
held for brief periods of time.

Locks are identified using a single SIXBIT word as a lock name.  (Actually,
and non-zero 36-bit word can be used.)  A job can lock the FOO lock by
opening "LOCK:FOO" for output.  The LOCK device ignores any second filename
or directory name you might supply.  The lock will be released when the
channel is closed.  As long as the job keeps the lock channel open, any
other attempts to lock the same lock result in a %ENAFL (FILE LOCKED)
error.

When opening the LOCK device, bit 1.4 indicates that the opener wishes to
hang waiting for the lock, rather than receiving a %ENAFL error.

It is also possible to receive a %EFLDV (DEVICE FULL) error when opening
the LOCK device if the table of currently held locks that ITS maintains
overflows.

Here is the registry of all lock names.  Any program that uses the LOCK
device should register the name of the locks it uses here.

	Name		Purpose
	------		------
	NUTMEG		This lock is used in an example later
			on in this file.

	MBXxxx		The set of locks whose names start with the
			characters "MBX" are used by COMSAT and GMSGS to
			coordinate access to mailboxes.  The algorithm for
			computing the lock for the mailbox named
			SNAME;NAME1 NAME2 is:

				MOVE A,NAME1
				ROT A,1
				ADD A,NAME2
				ROT A,1
				ADD A,SNAME
				IDIVI A,777773		; Largest prime < 1,,0
				HRLI B,(SIXBIT /MBX/)	; Result in B = A + 1


THE LOCKED SWITCH LIST AND CRITICAL ROUTINE FEATURES.

There is an obvious technique for interlocking between jobs on
the PDP-10 - namely, the use of AOSE - which has been unusable
under ITS simply because a job might be killed while it had a
switch locked, thus causing the switch to remain locked forever.
These implemented features allow that problem, and the similar
problem of system crashes while a switch in a permanent data
base is locked, to be solved without any loss of efficiency.

The locked switch list feature allows a program to maintain a
list of switches which it has locked, so that ITS can unlock
them when the job is killed (or logs out, or is gunned)
or is reset (for example, $l'd by DDT). The critical routine
feature allows the program to cause ITS to perform certain
actions if the job is killed or reset with the PC in a specified range.

These features do not prevent bugs in the programs accessing
a switch from causing the switch not to be unlocked. They make
possible successful interlocking but do not guarantee it.


A. THE DEFINITIONS OF THE FEATURES.

These features are all activated by setting the %OPLOK bit in
the .OPTION USET-variable to 1 (This bit is 1000,,). If that is
not done, words 43 and 44 are not in any way special. Also, the
addresses used do not have to be 43 and 44; that is merely their
default values. The actual adresses used are relative to the
contents of the .40ADDR USET-variable, which initially contains
40 . If .40ADDR were set to 1000, locations 1003 and 1004 would
used. Of course, all system actions that normally use locations
40, 41 and 42 would use 1000, 1001 and 1002.

 1. THE LOCKED SWITCH LIST.

When the %OPLOK bit is 1, location 43 is taken to be the pointer
to the job's locked switch list. The list pointer
should either be 0, meaning the list is empty, or the address of
the first two-word switch block. The format of a switch block is
as follows:

 1st word:	the switch, or, if the indirect bit is set in
		 the second word, the address of the switch
		 (multiple indirection is not allowed).
 2nd word:	the RH holds the address of the next switch
		 block, or 0 (this is the CDR of the list).
		the LH holds the instruction to be executed to
		 unlock the switch. The index field is ignored.
		 The instruction must either be an AOS or SOS
		 with 0 in the AC field, a logical instruction
		 (such as SETAM, IORM, SETOM, ANDCAM), a halfword
		 instruction, a MOVEM, MOVNM, MOVSM, MOVMM,
		 ADDM or SUBM. The AC will never be modified
		 even if the instruction
		 says to do so. If the LH is 0, the instruction
		 SETOM is used, as the most common choice.

When the job is killed or reset, if the locked switch list is
not null, the system looks at the first switch block, unlocks
the switch by executing the unlock instruction with the switch
word as its memory argument, and the copying the RH of the
second word of the switch block into 43 to remove the block
from the list (this makes sure that no switch is ever unlocked
twice due to PCLSR'ing).
This procedure is repeated until 43 contains 0,
thus unlocking all the switches in the list. Obviously since the
job's pages are about to be discarded this action will have no
consequence unless the switches are in pages shared with other
jobs.

If in the process of unlocking the switches
the system tries to read from a nonexistent page or write in a
pure page, it gives up entirely, ignoring the rest of the
locked switch list, and also the critical routine table. Also
if the end of the list is not reached after 512. switch blocks
have been unlocked, the system gives up.

 2. THE CRITICAL ROUTINE TABLE.

When the %OPLOK bit is 1, location 44 is considered to be an
AOBJN pointer to a table of critical sections of code, which are
involved in the manipulation of switches or the locked switch
list. The table should be a vector of two-word entries, one for
each critical section of code. The first word of each entry
should give the boundaries of the critical section: the left
half should have the the address of the first instruction of the
critical section; the right half, the address of the first
instruction after the critical section. The second word of the
entry should have an unlock instruction, subject to the same
restrictions as for locked switch list unlock instructions,
the only difference being that the address field of the
instruction is taken from the RH of the word, as one would expect,
whereas in unlock instructions in switch blocks the address of
the switch block is used, and the RH of the word is the CDR.
Examples will make all this clear.

If the job is killed or reset while the PC is in the range
specified by a critical routine table entry, the switch
specified by the entry will be unlocked by executing the unlock
instruction. It is possible
for the ranges specified by two entries to overlap; to make sure
that no entry is processed more than once, the system updates 44
as it processes each entry.

As with the switch list unlocking, the system abandons the whole
thing if it needs to read or write in a page that won't allow it.

 3. FATAL INTERRUPTS IN TOP-LEVEL JOBS WITH LOCKED LOCKS.

When the %OPLKF bit in a job's .OPTION variable is set to 1, and if the job
is the top-level job of a non-disowned tree, then if that job ever receives
a fatal interrupt its locks will be unlocked by the system job as part of
the process of detaching it.  Thus fatal interrupts in network servers,
toplevel DDTs, system demons, etc. that happen while those jobs have shared
databases locked, will not keep other jobs blocked waiting for someone to
gun down the corpse.  

The reason you might not want to set %OPLKF is that after a jobs locks are
unlocked, it will not in general work to proceed the job.  Such a detached
corpse will only be good for autopsy, not revival.


B. USING THE FEATURES FOR SWITCHES IN A SHARED PAGE.

In this section it is assumed that the page is not part of a
disk file, and will not survive through a system crash. That
means that it is not necessary to worry about unlocking switches
that are locked at the time of a system crash.

 1. LOCKING AN AOSE-STYLE SWITCH.

The proper routine to use for locking a switch follows:
The address of the two-word switch block is assumed to be in A.

LOCK:	AOSE (A)		;LOCK THE SWITCH, OR WAIT TILL WE CAN.
	 .HANG
LOCK1:	MOVE B,43		;PUT THE SWITCH ON THE
	HRLI B,(SETOM)
	MOVEM B,1(A)		;LOCKED SWITCH LIST
	MOVEM A,43
LOCK2:	POPJ P,

This routine will set up the switch as a switch block, and make
43 point to it. The contents of the switch block will be:

	0		;This word is the switch itself!
	SETOM <previous contents of 43>
			;The SETOM is the unlock instruction.
			;The RH has nothing to do with the SETOM;
			;it points to the next block of the list.

Note that the HRLI instruction is superfluous, because 0 in the
left half of the second word of the block is the same as (SETOM).

The three instructions starting at LOCK1 are critical because
the switch has been locked but is not on the locked switch list.
Therefore, an entry in the critical routine table of the form

	LOCK1,,LOCK2
	SETOM @A

is needed, in case the job is killed while executing there.


 2. UNLOCKING AN AOSE-STYLE SWITCH.

The correct way to unlock a switch follows:
(assuming that A points to the switch block and that
the switch block is the first item on the locked switch list).

UNLOCK:	HRRZ B,1(A)	;REMOVE THE SWITCH FROM THE
	MOVEM B,43	;LOCKED SWITCH LIST.
UNLOC1:	SETOM (A)	;THEN UNLOCK THE SWITCH.
UNLOC2:	POPJ P,

The instruction at UNLOC1 is critical because the switch is
locked but not on the locked switch list. Therefore, an entry
is needed in the critical routine table as follows:

	UNLOC1,,UNLOC2
	SETOM @A

Note that the switch must be removed from the list before
unlocking. That is because if the switch is locked but not on
the list, the critical routine table may be used to unlock it,
but if the switch is on the list but not locked, it will be set
to -1 if the job is killed, and that could cause problems if
some other job had locked the switch.


C. HANDLING SWITCHES IN DISK FILES.

The extra problem here is to make sure that if the system crashes, the next
time the data base is accessed after the system is reloaded all the
switches will be reinitialized.  This may be done by using the time and
date that the system was started to distinguish different incarnations of
it.

The technique uses a variable INITDN, stored in the database, and a LOCK
device lock named "NUTMEG".  Whenever a program accesses the database for
the first time, it must check INITDN.  If INITDN does not equal the time
the system was started, the database requires initialization.  If a job
detects that the database requires initialization, it seizes the NUTMEG
lock, checks to see that initialization really is required, performs the
initialization, updates INITDN, and releases the lock.

The skeleton of the necessary routine is as follows, assuming that the file
has already been opened and its pages mapped into core with a CORBLK system
call, and INITDN is the address of the variable and SWIT1, SWIT2, etc. are
the addresses of switches.

INIT:	.CALL [	SETZ ? SIXBIT /RQDATE/
		MOVEM A		; Ignore 1st value
		SETZM A]	; 2nd value is time of system startup.
	 .LOSE 1000
	JUMPL A,[MOVEI A,300.	; System doesn't know time,
		 .SLEEP A,	; Sleep 10. sec and hope it
		 JRST INIT]	; finds out the time.
	CAMN A,INITDN		; Init needed?
	 POPJ P,		; No => No need for the lock, just return.
	.CALL [	SETZ ? SIXBIT /OPEN/
		MOVSI 10\.UAO	; 1.4 => Hang waiting for initialization lock
		MOVEI CH
		MOVE [SIXBIT /LOCK/]
		SETZ [SIXBIT /NUTMEG/]]	; Registered, database specific lock
	 .LOSE
	CAMN A,INITDN		; Init needed?
	 JRST INIT1		; No => someone else did it, unlock and return.
	SETOM SWIT1		; Start setting switches to unlocked
	SETOM SWIT2		; state. These insns should address
	SETOM SWIT3		; locations in the mapped file pages.
	;; etc.
	SETOM SWIT9
	MOVEM A,INITDN		; Mark init complete.
INIT1:	.CLOSE CH,
	POPJ P,

Note that the first CAMN A,INITDN can be omitted, and the algorithm is
still correct, but the second CAMN A,INITDN can -not- be safely omitted.


D. REFERENCE COUNTS.

Sometimes it is desirable to keep a count of the number of jobs
looking at a data base. When the count is AOS'd, an entry must
be put on the locked switch list to cause it to be SOS'd if the
job is killed. For example, assuming that A points to the count
and B points to an available two word block of memory:

LOOK:	AOS (A)
LOOK1:	MOVEM A,(B)
	MOVSI C,(SOS @)
	HRR C,43
	MOVEM C,1(B)	;SET UP UNLOCK INSN & CDR POINTER.
	MOVEM B,43	;PUT THE BLOCK ON THE LIST.
LOOK2:	POPJ P,

The critical code table entry needed is:

	LOOK1,,LOOK2
	SOS @A

When finished looking, the count must be SOS'd, and removed from
the list. The following routine will work, assuming only that
the block at the front of the list was put on by the LOOK routine
above:

UNLOOK:	MOVE B,43
	HRRZ A,(B)	;GET ADDRESS OF THE COUNT VARIABLE TO BE SOS'D .
	HRRZ B,1(B)	;GET CDR POINTER.
UNLOO1:	MOVEM B,43	;REMOVE BLOCK FROM LIST.
UNLOO2:	SOS (A)		;DECREMENT THE COUNT.
	POPJ P,

The critical code table entry needed is:

	UNLOO1,,UNLOO2
	SOS @A


E. THE .HANG INSTRUCTION.

The .HANG UUO is to make it easy for programs to wait for various
conditions, such as locks becoming unlocked.  It should be used the way a
JRST .-1 would be used in a stand-alone program.  .HANG is documented
completely in .INFO.;ITS UUOS.


F. THE UNLOCK SYSTEM CALL.

The UNLOCK system call can be used to unlock the switches of a specified
job.  Usually this is used by a job to unlock its own switches, but it can
be used on the switches of any job that the executing job is allowed to
write.

Usual case:

	.CALL [	SETZ ? SIXBIT /UNLOCK/
		SETZI %JSELF ]	; Unlock all my switches
	 .LOSE %LSSYS

(Note that this has nothing to do with the LOCK device.  In particular it
will -not- close LOCK device channels.)