2 Copyright (c) 1999 Massachusetts Institute of Technology
4 This program is free software; you can redistribute it and/or modify
5 it under the terms of the GNU General Public License as published by
6 the Free Software Foundation; either version 2 of the License, or (at
7 your option) any later version.
9 This program is distributed in the hope that it will be useful, but
10 WITHOUT ANY WARRANTY; without even the implied warranty of
11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 General Public License for more details.
14 You should have received a copy of the GNU General Public License
15 along with this program; if not, write to the Free Software
16 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
17 ------------------------------
21 This file contains documentation for both the LOCK device,
22 and the locked switch list and critical routine features.
26 THE LOCK DEVICE. [ In ITS version 1630 and later ]
28 The LOCK device provides a simple and foolproof technique for interlocking
29 between jobs. While not as efficient as using an AOSE-style switch in
30 shared memory, the LOCK device is much easier to use and is suitable for
31 use in those situations where a lock is only rarely seized, and is only
32 held for brief periods of time.
34 Locks are identified using a single SIXBIT word as a lock name. (Actually,
35 and non-zero 36-bit word can be used.) A job can lock the FOO lock by
36 opening "LOCK:FOO" for output. The LOCK device ignores any second filename
37 or directory name you might supply. The lock will be released when the
38 channel is closed. As long as the job keeps the lock channel open, any
39 other attempts to lock the same lock result in a %ENAFL (FILE LOCKED)
42 When opening the LOCK device, bit 1.4 indicates that the opener wishes to
43 hang waiting for the lock, rather than receiving a %ENAFL error.
45 It is also possible to receive a %EFLDV (DEVICE FULL) error when opening
46 the LOCK device if the table of currently held locks that ITS maintains
49 Here is the registry of all lock names. Any program that uses the LOCK
50 device should register the name of the locks it uses here.
54 NUTMEG This lock is used in an example later
57 MBXxxx The set of locks whose names start with the
58 characters "MBX" are used by COMSAT and GMSGS to
59 coordinate access to mailboxes. The algorithm for
60 computing the lock for the mailbox named
68 IDIVI A,777773 ; Largest prime < 1,,0
69 HRLI B,(SIXBIT /MBX/) ; Result in B = A + 1
74 THE LOCKED SWITCH LIST AND CRITICAL ROUTINE FEATURES.
76 There is an obvious technique for interlocking between jobs on
77 the PDP-10 - namely, the use of AOSE - which has been unusable
78 under ITS simply because a job might be killed while it had a
79 switch locked, thus causing the switch to remain locked forever.
80 These implemented features allow that problem, and the similar
81 problem of system crashes while a switch in a permanent data
82 base is locked, to be solved without any loss of efficiency.
84 The locked switch list feature allows a program to maintain a
85 list of switches which it has locked, so that ITS can unlock
86 them when the job is killed (or logs out, or is gunned)
87 or is reset (for example, $l'd by DDT). The critical routine
88 feature allows the program to cause ITS to perform certain
89 actions if the job is killed or reset with the PC in a specified range.
91 These features do not prevent bugs in the programs accessing
92 a switch from causing the switch not to be unlocked. They make
93 possible successful interlocking but do not guarantee it.
96 A. THE DEFINITIONS OF THE FEATURES.
98 These features are all activated by setting the %OPLOK bit in
99 the .OPTION USET-variable to 1 (This bit is 1000,,). If that is
100 not done, words 43 and 44 are not in any way special. Also, the
101 addresses used do not have to be 43 and 44; that is merely their
102 default values. The actual adresses used are relative to the
103 contents of the .40ADDR USET-variable, which initially contains
104 40 . If .40ADDR were set to 1000, locations 1003 and 1004 would
105 used. Of course, all system actions that normally use locations
106 40, 41 and 42 would use 1000, 1001 and 1002.
108 1. THE LOCKED SWITCH LIST.
110 When the %OPLOK bit is 1, location 43 is taken to be the pointer
111 to the job's locked switch list. The list pointer
112 should either be 0, meaning the list is empty, or the address of
113 the first two-word switch block. The format of a switch block is
116 1st word: the switch, or, if the indirect bit is set in
117 the second word, the address of the switch
118 (multiple indirection is not allowed).
119 2nd word: the RH holds the address of the next switch
120 block, or 0 (this is the CDR of the list).
121 the LH holds the instruction to be executed to
122 unlock the switch. The index field is ignored.
123 The instruction must either be an AOS or SOS
124 with 0 in the AC field, a logical instruction
125 (such as SETAM, IORM, SETOM, ANDCAM), a halfword
126 instruction, a MOVEM, MOVNM, MOVSM, MOVMM,
127 ADDM or SUBM. The AC will never be modified
128 even if the instruction
129 says to do so. If the LH is 0, the instruction
130 SETOM is used, as the most common choice.
132 When the job is killed or reset, if the locked switch list is
133 not null, the system looks at the first switch block, unlocks
134 the switch by executing the unlock instruction with the switch
135 word as its memory argument, and the copying the RH of the
136 second word of the switch block into 43 to remove the block
137 from the list (this makes sure that no switch is ever unlocked
138 twice due to PCLSR'ing).
139 This procedure is repeated until 43 contains 0,
140 thus unlocking all the switches in the list. Obviously since the
141 job's pages are about to be discarded this action will have no
142 consequence unless the switches are in pages shared with other
145 If in the process of unlocking the switches
146 the system tries to read from a nonexistent page or write in a
147 pure page, it gives up entirely, ignoring the rest of the
148 locked switch list, and also the critical routine table. Also
149 if the end of the list is not reached after 512. switch blocks
150 have been unlocked, the system gives up.
152 2. THE CRITICAL ROUTINE TABLE.
154 When the %OPLOK bit is 1, location 44 is considered to be an
155 AOBJN pointer to a table of critical sections of code, which are
156 involved in the manipulation of switches or the locked switch
157 list. The table should be a vector of two-word entries, one for
158 each critical section of code. The first word of each entry
159 should give the boundaries of the critical section: the left
160 half should have the the address of the first instruction of the
161 critical section; the right half, the address of the first
162 instruction after the critical section. The second word of the
163 entry should have an unlock instruction, subject to the same
164 restrictions as for locked switch list unlock instructions,
165 the only difference being that the address field of the
166 instruction is taken from the RH of the word, as one would expect,
167 whereas in unlock instructions in switch blocks the address of
168 the switch block is used, and the RH of the word is the CDR.
169 Examples will make all this clear.
171 If the job is killed or reset while the PC is in the range
172 specified by a critical routine table entry, the switch
173 specified by the entry will be unlocked by executing the unlock
174 instruction. It is possible
175 for the ranges specified by two entries to overlap; to make sure
176 that no entry is processed more than once, the system updates 44
177 as it processes each entry.
179 As with the switch list unlocking, the system abandons the whole
180 thing if it needs to read or write in a page that won't allow it.
182 3. FATAL INTERRUPTS IN TOP-LEVEL JOBS WITH LOCKED LOCKS.
184 When the %OPLKF bit in a job's .OPTION variable is set to 1, and if the job
185 is the top-level job of a non-disowned tree, then if that job ever receives
186 a fatal interrupt its locks will be unlocked by the system job as part of
187 the process of detaching it. Thus fatal interrupts in network servers,
188 toplevel DDTs, system demons, etc. that happen while those jobs have shared
189 databases locked, will not keep other jobs blocked waiting for someone to
192 The reason you might not want to set %OPLKF is that after a jobs locks are
193 unlocked, it will not in general work to proceed the job. Such a detached
194 corpse will only be good for autopsy, not revival.
197 B. USING THE FEATURES FOR SWITCHES IN A SHARED PAGE.
199 In this section it is assumed that the page is not part of a
200 disk file, and will not survive through a system crash. That
201 means that it is not necessary to worry about unlocking switches
202 that are locked at the time of a system crash.
204 1. LOCKING AN AOSE-STYLE SWITCH.
206 The proper routine to use for locking a switch follows:
207 The address of the two-word switch block is assumed to be in A.
209 LOCK: AOSE (A) ;LOCK THE SWITCH, OR WAIT TILL WE CAN.
211 LOCK1: MOVE B,43 ;PUT THE SWITCH ON THE
213 MOVEM B,1(A) ;LOCKED SWITCH LIST
217 This routine will set up the switch as a switch block, and make
218 43 point to it. The contents of the switch block will be:
220 0 ;This word is the switch itself!
221 SETOM <previous contents of 43>
222 ;The SETOM is the unlock instruction.
223 ;The RH has nothing to do with the SETOM;
224 ;it points to the next block of the list.
226 Note that the HRLI instruction is superfluous, because 0 in the
227 left half of the second word of the block is the same as (SETOM).
229 The three instructions starting at LOCK1 are critical because
230 the switch has been locked but is not on the locked switch list.
231 Therefore, an entry in the critical routine table of the form
236 is needed, in case the job is killed while executing there.
239 2. UNLOCKING AN AOSE-STYLE SWITCH.
241 The correct way to unlock a switch follows:
242 (assuming that A points to the switch block and that
243 the switch block is the first item on the locked switch list).
245 UNLOCK: HRRZ B,1(A) ;REMOVE THE SWITCH FROM THE
246 MOVEM B,43 ;LOCKED SWITCH LIST.
247 UNLOC1: SETOM (A) ;THEN UNLOCK THE SWITCH.
250 The instruction at UNLOC1 is critical because the switch is
251 locked but not on the locked switch list. Therefore, an entry
252 is needed in the critical routine table as follows:
257 Note that the switch must be removed from the list before
258 unlocking. That is because if the switch is locked but not on
259 the list, the critical routine table may be used to unlock it,
260 but if the switch is on the list but not locked, it will be set
261 to -1 if the job is killed, and that could cause problems if
262 some other job had locked the switch.
265 C. HANDLING SWITCHES IN DISK FILES.
267 The extra problem here is to make sure that if the system crashes, the next
268 time the data base is accessed after the system is reloaded all the
269 switches will be reinitialized. This may be done by using the time and
270 date that the system was started to distinguish different incarnations of
273 The technique uses a variable INITDN, stored in the database, and a LOCK
274 device lock named "NUTMEG". Whenever a program accesses the database for
275 the first time, it must check INITDN. If INITDN does not equal the time
276 the system was started, the database requires initialization. If a job
277 detects that the database requires initialization, it seizes the NUTMEG
278 lock, checks to see that initialization really is required, performs the
279 initialization, updates INITDN, and releases the lock.
281 The skeleton of the necessary routine is as follows, assuming that the file
282 has already been opened and its pages mapped into core with a CORBLK system
283 call, and INITDN is the address of the variable and SWIT1, SWIT2, etc. are
284 the addresses of switches.
286 INIT: .CALL [ SETZ ? SIXBIT /RQDATE/
287 MOVEM A ; Ignore 1st value
288 SETZM A] ; 2nd value is time of system startup.
290 JUMPL A,[MOVEI A,300. ; System doesn't know time,
291 .SLEEP A, ; Sleep 10. sec and hope it
292 JRST INIT] ; finds out the time.
293 CAMN A,INITDN ; Init needed?
294 POPJ P, ; No => No need for the lock, just return.
295 .CALL [ SETZ ? SIXBIT /OPEN/
296 MOVSI 10\.UAO ; 1.4 => Hang waiting for initialization lock
299 SETZ [SIXBIT /NUTMEG/]] ; Registered, database specific lock
301 CAMN A,INITDN ; Init needed?
302 JRST INIT1 ; No => someone else did it, unlock and return.
303 SETOM SWIT1 ; Start setting switches to unlocked
304 SETOM SWIT2 ; state. These insns should address
305 SETOM SWIT3 ; locations in the mapped file pages.
308 MOVEM A,INITDN ; Mark init complete.
312 Note that the first CAMN A,INITDN can be omitted, and the algorithm is
313 still correct, but the second CAMN A,INITDN can -not- be safely omitted.
318 Sometimes it is desirable to keep a count of the number of jobs
319 looking at a data base. When the count is AOS'd, an entry must
320 be put on the locked switch list to cause it to be SOS'd if the
321 job is killed. For example, assuming that A points to the count
322 and B points to an available two word block of memory:
328 MOVEM C,1(B) ;SET UP UNLOCK INSN & CDR POINTER.
329 MOVEM B,43 ;PUT THE BLOCK ON THE LIST.
332 The critical code table entry needed is:
337 When finished looking, the count must be SOS'd, and removed from
338 the list. The following routine will work, assuming only that
339 the block at the front of the list was put on by the LOOK routine
343 HRRZ A,(B) ;GET ADDRESS OF THE COUNT VARIABLE TO BE SOS'D .
344 HRRZ B,1(B) ;GET CDR POINTER.
345 UNLOO1: MOVEM B,43 ;REMOVE BLOCK FROM LIST.
346 UNLOO2: SOS (A) ;DECREMENT THE COUNT.
349 The critical code table entry needed is:
355 E. THE .HANG INSTRUCTION.
357 The .HANG UUO is to make it easy for programs to wait for various
358 conditions, such as locks becoming unlocked. It should be used the way a
359 JRST .-1 would be used in a stand-alone program. .HANG is documented
360 completely in .INFO.;ITS UUOS.
363 F. THE UNLOCK SYSTEM CALL.
365 The UNLOCK system call can be used to unlock the switches of a specified
366 job. Usually this is used by a job to unlock its own switches, but it can
367 be used on the switches of any job that the executing job is allowed to
372 .CALL [ SETZ ? SIXBIT /UNLOCK/
373 SETZI %JSELF ] ; Unlock all my switches
376 (Note that this has nothing to do with the LOCK device. In particular it
377 will -not- close LOCK device channels.)