1 ;;; Copyright (c) 1999 Massachusetts Institute of Technology
3 ;;; This program is free software; you can redistribute it and/or
4 ;;; modify it under the terms of the GNU General Public License as
5 ;;; published by the Free Software Foundation; either version 3 of the
6 ;;; License, or (at your option) any later version.
8 ;;; This program is distributed in the hope that it will be useful,
9 ;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
10 ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
11 ;;; General Public License for more details.
13 ;;; You should have received a copy of the GNU General Public License
14 ;;; along with this program; if not, write to the Free Software
15 ;;; Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
19 Notes on TCP input and output buffers
21 Ideally the TCP segment queues should be logically separate from the IP
22 datagram queues, but for efficiency it should be possible to have them
25 Input buffers are expected to be lists of datagrams/segments.
26 This is reasonably simple, since pointers (in known places) can just be
27 set up to the data (whereever it is in the datagram/segment).
28 There is a problem with allocation/windows in this scheme, since it
29 is possible to run out of datagram buffers before running out of window,
30 if the remote site becomes cretinous about it and sends only a few
31 bytes per dgram very fast. But in that case re-transmission can just
32 force it to buffer up its output, so that future dgrams will be somewhat
33 more filled, so it is probably OK. If this does become a screw, can
34 add code to do compaction at clock level.
36 Output buffers are a little more uncertain. Could have simple
37 circular buffer, with appropriate pointers set up so that IMP output
38 message is read directly from the buffer (same as NCP). Alternatively
39 could have queue of segments/datagrams all ready to go. Guess I'd
40 like to try doing the latter, to keep things "simple" by minimizing
41 the number of kinds of things floating around.
43 Problem with putting output datagram together: can't always predict
44 ahead of time how big the leaders are going to be! Especially true
45 for IP level, which TCP is not supposed to know too much about. Thus
46 if not careful, it could happen that output is put into a segment too
47 close to the start of the buffer, so that there is not enough room for
48 the TCP and IP headers to fit in. Have to look at this; may need to
49 give up notion of keeping all the packet internals nicely contiguous.
50 Maximum IP header length is set at 60 bytes (15 words). The maximum
51 TCP header length is also 15 words.
52 Looking at the currently defined options, it seems unlikely that ITS
53 will use any of them, and if so, they can be predicted fairly easily on
54 a per-connection basis, so things should work out okay. Note that
55 the IP level can always simply fragment stuff if it really wants to be
58 It would be nice to be able to keep track of stuff which is on the
59 device (IMP) output queue but hasn't yet gone out, in order to add
60 last-minute bits (like ACK) or even some more data. Idea: output "queue"
61 is just a list of TCP connections that need attention, so can always
62 go in and mung stuff (even change mind about outputting) just by playing
63 with connection flags/lists. This is basically how NCP does it.
65 Re packet buffer design:
66 Somewhat more hypothetical is the notion of keeping a "usage
67 count" for each buffer, so that pointers from the packet table entries
68 can point to several different buffers and not just one. When a usage
69 count hits zero, put the buffer on the freelist. Actually this is not
70 really needed for the case of a table entry pointing to more than one
71 buffer, but it IS needed for the case of more than one table entry
72 pointing to the same buffer. This might happen, for example, if
73 an internet bypass were set up so that datagrams going to ourselves
74 were simply vectored directly to the input queues. But for the
75 time being, it probably isn't too outrageous to simply re-copy the
76 datagram in question. (Also better emulates a fake network device).
78 Suggest that lists point directly to themselves rather than
79 to start of entry; this allows low-level list routines to be used
80 on all lists. The higher-level routines of course have to know what
81 offsets to use for a specific list. Alternatively provide different routines
82 for each offset needed, and equate references to the appropriate routine
83 for each use. (This is what I'm trying at the moment, hence the IRPS)
85 May want to use format <head>,,<list ptr> where <head> is the addr
86 of the initial pointer (not initial node). This allows backtracking
87 to figure out what TCP connection or IP queue a packet belongs to.
88 Brute force approach is to simply have another word for the TCB index, etc.
90 If this stuff is general enough it could be used for CHAOS packets also.
95 Buffers are all 256 (400 octal) words long, and 4 of them fit on one
96 ITS page. This size was chosen because the maximum length of an IMP
97 message (not counting IMP-Host padding) is 255 words of 4 8-bit bytes
98 per word. This allows one extra word; not much. Taking away the
99 3 IMP leader words (which has to be done anyway for NCP to continue
100 working) gives us 4 spare words per buffer.
102 This still may not be enough. Rather than shoe-horn some clever stuff
103 into each buffer header, I am opting to maintain "Packet Entries" in a
104 "Packet Table" separate from the "Packet Buffers" themselves. All
105 messages/datagrams/segments/packets are identified by a pointer
106 into the table. (Could use index, as for sockets/TCBs, but this is
107 awkward for lists). The entry identified by the pointer will contain
108 the actual pointers into the buffer associated with that packet.
109 [NOTE: may want to have these pointers point into various places, not
110 necessarily all the same buffer. Must think about this.]
112 Free buffers are linked by a freelist pointer in their first word,
113 with an identifier in the second word to help GC. The only design
114 issue for the buffers themselves is how to set things up so that it is
115 easy to GC a large freelist, i.e. identify pages that have nothing
116 but free buffers in them and thus can be removed from the system's
117 address space. Currently I have simply adopted the strategem used for
118 CHAOS buffers (at CHCLN) to have the core job snuffle over the freelist.
120 Note that the low-level routines for manipulating lists are actually
121 referenced via macros which are given an offset as argument. This
122 offset refers to the offset between the node pointer and the location
123 of the "next" pointer in the node; the macro will assemble into a
124 call to the right routine for that offset. Currently only offsets
125 up to 2 are supported; any lists that the packet is put on must
126 be threaded through the first 3 words of the node, and the macros check
130 SUBTTL Packet Tracing Code
132 IFNDEF PKTTRC,PKTTRC==:0 ; Nonzero turns tracing code on
133 IF1,IFN PKTTRC,.ERR IP packet tracing code included.
137 This code can be used to keep a history of what happens to a packet.
138 The history is stored in the packet buffer as a series of indexes
139 into a table of named events.
141 To trace a particular event use the TRCPKT macro.
145 REG is the register which currently contains an index to the packet
146 buffer table. The code is more efficient if the register is R. String
147 is the description of the event.
149 To generate the event table, you must call the TRCCOD macro somewhere
150 after the last call to TRCPKT.
154 %NTRCE==:77 ; Allow this many trace events. Value must be a bit mask (all
155 ; ones) to work right. If you make it bigger than 77
156 ; you have to change the "TRC%" in the macros to something
161 ; TRCCOD generates TRCTBL, which holds <code address,,address of event string>
162 ; for each trace event.
165 TRC%0==:<0,,[ASCIZ /Null event/]>
166 TRCTBL: REPEAT %%%TRC,CONC TRC%,\.RPCNT
167 IF2, REPEAT %%%TRC,CONC EXPUNGE TRC%,\.RPCNT
171 DEFINE TRCPKT REG,&(EVENT)
173 IFG %%%TRC-%NTRCE,.ERR Too many packet trace events!
174 CONC TRC%,\%%%TRC,==:<.,,[ASCIZ EVENT]>
192 ; Store information in the packet history data buffer
193 ; "Information" is a 6-bit quantity which indexes into a table
196 ; R/ Pkt Buffer index
198 PKTPTS: CONO PI,PIOFF ; Freeze machine
199 DPB TT,PK.HSP(R) ; Store trace info
200 IBP PK.HSP(R) ; Bump BP
201 HRRZ TT,PK.HSP(R) ; Pick up history byte ref (address only)
202 CAIL TT,PK.HSP(R) ; If we've gotten to the bottom,
203 JRST [ MOVEI TT,PK.HST(R) ;Wrap around
204 HRLI TT,(<.BP %NTRCE_30.>) ; BP to left 6 bits
205 MOVEM TT,PK.HSP(R) ;Reset it to top
211 ;;; Packet Table Entries
214 IFNDEF PKBSIZ,PKBSIZ==400 ; 256. words per packet buffer
215 IFNDEF NPKPGS,NPKPGS==40. ; # pages OK to use for packet buffers
216 NPKB==:<NPKPGS*<2000/PKBSIZ>> ; # packet buffers available
217 NPKE==:NPKB ; # packet entry nodes.
218 IFN NPKB-NPKE,.ERR You must fix the UFLS at PKTGF.
221 ; General (device driver, etc)
222 PK.FLG:: 1,,PKETBL+PK.L ; General - <flags>,,<PE freelist or dev driver list>
223 %PKPIL==:<SETZ> ; Packet locked at PI level, being output
224 %PKODN==:<1000,,> ; Packet has been output (else not yet)
225 %PKNOF==:<2000,,> ; Packet should not be freed when output done.
226 %PKFLS==:<4000,,> ; Flush pkt if seen on output queue (ie dont output)
227 %PKRTR==:<10000,,> ; Packet is being re-transmitted by TCP
228 .SEE %PQFLX ; Low bits of LH used for on-list flags.
230 PK.IP:: 0 ; IP Datagram - <IP Header ptr>,,<IP Datagram list>
231 ; May be strung on Internet Queue, IP output queue
232 PK.TCP:: 0 ; TCP Segment - <TCP Header pointer>,,<TCP Segment list>
233 ; May be strung on TCB input Q, output retransmit Q
234 PK.TCI:: 0 ; TCP Segment - <# bytes data><# bytes offset><TCB index>
235 PK%TDL==:<777700,,0> ; # octets of data in TCP segment
236 PK%TDO==:< 77,,770000> ; # octets data is offset from TCP header
237 PK%TCB==:< ,,007777> ; TCB connection index
238 PK$TDL==:<.BP PK%TDL,PK.TCI>
239 PK$TDO==:<.BP PK%TDO,PK.TCI>
240 PK$TCB==:<.BP PK%TCB,PK.TCI>
241 PK.BUF:: 0 ; General - <# wds>,,<addr of buffer>
242 PK.TIM:: 0 ; General - Time sent or received, int level (Sys time)
243 PK.DST:: 0 ; Immediate destination address if on output queue
246 PK.HST:: BLOCK 3 ; Packet trace history buffer
247 PK.HSP:: 0 ; Packet trace history pointer
250 PK.L:: OFFSET 0 ; Length of a Packet-Entry (PE) node
252 IFN .RPCNT-<NPKE-2>, 1,,.+PK.L ; Build initial freelist
256 PKETBE==:.-PK.L ; Last legal PE pointer value
258 PKEQHF: PKETBE,,PKETBL ; Header for Packet-Entry node freelist
262 ; A "queue" is a list of nodes pointed to by a "queue header" word
263 ; of format <last node>,,<first node>. Each node pointer points to
264 ; the next node pointer (or zero if no more).
265 ; There is a set of flags in the LH of a certain word, at offset
266 ; PQ.FLG, that indicate which lists a node is currently on.
268 PQ.FLG==:PK.FLG ; Offset of word list-flags are in.
269 %PQFLX==0 ; Initial val
270 IFNDEF %%%QOF,%%%QOF==0 ; This gets set to highest offset supported
272 IRPS PKQGF,,[PKQGF0:PKQGF1:PKQGF2:]PKQPL,,[PKQPL0:PKQPL1:PKQPL2:]PKQPF,,[PKQPF0:PKQPF1:PKQPF2:]%PQFL,,[%PQFL0:%PQFL1:%PQFL2:]
273 IFG .IRPCNT-%%%QOF,%%%QOF==.IRPCNT
275 %PQFL==:<1_.IRPCNT,,> ; Def a flag in LH at offset PQ.FLG
276 %PQFLX==%PQFLX\%PQFL ; Mask of all list-flags def'd.
278 ; PKQGF - Get first node from queue
279 ; Q/ addr of queue header
280 ; A/ addr of node (zero if none)
282 PKQGF: CONO PI,PIOFF ; Work at all levels
283 HRRZ A,(Q) ; Get 1st from queue header word
284 JUMPE A,PIONJ ; None, so return zero.
285 MOVSI T,(%PQFL) ; Now clear appropriate flag for list
286 XORB T,PQ.FLG(A) ; to indicate it's not on it any more.
287 TLNE T,(%PQFL) ; Paranoia plus
288 BUG HALT,[PK: GF node wasnt on list]
289 HRRZ T,.IRPCNT(A) ; Get 2nd
290 HRRM T,(Q) ; Make it 1st
291 CAIN T, ; If all's well, done.
292 SETZM (Q) ; Else must clear whole header
297 ; PKQPL - Put node on queue as last thing.
298 ; Q/ addr of queue header
300 PKQPL: TRNN A,-1 ; More paranoia
301 BUG HALT,[PK: zero node ptr]
302 HLLZS .IRPCNT(A) ; Say this node is last one
303 CONO PI,PIOFF ; Work at all levels
304 MOVSI T,(%PQFL) ; Paranoia: Set appropriate flag for list
305 XORB T,PQ.FLG(A) ; to indicate it's on it now.
306 TLNN T,(%PQFL) ; plus check...
307 BUG HALT,[PK: node already on list]
308 HLRZ T,(Q) ; Get last node
309 HRLM A,(Q) ; Point to new last node
310 JUMPN T,[HRRM A,.IRPCNT(T) ; Make prev last node point to new last
311 JRST .+2] ; Skip over next instr!!
312 HRRM A,(Q) ; Queue was empty, make this the new first too
316 ; PKQPF - Put node on queue as first thing.
317 ; Q/ addr of queue header
319 PKQPF: TRNN A,-1 ; Yes more paranoia
320 BUG HALT,[PK: zero node ptr]
322 MOVSI T,(%PQFL) ; Paranoia: Set appropriate flag for list
323 XORB T,PQ.FLG(A) ; to indicate it's on it now.
324 TLNN T,(%PQFL) ; check...
325 BUG HALT,[PK: node already on list]
326 HRRZ T,(Q) ; Get first thing
327 CAIN A,(T) ; paranoia, avoid loops to self
329 HRRM T,.IRPCNT(A) ; Make it second thing
330 HRRM A,(Q) ; Make new first thing
332 HRLM A,(Q) ; Was empty, also make it last thing.
337 ; Define PKQGF, etc so that they actually reference PKQGF0, etc as
338 ; appropriate for the given offset.
339 IRP RTN,,[PKQGF,PKQPF,PKQPL]
342 IFG OFFST-%%%QOF,.ERR RTN used with bad offset
347 ; PKEGF - Get a free Packet-Entry node
348 ; Clears node contents.
350 ; Returns A/ PE ptr (0 if none)
352 PKEGF: MOVEI Q,PKEQHF
354 CALRET PKQGF ; Get a node
358 JUMPE A,CPOPJ ; No packet
362 MOVEI Q,PK.HST(A) ; Build byte ref to history trail
363 HRLI Q,(<.BP %NTRCE_30.>) ; (dpb ref)
364 MOVEM Q,PK.HSP(A) ; save ref
368 ; PKERT - Return a Packet-Entry node to freelist
372 PKERT: MOVEI Q,PKEQHF ; Use Packet-Entry freelist
373 CALRET PKQPF ; Put back on start of list.
375 ; Note that all MP calls to the routines below which allocate or free
376 ; entries/buffers must be sure not to block (page fault or UFLS)
377 ; while any "loose" entries/buffers exist (not pointed to by any list)
378 ; unless there something on the PCLSR locked-switch list which will return the
379 ; currently "loose" entry/buffer to its freelist -- otherwise
380 ; it is possible for "loose" stuff to slowly accumulate.
382 ; PKTGF - Get a free Packet-Entry node and Packet Buffer. Hangs until
383 ; it wins. Note that it depends on fact there is one PE node
384 ; for every packet buffer, and vice versa! If this becomes untrue
385 ; then the way it UFLSes should be fixed up.
386 ; PKTGFI - version that skips if wins, doesn't hang.
387 ; Returns A/ PE ptr Clobbers Q,T
389 PKTGF: SKIPN PKEQHF ; Fast check, see if any packet entries/buffers free
390 CALL UFLS ; Nope, hang until something turns up.
391 CALL PKTGFI ; Get a entry/buffer!
392 JRST PKTGF ; None? Sigh, go hang.
395 PKTGFI: CALL PKEGF ; Get a free node
397 PUSH P,A ; Save pointer to it
398 PKTGF1: CALL PKBGF ; Get a free buffer
399 JUMPN A,PKTGF8 ; Jump if found one right away!
400 CALL PKBAL ; None left on freelist, try to allocate more.
402 JRST PKTGF1 ; Won, go pluck a buffer from freelist.
404 ; Lost, can't get any more buffers.
405 POP P,A ; None available, take non-skip return
406 CALRET PKERT ; Put PE node back on its freelist.
408 ; Won, store buffer pointer in PE.
410 POP P,A ; Restore PE ptr
412 MOVE T,PQ.FLG(A) ; Paranoia dept, verify not on any lists.
414 BUG HALT,[PK: Freelist node not free!]
415 SETZM PK.FLG(A) ; Zap all other entries in packet node.
420 AOS (P) ; Win, skip on return!
423 ; PKTRT - Return both a Packet-Entry and its associated buffer to freelist
424 ; only if check shows that it doesn't belong to any lists.
425 ; PKTRTA - Always return to freelist. If check shows that it is still
426 ; on some list, bad error!
428 ; A/ PE ptr (must be off all lists)
430 PKTRTA: CAIL A,PKETBL ; Paranoia check for legal pointers
432 BUG HALT,[PK: Bad PE pointer]
433 MOVE T,PQ.FLG(A) ; Get list flags
434 TLNE T,(%PQFLX) ; Any still on?
435 BUG HALT,[PK: Freeing packet still in use!]
436 JRST PKTRTX ; Nope, can proceed to put on freelist.
438 PKTRT: CAIL A,PKETBL ; Paranoia check for legal pointers
440 BUG HALT,[PK: Bad PE pointer]
442 TLNE T,(%PQFLX) ; Any list flags on?
443 RET ; Yes, don't return to freelist yet.
444 PKTRTX: PUSH P,A ; Save PE ptr
445 SKIPE A,PK.BUF(A) ; Get buffer pointer associated with PE
446 CALL PKBRT ; Return the buffer
448 SETZM PK.BUF(A) ; Ensure buffer pointer zapped.
449 CALRET PKERT ; Then return the packet entry
451 ; PKTPCL - Return a packet entry/buffer while PCLSR'ing.
452 ; This is the standard LOSSET routine to use.
453 ; A must hold the PE ptr at time of the block (which we are backing
455 ; Must only clobber A and T!!
457 PKTPCL: MOVE A,AC0S+A(U) ; Get ac A at time of the block
458 PUSH P,Q ; Mustn't clobber Q
459 CALL PKTRT ; Return the entry/buffer (clobbers Q,T)
464 PKBNF: 0 ; # free Packet Buffers
465 PKBNT: 0 ; # total Packet Buffers
466 PKBCTM: 0 ; Time of last no-more-core complaint
467 PKBQHF: 0 ; Queue Header for buffer freelist
468 PKBQHC: 0 ; Queue Header for core job cleanup
471 ; PKBGF - Get a free Packet Buffer
473 ; Returns A/ PB ptr (0 if none)
475 PKBGF: MOVEI Q,PKBQHF ; Point to buffer freelist
476 CALL PKQGF ; Get first thing off it
477 JUMPE A,CPOPJ ; If got nothing, just return.
478 SETZM 1(A) ; Aha, got it! Flush free-buffer identifier.
479 SOS PKBNF ; Decrement # free packet buffers.
482 ; PKBRT - Return a Packet Buffer to freelist. Puts back at END of freelist,
483 ; as PKBCLN clean-up depends on this.
485 PKBRT: SETZM (A) ; Paranoia aid - clear "flags" in LH of 1st wd.
486 ; Otherwise PKQ routines complain.
487 MOVE T,[SIXBIT /BRUNCH/]
488 MOVEM T,1(A) ; Set up free-buffer identifier
489 AOS PKBNF ; Increment # free packet buffers.
490 MOVEI Q,PKBQHF ; Point to buffer freelist
491 CALRET PKQPL ; Put it back on, at end.
493 ; PKBRTL - Return a list of Packet Buffers to freelist
494 ; Q/ ptr to queue header of list
496 PKBRTL: CALL PKQGF ; Get first thing off list
499 CALL PKBRT ; Return it to buffer freelist
503 ; PKBAL - Allocate more Packet Buffers
505 ; Returns .+1 if lost
506 ; .+2 if won (must still call PKBGF to get a buffer from list)
509 CONI PI,Q ; Save PI channel-on status
511 CONO PI,UTCOFF ; Make the world safe for IOMQ
512 MOVE B,PKBNT ; Check total # of buffers so far
513 CAIL B,NPKB ; Make sure we're not already using max allowed
514 JRST PKBAL4 ; Ugh, already at max! Go complain.
515 PUSHJ P,IOMQ ; Get 1K of memory
516 JRST PKBAL3 ; Mem not available, fail
517 CONO PI,PICON(Q) ; Won, restore PI status
518 MOVEI B,MUPKT ; Set page type = packet
519 DPB B,[MUR,,MEMBLT(A)]
520 LSH A,10. ; Turn allocated page # into mem address
521 HRLI A,-<2000/PKBSIZ> ; Make AOBJN into page (# buffers per page)
522 PKBAL2: PUSHJ P,PKBRT ; Put them all on free list
525 MOVEI B,<2000/PKBSIZ> ; This many more buffers have been created
526 ADDM B,PKBNT ; Increase total (PKBNF bumped by PKBRT)
528 AOS (P) ; Take win return.
531 ; Here if packet stuff trying to use up too much core
532 PKBAL4: MOVE B,PKBCTM ; Don't complain too often
533 ADDI B,60.*30. ; Just once a minute
536 BUG CHECK,[PACKET NET ATTEMPTING TO USE TOO MUCH CORE]
539 PKBAL3: CONO PI,PICON(Q) ; Lost, restore PI status
541 POPJ P, ; and take error return.
544 ; PKBCLN - Called only by core job, to clean up packet buffers.
547 PKBCLN: SKIPE A,PKBNT ; See if 2/3 or more of buffers free
549 POPJ P, ; No buffers or none free, nothing to do
551 IDIV A,B ; Get ratio of total to used
552 CAIGE A,3 ; Note if B is zero, A is unchanged
553 POPJ P, ; and at least 32.
554 IFL TSYSM-256., MOVEI D,TSYSM-1 ; Scan memory for packet buffer pages
556 PKBCL0: LDB A,[MUR,,MEMBLT(D)]
558 PKBCL4: SOJGE D,PKBCL0
560 MOVE A,D ; Quickly determine if any non-free buffers
561 LSH A,10. ; on this page
562 HRLI A,-<2000/PKBSIZ>
563 MOVE T,[SIXBIT/BRUNCH/]
565 JRST PKBCL4 ; Not free, don't bother with slow stuff
568 SETZB C,PKBQHC ; Collect all free buffers on this page
569 MOVE E,PKBNF ; Loop about as many times as there are free buffers
570 PKBCL1: PUSHJ P,PKBGF ; Get next free buffer
574 JRST [ MOVEI Q,PKBQHC ; This one's on the page, save it
576 AOJA C,.+2 ] ; Count them
577 PUSHJ P,PKBRT ; Not on the page, put back. This depends on
578 ; the fact PKBRT puts back at END of list!
580 PKBCL2: CAIE C,<2000/PKBSIZ> ; Did we get the whole page?
581 JRST [ MOVEI Q,PKBQHC ; No, must punt this one, and
582 PUSHJ P,PKBRTL ; return all the buffers we saved up.
584 MOVNS C ; Yes, get rid of these buffers
585 ADDM C,PKBNT ; Decrement total # of buffers in use
587 PUSHJ P,MEMR ; Flush the page from addr space
588 JRST PKBCLN ; Back to flush more, until quota done.