1 ;;; Copyright (c) 1999 Massachusetts Institute of Technology
3 ;;; This program is free software; you can redistribute it and/or
4 ;;; modify it under the terms of the GNU General Public License as
5 ;;; published by the Free Software Foundation; either version 3 of the
6 ;;; License, or (at your option) any later version.
8 ;;; This program is distributed in the hope that it will be useful,
9 ;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
10 ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
11 ;;; General Public License for more details.
13 ;;; You should have received a copy of the GNU General Public License
14 ;;; along with this program; if not, write to the Free Software
15 ;;; Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
19 Notes on TCP input and output buffers
21 Ideally the TCP segment queues should be logically separate from the IP
22 datagram queues, but for efficiency it should be possible to have them
25 Input buffers are expected to be lists of datagrams/segments.
26 This is reasonably simple, since pointers (in known places) can just be
27 set up to the data (whereever it is in the datagram/segment).
28 There is a problem with allocation/windows in this scheme, since it
29 is possible to run out of datagram buffers before running out of window,
30 if the remote site becomes cretinous about it and sends only a few
31 bytes per dgram very fast. But in that case re-transmission can just
32 force it to buffer up its output, so that future dgrams will be somewhat
33 more filled, so it is probably OK. If this does become a screw, can
34 add code to do compaction at clock level.
36 Output buffers are a little more uncertain. Could have simple
37 circular buffer, with appropriate pointers set up so that IMP output
38 message is read directly from the buffer (same as NCP). Alternatively
39 could have queue of segments/datagrams all ready to go. Guess I'd
40 like to try doing the latter, to keep things "simple" by minimizing
41 the number of kinds of things floating around.
43 Problem with putting output datagram together: can't always predict
44 ahead of time how big the leaders are going to be! Especially true
45 for IP level, which TCP is not supposed to know too much about. Thus
46 if not careful, it could happen that output is put into a segment too
47 close to the start of the buffer, so that there is not enough room for
48 the TCP and IP headers to fit in. Have to look at this; may need to
49 give up notion of keeping all the packet internals nicely contiguous.
50 Maximum IP header length is set at 60 bytes (15 words). The maximum
51 TCP header length is also 15 words.
52 Looking at the currently defined options, it seems unlikely that ITS
53 will use any of them, and if so, they can be predicted fairly easily on
54 a per-connection basis, so things should work out okay. Note that
55 the IP level can always simply fragment stuff if it really wants to be
58 It would be nice to be able to keep track of stuff which is on the
59 device (IMP) output queue but hasn't yet gone out, in order to add
60 last-minute bits (like ACK) or even some more data. Idea: output "queue"
61 is just a list of TCP connections that need attention, so can always
62 go in and mung stuff (even change mind about outputting) just by playing
63 with connection flags/lists. This is basically how NCP does it.
65 Re packet buffer design:
66 Somewhat more hypothetical is the notion of keeping a "usage
67 count" for each buffer, so that pointers from the packet table entries
68 can point to several different buffers and not just one. When a usage
69 count hits zero, put the buffer on the freelist. Actually this is not
70 really needed for the case of a table entry pointing to more than one
71 buffer, but it IS needed for the case of more than one table entry
72 pointing to the same buffer. This might happen, for example, if
73 an internet bypass were set up so that datagrams going to ourselves
74 were simply vectored directly to the input queues. But for the
75 time being, it probably isn't too outrageous to simply re-copy the
76 datagram in question. (Also better emulates a fake network device).
78 Suggest that lists point directly to themselves rather than
79 to start of entry; this allows low-level list routines to be used
80 on all lists. The higher-level routines of course have to know what
81 offsets to use for a specific list. Alternatively provide different routines
82 for each offset needed, and equate references to the appropriate routine
83 for each use. (This is what I'm trying at the moment, hence the IRPS)
85 May want to use format <head>,,<list ptr> where <head> is the addr
86 of the initial pointer (not initial node). This allows backtracking
87 to figure out what TCP connection or IP queue a packet belongs to.
88 Brute force approach is to simply have another word for the TCB index, etc.
90 If this stuff is general enough it could be used for CHAOS packets also.
95 -- Lots of this is wrong now... --
97 Buffers are all 256 (400 octal) words long, and 4 of them fit on one
98 ITS page. This size was chosen because the maximum length of an IMP
99 message (not counting IMP-Host padding) is 255 words of 4 8-bit bytes
100 per word. This allows one extra word; not much. Taking away the
101 3 IMP leader words (which has to be done anyway for NCP to continue
102 working) gives us 4 spare words per buffer.
104 This still may not be enough. Rather than shoe-horn some clever stuff
105 into each buffer header, I am opting to maintain "Packet Entries" in a
106 "Packet Table" separate from the "Packet Buffers" themselves. All
107 messages/datagrams/segments/packets are identified by a pointer
108 into the table. (Could use index, as for sockets/TCBs, but this is
109 awkward for lists). The entry identified by the pointer will contain
110 the actual pointers into the buffer associated with that packet.
111 [NOTE: may want to have these pointers point into various places, not
112 necessarily all the same buffer. Must think about this.]
114 Free buffers are linked by a freelist pointer in their first word,
115 with an identifier in the second word to help GC. The only design
116 issue for the buffers themselves is how to set things up so that it is
117 easy to GC a large freelist, i.e. identify pages that have nothing
118 but free buffers in them and thus can be removed from the system's
119 address space. Currently I have simply adopted the strategem used for
120 CHAOS buffers (at CHCLN) to have the core job snuffle over the freelist.
122 Note that the low-level routines for manipulating lists are actually
123 referenced via macros which are given an offset as argument. This
124 offset refers to the offset between the node pointer and the location
125 of the "next" pointer in the node; the macro will assemble into a
126 call to the right routine for that offset. Currently only offsets
127 up to 2 are supported; any lists that the packet is put on must
128 be threaded through the first 3 words of the node, and the macros check
132 SUBTTL Packet Tracing Code
134 IFNDEF PKTTRC,PKTTRC==:0 ; Nonzero turns tracing code on
135 IF1,IFN PKTTRC,.ERR IP packet tracing code included.
139 This code can be used to keep a history of what happens to a packet.
140 The history is stored in the packet buffer as a series of indexes
141 into a table of named events.
143 To trace a particular event use the TRCPKT macro.
147 REG is the register which currently contains an index to the packet
148 buffer table. The code is more efficient if the register is R. String
149 is the description of the event.
151 To generate the event table, you must call the TRCCOD macro somewhere
152 after the last call to TRCPKT.
156 %NTRCE==:77 ; Allow this many trace events. Value must be a bit mask (all
157 ; ones) to work right. If you make it bigger than 77
158 ; you have to change the "TRC%" in the macros to something
163 ; TRCCOD generates TRCTBL, which holds <code address,,address of event string>
164 ; for each trace event.
167 TRC%0==:<0,,[ASCIZ /Null event/]>
168 TRCTBL: REPEAT %%%TRC,CONC TRC%,\.RPCNT
169 IF2, REPEAT %%%TRC,CONC EXPUNGE TRC%,\.RPCNT
173 DEFINE TRCPKT REG,&(EVENT)
175 IFG %%%TRC-%NTRCE,.ERR Too many packet trace events!
176 CONC TRC%,\%%%TRC,==:<.,,[ASCIZ EVENT]>
194 ; Store information in the packet history data buffer
195 ; "Information" is a 6-bit quantity which indexes into a table
198 ; R/ Pkt Buffer index
200 PKTPTS: CONO PI,PIOFF ; Freeze machine
201 DPB TT,PK.HSP(R) ; Store trace info
202 IBP PK.HSP(R) ; Bump BP
203 HRRZ TT,PK.HSP(R) ; Pick up history byte ref (address only)
204 CAIL TT,PK.HSP(R) ; If we've gotten to the bottom,
205 JRST [ MOVEI TT,PK.HST(R) ;Wrap around
206 HRLI TT,(<.BP %NTRCE_30.>) ; BP to left 6 bits
207 MOVEM TT,PK.HSP(R) ;Reset it to top
213 ;;; Packet Table Entries
216 IFNDEF PKBSIZ,PKBSIZ==1000 ; 512. words per packet buffer (2048 byte MTU)
217 IFNDEF NPKPGS,NPKPGS==40. ; # pages OK to use for packet buffers
218 NPKB==:<NPKPGS*<2000/PKBSIZ>> ; # packet buffers available
219 NPKE==:NPKB ; # packet entry nodes.
220 IFN NPKB-NPKE,.ERR You must fix the UFLS at PKTGF.
223 ; General (device driver, etc)
224 PK.FLG:: 1,,PKETBL+PK.L ; General - <flags>,,<PE freelist or dev driver list>
225 %PKPIL==:<SETZ> ; Packet locked at PI level, being output
226 %PKODN==:<1000,,> ; Packet has been output (else not yet)
227 ;;; %PKNOF==:<2000,,> ; Packet should not be freed when output done.
228 %PKFLS==:<4000,,> ; Flush pkt if seen on output queue (ie dont output)
229 %PKRTR==:<10000,,> ; Packet is being re-transmitted by TCP
230 .SEE %PQFLX ; Low bits of LH used for on-list flags.
232 PK.IP:: 0 ; IP Datagram - <IP Header ptr>,,<IP Datagram list>
233 ; May be strung on Internet Queue, IP output queue
234 PK.TCP:: 0 ; TCP Segment - <TCP Header pointer>,,<TCP Segment list>
235 ; May be strung on TCB input Q, output retransmit Q
236 PK.TCI:: 0 ; TCP Segment - <# bytes data><# bytes offset><TCB index>
237 PK%TDL==:<777700,,0> ; # octets of data in TCP segment
238 PK%TDO==:< 77,,770000> ; # octets data is offset from TCP header
239 PK%TCB==:< ,,007777> ; TCB connection index
240 PK$TDL==:<.BP PK%TDL,PK.TCI>
241 PK$TDO==:<.BP PK%TDO,PK.TCI>
242 PK$TCB==:<.BP PK%TCB,PK.TCI>
243 PK.BUF:: 0 ; General - <# wds>,,<addr of buffer>
244 PK.TIM:: 0 ; General - Time sent or received, int level (Sys time)
245 PK.DST:: 0 ; Immediate destination address if on output queue
246 PK.HWI:: 0 ; Hardware Information - currently KS Unibus address
249 PK.HST:: BLOCK 3 ; Packet trace history buffer
250 PK.HSP:: 0 ; Packet trace history pointer
253 PK.L:: OFFSET 0 ; Length of a Packet-Entry (PE) node
255 IFN .RPCNT-<NPKE-2>, 1,,.+PK.L ; Build initial freelist
259 PKETBE==:.-PK.L ; Last legal PE pointer value
261 PKEQHF: PKETBE,,PKETBL ; Header for Packet-Entry node freelist
265 ; A "queue" is a list of nodes pointed to by a "queue header" word
266 ; of format <last node>,,<first node>. Each node pointer points to
267 ; the next node pointer (or zero if no more).
268 ; There is a set of flags in the LH of a certain word, at offset
269 ; PQ.FLG, that indicate which lists a node is currently on.
271 PQ.FLG==:PK.FLG ; Offset of word list-flags are in.
272 %PQFLX==0 ; Initial val
273 IFNDEF %%%QOF,%%%QOF==0 ; This gets set to highest offset supported
275 IRPS PKQGF,,[PKQGF0:PKQGF1:PKQGF2:]PKQPL,,[PKQPL0:PKQPL1:PKQPL2:]PKQPF,,[PKQPF0:PKQPF1:PKQPF2:]%PQFL,,[%PQFL0:%PQFL1:%PQFL2:]
276 IFG .IRPCNT-%%%QOF,%%%QOF==.IRPCNT
278 %PQFL==:<1_.IRPCNT,,> ; Def a flag in LH at offset PQ.FLG
279 %PQFLX==%PQFLX\%PQFL ; Mask of all list-flags def'd.
281 ; PKQGF - Get first node from queue
282 ; Q/ addr of queue header
283 ; A/ addr of node (zero if none)
285 PKQGF: CONO PI,PIOFF ; Work at all levels
286 HRRZ A,(Q) ; Get 1st from queue header word
287 JUMPE A,PIONJ ; None, so return zero.
288 MOVSI T,(%PQFL) ; Now clear appropriate flag for list
289 XORB T,PQ.FLG(A) ; to indicate it's not on it any more.
290 TLNE T,(%PQFL) ; Paranoia plus
291 BUG HALT,[PK: GF node wasnt on list]
292 HRRZ T,.IRPCNT(A) ; Get 2nd
293 HRRM T,(Q) ; Make it 1st
294 CAIN T, ; If all's well, done.
295 SETZM (Q) ; Else must clear whole header
300 ; PKQPL - Put node on queue as last thing.
301 ; Q/ addr of queue header
303 PKQPL: TRNN A,-1 ; More paranoia
304 BUG HALT,[PK: zero node ptr]
305 HLLZS .IRPCNT(A) ; Say this node is last one
306 CONO PI,PIOFF ; Work at all levels
307 MOVSI T,(%PQFL) ; Paranoia: Set appropriate flag for list
308 XORB T,PQ.FLG(A) ; to indicate it's on it now.
309 TLNN T,(%PQFL) ; plus check...
310 BUG HALT,[PK: node already on list]
311 HLRZ T,(Q) ; Get last node
312 HRLM A,(Q) ; Point to new last node
313 JUMPN T,[HRRM A,.IRPCNT(T) ; Make prev last node point to new last
314 JRST .+2] ; Skip over next instr!!
315 HRRM A,(Q) ; Queue was empty, make this the new first too
319 ; PKQPF - Put node on queue as first thing.
320 ; Q/ addr of queue header
322 PKQPF: TRNN A,-1 ; Yes more paranoia
323 BUG HALT,[PK: zero node ptr]
325 MOVSI T,(%PQFL) ; Paranoia: Set appropriate flag for list
326 XORB T,PQ.FLG(A) ; to indicate it's on it now.
327 TLNN T,(%PQFL) ; check...
328 BUG HALT,[PK: node already on list]
329 HRRZ T,(Q) ; Get first thing
330 CAIN A,(T) ; paranoia, avoid loops to self
332 HRRM T,.IRPCNT(A) ; Make it second thing
333 HRRM A,(Q) ; Make new first thing
335 HRLM A,(Q) ; Was empty, also make it last thing.
340 ; Define PKQGF, etc so that they actually reference PKQGF0, etc as
341 ; appropriate for the given offset.
342 IRP RTN,,[PKQGF,PKQPF,PKQPL]
345 IFG OFFST-%%%QOF,.ERR RTN used with bad offset
350 ; PKEGF - Get a free Packet-Entry node
351 ; Clears node contents.
353 ; Returns A/ PE ptr (0 if none)
355 PKEGF: MOVEI Q,PKEQHF
357 CALRET PKQGF ; Get a node
361 JUMPE A,CPOPJ ; No packet
365 MOVEI Q,PK.HST(A) ; Build byte ref to history trail
366 HRLI Q,(<.BP %NTRCE_30.>) ; (dpb ref)
367 MOVEM Q,PK.HSP(A) ; save ref
371 ; PKERT - Return a Packet-Entry node to freelist
375 PKERT: MOVEI Q,PKEQHF ; Use Packet-Entry freelist
376 CALRET PKQPF ; Put back on start of list.
378 ; Note that all MP calls to the routines below which allocate or free
379 ; entries/buffers must be sure not to block (page fault or UFLS)
380 ; while any "loose" entries/buffers exist (not pointed to by any list)
381 ; unless there something on the PCLSR locked-switch list which will return the
382 ; currently "loose" entry/buffer to its freelist -- otherwise
383 ; it is possible for "loose" stuff to slowly accumulate.
385 ; PKTGF - Get a free Packet-Entry node and Packet Buffer. Hangs until
386 ; it wins. Note that it depends on fact there is one PE node
387 ; for every packet buffer, and vice versa! If this becomes untrue
388 ; then the way it UFLSes should be fixed up.
389 ; PKTGFI - version that skips if wins, doesn't hang.
390 ; Returns A/ PE ptr Clobbers Q,T
392 PKTGF: SKIPN PKEQHF ; Fast check, see if any packet entries/buffers free
393 CALL UFLS ; Nope, hang until something turns up.
394 CALL PKTGFI ; Get a entry/buffer!
395 JRST PKTGF ; None? Sigh, go hang.
398 PKTGFI: CALL PKEGF ; Get a free node
400 PUSH P,A ; Save pointer to it
401 PKTGF1: CALL PKBGF ; Get a free buffer
402 JUMPN A,PKTGF8 ; Jump if found one right away!
403 CALL PKBAL ; None left on freelist, try to allocate more.
405 JRST PKTGF1 ; Won, go pluck a buffer from freelist.
407 ; Lost, can't get any more buffers.
408 POP P,A ; None available, take non-skip return
409 CALRET PKERT ; Put PE node back on its freelist.
411 ; Won, store buffer pointer in PE.
413 POP P,A ; Restore PE ptr
415 MOVE T,PQ.FLG(A) ; Paranoia dept, verify not on any lists.
417 BUG HALT,[PK: Freelist node not free!]
418 SETZM PK.FLG(A) ; Zap all other entries in packet node.
423 AOS (P) ; Win, skip on return!
426 ; PKTRT - Return both a Packet-Entry and its associated buffer to freelist
427 ; only if check shows that it doesn't belong to any lists.
428 ; PKTRTA - Always return to freelist. If check shows that it is still
429 ; on some list, bad error!
431 ; A/ PE ptr (must be off all lists)
433 PKTRTA: CAIL A,PKETBL ; Paranoia check for legal pointers
435 BUG HALT,[PK: Bad PE pointer]
437 TLNE T,(%PQFLX\%PKPIL) ; On list or locked at I/O PI level?
438 BUG HALT,[PK: Freeing packet still in use!]
439 JRST PKTRTX ; Nope, can proceed to put on freelist.
441 PKTRT: CAIL A,PKETBL ; Paranoia check for legal pointers
443 BUG HALT,[PK: Bad PE pointer]
445 TLNE T,(%PQFLX\%PKPIL) ; On a list or locked by interface PI level?
446 RET ; Yes, don't return to freelist yet.
447 PKTRTX: PUSH P,A ; Save PE ptr
448 SKIPE A,PK.BUF(A) ; Get buffer pointer associated with PE
449 CALL PKBRT ; Return the buffer
451 SETZM PK.BUF(A) ; Ensure buffer pointer zapped.
452 CALRET PKERT ; Then return the packet entry
454 ; PKTPCL - Return a packet entry/buffer while PCLSR'ing.
455 ; This is the standard LOSSET routine to use.
456 ; A must hold the PE ptr at time of the block (which we are backing
458 ; Must only clobber A and T!!
460 PKTPCL: MOVE A,AC0S+A(U) ; Get ac A at time of the block
461 PUSH P,Q ; Mustn't clobber Q
462 CALL PKTRT ; Return the entry/buffer (clobbers Q,T)
467 PKBNF: 0 ; # free Packet Buffers
468 PKBNT: 0 ; # total Packet Buffers
469 PKBCTM: 0 ; Time of last no-more-core complaint
470 PKBQHF: 0 ; Queue Header for buffer freelist
471 PKBQHC: 0 ; Queue Header for core job cleanup
474 ; PKBGF - Get a free Packet Buffer
476 ; Returns A/ PB ptr (0 if none)
478 PKBGF: MOVEI Q,PKBQHF ; Point to buffer freelist
479 CALL PKQGF ; Get first thing off it
480 JUMPE A,CPOPJ ; If got nothing, just return.
481 SETZM 1(A) ; Aha, got it! Flush free-buffer identifier.
482 SOS PKBNF ; Decrement # free packet buffers.
485 ; PKBRT - Return a Packet Buffer to freelist. Puts back at END of freelist,
486 ; as PKBCLN clean-up depends on this.
488 PKBRT: SETZM (A) ; Paranoia aid - clear "flags" in LH of 1st wd.
489 ; Otherwise PKQ routines complain.
490 MOVE T,[SIXBIT /BRUNCH/]
491 MOVEM T,1(A) ; Set up free-buffer identifier
492 AOS PKBNF ; Increment # free packet buffers.
493 MOVEI Q,PKBQHF ; Point to buffer freelist
494 CALRET PKQPL ; Put it back on, at end.
496 ; PKBRTL - Return a list of Packet Buffers to freelist
497 ; Q/ ptr to queue header of list
499 PKBRTL: CALL PKQGF ; Get first thing off list
502 CALL PKBRT ; Return it to buffer freelist
506 ; PKBAL - Allocate more Packet Buffers
508 ; Returns .+1 if lost
509 ; .+2 if won (must still call PKBGF to get a buffer from list)
512 CONI PI,Q ; Save PI channel-on status
514 CONO PI,UTCOFF ; Make the world safe for IOMQ
515 MOVE B,PKBNT ; Check total # of buffers so far
516 CAIL B,NPKB ; Make sure we're not already using max allowed
517 JRST PKBAL4 ; Ugh, already at max! Go complain.
518 PUSHJ P,IOMQ ; Get 1K of memory
519 JRST PKBAL3 ; Mem not available, fail
520 CONO PI,PICON(Q) ; Won, restore PI status
521 MOVEI B,MUPKT ; Set page type = packet
522 DPB B,[MUR,,MEMBLT(A)]
523 LSH A,10. ; Turn allocated page # into mem address
524 HRLI A,-<2000/PKBSIZ> ; Make AOBJN into page (# buffers per page)
525 PKBAL2: PUSHJ P,PKBRT ; Put them all on free list
528 MOVEI B,<2000/PKBSIZ> ; This many more buffers have been created
529 ADDM B,PKBNT ; Increase total (PKBNF bumped by PKBRT)
531 AOS (P) ; Take win return.
534 ; Here if packet stuff trying to use up too much core
535 PKBAL4: MOVE B,PKBCTM ; Don't complain too often
536 ADDI B,60.*30. ; Just once a minute
539 BUG CHECK,[PACKET NET ATTEMPTING TO USE TOO MUCH CORE]
542 PKBAL3: CONO PI,PICON(Q) ; Lost, restore PI status
544 POPJ P, ; and take error return.
547 ; PKBCLN - Called only by core job, to clean up packet buffers.
550 PKBCLN: SKIPE A,PKBNT ; See if 2/3 or more of buffers free
552 POPJ P, ; No buffers or none free, nothing to do
554 IDIV A,B ; Get ratio of total to used
555 CAIGE A,3 ; Note if B is zero, A is unchanged
556 POPJ P, ; and at least 32.
557 IFL TSYSM-256., MOVEI D,TSYSM-1 ; Scan memory for packet buffer pages
559 PKBCL0: LDB A,[MUR,,MEMBLT(D)]
561 PKBCL4: SOJGE D,PKBCL0
563 MOVE A,D ; Quickly determine if any non-free buffers
564 LSH A,10. ; on this page
565 HRLI A,-<2000/PKBSIZ>
566 MOVE T,[SIXBIT/BRUNCH/]
568 JRST PKBCL4 ; Not free, don't bother with slow stuff
571 SETZB C,PKBQHC ; Collect all free buffers on this page
572 MOVE E,PKBNF ; Loop about as many times as there are free buffers
573 PKBCL1: PUSHJ P,PKBGF ; Get next free buffer
577 JRST [ MOVEI Q,PKBQHC ; This one's on the page, save it
579 AOJA C,.+2 ] ; Count them
580 PUSHJ P,PKBRT ; Not on the page, put back. This depends on
581 ; the fact PKBRT puts back at END of list!
583 PKBCL2: CAIE C,<2000/PKBSIZ> ; Did we get the whole page?
584 JRST [ MOVEI Q,PKBQHC ; No, must punt this one, and
585 PUSHJ P,PKBRTL ; return all the buffers we saved up.
587 MOVNS C ; Yes, get rid of these buffers
588 ADDM C,PKBNT ; Decrement total # of buffers in use
590 PUSHJ P,MEMR ; Flush the page from addr space
591 JRST PKBCLN ; Back to flush more, until quota done.