[ih] Question on Flow Control

Jack Haverty jack at 3kitty.org
Mon Dec 29 16:05:32 PST 2025


A little more rathole history...

In 1977/78, I implemented TCPV2 for Unix on a PDP-11/40.  It was based 
on the TCPV2 code which Jim Mathis at SRI had already created for the 
LSI-11.   So most of the "state diagram", buffer management, and 
datagram handling were compatible with a PDP-11, with a lot of 
shoehorning to get it into the Unix environment (not easy on an 11/40).

Jim's code set the Retransmission timer at 3 seconds.  When I asked why, 
the answers revealed that no one really knew what it should be.  It also 
didn't matter much at the time, since the underlying ARPANET which 
carried most traffic delivered everything sent, in order, intact, and 
with no duplicates.  Gateways might drop datagrams, and did -- 
especially the ones interconnecting ARPANET to SATNET for 
intercontinental traffic.

SATNET involved a geosynchronous satellite, with delays of perhaps a 
good fraction of a second even under no load.  So 3 seconds seemed 
reasonable for RTO.   I left the RTO in my Unix implementation set to 3 
seconds.   We also closely monitored the "core gateways" to detect 
situations with high loss rates of datagrams; gateways had no choice but 
discarding packets when no buffers were available.  It happened a lot in 
the intercontinental path.

A lot of us TCPV2 implementers just picked 3 seconds, while waiting for 
further research to produce a better answer.  Subsequently VanJ and 
others thought about it a lot and invented schemes for adjusting TCP 
behavior, documented in numerous RFCs.

...

More than a decade later, I was involved in operating a corporate 
network, using TCPV4 and 100+ Cisco routers.  We used SNMP to monitor 
the network behavior.  Since we were also responsible for many of the 
"host" computers, we also monitored TCP behavior in the hosts, also by 
using SNMP.  Not all TCP implementations implemented that capability, 
but for some we could watch retransmissions, duplicates, checksum 
errors, and collect such data from inside the hosts' TCPs.

It became obvious that there was a wide range of implementation 
decisions that the various TCP implementers had made.  At one point, 
before Microsoft embraced TCP, there were more than 30 separate TCP 
implementations available just for use in PCs.  All sorts of companies 
were also marketing workstations to attach to the proliferating Ethernets.

We had to test our own software with each of these.   They exhibited 
quite varied behavior.  Some were optimized for fastest network 
transfers -- including one that accomplished that by violating part of 
the Ethernet specifications for timing, effectively stealing service 
from others on the LAN.  Others were optimized for minimizing load on 
the PC, either CPU or memory resources or both. Some were optimized for 
simplicity -- I recall one which only accepted the "next" datagram for 
its current window, discarding anything else.  It was simple and took 
advantage of the fact that out-of-order datagrams it discarded would be 
retransmitted anyway.

All of these implementations "worked", in the sense that TCP traffic 
would flow.  We could observe their behavior by monitoring both the 
gateways (called routers by that time) and the TCPs in computers 
attached to our intranet.

Whether or not they were "legal" and conformed to the specifications and 
standards was unclear.   Marketing literature might say lots of things, 
but independent certification labs were scarce or non-existent.   Caveat 
emptor.

...

Fast forward to today.  My home LAN now has 50+ devices on it.   All of 
them presumably have implemented TCP.  I don't watch any of them.  I 
have no idea which algorithms, RFCs, standards, or optimizations each 
has chosen to implement.  Or if their implementation is correct.  Or 
"legal" in conforming to whatever the specifications are today.

Does anybody monitor the behavior of the Internet today at the host 
computers and their TCPs?   How does anyone know that the TCP in their 
device today is operating as expected and as the mathematical analyses 
promised?

/Jack Haverty


On 12/29/25 13:23, John Day via Internet-history wrote:
>
>> On Dec 29, 2025, at 12:57, Craig Partridge <craig at tereschau.net> wrote:
>>
>>
>>
>> On Mon, Dec 29, 2025 at 12:07 PM John Day <jeanjour at comcast.net <mailto:jeanjour at comcast.net>> wrote:
>>> As for TCP initially using Selective-repeat or SACK, do you remember what the TCP retransmission time out was at that time? It makes a difference.  The nominal value in the textbooks is RTT + 4D, where D is the mean variation. There is an RFC that says if 4D < 1 sec, set it to 1 sec. which seems high, but that is what it says.
>>>
>>> Take care,
>>> John
>> Serious study of what the RTO should be didn't happen until the late 1980s.  Before that, it was rather ad hoc.
> I only brought up RTO because of the comment about SACK. For SACK to be useful, RTO can’t be too short. 3 seconds sounds like plenty of time.
>
>> RFC 793 says min(upper bound,  beta * min(lower bound, SRTT)). where SRTT was an incremental moving average, SRTT = (alpha * SRTT) + (1-alpha)(measured RTT).  But this leaves open all sorts of questions such as: what should alpha and beta be (RFC 793 suggests alpha of .8 or so and beta of 1.3 to 2), and do you measure an RTT once per window (BSD's approach) or once per segment (I think TENEX's approach).  Not to mention the retransmission ambiguity problem, which Lixia Z. and Raj Jain discovered in 1985-6.  \\
> Yes, this is pretty much what the textbooks say these days. Although RFC 6298 has an equation for calculating RTO, the RFC says that if equation yields a value less than 1 sec, then set it to 1 sec. It also says that the previous value was 3 sec and there is no problem continuing to use that.  So it would seem RTO should be between 1 and 3 seconds.  This seems to be a long time.
>
>> (If you are wondering why we didn't use variance -- it required a square root which was strictly a no-no in kernels of that era;  Van J. solved part of this issue by finding a variance calculation that could be done without a square root).
> Yes, it was clear why variance wasn’t used. It required by both squares and square root. I tell students that in Operating Systems,  multiplication is higher math. ;-)
>
>> This is an improvement on TCP v2 (which is silent on the topic) and IEN 15 (1976) which says use 2 * RTT estimate.
> For RTO? Yea, that would something to start with.
>> Ethernet and ALOHA were more explicit about this process but both had far easier problems, with well bounded prop delay (and in ALOHA's case, a prop delay so long it swamped queueing times).
>>
>> Part of the reason TCP was slow to realize the issues, I think, were (1) the expectation loss would be low (Dave Clark used to say that in the 1970s, the notion was loss was below 1%, which, in a time when windows were often 4, mean the RTO was used about 4% of the time); and (2) failure to realize congestion collapse was an issue (when loss rates soar to 80% or more and your RTO estimator really needs to be good or you make congestion worse).  It is not chance that RTO issues came to a head as the Internet was suffering congestion collapse.  I got pulled into the issues (and helped Phil Karn solve retransmission ambiguity) because I was playing with RDP, which had selective acks, and was seeing also sorts of strange holes in my windows (as out of order segments were being acked) and trying to figure out what to retransmit and when.
> It doesn’t help that the Internet adopted what is basically CUTE+AIMD.
>
> But back to the flow control issue. This is a digression on a rat hole. ;-)
>
> But also a useful discussion.  ;-)
>
> The question remains was dynamic window an enhancement of static window or were they independently developed?
>
> Take care,
> John
>> Craig
>>
>>
>> --
>> *****
>> Craig Partridge's email account for professional society activities and mailing lists.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20251229/5f2acc69/attachment.asc>


More information about the Internet-history mailing list