[ih] Question on Flow Control

Mon Dec 29 20:40:17 PST 2025

As this discussion group has been reaching back in time to the early RFC’s, the early Host-Host protocol, the NCP, other early protocols and rathole history, and how they informed TCP and its many improvements, let’s not forget that Steve Crocker was a key contributor (e.g, RFC 1 and much more).  I may have missed mention of Steve, but surely we should be including his name in our discussions  about those early protocol and system developers.

Len

> On Dec 29, 2025, at 4:05 PM, Jack Haverty via Internet-history <internet-history at elists.isoc.org> wrote:
> 
> A little more rathole history...
> 
> In 1977/78, I implemented TCPV2 for Unix on a PDP-11/40.  It was based on the TCPV2 code which Jim Mathis at SRI had already created for the LSI-11.   So most of the "state diagram", buffer management, and datagram handling were compatible with a PDP-11, with a lot of shoehorning to get it into the Unix environment (not easy on an 11/40).
> 
> Jim's code set the Retransmission timer at 3 seconds.  When I asked why, the answers revealed that no one really knew what it should be.  It also didn't matter much at the time, since the underlying ARPANET which carried most traffic delivered everything sent, in order, intact, and with no duplicates.  Gateways might drop datagrams, and did -- especially the ones interconnecting ARPANET to SATNET for intercontinental traffic.
> 
> SATNET involved a geosynchronous satellite, with delays of perhaps a good fraction of a second even under no load.  So 3 seconds seemed reasonable for RTO.   I left the RTO in my Unix implementation set to 3 seconds.   We also closely monitored the "core gateways" to detect situations with high loss rates of datagrams; gateways had no choice but discarding packets when no buffers were available.  It happened a lot in the intercontinental path.
> 
> A lot of us TCPV2 implementers just picked 3 seconds, while waiting for further research to produce a better answer.  Subsequently VanJ and others thought about it a lot and invented schemes for adjusting TCP behavior, documented in numerous RFCs.
> 
> ...
> 
> More than a decade later, I was involved in operating a corporate network, using TCPV4 and 100+ Cisco routers.  We used SNMP to monitor the network behavior.  Since we were also responsible for many of the "host" computers, we also monitored TCP behavior in the hosts, also by using SNMP.  Not all TCP implementations implemented that capability, but for some we could watch retransmissions, duplicates, checksum errors, and collect such data from inside the hosts' TCPs.
> 
> It became obvious that there was a wide range of implementation decisions that the various TCP implementers had made.  At one point, before Microsoft embraced TCP, there were more than 30 separate TCP implementations available just for use in PCs.  All sorts of companies were also marketing workstations to attach to the proliferating Ethernets.
> 
> We had to test our own software with each of these.   They exhibited quite varied behavior.  Some were optimized for fastest network transfers -- including one that accomplished that by violating part of the Ethernet specifications for timing, effectively stealing service from others on the LAN.  Others were optimized for minimizing load on the PC, either CPU or memory resources or both. Some were optimized for simplicity -- I recall one which only accepted the "next" datagram for its current window, discarding anything else.  It was simple and took advantage of the fact that out-of-order datagrams it discarded would be retransmitted anyway.
> 
> All of these implementations "worked", in the sense that TCP traffic would flow.  We could observe their behavior by monitoring both the gateways (called routers by that time) and the TCPs in computers attached to our intranet.
> 
> Whether or not they were "legal" and conformed to the specifications and standards was unclear.   Marketing literature might say lots of things, but independent certification labs were scarce or non-existent.   Caveat emptor.
> 
> ...
> 
> Fast forward to today.  My home LAN now has 50+ devices on it.   All of them presumably have implemented TCP.  I don't watch any of them.  I have no idea which algorithms, RFCs, standards, or optimizations each has chosen to implement.  Or if their implementation is correct.  Or "legal" in conforming to whatever the specifications are today.
> 
> Does anybody monitor the behavior of the Internet today at the host computers and their TCPs?   How does anyone know that the TCP in their device today is operating as expected and as the mathematical analyses promised?
> 
> /Jack Haverty
> 
> 
> On 12/29/25 13:23, John Day via Internet-history wrote:
>> 
>>> On Dec 29, 2025, at 12:57, Craig Partridge <craig at tereschau.net> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Dec 29, 2025 at 12:07 PM John Day <jeanjour at comcast.net <mailto:jeanjour at comcast.net>> wrote:
>>>> As for TCP initially using Selective-repeat or SACK, do you remember what the TCP retransmission time out was at that time? It makes a difference.  The nominal value in the textbooks is RTT + 4D, where D is the mean variation. There is an RFC that says if 4D < 1 sec, set it to 1 sec. which seems high, but that is what it says.
>>>> 
>>>> Take care,
>>>> John
>>> Serious study of what the RTO should be didn't happen until the late 1980s.  Before that, it was rather ad hoc.
>> I only brought up RTO because of the comment about SACK. For SACK to be useful, RTO can’t be too short. 3 seconds sounds like plenty of time.
>> 
>>> RFC 793 says min(upper bound,  beta * min(lower bound, SRTT)). where SRTT was an incremental moving average, SRTT = (alpha * SRTT) + (1-alpha)(measured RTT).  But this leaves open all sorts of questions such as: what should alpha and beta be (RFC 793 suggests alpha of .8 or so and beta of 1.3 to 2), and do you measure an RTT once per window (BSD's approach) or once per segment (I think TENEX's approach).  Not to mention the retransmission ambiguity problem, which Lixia Z. and Raj Jain discovered in 1985-6.  \\
>> Yes, this is pretty much what the textbooks say these days. Although RFC 6298 has an equation for calculating RTO, the RFC says that if equation yields a value less than 1 sec, then set it to 1 sec. It also says that the previous value was 3 sec and there is no problem continuing to use that.  So it would seem RTO should be between 1 and 3 seconds.  This seems to be a long time.
>> 
>>> (If you are wondering why we didn't use variance -- it required a square root which was strictly a no-no in kernels of that era;  Van J. solved part of this issue by finding a variance calculation that could be done without a square root).
>> Yes, it was clear why variance wasn’t used. It required by both squares and square root. I tell students that in Operating Systems,  multiplication is higher math. ;-)
>> 
>>> This is an improvement on TCP v2 (which is silent on the topic) and IEN 15 (1976) which says use 2 * RTT estimate.
>> For RTO? Yea, that would something to start with.
>>> Ethernet and ALOHA were more explicit about this process but both had far easier problems, with well bounded prop delay (and in ALOHA's case, a prop delay so long it swamped queueing times).
>>> 
>>> Part of the reason TCP was slow to realize the issues, I think, were (1) the expectation loss would be low (Dave Clark used to say that in the 1970s, the notion was loss was below 1%, which, in a time when windows were often 4, mean the RTO was used about 4% of the time); and (2) failure to realize congestion collapse was an issue (when loss rates soar to 80% or more and your RTO estimator really needs to be good or you make congestion worse).  It is not chance that RTO issues came to a head as the Internet was suffering congestion collapse.  I got pulled into the issues (and helped Phil Karn solve retransmission ambiguity) because I was playing with RDP, which had selective acks, and was seeing also sorts of strange holes in my windows (as out of order segments were being acked) and trying to figure out what to retransmit and when.
>> It doesn’t help that the Internet adopted what is basically CUTE+AIMD.
>> 
>> But back to the flow control issue. This is a digression on a rat hole. ;-)
>> 
>> But also a useful discussion.  ;-)
>> 
>> The question remains was dynamic window an enhancement of static window or were they independently developed?
>> 
>> Take care,
>> John
>>> Craig
>>> 
>>> 
>>> --
>>> *****
>>> Craig Partridge's email account for professional society activities and mailing lists.
> 
> -- 
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history
> -
> Unsubscribe: https://app.smartsheet.com/b/form/9b6ef0621638436ab0a9b23cb0668b0b?The%20list%20to%20be%20unsubscribed%20from=Internet-history