[ih] Question on Flow Control

Vint Cerf vint at google.com
Wed Dec 31 08:18:54 PST 2025


That's a very crisp summary, Steve. Thanks!

V

Please send any postal/overnight deliveries to:
Vint Cerf
Google, LLC
1900 Reston Metro Plaza, 16th Floor
Reston, VA 20190
+1 (571) 213 1346


until further notice




On Wed, Dec 31, 2025, 11:01 Steve Crocker via Internet-history <
internet-history at elists.isoc.org> wrote:

> Len,
>
> Thanks for mentioning me.  In the design of the Arpanet protocols flow
> control was indeed a major concern.  However, there were some key
> differences between designing flow control for the Arpanet and flow control
> for the Internet.
>
> The initial version of the Arpanet was designed, implemented and deployed
> with the conviction that no messages would ever be lost.  Hence there was
> no reason to include retransmission in the Host-Host protocol.  (For those
> not familiar with the original nomenclature, I used Host-Host protocol as
> the name of the abstract bitstream.  Telnet and FTP were built on top of
> it.   I used the term Network Control Program to refer to the software that
> had to be added to each time-shared computer's operating system to support
> interactions between user process and the IMP.  Over time, the abbreviation
> "NCP" became repurposed to mean Network Control Protocol as the name of the
> Host-Host protocol.)
>
> Even though we didn't expect the Arpanet to drop messages, we anticipated
> there might be congestion in the receiving host, and thus we needed a way
> for the receiving host to have some control over the quantity or rate of
> data the sending host.  The resulting design, allocations of both messages
> and bits by the receiving host reflected a best guess.  We left it to the
> implementers, operators and future researchers to work out quantitative
> details.  (N.B. I said "bits."  Eight bit bytes were not yet the universal
> quantity of exchange.  This changed by the time the Internet protocols were
> being designed.)
>
> Thus, when the Internet protocols were being designed there were two
> significant differences.  First, it was clear there had to be a way to
> retransmit messages that had been lost.  Second, the community had gained
> some experience with the performance of the protocol.  And, of course, with
> the Arpanet in operation, it was possible to try out different designs.
> Retransmission strategies added a lot of complexity to the design problem.
> But even the "simple" problem of controlling congestion without considering
> lost messages was surprisingly complex.  In the early days, memory was very
> limited.  When memory became plentiful, allocating too much space brought
> forth the phenomenon of bufferbloat.
>
> Returning to the relationship between the early work on flow control in the
> Arpanet NCP and the later work on flow control and retransmission in the
> Internet, I'd say the main contribution from the Arpanet initial period was
> the identification of the need for flow control and an initial design that
> provided a basis for measurement and experimentation.
>
> Steve
>
>
>
>
> On Mon, Dec 29, 2025 at 11:40 PM Leonard Kleinrock via Internet-history <
> internet-history at elists.isoc.org> wrote:
>
> > As this discussion group has been reaching back in time to the early
> > RFC’s, the early Host-Host protocol, the NCP, other early protocols and
> > rathole history, and how they informed TCP and its many improvements,
> let’s
> > not forget that Steve Crocker was a key contributor (e.g, RFC 1 and much
> > more).  I may have missed mention of Steve, but surely we should be
> > including his name in our discussions  about those early protocol and
> > system developers.
> >
> > Len
> >
> >
> >
> > > On Dec 29, 2025, at 4:05 PM, Jack Haverty via Internet-history <
> > internet-history at elists.isoc.org> wrote:
> > >
> > > A little more rathole history...
> > >
> > > In 1977/78, I implemented TCPV2 for Unix on a PDP-11/40.  It was based
> > on the TCPV2 code which Jim Mathis at SRI had already created for the
> > LSI-11.   So most of the "state diagram", buffer management, and datagram
> > handling were compatible with a PDP-11, with a lot of shoehorning to get
> it
> > into the Unix environment (not easy on an 11/40).
> > >
> > > Jim's code set the Retransmission timer at 3 seconds.  When I asked
> why,
> > the answers revealed that no one really knew what it should be.  It also
> > didn't matter much at the time, since the underlying ARPANET which
> carried
> > most traffic delivered everything sent, in order, intact, and with no
> > duplicates.  Gateways might drop datagrams, and did -- especially the
> ones
> > interconnecting ARPANET to SATNET for intercontinental traffic.
> > >
> > > SATNET involved a geosynchronous satellite, with delays of perhaps a
> > good fraction of a second even under no load.  So 3 seconds seemed
> > reasonable for RTO.   I left the RTO in my Unix implementation set to 3
> > seconds.   We also closely monitored the "core gateways" to detect
> > situations with high loss rates of datagrams; gateways had no choice but
> > discarding packets when no buffers were available.  It happened a lot in
> > the intercontinental path.
> > >
> > > A lot of us TCPV2 implementers just picked 3 seconds, while waiting for
> > further research to produce a better answer.  Subsequently VanJ and
> others
> > thought about it a lot and invented schemes for adjusting TCP behavior,
> > documented in numerous RFCs.
> > >
> > > ...
> > >
> > > More than a decade later, I was involved in operating a corporate
> > network, using TCPV4 and 100+ Cisco routers.  We used SNMP to monitor the
> > network behavior.  Since we were also responsible for many of the "host"
> > computers, we also monitored TCP behavior in the hosts, also by using
> > SNMP.  Not all TCP implementations implemented that capability, but for
> > some we could watch retransmissions, duplicates, checksum errors, and
> > collect such data from inside the hosts' TCPs.
> > >
> > > It became obvious that there was a wide range of implementation
> > decisions that the various TCP implementers had made.  At one point,
> before
> > Microsoft embraced TCP, there were more than 30 separate TCP
> > implementations available just for use in PCs.  All sorts of companies
> were
> > also marketing workstations to attach to the proliferating Ethernets.
> > >
> > > We had to test our own software with each of these.   They exhibited
> > quite varied behavior.  Some were optimized for fastest network transfers
> > -- including one that accomplished that by violating part of the Ethernet
> > specifications for timing, effectively stealing service from others on
> the
> > LAN.  Others were optimized for minimizing load on the PC, either CPU or
> > memory resources or both. Some were optimized for simplicity -- I recall
> > one which only accepted the "next" datagram for its current window,
> > discarding anything else.  It was simple and took advantage of the fact
> > that out-of-order datagrams it discarded would be retransmitted anyway.
> > >
> > > All of these implementations "worked", in the sense that TCP traffic
> > would flow.  We could observe their behavior by monitoring both the
> > gateways (called routers by that time) and the TCPs in computers attached
> > to our intranet.
> > >
> > > Whether or not they were "legal" and conformed to the specifications
> and
> > standards was unclear.   Marketing literature might say lots of things,
> but
> > independent certification labs were scarce or non-existent.   Caveat
> emptor.
> > >
> > > ...
> > >
> > > Fast forward to today.  My home LAN now has 50+ devices on it.   All of
> > them presumably have implemented TCP.  I don't watch any of them.  I have
> > no idea which algorithms, RFCs, standards, or optimizations each has
> chosen
> > to implement.  Or if their implementation is correct.  Or "legal" in
> > conforming to whatever the specifications are today.
> > >
> > > Does anybody monitor the behavior of the Internet today at the host
> > computers and their TCPs?   How does anyone know that the TCP in their
> > device today is operating as expected and as the mathematical analyses
> > promised?
> > >
> > > /Jack Haverty
> > >
> > >
> > > On 12/29/25 13:23, John Day via Internet-history wrote:
> > >>
> > >>> On Dec 29, 2025, at 12:57, Craig Partridge <craig at tereschau.net>
> > wrote:
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Dec 29, 2025 at 12:07 PM John Day <jeanjour at comcast.net
> > <mailto:jeanjour at comcast.net>> wrote:
> > >>>> As for TCP initially using Selective-repeat or SACK, do you remember
> > what the TCP retransmission time out was at that time? It makes a
> > difference.  The nominal value in the textbooks is RTT + 4D, where D is
> the
> > mean variation. There is an RFC that says if 4D < 1 sec, set it to 1 sec.
> > which seems high, but that is what it says.
> > >>>>
> > >>>> Take care,
> > >>>> John
> > >>> Serious study of what the RTO should be didn't happen until the late
> > 1980s.  Before that, it was rather ad hoc.
> > >> I only brought up RTO because of the comment about SACK. For SACK to
> be
> > useful, RTO can’t be too short. 3 seconds sounds like plenty of time.
> > >>
> > >>> RFC 793 says min(upper bound,  beta * min(lower bound, SRTT)). where
> > SRTT was an incremental moving average, SRTT = (alpha * SRTT) +
> > (1-alpha)(measured RTT).  But this leaves open all sorts of questions
> such
> > as: what should alpha and beta be (RFC 793 suggests alpha of .8 or so and
> > beta of 1.3 to 2), and do you measure an RTT once per window (BSD's
> > approach) or once per segment (I think TENEX's approach).  Not to mention
> > the retransmission ambiguity problem, which Lixia Z. and Raj Jain
> > discovered in 1985-6.  \\
> > >> Yes, this is pretty much what the textbooks say these days. Although
> > RFC 6298 has an equation for calculating RTO, the RFC says that if
> equation
> > yields a value less than 1 sec, then set it to 1 sec. It also says that
> the
> > previous value was 3 sec and there is no problem continuing to use that.
> > So it would seem RTO should be between 1 and 3 seconds.  This seems to
> be a
> > long time.
> > >>
> > >>> (If you are wondering why we didn't use variance -- it required a
> > square root which was strictly a no-no in kernels of that era;  Van J.
> > solved part of this issue by finding a variance calculation that could be
> > done without a square root).
> > >> Yes, it was clear why variance wasn’t used. It required by both
> squares
> > and square root. I tell students that in Operating Systems,
> multiplication
> > is higher math. ;-)
> > >>
> > >>> This is an improvement on TCP v2 (which is silent on the topic) and
> > IEN 15 (1976) which says use 2 * RTT estimate.
> > >> For RTO? Yea, that would something to start with.
> > >>> Ethernet and ALOHA were more explicit about this process but both had
> > far easier problems, with well bounded prop delay (and in ALOHA's case, a
> > prop delay so long it swamped queueing times).
> > >>>
> > >>> Part of the reason TCP was slow to realize the issues, I think, were
> > (1) the expectation loss would be low (Dave Clark used to say that in the
> > 1970s, the notion was loss was below 1%, which, in a time when windows
> were
> > often 4, mean the RTO was used about 4% of the time); and (2) failure to
> > realize congestion collapse was an issue (when loss rates soar to 80% or
> > more and your RTO estimator really needs to be good or you make
> congestion
> > worse).  It is not chance that RTO issues came to a head as the Internet
> > was suffering congestion collapse.  I got pulled into the issues (and
> > helped Phil Karn solve retransmission ambiguity) because I was playing
> with
> > RDP, which had selective acks, and was seeing also sorts of strange holes
> > in my windows (as out of order segments were being acked) and trying to
> > figure out what to retransmit and when.
> > >> It doesn’t help that the Internet adopted what is basically CUTE+AIMD.
> > >>
> > >> But back to the flow control issue. This is a digression on a rat
> hole.
> > ;-)
> > >>
> > >> But also a useful discussion.  ;-)
> > >>
> > >> The question remains was dynamic window an enhancement of static
> window
> > or were they independently developed?
> > >>
> > >> Take care,
> > >> John
> > >>> Craig
> > >>>
> > >>>
> > >>> --
> > >>> *****
> > >>> Craig Partridge's email account for professional society activities
> > and mailing lists.
> > >
> > > --
> > > Internet-history mailing list
> > > Internet-history at elists.isoc.org
> > > https://elists.isoc.org/mailman/listinfo/internet-history
> > > -
> > > Unsubscribe:
> >
> https://app.smartsheet.com/b/form/9b6ef0621638436ab0a9b23cb0668b0b?The%20list%20to%20be%20unsubscribed%20from=Internet-history
> >
> > --
> > Internet-history mailing list
> > Internet-history at elists.isoc.org
> > https://elists.isoc.org/mailman/listinfo/internet-history
> > -
> > Unsubscribe:
> >
> https://app.smartsheet.com/b/form/9b6ef0621638436ab0a9b23cb0668b0b?The%20list%20to%20be%20unsubscribed%20from=Internet-history
> >
>
>
> --
> Sent by a Verified
>
> sender
> --
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history
> -
> Unsubscribe:
> https://app.smartsheet.com/b/form/9b6ef0621638436ab0a9b23cb0668b0b?The%20list%20to%20be%20unsubscribed%20from=Internet-history
>


More information about the Internet-history mailing list