[ih] TCP RTT Estimator

Sat Mar 22 15:27:04 PDT 2025

Hi Jack,

Thanks for your additional data on the early networks and the ongoing discussion re such topics as “TCP RTT Estimator” and network congestion.

Regarding your comment below, "So my personal conclusion has been that scientific analysis is important and useful, but has to be viewed in the context of real-world conditions.  The Internet in particular is a real-world environment that seems, to me at least, to be mathematically intractable.”.   I agree with most of that, but want to address the “mathematically intractable” tone.  I think you would agree that we should  be sure that we continue to realize that mathematical analysis, although with its simplifying assumptions,  has a valuable role in that  a combination of mathematical models, analysis, optimization, simulation, measurement, experiments, and testing should all work together and iteratively to provide useful results, understanding, principles, intuition, judgment, and guidelines etc, all of which help enable us to deal with the intricacies and behavior of such complex systems as the Internet.  For example, even in those very early Arpanet days, by taking a system viewpoint, we were able to anticipate and/or measure the deadlocks and degradations due to the ad-hoc flow control measures that were introduced with incomplete understanding of their interaction.  Another example in recent discussions in this mailing group was the emergence of Buffer Bloat that could have been anticipated had there been a proper analysis and understanding of the source of network congestion (but that’s a whole other discussion).   

Let me offer a quote due, not to Yogi Berra, but rather to Einstein “Make everything as simple as possible, but not simpler”. 

Best,
Len

> On Mar 18, 2025, at 4:16 PM, Jack Haverty <jack at 3kitty.org> wrote:
> 
> Hi Len,
> 
> Thanks for the pointers.  They fill in a bit more of the History.   In particular I've seen little written about the early days of SATNET, AlohaNet, and such.  Also, in those days  ( 1970s+- ) there was no Web, no Internet, no search engines, and no easy way to access such papers except by attending the conferences.
> 
> I wasn't involved with SATNET in its early days.   It came onto my radar when Vint put "make the core gateways a 24x7 operational service" onto an ARPA contract I was managing.  I think it was fall 1978.  By that time, SATNET was running CPODA and was in "operation" mode, monitored by the BBN NOC which also similarly managed the ARPANET.  The technology was pretty stable by then.  MATNET had also been deployed, as a clone of SATNET, with installations on Navy sites including the USS Carl Vinson.   It was the next step in the progression from research to operational "technology transfer" into the "real world" of DoD.
> 
> From the papers you highlighted, it seems that the experiments were carried out before the CPODA introduction.  I'm a bit confused about exactly what was involved.   There was SATNET with sites in West Virginia US and Goonhilly Downs UK.   There was also an ARPANET IMP (actually UCL-TIP IIRC) linked to IMPs in the US by satellite.  I always thought those were two separate networks, but maybe somehow the ARPANET IMP-IMP "circuit" used the SATNET satellite channel?  The paper references RFNMs on SATNET.  But I don't remember if those were part of the SATNET mechanisms (CPODA?) or somehow part of the ARPANET internal mechanisms.  I don't recall ever hearing anything about RFNMs being part of SATNET's mechanisms while I was responsible for it.
> 
> In any event, I studied quite a bit of queueing theory and other branches of mathematics (e.g., statistics, operations research, etc.) while a student at MIT.   It was all very enlightening to understand how things work, and to be able to use the techniques to compare possible internal algorithms.
> 
> But I also learned that there can be large differences between theory and practice.
> 
> One example was while I had a student job programming a PDP-8 for data collection in a lab where inertial navigation equipment was developed, used in Apollo, Minuteman, and such systems.  I had studied lots of mathematical techniques for engineering design, e.g., use of Karnaugh Maps to minimize logic circuit components. 
> 
> My desk happened to be next to one of the career engineer's desk (an actual "rocket scientist").   So I asked him what kinds of tools he had found were most useful for his work.  His answer -- none of them.  By analyzing enormous amounts of data, they had discovered that almost all failures were caused by some kind of metal-metal connector problem.  So their engineering principle was to minimize the number of such connections in a design.   There were no tools for that.
> 
> Another example occurred at BBN, when the ARPANET was being transformed into the Defense Data Network, to become a DoD-wide operational infrastructure.  Someone (can't remember who) had produced a scientific paper proving that the ARPANET algorithms would "lock up" and the entire network would crash.  That understandably caused significant concern in the DoD.   The DDN couldn't be allowed to crash.
> 
> After BBN investigated, we discovered that the research was true.   But there were assumptions made in order for the analysis to be tractable.  In particular, the analysis assumed that every IMP in the network ran at exactly the same speed, and was started at exactly the same time, so that all the programs were running in perfect synchrony, with instructions being executed simultaneously in every IMP.  That assumption made the analysis mathematically feasible.
> 
> Without that assumption, the analysis was still accurate, but became irrelevant.  We advised the DoD not to worry, explaining that the probability of such an occurrence was infinitesimal.  If we had to make that behavior happen, we didn't know how to do so.  They agreed.  DDN continued to be deployed.
> 
> So my personal conclusion has been that scientific analysis is important and useful, but has to be viewed in the context of real-world conditions.  The Internet in particular is a real-world environment that seems, to me at least, to be mathematically intractable.  There are many components in use, even within a single TCP connection, where some of the mechanisms (retransmissions, error detection, queue management, timing, etc.) are in the switches, some are in the hosts' implementations of TCP, and some are in the particular operating systems involved.
> 
> There is a quote, attributed to Yogi Berra, which captures the situation:
> 
> "In theory, there is no difference between theory and practice.   In practice, there is."
> 
> While I was involved in designing internals of The Internet, generally between 1972 and 1997, I don't recall much if any "analysis" of the Internet as a whole communications system, including TCP, IP, UDP, as well as mechanisms in each of the underlying network technologies.  Mostly design decisions were driven by intuition and/or experience.   Perhaps there was some comprehensive analysis, but I missed it.
> 
> Perhaps The Internet as a whole is just too complex for the existing capabilities of mathematical tools?
> 
> Jack
> 
> 
> 
> 
> 
> On 3/17/25 21:46, Leonard Kleinrock wrote:
>> Hi Jack,
>> 
>> There were some queueing theory papers in those early days that did indeed shed some light on the phenomena and performance of the Arpanet and of Satnet.  Here are a couple of references where analysis and measurement were both of value in providing understanding:
>> 
>> https://www.lk.cs.ucla.edu/data/files/Naylor/On%20Measured%20Behavior%20of%20the%20ARPA%20Network.pdf
>> 
>> and 
>> 
>> https://www.lk.cs.ucla.edu/data/files/Kleinrock/packet_satellite_multiple_access.pdf
>> 
>> and this last paper even showed the “capture" effect with the SIMPs.  In particular, one phenomenon was that if site A at one end of the Satnet was sending traffic to site B at the other end, then the fact that a message traveling from A to B forced a RFNM reply from B to A and this prevented B from sending its own messages to A since the RFNMs hogged the B to A channel.  Lots more was observed and these are just some of the performance papers that used measurement and queueing models in those early days.
>> 
>> Len
>> 
>> 
>> 
>>> On Mar 11, 2025, at 1:42 PM, Jack Haverty via Internet-history <internet-history at elists.isoc.org> <mailto:internet-history at elists.isoc.org> wrote:
>>> 
>>> On 3/11/25 07:05, David Finnigan via Internet-history wrote:
>>>> It looks like staff at RSRE (Royal Signals and Radar Establishment) took
>>>> the lead in experimenting with formulae and methods for dynamic
>>>> estimation of round trip times in TCP. Does anyone here have any further
>>>> insight or recollection into these experiments for estimating RTT, and
>>>> the development of the RTT formula?
>>>> 
>>> 
>>> IMHO the key factor was the state of the Internet at that time (1980ish).  The ARPANET was the primary "backbone" of The Internet in what I think of as the "fuzzy peach" stage of Internet evolution.   The ARPANET was the peach, and sites on the ARPANET were adding LANs of some type and connecting them with some kind of gateway to the ARPANET IMP.
>>> 
>>> The exception to that structure was Europe, especially Peter Kirstein's group at UCL and John Laws group at RSRE.   They were interconnected somehow in the UK, but their access to the Internet was through a connection to a SATNET node (aka SIMP) at Goonhilly Downs.
>>> 
>>> SATNET was connected to the ARPANET through one of the "core gateways" that we at BBN were responsible to run as a 24x7 operational network.
>>> 
>>> The ARPANET was a packet network, but it presented a virtual circuit service to its users.  Everything that went in one end came out the other end, in order, with nothing missing, and nothing duplicated. TCPs at a US site talking to TCPs at another US site didn't have much work to do, since everything they sent would be received intact.   So RTT values could be set very high - I recall one common choice was 3 seconds.
>>> 
>>> For the UK users however, things were quite different.  The "core gateways" at the time were very limited by their hardware configurations.  They didn't have much buffering space.   So they did drop datagrams, which of course had to be retransmitted by the host at the end of the TCP connection.  IIRC, at one point the ARPANET/SATNET gateway had exactly one datagram of buffer space.
>>> 
>>> I don't recall anyone ever saying it, but I suspect that situation caused the UCL and RSRE crews to pay a lot of attention to TCP behavior, and try to figure out how best to deal with their skinny pipe across the Atlantic.
>>> 
>>> At one point, someone (from UCL or RSRE, can't remember) reported an unexpected measurement.  They did frequent file transfers, often trying to "time" their transfers to happen at a time of day when UK and US traffic flows would be lowest.   But they observed that their transfers during "busy times" went much faster than similar transfers during "quiet times".  That made little sense of course.
>>> 
>>> After digging around with XNET, SNMP, etc., we discovered the cause.  That ARPANET/SATNET gateway had very few buffers.  The LANs at users' sites and the ARPANET path could deliver datagrams to that gateway faster than SATNET could take them.  So the buffers filled up and datagrams were discarded -- just as expected.
>>> 
>>> During "quiet times", the TCP connection would deliver datagrams to the gateway in bursts (whatever the TCPs negotiated as a Window size).   Buffers in the gateway would overflow and some of those datagrams were lost.  The sending TCP would retransmit, but only after the RTT timer expired, which was often set to 3 seconds. Result - slow FTPs.
>>> 
>>> Conversely, during "busy times", the traffic through the ARPANET would be spread out in time.   With other users' traffic flows present, chances were better that someone else's datagram would be dropped instead.  Result - faster FTP transfers.
>>> 
>>> AFAIK, none of this behavior was ever analyzed mathematically.  The mathematical model of an Internet seemed beyond the capability of queuing theory et al.  Progress was very much driven by experimentation and "let's try this" activity.
>>> 
>>> The solution, or actually workaround, was to improve the gateway's hardware.  More memory meant more buffering was available.   That principle seems to have continued even today, but has caused other problems.  Google "buffer bloat" if you're curious.
>>> 
>>> As far as I remember, there weren't any such problems reported with the various Packet Radio networks.   They tended to be used only occasionally, for tests and demos, where the SATNET linkage was used almost daily.
>>> 
>>> The Laws and Kirstein groups in the UK were, IMHO, the first "real" users of TCP on The Internet, exploring paths not protected by ARPANET mechanisms.
>>> 
>>> Jack Haverty
>>> 
>>> -- 
>>> Internet-history mailing list
>>> Internet-history at elists.isoc.org <mailto:Internet-history at elists.isoc.org>
>>> https://elists.isoc.org/mailman/listinfo/internet-history
>> 
>