[ih] TCP RTT Estimator

Mon Mar 17 21:46:08 PDT 2025

Hi Jack,

There were some queueing theory papers in those early days that did indeed shed some light on the phenomena and performance of the Arpanet and of Satnet.  Here are a couple of references where analysis and measurement were both of value in providing understanding:

https://www.lk.cs.ucla.edu/data/files/Naylor/On%20Measured%20Behavior%20of%20the%20ARPA%20Network.pdf

and 

https://www.lk.cs.ucla.edu/data/files/Kleinrock/packet_satellite_multiple_access.pdf

and this last paper even showed the “capture" effect with the SIMPs.  In particular, one phenomenon was that if site A at one end of the Satnet was sending traffic to site B at the other end, then the fact that a message traveling from A to B forced a RFNM reply from B to A and this prevented B from sending its own messages to A since the RFNMs hogged the B to A channel.  Lots more was observed and these are just some of the performance papers that used measurement and queueing models in those early days.

Len

> On Mar 11, 2025, at 1:42 PM, Jack Haverty via Internet-history <internet-history at elists.isoc.org> wrote:
> 
> On 3/11/25 07:05, David Finnigan via Internet-history wrote:
>> It looks like staff at RSRE (Royal Signals and Radar Establishment) took
>> the lead in experimenting with formulae and methods for dynamic
>> estimation of round trip times in TCP. Does anyone here have any further
>> insight or recollection into these experiments for estimating RTT, and
>> the development of the RTT formula?
>> 
> 
> IMHO the key factor was the state of the Internet at that time (1980ish).  The ARPANET was the primary "backbone" of The Internet in what I think of as the "fuzzy peach" stage of Internet evolution.   The ARPANET was the peach, and sites on the ARPANET were adding LANs of some type and connecting them with some kind of gateway to the ARPANET IMP.
> 
> The exception to that structure was Europe, especially Peter Kirstein's group at UCL and John Laws group at RSRE.   They were interconnected somehow in the UK, but their access to the Internet was through a connection to a SATNET node (aka SIMP) at Goonhilly Downs.
> 
> SATNET was connected to the ARPANET through one of the "core gateways" that we at BBN were responsible to run as a 24x7 operational network.
> 
> The ARPANET was a packet network, but it presented a virtual circuit service to its users.  Everything that went in one end came out the other end, in order, with nothing missing, and nothing duplicated. TCPs at a US site talking to TCPs at another US site didn't have much work to do, since everything they sent would be received intact.   So RTT values could be set very high - I recall one common choice was 3 seconds.
> 
> For the UK users however, things were quite different.  The "core gateways" at the time were very limited by their hardware configurations.  They didn't have much buffering space.   So they did drop datagrams, which of course had to be retransmitted by the host at the end of the TCP connection.  IIRC, at one point the ARPANET/SATNET gateway had exactly one datagram of buffer space.
> 
> I don't recall anyone ever saying it, but I suspect that situation caused the UCL and RSRE crews to pay a lot of attention to TCP behavior, and try to figure out how best to deal with their skinny pipe across the Atlantic.
> 
> At one point, someone (from UCL or RSRE, can't remember) reported an unexpected measurement.  They did frequent file transfers, often trying to "time" their transfers to happen at a time of day when UK and US traffic flows would be lowest.   But they observed that their transfers during "busy times" went much faster than similar transfers during "quiet times".  That made little sense of course.
> 
> After digging around with XNET, SNMP, etc., we discovered the cause.  That ARPANET/SATNET gateway had very few buffers.  The LANs at users' sites and the ARPANET path could deliver datagrams to that gateway faster than SATNET could take them.  So the buffers filled up and datagrams were discarded -- just as expected.
> 
> During "quiet times", the TCP connection would deliver datagrams to the gateway in bursts (whatever the TCPs negotiated as a Window size).   Buffers in the gateway would overflow and some of those datagrams were lost.  The sending TCP would retransmit, but only after the RTT timer expired, which was often set to 3 seconds. Result - slow FTPs.
> 
> Conversely, during "busy times", the traffic through the ARPANET would be spread out in time.   With other users' traffic flows present, chances were better that someone else's datagram would be dropped instead.  Result - faster FTP transfers.
> 
> AFAIK, none of this behavior was ever analyzed mathematically.  The mathematical model of an Internet seemed beyond the capability of queuing theory et al.  Progress was very much driven by experimentation and "let's try this" activity.
> 
> The solution, or actually workaround, was to improve the gateway's hardware.  More memory meant more buffering was available.   That principle seems to have continued even today, but has caused other problems.  Google "buffer bloat" if you're curious.
> 
> As far as I remember, there weren't any such problems reported with the various Packet Radio networks.   They tended to be used only occasionally, for tests and demos, where the SATNET linkage was used almost daily.
> 
> The Laws and Kirstein groups in the UK were, IMHO, the first "real" users of TCP on The Internet, exploring paths not protected by ARPANET mechanisms.
> 
> Jack Haverty
> 
> -- 
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history