[ih] bufferbloat and modern congestion control (was 4004)

Steve Crocker steve at shinkuro.com
Thu Oct 3 10:45:09 PDT 2024


John,

The RFNMs restricted flow on a per "connection" basis.  As I accidentally demonstrated when I suggested to the folks at Tinker that they use eight parallel connections to maximize throughput, the early Arpanet could be brought down nearly instantly.

The reassembly problem was separate and at least partly related to focusing on eight packet messages instead of two packet messages.

Steve

Sent from my iPhone

> On Oct 3, 2024, at 1:38 PM, John Day via Internet-history <internet-history at elists.isoc.org> wrote:
> 
> This is all well and good and actually quite interesting; but it doesn’t address the systems problems.
> 
> As long as detection is implicit, it is predatory. The senders will react to conditions that did not occur in this layer and mechanisms in lower layers will have already reacted (having smaller scope), so there are conflicting reactions.
> 
> Delay is going to be a noisy signal, which will result in false positives.  Packets may have been discarded by a lower layer. Of course this falls prey to the ridiculous complaint we have seen that ’TCP doesn’t support wireless,’ but it isn’t suppose to wireless is suppose to support TCP. However, if there is congestion in the lower layers (doing what is necessary to support TCP), you don’t want TCP reacting to it.
> 
> Congestion in TCP is difficult if not impossible to coordinate with QoS. QoS mechanisms are primarily in the layer below. As you describe there is some loose coordination but it is far from sufficient.
> 
> As I said, I have not been impressed by these implicit indirect signals of congestion.
> 
> Take care,
> John
> 
> 
>> On Oct 2, 2024, at 18:21, Dave Taht <dave.taht at gmail.com> wrote:
>> 
>> I wish I had had the time and resources to (help) write more papers. (For example there isn't much on "drop head queueing")
>> 
>> fq_codel is now a linux-wide default and has the following unique properties:
>> 
>> codel queue management, which measure the time a packet spends in a queue and gradually attempts to find an optimum point for queue length, which is 5ms by default. (it has been tested in software below 250us in the DC). There is another subsystem, called BQL, which attempts to limit bytes on the device txring to one interrupt's worth. (a pretty good explanation of modern layers here) [2]
>> 
>> It drops from the head, not the tail of the queue, with a small (BQL or HTB) FIFO in front of the lowest bits of the hardware to account
>> for interrupt latency.
>> 
>> (I am kind of curious if a txring existed back in the day and how close an application sat to the hardware)
>> 
>> Anecdote: when van and kathy were working on what became codel (january 2012), she rang me up one day and asked me just how much overhead there was in getting a timestamp from the hardware nowadays. And I explained that it was only a few cycles and a pipeline bubble, and the cost of unsynced TSQs and so on and so forth, and she said thanks, and hung up. Getting a timestamp must have been mighty hard back in the day!
>> 
>> The "flow queueing" mechanism sends packets that have an arrival rate of less than the departure rate of all the other flows, out first.[1] This is an improvement over prior FQ mechanisms like SFQ and DRR, which always put a new flow at the tail of the flow list. It is pretty amazing how often this works on real traffic. Also it automatically puts flows that build a queue into a queue that is managed by codel.
>> 
>> One (eventual) benefit of these approaches, combined, is it makes delay based congestion control more feasible (indeed,
>> BBR spends most of its time in this mode), but the flow isolation makes for most interactive traffic never being queued at all.
>> 
>> IMHO the edges of the internet at least, would have been much better were some form of FQ always in it (which we kind of got from switched networks naturally) but the idea of FQ was roundly rejected in the first ietf meeting in 1989, and it's been uphill ever since.
>> 
>> Just to touch upon pacing a bit - pacing is the default for the linux stack no matter the overlying qdisc or congestion control algorithm.
>> I don't know if anyone has ever attempted to compare pacing w/cubic vs pacing w/bbr, and very few, until recently, have
>> attempted to also compare the cc-of-the-day vs fq_codel or cake. [3]
>> 
>> [1] https://ieeexplore.ieee.org/document/8469111
>> [2] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9541151
>> [3]  https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0304609&type=printable
>> 
>> Varying the packet pacing to get a pre-congestion notification is a paper I'd like more to pursue.
>> https://www.usenix.org/system/files/atc24-han.pdf
>> (I so want to believe this paper)
>> 
>> A tiny bit more below....
>> 
>>> On Wed, Oct 2, 2024 at 2:31 PM John Day via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:
>>> The response to bufferbloat has always struck me as looking for your keys under a street light when that wasn’t where you dropped them but there is light there.
>>> 
>>> Initially, bufferbloat was not a problem because memory was expensive and when TCP ran out of buffers (or got low), the connection simply blocked the sending application until buffers were available. This was still true with the advent of NIC cards. Memory was still tight. However, as memory got cheap and NIC cards had oceans of memory, TCP never got low on buffers and no one told the application to slow down or wait, so there was local congestion collapse:  bufferbloat.
>>> 
>>> One part of the solution would be interface flow control between the sending application and TCP (you would have thought that would have occurred to implementers any way, it is obvious) and/or simply restrict the amount of buffers TCP has available so that it runs out and blocks the sending the application before things get bad and opens up when buffers are available.  But virtually all of the papers I see are on different drop-strategies, and oddly enough they never find their keys.
>> 
>> don't have a lot of time for papers!  The most modern stuff for tcp is using EDF (earliest deadline first) to manage the packet pacing.
>> There are virtual and actual physical devices nowadays that take a "time to be sent" and packet. This paper was highly influential:
>> 
>> https://saeed.github.io/files/carousel-sigcomm17.pdf
>> 
>> the latest commit to the linux kernel about it:
>> 
>> https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/
>> 
>> PS IMHO eric dumazet belongs a spot in the internet hall of fame for so many things...
>> 
>>> 
>>> Take care,
>>> John
>>> 
>>>> On Oct 2, 2024, at 01:48, Barbara Denny via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:
>>>> 
>>>> Just throwing some thoughts out here ......
>>>> I can see how this happens in a FIFO queuing world.   However a lot of work has gone into fair queuing starting in the late 80s.  Just wondering if anyone has done work utilizing fair queuing and source quench.  For example, I think I can see how to use fair queuing information to better select who to send a source quench to. At least I can see how to do it with Stochastic Fairness Queueing since I worked on it and I  remember a fair amount about how it was implemented. I wouldn't be able to provide a guarantee that the wrong host would never receive a source quench but the likelihood should be much lower.  Considering whether the use of NAT creates undesirable behavior is also important and I am sure there are probably other cases that need to be checked.
>>>> Hum,  it might also be interesting to speculate whether this could have any effect on bufferbloat but I fess up I need to learn more about the work done in the area of bufferbloat.  I was involved with other things when this started to appear on my radar screen as a hot topic.  I will admit I wish I had done more work on possible buffering effects from implementation choices at the time I did work on SFQ but there were contractual obligations that restricted how much time I could devote to the SFQ part of the project.
>>>> Just curious, ECN (Explicit Congestion Notification) is optional . Does anyone have any idea about its use in the Internet?
>>>> barbara
>>>> 
>>>>   On Tuesday, October 1, 2024 at 07:10:25 PM PDT, Vint Cerf <vint at google.com <mailto:vint at google.com>> wrote:  
>>>> 
>>>> One basic problem with blaming the "last packet that caused intermediate router congestion" is that it usually blamed the wrong source, among other problems. Van Jacobson was/is the guru of flow control (among others) who might remember more.
>>>> 
>>>> v
>>>> 
>>>> On Tue, Oct 1, 2024 at 8:50 PM Barbara Denny via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:
>>>> 
>>>> In a brief attempt to try to find some information about the early MIT work you mentioned, I ended up tripping on this Final Report from ISI in DTIC.  It does talk a fair amount about congestion control and source quench (plus other things that might interest people). The period of performance is 1987 to 1990 which is much later than I was considering in my earlier message.
>>>> 
>>>> https://apps.dtic.mil/sti/tr/pdf/ADA236542.pdf
>>>> 
>>>> Even though the report mentions testing on DARTnet, I don't remember anything about this during our DARTnet meetings.  I did join the project after the start so perhaps the work was done before I began to participate. I also couldn't easily find the journal they mention as a place for publishing their findings. I will have more time later to see if I can something that covers this testing.
>>>> 
>>>> barbara
>>>> 
>>>>    On Tuesday, October 1, 2024 at 04:37:47 PM PDT, Scott Bradner via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:  
>>>> 
>>>> multicast is also an issue but I do not recall if that was one that Craig & I talked about
>>>> 
>>>> Scott
>>>> 
>>>>> On Oct 1, 2024, at 7:34 PM, Scott Bradner via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:
>>>>> 
>>>>> I remember talking with Craig Partridge (on a flight to somewhere) about source quench
>>>>> during the time when 1812 was being written - I do not recall
>>>>> the specific issues but I recall that there were more than one issue
>>>>> 
>>>>> (if DoS was not an issue at the time, it should have been)
>>>>> 
>>>>> Scott
>>>>> 
>>>>>> On Oct 1, 2024, at 6:22 PM, Brian E Carpenter via Internet-history <internet-history at elists.isoc.org <mailto:internet-history at elists.isoc.org>> wrote:
>>>>>> 
>>>>>> On 02-Oct-24 10:19, Michael Greenwald via Internet-history wrote:
>>>>>>> On 10/1/24 1:11 PM, Greg Skinner via Internet-history wrote:
>>>>>>>> Forwarded for Barbara
>>>>>>>> 
>>>>>>>> ====
>>>>>>>> 
>>>>>>>> From: Barbara Denny <b_a_denny at yahoo.com <mailto:b_a_denny at yahoo.com>>
>>>>>>>> Sent: Tuesday, October 1, 2024 at 10:26:16 AM PDT
>>>>>>>> I think congestion issues were discussed because I remember an ICMP message type called source quench (now deprecated). It was used for notifying a host to reduce the traffic load to a destination.  I don't remember hearing about any actual congestion experiments using this message type.
>>>>>>> Of only academic interest: I believe that, circa 1980 +/- 1-2 years, an
>>>>>>> advisee of either Dave Clark or Jerry Saltzer, wrote an undergraduate
>>>>>>> thesis about the use of Source Quench for congestion control. I believe
>>>>>>> it included some experiments (maybe all artificial, or only through
>>>>>>> simulation).
>>>>>>> I don't think it had much impact on the rest of the world.
>>>>>> 
>>>>>> Source quench is discussed in detail in John Nagle's RFC 896 (dated 1984).
>>>>>> A trail of breadcrumbs tells me that he has an MSCS from Stanford, so
>>>>>> I guess he probably wasn't an MIT undergrad.
>>>>>> 
>>>>>> Source quench was effectively deprecated by RFC 1812 (dated 1995). People
>>>>>> had played around with ideas (e.g. RFC 1016) but it seems that basically
>>>>>> it was no use.
>>>>>> 
>>>>>> A bit more Google found this, however:
>>>>>> 
>>>>>> "4.3. Internet Congestion Control
>>>>>> Lixia Zhang began a study of network resource allocation techniques suitable for
>>>>>> the DARPA Internet. The Internet currently has a simple technique for resource
>>>>>> allocation, called "Source Quench."
>>>>>> Simple simulations have shown that this technique is not effective, and this work
>>>>>> has produced an alternative which seems considerably more workable. Simulation
>>>>>> of this new technique is now being performed."
>>>>>> 
>>>>>> [MIT LCS Progress Report to DARPA, July 1983 - June 1984, AD-A158299,
>>>>>> https://apps.dtic.mil/sti/pdfs/ADA158299.pdf ]
>>>>>> 
>>>>>> Lixia was then a grad student under Dave Clark. Of course she's at UCLA now. If she isn't on this list, she should be!
>>>>>> 
>>>>>>  Brian Carpenter
>>>> 
>>>> 
>>>> --
>>>> Internet-history mailing list
>>>> Internet-history at elists.isoc.org <mailto:Internet-history at elists.isoc.org>
>>>> https://elists.isoc.org/mailman/listinfo/internet-history
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Please send any postal/overnight deliveries to:Vint CerfGoogle, LLC1900 Reston Metro Plaza, 16th FloorReston, VA 20190+1 (571) 213 1346
>>>> 
>>>> 
>>>> until further notice
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Internet-history mailing list
>>>> Internet-history at elists.isoc.org <mailto:Internet-history at elists.isoc.org>
>>>> https://elists.isoc.org/mailman/listinfo/internet-history
>>> 
>>> --
>>> Internet-history mailing list
>>> Internet-history at elists.isoc.org <mailto:Internet-history at elists.isoc.org>
>>> https://elists.isoc.org/mailman/listinfo/internet-history
>> 
>> 
>> --
>> Dave Täht CSO, LibreQos
> 
> --
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history



More information about the Internet-history mailing list