From perry at piermont.com Mon Dec 13 13:01:47 2004 From: perry at piermont.com (Perry E. Metzger) Date: Mon, 13 Dec 2004 16:01:47 -0500 Subject: [ih] Global congestion collapse In-Reply-To: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> (Michael Welzl's message of "04 Oct 2004 10:27:41 +0200") References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> Message-ID: <87mzwhap50.fsf@snark.piermont.com> Sorry for not replying for a long time... Michael Welzl writes: > Does anybody here have stories about the Internet's congestion > collapse(s) during the 80's? Some details would be great! [...] > So, I wonder, what was it like? What are your experiences? > When did folks first notice it? I strongly remember a point in '88 or so (perhaps it was 87 -- it probably wasn't '89) when it became impossible to move data back and forth between Bellcore and NYNEXs research lab in White Plains over the net because of congestion related problems. I was working on some collaboration with them and suddenly found myself forced to make use of mag tapes as the only practical way to move even fairly small files back and forth. A mailing list I ran off of one of my machines also started having trouble moving bits through efficiently. As I recall, the arrival of kernel patches implementing congestion control rapidly began to reverse the situation. The first time I saw such patches was when Phil Karn handed them to me one day, and I swiftly added them to the kernels of my lab's Sun-3s. The world was somewhat different back then... :) Perry From mills at udel.edu Mon Dec 13 17:05:50 2004 From: mills at udel.edu (David L. Mills) Date: Tue, 14 Dec 2004 01:05:50 +0000 Subject: [ih] Global congestion collapse In-Reply-To: <87mzwhap50.fsf@snark.piermont.com> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> Message-ID: <41BE3C6E.2090507@udel.edu> Perry, Well, if your incident was during 1986-1988 and involved transit of the NSFnet Phase-I backbone, I'm the perp. The NSFnet routers ran my code, which was horribly overrun by supercomputer traffic. I found the best way to deal with the problem was to find the supercomputer elephants and shoot them. More is in a 1988 SIGCOMM Symposium paper. More recently the USNO and NIST time servers are being overrun with NTP traffic. See my recent PTTI paper at www.eecis.udel.edu/+mills/papers.html. The NSFnet meltdown occured primarily because the fuzzball routers used smart interfaces that retransmitted when either an error occured or the receiver ran dry of buffers. The entire network locked up for a time because all the buffers in all six machines filled up with retransmit traffic and nothing could get in or out. As I recall, the ARPAnet also had a similar problem with reassembly buffers. Dave Perry E. Metzger wrote: >Sorry for not replying for a long time... > >Michael Welzl writes: > > >>Does anybody here have stories about the Internet's congestion >>collapse(s) during the 80's? Some details would be great! >> >> >[...] > > >>So, I wonder, what was it like? What are your experiences? >>When did folks first notice it? >> >> > >I strongly remember a point in '88 or so (perhaps it was 87 -- it >probably wasn't '89) when it became impossible to move data back and >forth between Bellcore and NYNEXs research lab in White Plains over >the net because of congestion related problems. I was working on some >collaboration with them and suddenly found myself forced to make use >of mag tapes as the only practical way to move even fairly small files >back and forth. A mailing list I ran off of one of my machines also >started having trouble moving bits through efficiently. > >As I recall, the arrival of kernel patches implementing congestion >control rapidly began to reverse the situation. > >The first time I saw such patches was when Phil Karn handed them to me >one day, and I swiftly added them to the kernels of my lab's >Sun-3s. The world was somewhat different back then... :) > >Perry > > From perry at piermont.com Mon Dec 13 18:32:02 2004 From: perry at piermont.com (Perry E. Metzger) Date: Mon, 13 Dec 2004 21:32:02 -0500 Subject: [ih] Global congestion collapse In-Reply-To: <41BE3C6E.2090507@udel.edu> (David L. Mills's message of "Tue, 14 Dec 2004 01:05:50 +0000") References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> Message-ID: <87r7lt7gpp.fsf@snark.piermont.com> "David L. Mills" writes: > Well, if your incident was during 1986-1988 and involved transit of > the NSFnet Phase-I backbone, I'm the perp. The NSFnet routers ran my > code, which was horribly overrun by supercomputer traffic. I found the > best way to deal with the problem was to find the supercomputer > elephants and shoot them. More is in a 1988 SIGCOMM Symposium > paper. More recently the USNO and NIST time servers are being overrun > with NTP traffic. See my recent PTTI paper at > www.eecis.udel.edu/+mills/papers.html. > > The NSFnet meltdown occured primarily because the fuzzball routers > used smart interfaces that retransmitted when either an error occured > or the receiver ran dry of buffers. The entire network locked up for a > time because all the buffers in all six machines filled up with > retransmit traffic and nothing could get in or out. As I recall, the > ARPAnet also had a similar problem with reassembly buffers. Interesting. Bellcore switched from a 56k link to the IMP at Columbia to NSFnet towards the end (latter half?) of that time, but I can't remember if the horrible congestion was before or after our switch. Either way, though, it was pretty shortly thereafter that I remember getting my first replacement .o files with yummy new TCP congestion control algorithms in them. Perry From mills at udel.edu Mon Dec 13 20:37:30 2004 From: mills at udel.edu (David L. Mills) Date: Tue, 14 Dec 2004 04:37:30 +0000 Subject: [ih] Global congestion collapse In-Reply-To: <87r7lt7gpp.fsf@snark.piermont.com> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> <87r7lt7gpp.fsf@snark.piermont.com> Message-ID: <41BE6E0A.9030807@udel.edu> Perry, Not so fast. Steve Wolff of NSF and I had a nasty little secret we did not tell the NSFnet maintenance crew who could never keep a secret. I built in priority queueing and preemption in the fuzzball routers. The former wiretapped the telnet port and made it just below NTP on the priority scale. We put mail on the bottom just below ftp. A lot of telnet users stopped complaining because they thought we "fixed" the network. The other thing was to shoot the elephants. When a new packet arrived and no buffer space was available, the output queues were scanned looking for the biggest elephant (total byte count on all queues from the same IP address) and killed its biggest packet. Gunshots continued until either the arriving packet got shot or there was enough room to save it. It all worked gangbusters and the poor ftpers never found out. Dave Perry E. Metzger wrote: >"David L. Mills" writes: > > >>Well, if your incident was during 1986-1988 and involved transit of >>the NSFnet Phase-I backbone, I'm the perp. The NSFnet routers ran my >>code, which was horribly overrun by supercomputer traffic. I found the >>best way to deal with the problem was to find the supercomputer >>elephants and shoot them. More is in a 1988 SIGCOMM Symposium >>paper. More recently the USNO and NIST time servers are being overrun >>with NTP traffic. See my recent PTTI paper at >>www.eecis.udel.edu/+mills/papers.html. >> >>The NSFnet meltdown occured primarily because the fuzzball routers >>used smart interfaces that retransmitted when either an error occured >>or the receiver ran dry of buffers. The entire network locked up for a >>time because all the buffers in all six machines filled up with >>retransmit traffic and nothing could get in or out. As I recall, the >>ARPAnet also had a similar problem with reassembly buffers. >> >> > >Interesting. Bellcore switched from a 56k link to the IMP at Columbia >to NSFnet towards the end (latter half?) of that time, but I can't >remember if the horrible congestion was before or after our switch. > >Either way, though, it was pretty shortly thereafter that I remember >getting my first replacement .o files with yummy new TCP congestion >control algorithms in them. > >Perry > > From michael.welzl at uibk.ac.at Mon Dec 13 22:47:01 2004 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: 14 Dec 2004 07:47:01 +0100 Subject: [ih] Global congestion collapse In-Reply-To: <41BE6E0A.9030807@udel.edu> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> <87r7lt7gpp.fsf@snark.piermont.com> <41BE6E0A.9030807@udel.edu> Message-ID: <1103006821.4796.3.camel@lap10-c703.uibk.ac.at> Folks, Thanks a lot for answering my original question; this discussion is getting more and more exciting :) Cheers, Michael On Tue, 2004-12-14 at 05:37, David L. Mills wrote: > Perry, > > Not so fast. Steve Wolff of NSF and I had a nasty little secret we did > not tell the NSFnet maintenance crew who could never keep a secret. I > built in priority queueing and preemption in the fuzzball routers. The > former wiretapped the telnet port and made it just below NTP on the > priority scale. We put mail on the bottom just below ftp. A lot of > telnet users stopped complaining because they thought we "fixed" the > network. > > The other thing was to shoot the elephants. When a new packet arrived > and no buffer space was available, the output queues were scanned > looking for the biggest elephant (total byte count on all queues from > the same IP address) and killed its biggest packet. Gunshots continued > until either the arriving packet got shot or there was enough room to > save it. It all worked gangbusters and the poor ftpers never found out. > > Dave > > Perry E. Metzger wrote: > > >"David L. Mills" writes: > > > > > >>Well, if your incident was during 1986-1988 and involved transit of > >>the NSFnet Phase-I backbone, I'm the perp. The NSFnet routers ran my > >>code, which was horribly overrun by supercomputer traffic. I found the > >>best way to deal with the problem was to find the supercomputer > >>elephants and shoot them. More is in a 1988 SIGCOMM Symposium > >>paper. More recently the USNO and NIST time servers are being overrun > >>with NTP traffic. See my recent PTTI paper at > >>www.eecis.udel.edu/+mills/papers.html. > >> > >>The NSFnet meltdown occured primarily because the fuzzball routers > >>used smart interfaces that retransmitted when either an error occured > >>or the receiver ran dry of buffers. The entire network locked up for a > >>time because all the buffers in all six machines filled up with > >>retransmit traffic and nothing could get in or out. As I recall, the > >>ARPAnet also had a similar problem with reassembly buffers. > >> > >> > > > >Interesting. Bellcore switched from a 56k link to the IMP at Columbia > >to NSFnet towards the end (latter half?) of that time, but I can't > >remember if the horrible congestion was before or after our switch. > > > >Either way, though, it was pretty shortly thereafter that I remember > >getting my first replacement .o files with yummy new TCP congestion > >control algorithms in them. > > > >Perry > > > > From sbrim at cisco.com Tue Dec 14 04:31:09 2004 From: sbrim at cisco.com (Scott W Brim) Date: Tue, 14 Dec 2004 07:31:09 -0500 Subject: [ih] Global congestion collapse In-Reply-To: <41BE6E0A.9030807@udel.edu> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> <87r7lt7gpp.fsf@snark.piermont.com> <41BE6E0A.9030807@udel.edu> Message-ID: <20041214123108.GC1492@sbrim-w2k02> On Tue, Dec 14, 2004 04:37:30AM +0000, David L. Mills allegedly wrote: > Perry, > > Not so fast. Steve Wolff of NSF and I had a nasty little secret we did > not tell the NSFnet maintenance crew who could never keep a secret. I > built in priority queueing and preemption in the fuzzball routers. The > former wiretapped the telnet port and made it just below NTP on the > priority scale. We put mail on the bottom just below ftp. A lot of > telnet users stopped complaining because they thought we "fixed" the > network. The news leaked out pretty quickly iirc :-) Another thing I noticed was that people adjusted their behavior. The congestion spread in time when it couldn't spread any other way, and filled most of the night. From craig at aland.bbn.com Tue Dec 14 06:09:36 2004 From: craig at aland.bbn.com (Craig Partridge) Date: Tue, 14 Dec 2004 09:09:36 -0500 Subject: [ih] Global congestion collapse In-Reply-To: Your message of "Mon, 13 Dec 2004 21:32:02 EST." <87r7lt7gpp.fsf@snark.piermont.com> Message-ID: <20041214140936.582251AD@aland.bbn.com> In message <87r7lt7gpp.fsf at snark.piermont.com>, "Perry E. Metzger" writes: >Interesting. Bellcore switched from a 56k link to the IMP at Columbia >to NSFnet towards the end (latter half?) of that time, but I can't >remember if the horrible congestion was before or after our switch. ARPANET had trouble too. I remember much tuning. >Either way, though, it was pretty shortly thereafter that I remember >getting my first replacement .o files with yummy new TCP congestion >control algorithms in them. That would have been Van's TCP mods (described in the SIGCOMM '88 paper). It was astonishing how big a difference they made. Craig From perry at piermont.com Tue Dec 14 08:18:59 2004 From: perry at piermont.com (Perry E. Metzger) Date: Tue, 14 Dec 2004 11:18:59 -0500 Subject: [ih] Global congestion collapse In-Reply-To: <20041214140936.582251AD@aland.bbn.com> (Craig Partridge's message of "Tue, 14 Dec 2004 09:09:36 -0500") References: <20041214140936.582251AD@aland.bbn.com> Message-ID: <87d5xcn98s.fsf@snark.piermont.com> Craig Partridge writes: >>Either way, though, it was pretty shortly thereafter that I remember >>getting my first replacement .o files with yummy new TCP congestion >>control algorithms in them. > > That would have been Van's TCP mods (described in the SIGCOMM '88 paper). Of course. :) > It was astonishing how big a difference they made. Yes, though apparently (according to David Mills in the last few notes to this list) more was going on than I was aware of at the time. (That's not surprising -- my research work around then was debuggers for highly parallel systems, and I was not paying much attention to the network except as a way of getting my work done...) Perry From touch at ISI.EDU Wed Dec 15 07:09:03 2004 From: touch at ISI.EDU (Joe Touch) Date: Wed, 15 Dec 2004 07:09:03 -0800 Subject: [ih] Global congestion collapse In-Reply-To: <41BE6E0A.9030807@udel.edu> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> <87r7lt7gpp.fsf@snark.piermont.com> <41BE6E0A.9030807@udel.edu> Message-ID: <41C0538F.9080108@isi.edu> David L. Mills wrote: > Perry, > > Not so fast. Steve Wolff of NSF and I had a nasty little secret we did > not tell the NSFnet maintenance crew who could never keep a secret. I > built in priority queueing and preemption in the fuzzball routers. The > former wiretapped the telnet port and made it just below NTP on the > priority scale. We put mail on the bottom just below ftp. A lot of > telnet users stopped complaining because they thought we "fixed" the > network. > > The other thing was to shoot the elephants. When a new packet arrived > and no buffer space was available, the output queues were scanned > looking for the biggest elephant (total byte count on all queues from > the same IP address) and killed its biggest packet. Gunshots continued > until either the arriving packet got shot or there was enough room to > save it. It all worked gangbusters and the poor ftpers never found out. RED would benefit from two variants - per packet (when per-packet ops are the bottleneck) and per-byte weighting, though it doesn't seem to be described that way much. This sounds a lot like per-byte (the more common case now anyway), except that RED is statistical (everyone gets slammed, proportional to their load) and this hits each in series (largest user first, then next-largest when largest backs off, etc.). Was there ever any backlash (software oscillation or people complaining) from that? Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature URL: From touch at ISI.EDU Wed Dec 15 07:11:26 2004 From: touch at ISI.EDU (Joe Touch) Date: Wed, 15 Dec 2004 07:11:26 -0800 Subject: [ih] Global congestion collapse In-Reply-To: <20041214140936.582251AD@aland.bbn.com> References: <20041214140936.582251AD@aland.bbn.com> Message-ID: <41C0541E.701@isi.edu> Craig Partridge wrote: ... >>Either way, though, it was pretty shortly thereafter that I remember >>getting my first replacement .o files with yummy new TCP congestion >>control algorithms in them. > > That would have been Van's TCP mods (described in the SIGCOMM '88 paper). > It was astonishing how big a difference they made. Not to downplay the utility of Van's variant, but it seems like _any_ congestion control would have (or may have - e.g. Dave's mods) made an astonishing impact. Joe -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature URL: From mills at udel.edu Wed Dec 15 08:48:36 2004 From: mills at udel.edu (David L. Mills) Date: Wed, 15 Dec 2004 16:48:36 +0000 Subject: [ih] Global congestion collapse In-Reply-To: <41C0538F.9080108@isi.edu> References: <1096878461.4794.76.camel@lap10-c703.uibk.ac.at> <87mzwhap50.fsf@snark.piermont.com> <41BE3C6E.2090507@udel.edu> <87r7lt7gpp.fsf@snark.piermont.com> <41BE6E0A.9030807@udel.edu> <41C0538F.9080108@isi.edu> Message-ID: <41C06AE4.6080903@udel.edu> Joe, RED has always been a problem with me. That's like shooting a load of buckshot at the herd of elephants and tigers and hoping you hit an elephant. My agenda was to find the elephants first and then target them. Dave Joe Touch wrote: > > > David L. Mills wrote: > >> Perry, >> >> Not so fast. Steve Wolff of NSF and I had a nasty little secret we >> did not tell the NSFnet maintenance crew who could never keep a >> secret. I built in priority queueing and preemption in the fuzzball >> routers. The former wiretapped the telnet port and made it just below >> NTP on the priority scale. We put mail on the bottom just below ftp. >> A lot of telnet users stopped complaining because they thought we >> "fixed" the network. >> >> The other thing was to shoot the elephants. When a new packet arrived >> and no buffer space was available, the output queues were scanned >> looking for the biggest elephant (total byte count on all queues from >> the same IP address) and killed its biggest packet. Gunshots >> continued until either the arriving packet got shot or there was >> enough room to save it. It all worked gangbusters and the poor ftpers >> never found out. > > > RED would benefit from two variants - per packet (when per-packet ops > are the bottleneck) and per-byte weighting, though it doesn't seem to > be described that way much. This sounds a lot like per-byte (the more > common case now anyway), except that RED is statistical (everyone gets > slammed, proportional to their load) and this hits each in series > (largest user first, then next-largest when largest backs off, etc.). > Was there ever any backlash (software oscillation or people > complaining) from that? > > Joe From mills at udel.edu Wed Dec 15 08:51:35 2004 From: mills at udel.edu (David L. Mills) Date: Wed, 15 Dec 2004 16:51:35 +0000 Subject: [ih] Global congestion collapse In-Reply-To: <41C0541E.701@isi.edu> References: <20041214140936.582251AD@aland.bbn.com> <41C0541E.701@isi.edu> Message-ID: <41C06B97.1030208@udel.edu> Joe, That's my point. The elephants are a small percentage of the population, but generate the vast amount of congestion. My recent PTTI paper (www.eecis.udel.edu/~mills/papers.html) show that 78 percent of the congestion seen at busy NTP servers is due to 18 percent of the population. Dave Joe Touch wrote: > > > Craig Partridge wrote: > ... > >>> Either way, though, it was pretty shortly thereafter that I remember >>> getting my first replacement .o files with yummy new TCP congestion >>> control algorithms in them. >> >> >> That would have been Van's TCP mods (described in the SIGCOMM '88 >> paper). >> It was astonishing how big a difference they made. > > > Not to downplay the utility of Van's variant, but it seems like _any_ > congestion control would have (or may have - e.g. Dave's mods) made an > astonishing impact. > > Joe From faber at ISI.EDU Wed Dec 15 08:57:32 2004 From: faber at ISI.EDU (Ted Faber) Date: Wed, 15 Dec 2004 08:57:32 -0800 Subject: [ih] Global congestion collapse In-Reply-To: <41C0541E.701@isi.edu> References: <20041214140936.582251AD@aland.bbn.com> <41C0541E.701@isi.edu> Message-ID: <20041215165732.GA35624@pun.isi.edu> On Wed, Dec 15, 2004 at 07:11:26AM -0800, Joe Touch wrote: > > > Craig Partridge wrote: > ... > >>Either way, though, it was pretty shortly thereafter that I remember > >>getting my first replacement .o files with yummy new TCP congestion > >>control algorithms in them. > > > >That would have been Van's TCP mods (described in the SIGCOMM '88 paper). > >It was astonishing how big a difference they made. > > Not to downplay the utility of Van's variant, but it seems like _any_ > congestion control would have (or may have - e.g. Dave's mods) made an > astonishing impact. There's a fundamental difference between an e2e control like Van's and a queueing system like Dave's. One reduces load and one reallocates scarce resources to the more deserving. While sophisticated queueing is undeniably helpful, the end-to-end control a necessity. -- Ted Faber http://www.isi.edu/~faber PGP: http://www.isi.edu/~faber/pubkeys.asc Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: From touch at ISI.EDU Wed Dec 15 09:54:52 2004 From: touch at ISI.EDU (Joe Touch) Date: Wed, 15 Dec 2004 09:54:52 -0800 Subject: [ih] Global congestion collapse In-Reply-To: <20041215165732.GA35624@pun.isi.edu> References: <20041214140936.582251AD@aland.bbn.com> <41C0541E.701@isi.edu> <20041215165732.GA35624@pun.isi.edu> Message-ID: <41C07A6C.8050002@isi.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ted Faber wrote: | On Wed, Dec 15, 2004 at 07:11:26AM -0800, Joe Touch wrote: | |> |>Craig Partridge wrote: |>... |> |>>>Either way, though, it was pretty shortly thereafter that I remember |>>>getting my first replacement .o files with yummy new TCP congestion |>>>control algorithms in them. |>> |>>That would have been Van's TCP mods (described in the SIGCOMM '88 paper). |>>It was astonishing how big a difference they made. |> |>Not to downplay the utility of Van's variant, but it seems like _any_ |>congestion control would have (or may have - e.g. Dave's mods) made an |>astonishing impact. | | | There's a fundamental difference between an e2e control like Van's and a | queueing system like Dave's. One reduces load and one reallocates | scarce resources to the more deserving. While sophisticated queueing is | undeniably helpful, the end-to-end control a necessity. Why? Granted it's useful, granted that it avoids needing to deploy Dave's stuff throughout (which is otherwise required) - but if that were done, why is E2E control a "necessity"? Joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBwHpsE5f5cImnZrsRAiJYAKDMUkEQpprAno4qEowvqeD7gf4g9ACfSc8b aVL6YnjUweZNFS8Anf0kDR8= =eOFM -----END PGP SIGNATURE----- From michael.welzl at uibk.ac.at Sun Dec 26 12:17:34 2004 From: michael.welzl at uibk.ac.at (Michael Welzl) Date: Sun, 26 Dec 2004 21:17:34 +0100 Subject: [ih] Re: Global congestion collapse References: <200412152000.iBFK02Q19053@boreas.isi.edu> Message-ID: <000a01c4eb87$f15db8a0$0200a8c0@fun> Dear all, Some of you mentioned a TCP patch by Jacobson in this thread - e.g.: > Craig Partridge wrote: > ... > >>Either way, though, it was pretty shortly thereafter that I remember > >>getting my first replacement .o files with yummy new TCP congestion > >>control algorithms in them. I'm interested in the history of Internet congestion control; so, I wonder: * were admins aware that this patch would reduce your own rate and might make things worse for you if you're the only one who installs it? e.g., think of 1000 * unresponsive UDP vs. 1 * TCP - across a single bottleneck - in this scenario, a single unresponsive flow would be better off than a single TCP flow. * Van Jacobson's paper came out in August 1988. I think that the first RFC which says "you MUST implement congestion control" is RFC 1122 - which came out October 1989. What happened in between? Was it just a patch flying around and word of mouth ("c'mon, install it, we'll all be better off")? It all looks a bit like an Internet community type of thing to me that couldn't work like this nowadays. Am I right? Cheers, Michael From craig at aland.bbn.com Sun Dec 26 13:10:41 2004 From: craig at aland.bbn.com (Craig Partridge) Date: Sun, 26 Dec 2004 16:10:41 -0500 Subject: [ih] Re: Global congestion collapse In-Reply-To: Your message of "Sun, 26 Dec 2004 21:17:34 +0100." <000a01c4eb87$f15db8a0$0200a8c0@fun> Message-ID: <20041226211041.9057F1A8@aland.bbn.com> In message <000a01c4eb87$f15db8a0$0200a8c0 at fun>, "Michael Welzl" writes: >* were admins aware that this patch would reduce your own rate and > might make things worse for you if you're the only one who installs it? > e.g., think of 1000 * unresponsive UDP vs. 1 * TCP - across a single > bottleneck - in this scenario, a single unresponsive flow would be > better off than a single TCP flow. Actually the great thing about Van's patch was that the existing TCPs were so bad, that being the only one running Van's patch meant you got *better* performance. Only later did people figure out how to create unresponsive TCP's that were well-behaved enough they'd win in this fight. [Reaching deep into my brain, my recollection is that Van's worked better because it (a) did RTT estimation right and (b) slow start allowed it to correctly probe available bandwidth, whereas existing implementations just hammered at not enough bandwidth. Happy to be corrected, this was long ago] >* Van Jacobson's paper came out in August 1988. I think that the first > RFC which says "you MUST implement congestion control" is > RFC 1122 - which came out October 1989. What happened in between? > Was it just a patch flying around and word of mouth ("c'mon, install it, > we'll all be better off")? The patch came out well before August of 1988. And yes, it was word of mouth -- or perhaps, better said, notes on the TCP-IP list. There's a note from Van on 11 Feb 88 discussing the work and a note from Dan Lynch soon thereafter inviting people to a tutorial at Interop about it. There's a Jan 87 note from Van saying he and Mike Karels are experimenting with the mods. I tried to find the actually software release but all I could find was the official release on 6 Dec 88 (whereas the patch had been around for a while by then). If you don't have the TCP-IP archives, well worth reading (I grabbed what I could when I realized they might be endangered and appear to have much of the list from 82 to 91). Craig From craig at aland.bbn.com Thu Dec 30 09:58:45 2004 From: craig at aland.bbn.com (Craig Partridge) Date: Thu, 30 Dec 2004 12:58:45 -0500 Subject: [ih] Re: Global congestion collapse Message-ID: <20041230175845.540991AB@aland.bbn.com> Following up our discussions on this topic, I had cause today to re-read the proceedings of the 6th IETF (April 1987), which are on-line at www.ietf.org. They include minutes (p. 9) in which Van Jacobson describes early thinking about slow start (and why it works better than what was in the Internet at the time). Also included are Van's slides! The minutes also include a report from the ARPANET team on how they dealt with an episode of congestion collapse (apparently around January of 1987) with upgrades to IMP/PSN software -- and traffic distribution matrices showing where the heavy congestion was observed -- fun stuff (and it brings back wonderful memories..). Craig