[ih] Global congestion collapse
David L. Mills
mills at udel.edu
Mon Dec 13 20:37:30 PST 2004
Perry,
Not so fast. Steve Wolff of NSF and I had a nasty little secret we did
not tell the NSFnet maintenance crew who could never keep a secret. I
built in priority queueing and preemption in the fuzzball routers. The
former wiretapped the telnet port and made it just below NTP on the
priority scale. We put mail on the bottom just below ftp. A lot of
telnet users stopped complaining because they thought we "fixed" the
network.
The other thing was to shoot the elephants. When a new packet arrived
and no buffer space was available, the output queues were scanned
looking for the biggest elephant (total byte count on all queues from
the same IP address) and killed its biggest packet. Gunshots continued
until either the arriving packet got shot or there was enough room to
save it. It all worked gangbusters and the poor ftpers never found out.
Dave
Perry E. Metzger wrote:
>"David L. Mills" <mills at udel.edu> writes:
>
>
>>Well, if your incident was during 1986-1988 and involved transit of
>>the NSFnet Phase-I backbone, I'm the perp. The NSFnet routers ran my
>>code, which was horribly overrun by supercomputer traffic. I found the
>>best way to deal with the problem was to find the supercomputer
>>elephants and shoot them. More is in a 1988 SIGCOMM Symposium
>>paper. More recently the USNO and NIST time servers are being overrun
>>with NTP traffic. See my recent PTTI paper at
>>www.eecis.udel.edu/+mills/papers.html.
>>
>>The NSFnet meltdown occured primarily because the fuzzball routers
>>used smart interfaces that retransmitted when either an error occured
>>or the receiver ran dry of buffers. The entire network locked up for a
>>time because all the buffers in all six machines filled up with
>>retransmit traffic and nothing could get in or out. As I recall, the
>>ARPAnet also had a similar problem with reassembly buffers.
>>
>>
>
>Interesting. Bellcore switched from a 56k link to the IMP at Columbia
>to NSFnet towards the end (latter half?) of that time, but I can't
>remember if the horrible congestion was before or after our switch.
>
>Either way, though, it was pretty shortly thereafter that I remember
>getting my first replacement .o files with yummy new TCP congestion
>control algorithms in them.
>
>Perry
>
>
More information about the Internet-history
mailing list