[ih] Somebody probably asked before - Trying to remember early net routing collapse
Jack Haverty
jack at 3kitty.org
Mon Mar 20 17:58:07 PDT 2023
I don't remember the details, but in the early days there were frequent
battles between the "core" gateways and the "research" gateways. We
(BBN) were supposed to keep the core running 24x7. But lots of people
wanted to build gateways and try out ideas, and the gateway protocols
(basically at the time just exchange of routing tables) were sometimes
corrupted with nonsensical information.
Often incidents were simply caused by bugs in someone's experimental
code. But not always. One case I sort of remember was when someone
decided to put a new circuit between 2 university sites because they
wanted faster service for their frequent file transfers. It seemed (to
them) like an obvious thing to do. So they ordered a circuit from the
phone company and plugged it into the routers at each site. But because
of the topology of the overall network, that new circuit became the
"shortest path", simply because it was the fewest number of hops, for a
lot of network traffic. That could very easily have routed a lot of
traffic through a Fuzzball, and resulted in those two sites actually
experiencing much slower service.
Network behavior is often counter-intuitive....
Such "incidents" were the motivation, circa 1982, for creating EGP and
the notion of Autonomous Systems (see RFC 827). EGP provided a means
for putting a sort of "firewall" between different parts of the Internet
-- assuming you could figure out exactly how to filter routing
information as it enter "your" Autonomous System to create your own
protective firewall. We needed such a mechanism in order to keep
the "core" running while all sorts of experiments occurred in other
parts of the Internet.
With EGP in place, the various research efforts were expected to then
experiment and develop some kind of next generation routing scheme
involving more appropriate metrics than just "hops" -- at least using
transit time as a metric as the ARPANET had been doing, and perhaps
introducing other metrics such as available bandwidth, or constraints
based on policies such as only carrying certain kinds of data through
particular networks. AFAIK, that never happened and hops are still the
basis for routing.
I'm pretty sure I wrote about this at some point years ago in the
internet-history discussions. Perhaps ChatGPT or one of its friends can
find it.....
Jack Haverty
On 3/20/23 17:11, Karl Auerbach via Internet-history wrote:
> I am sure this has been discussed, but I can't seem to find it...
>
> I vaguely remember a story involving some of Dave Mills' machines and
> a memory error in IMPs or some other switching device that caused all
> of the net's traffic to be forwarded through one struggling Fuzzy* or
> PDP-11/03.
>
> Could someone give me a pointer?
>
> I once did something similar - back when we were using flood-and-prune
> routing for IP multicast, I was working at a site where our inbound
> link was a T-1. Our internal net had several Cisco routers [2500
> series] all chatting away with DVMRP [the flood-and-prune multicast
> routing protocol of that era.] Anyway, while I was setting up one or
> our internal 25xx routers I had not yet finished setting up the IP
> unicast routing. But that didn't stop my partially configured router
> from chatting away with IGMP and DVMRP, it merely meant that that
> router could not send the "prune, please stop sending me traffic!"
> message.
>
> So that router eventually ended up at the end of every IP multicast
> "flood" that was active on the MBone but without a way of saying
> "stop, please stop!". Our poor T-1 saturated. I learned to not
> enable IP multicast via DVMRP until my unicast routing was stable.
> (We eventually moved onto PIM for multicast routing.)
>
> --karl--
>
>
More information about the Internet-history
mailing list