[ih] Interop as part of Internet History (was Re: Fwd: Fwd: List archives (Was: Exterior Gateway Protocol))
Karl Auerbach
karl at cavebear.com
Sat Sep 12 15:27:38 PDT 2020
For some reason my post never made it onto the internet history list
itself. Odd. I hope it shows up eventually.
Seventeen minutes for things to settle down after a route flap? You made
a good prediction! By today's standards that's a long time. But its a
whole lot better than the hours it often took in earlier days.
(BTW, our internet architecture is lacking a layer, often called
"association" or "session" that would lay on top of the transport and
would allow fast application-level healing despite failures of transport
connections and establishment of replacement connections (such as would
be common in mobile situations.) It turns out that such a layer would
be extremely lightweight and not add a meaningful amount of overhead.
But because that was an ISO/OSI idea (but done badly) the Internet
engineering community has not picked up on it.
I worked at Wells Fargo for several years helping them to move into the
"modern" age of computers and networks (circa 1982). And, like your
experience, down time meant serious money. It was amazing how much
money sloshes through a large bank, especially around 3am when they have
to meet their reserves requirements - it was a mad time of buying and
selling huge blocks of money for to cover the reserve requirements for
the next 24 hours. If they missed, or if they couldn't get things to
reconcile - then the regulations would shut the whole thing - the whole
bank - down almost immediately.
When I got into networking - around 1972 - I was working first on a
project that closely resembled the movie "War Games" (but with real
missile launches) and then I moved on to do secure network research for
the US Joint Chiefs and later, for an unmentionable three-letter-agency
just south of Baltimore. So we were not only concerned with failure, we
anticipated it in the most nuclear-blast-like of forms. Our reliability
and response time requirements were based, literally, on whether the US
could make and deploy a timely "launch/not-launch" decision. It was
scary stuff.
The Interop net was unique in many ways. The time pressure to install
was immense, the pressure to keep it alive was heavy, and something
nobody ever mentions - we had to tear it down and get it onto trucks
back home within a few hours. We had to invent a lot of things - and
Dan let us have the leeway to experiment, sometimes destructively or not
strictly within the limits of "the rules", to get the net up. (It also
helped that we put our often large bar tab on Dan's hotel room bill.)
Sometimes we got downright brutal. For instance, just before opening of
the first show in San Jose we had an utterly critical box that needed to
be cabled up - but the access hole was too small. So ten minutes before
the doors opened Alex Latzko and I pulled out hammers and proceeded to
pound the beejeebers out of the existing box's hole. We metal fatigued
the steel and got the cables in just as the doors opened. It was ugly,
and we destroyed the box, but it worked.
We initially had a lot of trouble with the house electricians. At first
they didn't mind us hanging our yellow hose Ethernet, but they felt that
if it got in their way that they could simply cut it and splice it back
together. We later had troubles with telco people cutting and splicing
our carefully balanced long DSL runs across the main rooms of convention
centers. It came to a head when union people insisted that we could not
touch our own fiber optic plant. I can't remember the details of how we
resolved that in the short term - I had suggested arguing that fiber
optics are light "pipes" and that the proper union would be the plumbers
not the electrical workers. In the long term we trained a lot of them
about the right way to do things. In New York we had to supplement that
with thick wads of $20 and $50 bills.
I am very concerned that today's networks are very difficult to diagnose
and repair. My grandfather was a radio repair guy and my father had a
business repairing TV's that nobody else could fix. I kind of inherited
those genes. One could tell a good TV guy just by looking at his
toolkit. For example, fixing one of those vacuum tube TV's required a
lot of turning of coils and capacitors via screws on the back and
looking how the picture changed. A good TV guy had a little mirror that
could be propped up in front of the TV so that the picture could be
viewed while turning the screws. The bad TV guy kept running between
the front and the back.
Our tools to detect problems and to make tests are primitive. Back in
the mid 1990's I built a tool called "Dr. Watson, The Network
Detective's Assistant" - it was the first internet butt-set designed to
get a repair person up and running within a few seconds. it worked well
but I was not a good manager of my company and it died (parts of the
tool were picked up by Fluke instruments.) That tool was intended
ultimately to be part of a much larger pathology-analysis system based
on some of our medical systems.
At Cisco I worked on a DARPA backed project to do "smart networks" - I
was beginning to instrument routers so that they could detect when they
began to wobble beyond limits established by models. That detection
would feed back into the models - often revealing incorrect
configuration, a degradation, a failure, or, interestingly, a security
penetration. That work got short-circuited when I was elected to the
ICANN board and that absorbed all of my time. That work has never
continued but it needs to be resurrected.
I am extremely interested in adopting methods from biology into
networking. Living systems are amazingly robust. The key is that they
have, over time and evolution, acquired layers of responses to
situations. By comparison our computers and networking systems are
extremely brittle and generally incorporate only one response to any
situation. I really want to change that but to do so will require a
massive change in our mental approach to networking as well as breaking
some honored traditions, such as strict separation between layers of
abstraction/protocols.
With my lawyer hat on I keep wondering when the ax of liability is going
to fall on network operators. I gave a talk at NANOG last year about
issues of carrier liability and how we ought to change our approach to
engineering of the internet to make it more robust (based, somewhat, on
lessons from biology. ;-)
Here's the video/transcript of that talk:
https://blog.iwl.com/blog/keynote-at-nanog-77-by-cto-karl-auerbach
--karl--
More information about the Internet-history
mailing list