[ih] Failures of the early Internet
Jack Haverty
jack at 3kitty.org
Fri Jan 19 18:53:19 PST 2024
Hi Vint,
By the time that incident occurred, the Arpanet was more than a decade
old, and the software had undergone lots of changes as the net got
bigger and the internal algorithms were changed with experience. I'm
not sure it still behaved then exactly as you described.
I don't remember (may never have heard) what the exact IMP (or the OS
"Internet Status" app) behavior was except a comment from someone in the
NOC that the net was setting up and tearing down way more connections
than usual. Perhaps the app was doing a TCP "echo" as a probe?
Whatever it did, the behavior was traced to the workstations and then to
the new OS release.
Within BBN, I was "the Internet guy", and the Internet was just some
Arpa Experiment at that point. So I got the flak when it became clear
that "the Internet" was responsible for the Arpanet crisis. That
incident probably didn't help when the next request came in from someone
who wanted to use Arpanet "uncontrolled packets" for their Internet
experiments!
Fun times...
Jack
On 1/19/24 18:30, Vint Cerf wrote:
> Jack,
> as I recall, for single packet messages, there was no set up because
> the data for the packet went with the first packet.
> If multiple packets were needed the first packet carried the first
> packet's worth of data but a multipacket set-up occurred.
> The packets of a multipacket message did not go on a virtual or fixed
> real path. However, a multipacket message did have to assure that
> reassembly space was available before the remaining packets of a
> multipacket message were sent. I seem to remember and exchange of the
> form, "get a block" "got a block" if you needed multipacket reassembly
> space.
>
> so I am not sure that single packet floods would have caused a setup
> delay/congestion unless the ping messages were longer than a single
> 1008 byte packet?
> v
>
>
> On Fri, Jan 19, 2024 at 9:20 PM Jack Haverty via Internet-history
> <internet-history at elists.isoc.org> wrote:
>
> On 1/19/24 16:00, Karl Auerbach via Internet-history wrote:
> > (I've never felt that I have an adequate understanding of the early
> > routing failures and their effects.)
>
> OK, I'll jump in.....
>
> It was painfully easy for routing problems to occur. All one had
> to do
> was advertise to a neighboring router that you were the best route to
> everywhere. A simple bug could do the job. Word would quickly
> spread,
> and all traffic would head your way, which sometimes made it
> impossible
> to connect to the offending router to try to fix the problem. IIRC,
> something like that was what the Fuzzballs occasionally did.
>
> Another incident I recall was also a routing issue. I don't remember
> exactly where it happened, but two sites, universities IIRC, were
> collaborating on some research project and had a need to send data
> back
> and forth. Their pathway to each other through the Internet was
> somewhat long and often congested. So they decided to fix the
> problem
> by installing a circuit directly between their two campus' routers.
>
> Money was of course an issue, but they found the funds to pay for
> a 9.6
> kb/s line. They were surprised to observe that the added line
> only made
> things worse. File transfers took even longer than before. Of course
> their change to the topology of the Internet had unexpectedly made
> their
> 9.6 line the best route for all sorts of Internet traffic
> unrelated to
> their project.
>
> Many of the incidents I remember were caused by the routing
> algorithms
> which were based on "hops" rather than on time (as had been the
> case in
> the Arpanet for a decade or more). This was a well-known problem
> which
> I think was part of the motivation for Dave Mills to create the NTP
> machinery. In addition to routing, there were other Internet
> mechanisms
> that depended on time, but had necessarily been implemented
> "temporarily" until good time mechanisms were available. For
> example,
> the TTL (Time To Live) and TOS (Type Of Service) values in IP were
> supposed to provide the routers with information to route IP
> datagrams
> over the most appropriate route, or quickly discard them if there
> was no
> expectation they could possibly get to their destination in time to
> still be useful.
>
> Dave worked hard to get Time as an inherent element of The
> Internet, and
> our expectation was that TCP and IP software throughout the Internet
> would be changed to make decisions based on Time rather than
> Hops. I'm
> not sure if that ever happened. The Internet now knows what time it
> is, but does networking software today ever look at its watch?
>
> Another incident I recall was not an Internet failure, but rather a
> situation where the Internet terrorized the Arpanet.
>
> The Arpanet was touted as a "packet network", but in reality it was a
> virtual circuit network, using packets internally. There were
> lots of
> mechanisms inside the Arpanet IMPs to make all user traffic travel to
> its destination intact and in the same order it was sent. The network
> was designed to match the typical usage patterns of the era - people
> connected to some computer somewhere on the Arpanet, did their
> work, and
> disconnected minutes or even hours later. Inside the Arpanet, the
> mechanisms to set up virtual circuits consumed resources and took
> time,
> but with sessions lasting minutes or hours the impact was tolerable.
>
> One day the Arpanet was having problems and response times were
> noticeably slower than usual. Investigation revealed that the
> Arpanet
> was flailing, constantly setting up and tearing down virtual
> circuits,
> each of which was only lasting for a second or two. The Arpanet NOC
> (down the hall from my office) was in crisis.
>
> Eventually the problem was traced down to a new release of OS
> software
> (BSD, IIRC) that had just been posted on the Arpanet, and was being
> installed in the large numbers of workstations (Sun, IIRC) that had
> started appearing on the Internet. The new OS release included a new
> tool to advise its users of the current status of the Internet. It
> accomplished that by "pinging" every router every few minutes to
> see if
> that router was up and responsive.
>
> Pinging involved sending a single datagram, and receiving a single
> datgram in response. But each such datagram required the Arpanet
> to set
> up a virtual circuit to carry that traffic. With lots of OSes and
> lots
> of routers now scattered around the Arpanet, it was trying to do
> something it was never designed to do. As more workstations
> loaded the
> new OS release, the problem only got worse.
>
> Although this wasn't an "Internet failure", it was a system failure,
> caused by the Internet. Administrative action suppressed the problem
> and as the Arpanet was decommissioned the problem disappeared. Or
> perhaps moved somewhere else?
>
> Anybody else have recollections of early failures...?
>
> Jack Haverty
>
>
>
>
>
> --
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history
>
>
>
> --
> Please send any postal/overnight deliveries to:
> Vint Cerf
> Google, LLC
> 1900 Reston Metro Plaza, 16th Floor
> Reston, VA 20190
> +1 (571) 213 1346
>
>
> until further notice
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20240119/b3e18e52/attachment.asc>
More information about the Internet-history
mailing list