[ih] early networking: "the solution"

Jack Haverty jack at 3kitty.org
Sun Apr 21 15:14:00 PDT 2024


Probably not many people know the story behind the IP checksum.   I 
don't think anyone's ever written it down.  While I still remember...:

The checksum algorithm was selected not for its capabilities to catch 
errors, but rather for its simplicity for our overworked and inadequate 
computing power.  There was significant concern at the time, especially 
in the sites running the big host computers, about the use of scarce 
computing power as "overhead" involved in using the network.  See for 
example: https://www.rfc-editor.org/rfc//rfc425

Besides, at the time all TCP traffic was through the Arpanet, and the 
IMPs did their own checksums so any circuit problems would be caught 
there.  So as we were defining the details of the new TCP4 mechanisms, 
the checksum algorithm was kept intentionally simple, to be replaced in 
some future version of TCP when computers would be more capable and the 
error characteristics of pathways through the Internet were better 
understood by experience.   The checksum algorithm was a placeholder for 
a future improved version, like many other mechanisms of TCP/IP4.

The actual details of the checksum computation were nailed down on 
January 27, 1979.  That was the date of the first TCP Bakeoff, organized 
by Jon Postel.   I think of it as possibly the first ever "Hackathon".

The group of TCP implementers assembled on a weekend at USC-ISI and 
commandeered a bunch of offices with terminals that we could use to 
connect to our computers back home.   At first, we could all talk to 
ourselves fine.   However, no one could talk to any other 
implementation.  Everybody was getting checksum errors.

Since we could all hear each other, a discussion quickly reach a 
consensus.   We turned off the checksum verification code in all of our 
implementations, so our TCPs would simply assume every incoming 
message/packet/datagram/segment (you pick your favorite term...) was 
error-free.

It seems strange now, but computing in the 1970s was a lot different 
from today.  In addition to the scarcity of CPU power and memory, there 
was little consensus about how bits were used inside of each computer, 
and how they were transferred onto wires by network interface hardware.  
Computers didn't agree on the number of bits in a byte, or how bytes 
were ordered into computer words, how arithmetic calculations were 
performed, or how to take the bits in and out of your computer's memory 
and transfer them serially over an I/O interface.  If you think the 
confusion of today's USB connectors is bad, it was much worse 50 years ago!

Danny Cohen later published a great "plea for peace" that reveals some 
of the confusion - see https://www.rfc-editor.org/ien/ien137.txt

So it wasn't a surprise that each TCP implementer had somehow failed in 
translating the specification, simple as it was, into code.

The disabling of checksums enabled us to debug all this and slowly (took 
two days IIRC) got implementations to talk to other implementations.  
Then we re-enabled checksumming and tried all the tests again.  TCP4 
worked!  Jon Postel took on the task of figuring out how the now working 
checksums actually were doing the computations and revised the 
specifications accordingly.   Rough consensus and running code had 
failed; instead we had running code and then rough consensus.

My most memorable recollection of that weekend was late on Sunday. Jon 
had set up the Bakeoff with a "scoring scheme" which gave each 
participant a number of points for passing each test.   His score rules 
are here: 
https://drive.google.com/file/d/1NNc9tJTEQsVq-knCCWLeJ3zVrL2Xd25g/view?usp=sharing

We were all getting tired, and Bill Plummer (Tenex TCP) shouted down the 
hall to Dave Clark (Multics TCP) -- "Hey Dave, can you turn off your 
checksumming again?"  Dave replied "OK, it's off".  Bill hit a key on 
his terminal.  Dave yelled "Hey, Multics just crashed!"  Bill gloated 
"KO! Ten points for me!"

Such was how checksumming made it into TCP/IP4.

Jack Haverty



On 4/21/24 12:27, John Day via Internet-history wrote:
> So I wasn’t dreaming!  ;-)
>
> CRCs also have problems in HDLC if there are a lot of 1s in the data.  (The bit stuffing is not included in the checksum calculation.)
>
>> On Apr 21, 2024, at 15:22,touch at strayalpha.com  wrote:
>>
>> I think it was this one:
>> http://ccr.sigcomm.org/archive/1995/conf/partridge.pdf
>>
>> Joe
>>
>>>> Dr. Joe Touch, temporal epistemologist
>> www.strayalpha.com
>>
>>> On Apr 21, 2024, at 12:20 PM, Scott Bradner via Internet-history<internet-history at elists.isoc.org>  wrote:
>>>
>>> maybe in conjunction with the Pac Bell NAP
>>>
>>> https://www.cnet.com/tech/mobile/pac-bell-adds-network-access/
>>>
>>> https://mailman.nanog.org/pipermail/nanog/1998-March/127113.html
>>>
>>> Scott
>>>
>>>> On Apr 21, 2024, at 3:00 PM, John Day<jeanjour at comcast.net>  wrote:
>>>>
>>>> I have a vague recollection of a paper (possibly by Craig Partridge) that talked about ATM dropping cells (and possibly other different forms of errors) and how IP and other protocols were not built to detect such losses.
>>>>
>>>> Am I dreaming?
>>>>
>>>> John
>>>>
>>>>> On Apr 21, 2024, at 09:10, Scott Bradner via Internet-history<internet-history at elists.isoc.org>  wrote:
>>>>>
>>>>> yes but...
>>>>>
>>>>> the ATM Forum people felt that ATM should replace TCP and most of IP
>>>>> i.e. become the new IP and that new applications should assume they were
>>>>> running over ATM and directly make use of ATM features (e.g., ABR)
>>>>>
>>>>> ATM as yet another wire was just fine (though a bit choppy)
>>>>>
>>>>> Scott
>>>>>
>>>>>
>>>>>
>>>>>> On Apr 21, 2024, at 9:02 AM, Andrew G. Malis<agmalis at gmail.com>  wrote:
>>>>>>
>>>>>> Scott,
>>>>>>
>>>>>> ATM could carry any protocol that you could carry over Ethernet, see RFCs 2225, 2492, and 2684.
>>>>>>
>>>>>> Cheers,
>>>>>> Andy
>>>>>>
>>>>>>
>>>>>> On Sat, Apr 20, 2024 at 8:15 PM Scott Bradner via Internet-history<internet-history at elists.isoc.org>  wrote:
>>>>>>
>>>>>>
>>>>>>> On Apr 20, 2024, at 8:11 PM, John Gilmore via Internet-history<internet-history at elists.isoc.org>  wrote:
>>>>>>>
>>>>>>> John Day via Internet-history<internet-history at elists.isoc.org>  wrote:
>>>>>>>> In the early 70s, people were trying to figure out how to interwork multiple networks of different technologies. What was the solution that was arrived at that led to the current Internet?
>>>>>>>> I conjectured yesterday that the fundamental solution must have been in hand by the time Cerf and Kahn published their paper.
>>>>>>>> Are you conjecturing that the solution was gateways? and hence protocol translation at the gateways?
>>>>>>> Maybe it's too obvious in retrospect.  But the "solution" that I see was
>>>>>>> that everyone had to move to using a protocol that was independent of
>>>>>>> their physical medium.
>>>>>> and ATM was an example of the reverse - it was a protocol & a network - OK
>>>>>> as long as you did not build applications that knew they were running over ATM
>>>>>> (or if ATM had been the last networking protocol)
>>>>>>
>>>>>> Scott
>>>>>> -- 
>>>>>> Internet-history mailing list
>>>>>> Internet-history at elists.isoc.org
>>>>>> https://elists.isoc.org/mailman/listinfo/internet-history
>>>>> -- 
>>>>> Internet-history mailing list
>>>>> Internet-history at elists.isoc.org
>>>>> https://elists.isoc.org/mailman/listinfo/internet-history
>>> -- 
>>> Internet-history mailing list
>>> Internet-history at elists.isoc.org
>>> https://elists.isoc.org/mailman/listinfo/internet-history

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20240421/d544a006/attachment-0001.asc>


More information about the Internet-history mailing list