[ih] Checksums in Host-Host protocol
Jack Haverty
jack at 3kitty.org
Sun Apr 20 10:45:08 PDT 2025
Hi Steve,
IIRC there were concerns other than the Engineering ones.
Computers in 1970 were big, expensive to purchase, and expensive to
operate. Projects using computers were charged by the "CPU-second" for
the users' computing, but time consumed by the OS was overhead, and its
costs spread across all users, along with the costs for power, HVAC,
floor space, and such.
ARPA was into "resource sharing" but organizations with computers
(mostly universities and research sites IIRC) were sometimes reluctant
to "share" their hard-won computing power. ARPA was paying the bills
though, so it could pressure them to connect to the ARPANET.
Still, there was concern to minimize that overhead in the OS, e.g., RFC
425, titled "But my NCP costs $500 a day...", written by Bob Bressler,
in Frank Heart's ARPANET group at BBN. (In 1977 I migrated from MIT to
Frank's group at BBN, to implement TCP for Unix)
It's easy to forget the computing world of 50+ years ago, now that we
have computers (aka "devices") lying around everywhere, much more
powerful than any machine on the ARPANET in the 1970s, but sitting idle
most of the time.
Even if the Engineering analysis showed a bit of additional OS overhead
was minimal, it was still a thorn in the sides of IT managers fifty
years ago.
But yes, I agree that the PDP-10 had some great instructions. I spent a
lot of time in the 70s writing assembler code in Lick's group at MIT, as
we all tried feverishly to make our PDP-10 do more than it was capable
of doing. We did things like using JFCL as the best NO-OP instruction,
simply because it was the fastest such instruction to execute. The
"official" NOP instruction took a bit longer. That and other such
techniques were considered common practice of us "bit twiddlers", all to
save a few CPU-microseconds.
BTW, the ARPANET IMP code is the most amazing example of such "bit
twiddling" that I have ever encountered. I did a "deep dive" into the
old IMP code as a consultant in a patent battle, to figure out exactly
how the IMP did some of the things that I remembered it did. I was
astonished at the techniques used to get every bit of power out of the
Honeywell minicomputer that was the core of the IMP, often by violating
today's rules of good program design. It's just what you did in those
days of too little CPU, too little memory, and everything too expensive.
Frank Heart's focus was on Engineering - getting the system to work. I
can still hear his voice as he argued for that. You knew he was serious
when the pitch of his voice approached ultrasonic.
Jack
On 4/19/25 20:14, Steve Crocker via Internet-history wrote:
> Alex,
>
> Thanks for your note. I understand the point you're making, but I don't
> think the effect would have been very large. I did a mental exercise to
> estimate the time to compute the checksum on PDP-10s. The PDP-10 was a 36
> bit machine, so it definitely would have required the extraction and
> shifting you're referring to. However, it also had a very powerful
> instruction set. One of its instructions was a Load Byte instruction. See
> http://pdp10.nocrew.org/docs/instruction-set/Byte.html . It uses a byte
> pointer which contains the size and offset of the byte, so the instruction
> takes 3 cycles. Once the byte is loaded into one of the several registers,
> it can be added to another register in one instruction, and because it does
> not need to load data from memory, I believe that Add instruction would
> take only one cycle. Hence, a total of 4 cycles to extract a byte and add
> it to a register.
>
> The machine has enough registers to permit assigning one for each of the
> four offsets, thus deferring the shifting to the end.
>
> A typical word will have three bytes. so it will take 12 cycles per 36 bit
> word. But we can do slightly better. Four words is 144 bits, which is 9
> 16-bit bytes. In two of those four words, there will be 32 bits that have
> two complete 16-bit bytes. These can be treated as a single byte in the
> inner loop and subdivided at the end. Therefore, within each group of four
> words, there will be only 10 Load Byte and Add instructions, i.e. 40 cycles
> for every four words, or 40/9 cycles per 16 bit byte.
>
> An 8000 bit message has 500 16-bit bytes, so the inner loop to add all the
> 16 bit-bytes is roughly 500 * 40 / 9 cycles, approximately 2222 cycles.
> Round this up to 2500 cycles to accommodate the loop management and the
> assembling the pieces at the end.
>
> According to chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/
> https://archive.computerhistory.org/resources/text/DEC/pdp-10/dec.pdp-10.the_evolution_of_the_DECsystem_10.1978.102630382.pdf
> , the cycle time on the KA-10 was 5 microseconds, so the computation time
> for the checksum over a full 8000 bit message would have been about 12.5
> ms. Double this to account for creating the checksum on the sending side
> and checking it on the receiving side, so 25 ms overall.
>
> For short messages, e.g. typical interactive messages, this figure would be
> *much* smaller.
>
> The Arpanet spec was to deliver a message from end to end in under a half
> second, i.e. 500 ms. Thus the checksum would have added a little bit of
> time, but only about 5%.
>
> Apologies if there are errors in the above. Corrections welcome.
>
> Steve
>
> On Sat, Apr 19, 2025 at 5:00 PM Alexander McKenzie via Internet-history <
> internet-history at elists.isoc.org> wrote:
>
>> Steve,
>>
>> It sounds so simple, but the devil is in the details. Given the plethora
>> of word sizes of the computers connected (or scheduled to be connected) to
>> ARPAnet in 1971, I would bet that for ANY checksum length proposed over 50%
>> of the Hosts would have to engage in mask and shift operations on every
>> word in a message in order to calculate even a checksum like the one you
>> describe. This would indeed have somewhat slowed the effective network
>> speed. Enough to matter? Who knows, but at that time maximum effective
>> bandwidth was of real concern to prospective users (remember the motivation
>> for the design of the Tinker-McClellan experiment). Recall that at that
>> time a majority of the communications community viewed packet switching as
>> foolish, and ARPAnet as an experiment about to fail. Yes, a simple
>> end-to-end checksum would have sometimes been of diagnostic help, but both
>> Frank Heart and Larry Roberts had a real reason to worry about anything
>> that would negatively affect perceived performance of this brand new
>> technology. Maybe we should have done checksumming for debugging in TELNET,
>> where performance was irrelevant, and left File Transfer alone.
>>
>> Cheers,
>> Alex
>>
>> On Friday, April 18, 2025 at 03:12:39 PM EDT, Steve Crocker via
>> Internet-history<internet-history at elists.isoc.org> wrote:
>>
>>
>> We tried to include a lightweight checksum in the original host-host
>> protocol. (Later it was called the Network Control Protocol or NCP. Same
>> protocol.) The checksum was designed to be reasonably easy to compute. It
>> was a 16-bit ones complement sum with one bit of rotation every thousand or
>> so bits. (The rotation was intended to catch packets out of order, error
>> which we imagined might be possible but never occurred.) Frank Heart
>> argued vehemently against it, saying it would make his network look slow.
>> I tried to push back and asked about the Host-IMP interface. "As reliable
>> as your accumulator," he roared.
>>
>> We removed the checkum from our design, a mistake I've rued ever since.
>> And, of course, it turned out there were indeed a few cases where it would
>> have made a difference. As has been pointed out, there was a major memory
>> error in one of the IMPs that caused that IMP to look like it was zero
>> distance to every IMP. But even before that error, when Lincoln Lab first
>> connected its host to its IMP, their hardware interface had a problem.
>> There was some crosstalk between the interface and the disk (or drum)
>> controller. When the disk (or drum) was operating at the same time as the
>> Host-IMP interface, some bits got scrambled. It apparently took them some
>> time to track down. I think they would have found it faster if the
>> checksum had been part of the design.
>>
>> Steve
>> --
>> Internet-history mailing list
>> Internet-history at elists.isoc.org
>> https://elists.isoc.org/mailman/listinfo/internet-history
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20250420/beb65e91/attachment-0001.asc>
More information about the Internet-history
mailing list