[ih] Checksums in Host-Host protocol

Sat Apr 19 20:14:54 PDT 2025

Alex,

Thanks for your note.  I understand the point you're making, but I don't
think the effect would have been very large.  I did a mental exercise to
estimate the time to compute the checksum on PDP-10s.  The PDP-10 was a 36
bit machine, so it definitely would have required the extraction and
shifting you're referring to.  However, it also had a very powerful
instruction set.  One of its instructions was a Load Byte instruction.  See
http://pdp10.nocrew.org/docs/instruction-set/Byte.html .  It uses a byte
pointer which contains the size and offset of the byte, so the instruction
takes 3 cycles.  Once the byte is loaded into one of the several registers,
it can be added to another register in one instruction, and because it does
not need to load data from memory, I believe that Add instruction would
take only one cycle.  Hence, a total of 4 cycles to extract a byte and add
it to a register.

The machine has enough registers to permit assigning one for each of the
four offsets, thus deferring the shifting to the end.

A typical word will have three bytes. so it will take 12 cycles per 36 bit
word.  But we can do slightly better.  Four words is 144 bits, which is 9
16-bit bytes.  In two of those four words, there will be 32 bits that have
two complete 16-bit bytes.  These can be treated as a single byte in the
inner loop and subdivided at the end.  Therefore, within each group of four
words, there will be only 10 Load Byte and Add instructions, i.e. 40 cycles
for every four words, or 40/9 cycles per 16 bit byte.

An 8000 bit message has 500 16-bit bytes, so the inner loop to add all the
16 bit-bytes is roughly 500 * 40 / 9 cycles, approximately 2222 cycles.
Round this up to 2500 cycles to accommodate the loop management and the
assembling the pieces at the end.

According to chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/
https://archive.computerhistory.org/resources/text/DEC/pdp-10/dec.pdp-10.the_evolution_of_the_DECsystem_10.1978.102630382.pdf
, the cycle time on the KA-10 was 5 microseconds, so the computation time
for the checksum over a full 8000 bit message would have been about 12.5
ms.  Double this to account for creating the checksum on the sending side
and checking it on the receiving side, so 25 ms overall.

For short messages, e.g. typical interactive messages, this figure would be
*much* smaller.

The Arpanet spec was to deliver a message from end to end in under a half
second, i.e. 500 ms.  Thus the checksum would have added a little bit of
time, but only about 5%.

Apologies if there are errors in the above.  Corrections welcome.

Steve

On Sat, Apr 19, 2025 at 5:00 PM Alexander McKenzie via Internet-history <
internet-history at elists.isoc.org> wrote:

>  Steve,
>
> It sounds so simple, but the devil is in the details.  Given the plethora
> of word sizes of the computers connected (or scheduled to be connected) to
> ARPAnet in 1971, I would bet that for ANY checksum length proposed over 50%
> of the Hosts would have to engage in mask and shift operations on every
> word in a message in order to calculate even a checksum like the one you
> describe. This would indeed have somewhat slowed the effective network
> speed.  Enough to matter? Who knows, but at that time maximum effective
> bandwidth was of real concern to prospective users (remember the motivation
> for the design of the Tinker-McClellan experiment).  Recall that at that
> time a majority of the communications community viewed packet switching as
> foolish, and ARPAnet as an experiment about to fail. Yes, a simple
> end-to-end checksum would have sometimes been of diagnostic help, but both
> Frank Heart and Larry Roberts had a real reason to worry about anything
> that would negatively affect perceived performance of this brand new
> technology. Maybe we should have done checksumming for debugging in TELNET,
> where performance was irrelevant, and left File Transfer alone.
>
> Cheers,
> Alex
>
> On Friday, April 18, 2025 at 03:12:39 PM EDT, Steve Crocker via
> Internet-history <internet-history at elists.isoc.org> wrote:
>
>
> We tried to include a lightweight checksum in the original host-host
> protocol.  (Later it was called the Network Control Protocol or NCP.  Same
> protocol.)  The checksum was designed to be reasonably easy to compute.  It
> was a 16-bit ones complement sum with one bit of rotation every thousand or
> so bits.  (The rotation was intended to catch packets out of order, error
> which we imagined might be possible but never occurred.)  Frank Heart
> argued vehemently against it, saying it would make his network look slow.
> I tried to push back and asked about the Host-IMP interface.  "As reliable
> as your accumulator," he roared.
>
> We removed the checkum from our design, a mistake I've rued ever since.
> And, of course, it turned out there were indeed a few cases where it would
> have made a difference.  As has been pointed out, there was a major memory
> error in one of the IMPs that caused that IMP to look like it was zero
> distance to every IMP.  But even before that error, when Lincoln Lab first
> connected its host to its IMP, their hardware interface had a problem.
> There was some crosstalk between the interface and the disk (or drum)
> controller.  When the disk (or drum) was operating at the same time as the
> Host-IMP interface, some bits got scrambled.  It apparently took them some
> time to track down.  I think they would have found it faster if the
> checksum had been part of the design.
>
> Steve
> --
> Internet-history mailing list
> Internet-history at elists.isoc.org
> https://elists.isoc.org/mailman/listinfo/internet-history
>

-- 
Sent by a Verified

sender