[ih] Yet another subject change: Testing (Was Re: Gateway Issue: Certification (was Re: booting linux on a 4004))

Thu Oct 3 11:31:57 PDT 2024

My grandfather was a radio repair guy, my father repaired TV's that 
other repair people could not fix.  So I grew up with my hands inside 
electronics learning how to figure out what was going wrong and what to 
do about it.  (I also learned a lot about keeping my fingers clear of 
high voltages - some day ask me about how the phrase "beating the 
bounds" [with regard to land titles] came about, and yes, there is an 
analogy to high voltage shocks.)

I've carried that family history (of repairing, not shocking) into the 
land of networks.

I am extremely concerned, and I mean *extremely* concerned, that our 
race to lock and secure things is slowly making it increasingly 
difficult for us to monitor, diagnose, and repair the Internet (and the 
increasing number of other important infrastructures that have become 
intermeshed with the net.)

I wrote a note about this issue:

Is The Internet At Risk From Too Much Security
https://www.cavebear.com/cavebear-blog/netsecurity/

My experience with designing, deploying, and running the Interop show 
networks informed me that we have few decent tools.  I looked in awe 
with the collection of well designed tools that AT&T guys (they were 
always guys in that era) had dangling from their tool belts.  So I 
designed and sold the first Internet buttset - a tool to get one up and 
running within seconds to do testing and evaluation of a IP (and 
Netware) network.  (The tool was "Dr. Watson, The Network Detective's 
Assistant" - https://www.cavebear.com/archive/dwtnda/ .  However, I was 
learning about how to run a company at that time and I didn't watch, 
much less control, what my marketing group was spending - so we went 
under.  I then helped Fluke pick up some of the remnant ideas for their 
products.)

Anyway, I have been bothered at how few test points we build into 
network software.  Even one of the most fundamental - remote loopback - 
is barely present in network equipment (yes, we have ICMP Echo/ping) but 
that's rather primitive. And I've long worked with SNMP and MIBs.  (I 
wrote and implemented an alternative to SNMP and Netconf that I though 
was much more useful then either: KNMP at 
https://www.iwl.com/idocs/knmp-overview )

My wife (Chris Wellens) and wrote up a paper in 1996 titled "Towards 
Useful Management" in which we made several proposals to improve our 
means to monitor and test networks. 
https://www.cavebear.com/docs/simple-times-vol4-num3.pdf

In the meantime Marshall Rose and my  wife spun a new company, 
Interworking Labs, out from the Interop company.  The initial purpose 
was to develop test suits for network protocols.  (These suites still 
exist and often reveal mistakes in network code.  One of my favorite is 
to repackage Ethnernet frames that have short IP packets inside those 
Ethernet frames.  The IP packet is put into an Ethernet frame that is 
larger than it needs to be to hold that IP packet. (Some vendors have 
used that space to do things like announcing license identifiers in the 
unused space in an Ethernet frame after an ARP packet.)  Far too much 
code uses the ethernet frame length rather than properly using the IP 
length fields - bad things can happen as a result.  And there is still 
code out there that uses signed integer math on unsigned integer packet 
fields - so a lot of code still wobbles if one tickles packets with 
numbers just below or just above the point where that high order bit 
toggles.)

Jon Postel came up with a testing idea for the bakeoff test events we 
had at places like FTP Software and ISI - a router that does things 
wrong in a controlled way.  A few years later Steve Casner and I were 
working to develop a portable RTP/RTCP engine for entertainment grade 
audio/video (on IP multicast); we longed for a device such as Jon's 
"flakeway" because of the need to evaluate all of the potential race 
conditions that can happen when running several related media streams in 
real time.

So a few years later at Interworking labs we started to develop Jon's 
flakeway into a real tool.  We called the line "Maxwell" after James 
Clerk Maxwell's thought experiment about a daemon that could select and 
control the flow of hot and cold particles, seemingly violating the laws 
of Thermodynamics.  It is still rather surprising how much code out 
there wobbles (or worse) when faced with simple network behaviour such 
as packet order resequencing (such as can happen when there are 
parallel/load balanced/bound) network paths, or when packets are 
accumulated for a short while and then suddenly released (as if a dam, 
holding back a lake of packets, suddenly bursts.)

I have seen many network test suites that check that a protocol 
implementation complies with the mandatory or suggested parts of RFCs.  
Those are nice.  But my concern is on the other side of the RFCs - what 
about the DO NOT cases or undefined cases, what happens when those 
situations happen.

For instance, I remember Dave Bridgham (FTP Software) one afternoon 
saying "You know, if I received the last IP fragment first I would have 
information that let me do better receive buffer allocation."  So he 
changed the FTP Software IP stack to send last fragment first.  It 
worked.  That is it worked until an FTP Software based machine was added 
to a network running competitor Netmanage TCP/IP code.  That latter code 
simply up and died when it got the last fragment first.

And at a TCP bakeoff I had a tool to test ARP, a protocol that has many 
knobs and levers that are rarely used.  I managed to generate a 
broadcast ARP packet that used some of those knobs and levers.  That ARP 
hit the router between our test networks and the host company's main 
network - that router crashed, but before it did it (for some reason) 
propagated that ARP further along, causing every other (I believe 
Proteon) router in the company to also crash.

We found a lot of things like that on the Interop show network. (I 
usually got blamed because I was usually near, if not operating, the 
device that triggered the flaws.)  One of the worst was a difference in 
opinion between Cisco and Wellfleet routers about what to do with 
expansion of IP multicast packets into Ethernet frames (in particular 
what group MAC addresses to use) resulting in infinite IP multicast 
routing across the show net - every load LED on every one of our 
hundreds of routers and switches turned red.  (And, of course, all 
fingers pointed at me. ;-)

The Interop show net was a wonderful place to discover flaws in protocol 
standards and implementations.  One of our team members (who I believe 
is on this list) found a flaw the FDDI standard.  I have a memory of 
companies reworking their code and blasting new firmware overnight in 
their hotel rooms.

The point of this long note is that the state of the art of testing 
Internet protocol implementation is weak.  It's not an exciting field, 
QA people are not honored employees, and as more and more people believe 
(often quite wrongly) that they can write code we are actually moving 
backwards in some regards.

In addition, we do not adequately consider monitoring, testing, and 
repair in our work defining protocols.

In 2003 I gave a long talk with a title that is now a bit misleading:   
 From Barnstorming to Boeing –
Transforming the Internet Into a Lifeline Utility.

(The slides are at 
https://www.cavebear.com/archive/rw/Barnstorming-to-Boeing-slides.pdf 
and the speaker notes at 
https://www.cavebear.com/archive/rw/Barnstorming-to-Boeing.pdf )

(One of my suggestions was the imposition of legal, civil tort, 
liability for network design, implementation, and operational errors - 
using a negligence standard so that simple mistakes would not suffer 
liability.  Wow, the groans from the audience were quite loud.)

I had other suggestions as well - such as design rules and operational 
practices that must be followed unless the person looking to deviate 
could express a compelling, cogent, argument why deviation is 
appropriate.  This is the norm in many engineering disciplines, but not 
for software where we are largely still in the anything goes, wild west.)

By-the-way, I have over the years been working on ideas to advance our 
testing/repair capabilities.

One piece that we are missing is a database of network pathology.  I am 
thinking here of a database of symptoms that are tied to possible causes 
and tests to distinguish among those causes.  (Yes, I am taking a cue 
from the practice of medicine.) Once we have such a database one could 
build tools to do symptom-to-cause reasoning, including running of 
diagnostic tests to work through the branches of the possible causation 
tree.  To do this right one needs trusted test agents disseminated 
throughout the network - the word "trusted" is important because network 
tests can be intrusive, sharp, and dangerous, like a surgeon's scalpel.  
(Imagine a world where surgeons were required to use dull, but safe 
plastic butter knives rather than sharp scalpels.)

Baseline records are important - and we do gather some of that, but we 
always want more detail.  But the amount of data to be collected is 
voluminous and is subject to concerns about how it could be used 
competitively.  (This is why in our Interworking Labs test contracts we 
prohibit the publishing of results to the public - we want to encourage 
correction for the benefit of us all rather than creation of competitive 
cudgels.)

(One element that I've slowly been working on in my zero free time is a 
precisely timed beacon and precisely timed listeners - all tightly 
synchronized to GPS time.  The idea is for beacons to take subscriptions 
from listeners and then to emit highly predictable patterns of packets 
of various sizes and timings. I've been meaning to corner some of my 
astrophysicist friends to adopt some of their methods of using that kind 
of predictable behaviour, observed at a distance, to evaluate what lies 
between the beacon's hither and the listerner's yon.  [And yes, I did 
pick up some ideas from Van J's pathchar and Bruce Mah's 
re-implementation as pchar.)

I am also thinking that we need some legal and accounting rule changes 
so that vendors are more able to share improvements and tests without 
running afoul of restraint of trade laws or damaging their balance 
sheets and that ever present, false fable of "shareholder value".)

         --karl--