[ih] A revolution in Internet point-of-view - Was Re: Internet analyses (Was Re: IPv8...)

Thu Apr 30 17:31:32 PDT 2026

(Beware I am about to wax long on some ideas here....  Joe T. may ding 
me for deviating from strictly writing about "history" and drifting in 
conjectures about the future.)

And I want to emphasize your point that instrumentation of things (the 
MIBs) are where a lot of the really hard work is.  (I found that 
atomicity - or the ability to grab a time synchronized set of values 
from all over a MIB can be both really important and also really hard to 
implement.  And individual MIB items often lagged the reality underneath 
due to things like slow polling/fetching of values from a kernel or 
network stack.)

I kinda got hooked on network management issues at the TCP event that 
Dan Lynch arranged in Monterey in .. 1987?.

I remember walking past you (Craig P.) and Dan Lynch having lunch and I 
overheard you chatting about issues of network control and management.  
I kinda innocently eavesdropped.  I am intrigued by the concepts of 
homeostasis - self healing - and resilient design, so your conversation 
on network management took root.

I liked the HEMs design - it was elegant.  And I liked CMIP for its 
ability to apply filters to select pieces of management data. I thought 
that SGMP/SNMP was kinda bad - inefficient (lots of packets, lots of 
work to maintain security and access control), hard to implement 
(lexiographic ordering is a real pain - I say that from many years of 
experience), and SNMP was OK for monitoring but weak for control.

And I worked at times with Glen Trewitt who, of course nudged in the 
HEMS direction.  (I also remember his car - one of those Dodge Dart 
things with the square steering wheel.  Where is he these days?)

John Romkey moved out from Boston to California and we worked together 
at my company, Epilogue Technology.  On day John sat down and wrote a 
router - took him a couple of days.  I thought "gee, I can sling code 
with the best of 'em and John is certainly among the best of the best - 
so I'll write an SNMP agent."  It took me about a week to get it 
working.  John added a tool to compile MIBs into tables to drive my 
code.  The whole thing, when compiled, could fit into as small as 
12Kbytes!!  We licensed it to a lot of companies - the license fees paid 
for a lot of goodies, including my house.  We also used that code in 
both John's and Simon Hacket's SNMP driven Internet toasters (1989?)  
Yes, those toasters were real, and they worked.

After several years working with SNMP I came to realize that there is a 
difference, a significant difference, between "network management" and 
"network troublehooting".

Much of the design of SNMP - particularly its UDP foundation - was based 
on a notion that SNMP would be a floor wax *and* a dessert topping - 
that it would be both a network management protocol and a 
troubleshooting protocol.  That latter goal, to my mind, seriously 
compromised SNMP by making it far less efficient and far more difficult 
to implement than it needed to be.

At the time it was claimed that UDP was necessary because SNMP needed to 
run when network conditions were too degraded to support a TCP 
connection.  To my way of thinking when a network can't support TCP that 
means that we had exited the realm of "network management" and entered 
"troublehooting".  Given that I grew up in a family of radio and TV 
repair guys I knew that this was a major divide and that different 
toolboxes were needed.

I continue see this interfusion of the concepts of "network management" 
and "troubleshooting" - and it is an area that I think we need to 
revisit in a serious way, especially as security barriers are making 
both management and troubleshooting more difficult.

(My own re-design of SNMP - a thing I call KNMP - tries to preserve the 
core value of SNMP - the MIB instrumentation in devices everywhere - but 
carries the protocol over TLS/TCP [because it is a management tool, not 
a troubleshooting tool], and borrows ideas from both HEMS and CMIP about 
retrieving and operating (potentially atomically) on entire subtrees of 
management data (subject to filter expressions).  I tried to present 
KNMP to the IETF but a) I was too lazy to write an Internet Draft and b) 
Netconf had taken the bit and was absorbing all interest.  I did a 
rough, incomplete writeup and did implement much of it (in Python 2) - 
https://www.iwl.com/idocs/knmp-overview

On the troubleshooting side I've been learning from my astrophysics and 
medical friends.

    - We need a database of network symptoms with links back to possible 
causations, along with tests to apply to distinguish between possible 
causes.  This would be a rather rough kind of database. And it would be 
constructed based on gathering of expertise and experience from humans 
who have been running networks.)

    - We would need a means take a problem, walk through that database 
(performing the suggested tests as we walk towards possible 
causations).  I used Prolog - which is hard - the new AI systems would 
probably be nicer.

    - Baselines, baselines, baselines - we need to measure "good" 
behavior so that we know when things wobble.  And this needs to be 
something that goes into things that users care about, such as freezing 
or bad video on Netflix, broken voice on VoIP, etc.

    - Gathering of that baseline data and looking for deviations brought 
me to wonder about what we kinds of things we can derive from one-way 
observations.  Most of our network tools are two-way - and thus subject 
to asymmetry of routing or one-way problems. So I kinda wondered about 
pulsars and what do astrophysicists do when they measure the perceived 
deviation of highly predictable events.  So I set out to build some 
network beacons - somewhat akin to what RIPE does in ATLAS.  I'm using 
Raspberry Pis with a GPS driven clock.  I borrowed ideas from Van 
Jacobson/Bruce Mah in the pathchar code - a spectrum of packets - and 
precise and predictable, but irregular, timing of those beacons.  I want 
to explore what I can derive from the arrival timing dispersion, as seen 
by receivers around the net, of those beacons.

One lesson that I've drawn from all of this stuff is that we have for 
too long been building the Internet with the nearly singular goal of 
having something that works (and tuning that for efficiency.)  I think 
that, as a result, the net is more brittle than it needs to be.

To me merely working ought to no longer be our design goal.

I am now of the belief that we need to build new parts of the Internet 
(and redesign the existing parts) with the goal being survivable rather 
than efficient or that "it works".  I'm thinking here in terms of how 
biological evolution works - survival into the next generations is more 
important than moment to moment efficiency.  Moreover, in biological 
evolution existing approaches and solutions (such as resistance to 
disease or altering internal operations [such as a transition from using 
sugars to using fats] in response to external conditions) do not vanish 
when evolution provides new mechanisms - those old mechanisms remain and 
still operate.  In our future Internet we ought to strive to not 
"deprecate" the old but rather to use those old means to create a kind 
of dynamic tension with new methods so that our overall system pulls 
towards "operational" and "survival" rather than "optimal".  (Old 
methods can also be constraints that limit how far astray new methods 
might go.)  Yes, this is very rough and vague thinking, but I did work 
with some of these ideas back in the 1970's when we were building 
trustworthy operating systems.

At a more basic, and more achievable level, we perhaps ought to be 
taking a cue from old Ma Bell and building lots of test points (and test 
tools) into network protocols and devices - I can remember when I first 
started getting involved in networking. Back then everything seemed to 
have remote loopback modes and explicit test points.  And yes, security 
is a big concern with that kind of thing.

         --karl--

On 4/30/26 4:50 AM, Craig Partridge via Internet-history wrote:
> On Wed, Apr 29, 2026 at 10:42 PM Dave Crocker via Internet-history <
> internet-history at elists.isoc.org> wrote:
>
>> On 4/29/2026 5:04 PM, John Day via Internet-history wrote:
>>
>>>    Choosing SNMP over HEMS.
>> I'm my usual version of fuzzy about the details, but it appears I was
>> the Network Management AD at the time, for whatever that might be
>> worth.  The only 'directed' choice I recall was to use ASN.1, much to
>> the IETF-constitueny's chagrine.  But that was due to the persistent and
>> vigorous politics coming from the OSI side.
>>
>> My vague sense of the competition -- besides the solid
>> politicking-over-implementing that characterized the CMIP folk -- was
>> that HEMS was cleaner but lacked experience, whereas SNMP was an
>> increment over the deployed SGMP. Worse, Alas, HEMS also did not develop
>> enough traction to counter  advocacy by the other two communities.
>>
>> There is quite a bit of history of choosing experience over elegance,
>> especially given the benefits (and in spite of the detriments) of
>> installed base.
>>
>> By the time of this particular competition, participation in the IETF
>> was wide open and the participation in the IETF was extensive and
>> vigorous.  So the model of rough consensus even benefit from pretty good
>> market sampling.
>>
> Speaking as a co-author, with Glenn Trewitt, of HEMS, here's what I
> remember.
>
> HEMS sought to be a richly featured network management solution. Drawing on
> prior network management projects at BBN and elsewhere, it supported
> proxies, MIB extensions, and configurable traps, and it had some notion—I'm
> sure not good enough—of authentication and encryption.  Queries were
> ("safe") little programs that walked the MIB and retrieved information
> based on the current state of the router (e.g. you could send a query to a
> router to find out why it couldn't reach prefix P, and the query could
> discover the link to P was down and return a report on link status).  At
> the network management summit meeting intended to find a direction forward,
> HEMS had a prototype implementation on BSD UNIX. Many vendors, who had
> never implemented a serious network management protocol, were worried about
> fitting HEMS on their platforms.  [Side note: almost no one at the time
> understood just how hard it was to instrument and monitor Internet devices
> in a large operational network.  No vendor and no ISP had been in business
> for more than a few years and there was a lot of wishful thinking about how
> simple network management protocols and tools could be.  Glenn Trewitt and
> I sought out operators and forward looking researchers and aimed HEMS at
> their guess of what would be needed 5 years or so in the future.  Their
> guesses were right, which is why, on paper, HEMS sounds like something the
> IETF should have picked.  But I suspect Glenn and I got the concept right
> but the details wrong in many places.]
>
> SGMP was a stripped-down protocol that took core ideas from HEMS and CMIP
> and fit easily into existing router implementations. Within months it had
> been widely deployed and was in operational use at the time of the network
> management summit.
>
> CMIP was part of an OSI network management standards process that by this
> time was importing ideas from HEMS and SGMP. It had, I think, one toy
> implementation of a fragmentary spec, but had a number of powerful vendor
> CEOs calling anyone they could think of (including me) to say "this is the
> future."
>
> So when the IAB/IESG tried to figure out what to do, after a full day
> meeting, it was clear we were in a tough spot: CMIP had lost, but a number
> of people's ultimate boss (CEO) didn't want to hear that; SGMP was deployed
> and in use, but had serious gaps; and HEMS was perhaps the most complete
> solution but  a prototype that was months (well over a year?) from
> full-scale deployment.  Standards gridlock was a real possibility and the
> growing Internet needed a network management solution "right now".
>
> To solve the problem, I offered to withdraw HEMS from consideration. That
> immediately made the path forward clear.  A slightly improved SGMP with a
> better MIB was the obvious choice for immediate standardization.  CMIP
> could be tagged as the full-service future, when ready (which it was pretty
> clear, it probably never would be).  Political note: the decision promptly
> created a headache for OSI advocates, as the CMIP proponents, realizing
> they'd lost by winning, then tried to impede progress on SNMP and MIBv1,
> and created the perception that the OSI community was opposed to the
> well-being of the Internet.
>
> Thanks!
>
> Craig
>
> PS: Regarding ASN.1 and network management.  Blame me.  HEMS, to support
> MIB extensions, allowed MIB components (e.g. a routing table entry) to
> include self-documenting vendor extensions.   HEMS, by default, retrieved
> data structures (rather than individual MIB variables).  So you could ask
> for the routing entry for prefix P, and would get all the variables in the
> routing table for P -- including the vendor extensions.  ASN.1's encoding
> (BER) made that easy. All data in ASN.1 BER is self describing. ASN.1
> supports private types (so a vendor could add extensions without worrying
> about typespace collisions, etc. and a network management tool could
> automatically display extensions that it had never seen before).  So when I
> sought an existing external data format for HEMS, ASN.1 was the obvious
> choice (vs., say, Courier or XDR).  CMIP was already planning to use
> ASN.1.  So SGMP picked up the idea while discarding the self-describing
> aspects.
>