[ih] UDP Length Field?

Sun Nov 29 13:55:56 PST 2020

I've often myself wondered why we have redundant length information at 
various layers.

I understand the desire to have layers be independent (an issue that I 
think needs some revisiting - I'll get back to that later).

One problem that I've come across in my development of test software is 
that lengths at different layers can be inconsistent.

For example, it is not uncommon (in fact for short IP packets it is 
necessary) for an IP packet to be shorter than the data space in an 
Ethernet frame.

This is also true of ARP packets - they don't fill an entire Ethernet 
frame.  (FTP Software used this difference:  They used the gap between 
the end of the ARP packet [or some other short broadcast IP packet] and 
the end of the enclosing Ethernet frame as a place to hold license 
information so that they could test for duplicated/copied licenses.)

One test that I often run is to re-encapsulate shorter-than maximum 
Ethernet frame sized IP packets into Ethernet frames that have extra 
space at the end.  That trips-up implementations that lazily use the 
Ethernet length to imply IP length.  A surprising number of 
implementations fail.

--

Regarding TCP length (or lack of).  TCP carries a stream, not demarcated 
messages.  A lot of implementations implicitly depend on the 
happenstance that quite often every send() call or implied Push by a TCP 
sender sender results in a single read() completion at the receiver.   
Code that does that is potentially brittle. And as the world evolves to 
have more application level proxies that chain multiple TCP connections 
together, especially as the pieces have different MTUs and segment sizes 
(or there is an intermediary path between proxies that isn't IP based at 
all), we may see more of this kind of code begin to wobble in unpleasant 
ways.

--

Regarding split of IP from the once monolithic TCP:  That idea seems to 
have surfaced in several places at different times.  For instance, when 
I was working on security protocols back in the 1970s at SDC Dave 
Kaufman and I wanted to insert end-to-end cryptographic machinery into 
the then still academic TCP.  Our idea was somewhat like IPSEC in that 
we wanted to do encryption on  IP datagrams.  That implied cracking open 
TCP so that there would be an explicit boundary/API to a lower layer 
that moved datagrams bearing IP addresses.   This was circa 1974/75.  
Vint had a hand in this.  (For us that API layer was very real - it was 
actual hardware, very special hardware, between TCP and the underlying IP.)

But like much of what we did on that project (including my security 
kernel and capability architecture machine work), it was wrapped in 
layers of US military security and cold-war paranoia and only the 
thinnest glimmers of it ever made it into the public view.  (I'm 
particularly bothered that my work on debugging secure code, especially 
on a capability architecture system, has vanished.)

--

I mentioned that I think we may have gone too far in protocol layer 
isolation.   Layering of protocols and abstractions is good and lets us 
wrap our minds around  designs.

But I've long wondered about what we can learn from biology. Living 
things are surprisingly robust.

And here's where I'm going to get really fuzzy and wave my hands a lot....

One of the ways that living things manage themselves is through feedback 
loops that exist between seemingly independent pieces. For example, one 
of our reactions to infection is to increase our body temperature.  Many 
of these feedback loops are created through sequences of one chemical 
reaction triggering another triggering another.  That chain was built 
via evolutionary processes that did not particularly care that the 
indicator-chemical was from a system that was in another logical layer 
or abstraction.

When I was doing network video back in the mid 1990's we had an issue - 
sometimes network conditions could not sustain a viable high quality 
video stream.  There needed to be a feedback loop so that sender could 
adjust.  Fortunately people like Steve Casner had recognized this need 
and created a feedback protocol into the media distribution protocol.

We have some similar feedback systems in the Internet - for example TCP 
congestion detection can result in TCP senders decreasing their rate of 
transmission.  And the Explicit Congestion machinery can cross protocol 
layers so that routers can provide information that IP and TCP can use 
to change their behavior.

We can go further down this path.

Those of us who are managing their intake of sugars know that there are 
a couple of measures of importance - instantaneous blood sugar level and 
an measurement known as A1C.  The latter is a longer-term measure.  It 
is a measure of chemistry that results from longer term and slower 
reactions based on instantaneous blood sugar.

The Internet has many of those kinds of instantaneous data generators - 
video rendering gear (such as Roku boxes) know when network input is 
underrunning their video renderers and generating bad images for 
viewers.  Zoom can tell when network jitter has risen to the level that 
speech is breaking up.

But what happens to that data?  Usually nothing.  And if anything that 
reaction is usually confined, because of the concept of protocol 
layering, to the protocol and devices directly involved. In a biological 
sense that is like saying "every cell is on its own,  it must defend 
itself".  In living things that would probably lead to evolutionary 
extinction.

Here's where I'm going to get really vague.  (Brian Carpenter sent me 
some materials in which it appears that he and others have done work 
that helps remove some of this vagueness.)

I'm intrigued by the idea of network pheromones and other cross-boundary 
(protocol layer boundary, inter-device boundary) signals that can be 
used by feedback mechanisms (usually not even designed yet) to change  
the larger system behavior of the Internet or local pieces of the Internet.

We've seen a bit of that - Google has done some impressive work 
generating things like disease transmission maps based on derivations 
from large numbers of (at first glance) unrelated data points, such as 
web search queries and geo-IP data.

When I was thinking about this stuff during my time at Cisco I came to 
know that this was an idea fraught with dangers.  Feedback systems can 
destroy as much as they can heal:  A high fever that the body creates to 
fight infection could itself kill the entire body.    And floods of 
network status information can overwhelm the net - and create means for 
delving for information that ought not to be exposed, or create means to 
inject trouble into a network.

         --karl--