[ih] UDP Length Field?

Vint Cerf vint at google.com
Wed Dec 2 09:23:55 PST 2020


great summary David.
v


On Tue, Dec 1, 2020 at 9:02 PM David P. Reed <dpreed at deepplum.com> wrote:

> Hi all -
>
>
>
> I'm glad to be able to try to help. The actual process of designing UDP's
> packet format was very brief, and was done in the context of sketching out
> how to split TCP into IP, TCP, and UDP right after that decision.
>
>
>
> The group doing the design of TCP prior to its split into IP, TCP/IP and
> UDP/IP (ICMP/IP wasn't contemplated yet), was a combined group, and the
> efforts were combined. This decision was a quick decision to placate those
> of us who strongly urged a "first-class" datagram option be part of the
> Internetworking Experiment, rather than a pure virtual circuit focus as had
> been pursued up to that point.
>
>
>
> So UDP was not the focus of the group - in some sense it was a
> "placeholder" for a more refined datagram design. While the efforts on the
> IP protocol (including fragmentation and routing-related functions that
> belonged in the packet forwarding layer) continued, and the efforts on the
> TCP virtual circuit functionality continued, UDP was sketched, and kind of
> orphaned with no caretakers polishing it up. There was no "UDP group"
> formed, as did happen for the IP packet and protocol, separate from the TCP
> design efforts.
>
>
>
> This was in the late 1970's.
>
>
>
> So UDP remained as sketched, until it was finally implemented in various
> systems as an end-to-end protocol, where ports were used to demultiplex the
> datagrams. Things got a bit weird when OS's started creating system call
> interfaces for UDP, because OS's seemed to be stuck in the telephony
> mindset where circuits were "set up" and "torn down" during a "connection",
> but UDP defined no connection. That was *intentional* in its concept.
> Demultiplexing was supposed to be separate from any concept of "connection"
> to a foreign entity - the whole idea was that a process could own a socket
> that any foreign system could send to - so A could send a request to B,
> which would send a message to C, which would send a message to D, and D
> could confirm receipt and provide a response to A directly, though A might
> not even know that D existed. Such an idea didn't fit with pairwise
> connections, but was very important to those of us developing
> multi-computer decentralized systems (PARC and my group at MIT were
> examples of places where we were developing fault tolerant
> process-to-process coordination systems where the fault tolerance was not
> "in the OS" because we believed OS's would evolve to span loosely federated
> sets of machines (not mainframes, minis or superminis).
>
>
>
> That color commentary helps, I think, in understanding the answer to your
> specific question, which I will suggest shortly. It's important to
> understand that UDP's implementation was distorted by folks like the
> Berkeley Unix team who invented "sockets", and it was also frozen in
> concrete by the OS folks who implemented the sketch. The result was "ok",
> but not polished. Also, because of the premature solidification, many of
> the desired goals of UDP were unworkable because "interoperation" had to
> cope with semantics imposed by relatively clueless OS network stack
> implementors, like Bill Joy's team.
>
> TCP was the star, UDP was a poor relation, though it was not supposed to
> be - those of us who pushed for it thought it was crucial. (since we
> accepted that congestion control would be something that had to be
> addressed by high-level etiquettes that would stanch packet flows based on
> feedback about conditions of the network observed during packet forwarding,
> it was disappointing that the congestion control put into TCP was not
> properly split between IP and TCP, in my opinion. Further making UDP a poor
> relation - with no coherent theory of congestion management, protocols that
> used UDP (like RTP) were left to struggle with how to be compatible with
> Internet level congestion control.
>
>
>
> So, with all that context (and there's more), to answer the question:
>
>
>
> 1. UDP's design was minimal and frozen without enough attention being
> paid. IMO, there is room for an improved UDP, not based on adding features
> to the packets, but by creating a new protocol number for a "UDP2", which
> would not, for example, need a redundant length field.
>
>
>
> 2. The general view back in the day was that it was not the job of the
> protocol designer to focus on likely bugs in protocol stacks. The length
> field was redundant w.r.t. the IP header's "Total Length", yes, but
> redundancy can be checked. There's nothing "wrong" with having redundancy,
> in other words. It's the OS's responsibility to check. (The same reasoning
> applies to the redundancy between the IP header's total length and the
> length of the underlying Ethernet frame or ATM frame or carrier pigeon
> envelope. Yes, if the network on which IP was overlaid had a length field,
> too, that provided some redundancy, but redundancy is to be checked, and if
> there's no match, it is an error, not a security hole.
>
>
>
> 3. It was thought that across all TCP implementations, headers would be
> multiples of 32 bit "words" (4 octets). The reasoning was that this was
> optimal for all kinds of computers that we could conceive.
>
> Octets were not "word boundaries" and aligning fields based on octets
> would have made things quite awkward on machines that did not have byte
> addressed memory systems. 32 bits fit into the DEC systems (PDP-6/10/20,
> which many research labs affiliated with ARPA used, and the GE645 and
> Honeywell 6180 and 68/80 systems had 36 bit word sizes, and so forth. The
> idea of 8-bit byte addressable memories didn't become popular until 8-bit
> microprocessors, like the 8080, became important). So clearly the UDP
> header would be two or more 32-bit words, even though there were only 3
> 16-bit quantities really needed.
>
>
>
> 4. The length field is likely to be needed at the endpoints as part of the
> datagram as delivered by the network stack. TCP didn't have a "length" - it
> didn't even have packet boundaries at this point. messages within the
> infinite stream of bytes that were a virtual circuit in TCP would have to
> have length delimiters to separate messages, but they were not part of the
> function of the IP header Total Length field.
>
> That is to say, the TCP receiving port never needed to see (and would not
> see) any information derived from the IP header "Total Length". In fact,
> when TCP retransmitted a range of bytes, the packet boundaries might be
> quite different - that was crucially part of the design of TCP's semantics
> as a stream of octets in each direction.  Wheras, UDP sent "user datagrams"
> that had lengths and a checksum that the *user* was supposed to check (not
> the operating system, though the Unix sockets guys screwed that up too!)
>
>
>
> That's the story, such as it is. As one of the advocates of the datagram
> internet as a primary goal, I think it's a bit sad that the benefit of
> datagrams as a mode of richer communications has been poorly developed.
> It's OK, but not great. It's also sad that multi-datagram protocols haven't
> been developed more maturely - DNS shouldn't require TCP to send longer
> one-off messages. That's like using a private jet plane instead of a car
> for a family trip because cars only have 2 seats. You should be able to use
> 2 cars.
>
>
>
> Now that we see some of the benefits (in the QUIC concept to replace the
> heavyweight HTTP/TCP mess) it would be nice to be able to go back and
> change history. But one cannot.
>
>
>
>
>
> On Sunday, November 29, 2020 7:33am, "Vint Cerf" <vint at google.com> said:
>
> the primary proponents of splitting off IP from TCP were Jon Postel, Danny
> Cohen  and David Reed, I believe. Sadly, Jon and Danny are no longer with
> us. My recollection is primarily that UDP was to allow for real-time,
> non-retransmitted, non-sequenced delivery for voice, video, radar in which
> low latency was more important than sequenced and assured delivery. As to
> the length field, it may merely have been habit to include, even if the
> value could have been computed. Sometimes <length> was used to distinguish
> real data from padding to achieve preferred word boundaries.
> v
>
> On Sat, Nov 28, 2020 at 8:21 PM Brian E Carpenter via Internet-history <
> internet-history at elists.isoc.org> wrote:
>
>> Reverse designing it (a bit like reverse engineering), it seems useful
>> to be able to check that the intended payload length fits inside the
>> actual packet length. If it doesn't, you are exposed to what you might
>> call buffer underrun issues. Conversely, if you don't like covert
>> channels,
>> you might want to detect any spare bits after the payload.
>>
>> Regards
>>    Brian Carpenter
>>
>> On 29-Nov-20 12:42, Timothy J. Salo via Internet-history wrote:
>> > Hi,
>> >
>> > Can anyone provide some [historical] insight into why the UDP header
>> > contains a length field?  TCP manages to ascertain the length of data in
>> > a packet just fine without a length field, so why couldn't UDP?
>> >
>> > Several people have noted that the UDP length field is redundant,
>> > including for example, the current Internet Draft "Transport Options for
>> > UDP",
>> > <https://www.ietf.org/archive/id/draft-ietf-tsvwg-udp-options-09.txt>.
>> >
>> > There are some other opinions, some of which sound to me like
>> > after-the-fact reasoning:
>> >
>> > - So that UDP can run over network protocols other than IP (although
>> >    presumably TCP could do this just fine without a length field).  But,
>> >    the UDP spec says that an IP-like pseudo header needs to be created,
>> >    in any case.
>> >
>> > - Layering and encapsulation reasons, (although, again, TCP seems like
>> >    a counter example).
>> >
>> > - Word alignment, (there were 16-bits left over, so why not use it for
>> >    the length?).  Personally, this sounds the most likely to me.
>> >
>> > Thanks,
>> >
>> > -tjs
>> >
>> --
>> Internet-history mailing list
>> Internet-history at elists.isoc.org
>> https://elists.isoc.org/mailman/listinfo/internet-history
>
>
> --
> Please send any postal/overnight deliveries to:
> Vint Cerf
> 1435 Woodhurst Blvd
> McLean, VA 22102
> 703-448-0965 <(703)%20448-0965>
> until further notice
>


-- 
Please send any postal/overnight deliveries to:
Vint Cerf
1435 Woodhurst Blvd
McLean, VA 22102
703-448-0965

until further notice


More information about the Internet-history mailing list