[ih] The History of Protocol R&D?

Mon May 26 10:50:02 PDT 2014

All the discussion of algorithms and protocols has me wondering about the
History of such research over the 30+ year experiment of The Internet.   I
apologize if this is common knowledge, but I haven't been tracking such
work for many years now, so I may easily have missed it.  But, since this
is "internet-history", it seems like a good place to ask.

Research is usually characterized by the use of the Scientific Method,
where someone has a new idea and then performs experiments to validate the
expected results.  That often involves a series of steps - e.g., maybe a
mathematical model, perhaps a simulation, and eventually a real-world live
test, instrumented so that the validity of the new idea can be observed in
action.

In the 1980s early TCP work, there was a lot of discussion about algorithms
and protocols for things like retransmission techniques.   We also
necessarily made a lot of assumptions about how the real world would or
should behave.  E.G., one question was "What percentage of packets should
we expect to lose in transit?" - from discards, or checksum errors, or
whatever.   The consensus was 1% - although that number was totally just
pulled out of the air.   But it helped focus the discussions about
appropriate algorithms and expected results.

TCP and router implementations had various tools for looking at behavior in
live situations, e.g,. counters of numbers of packets discarded in routers,
or counters of numbers of duplicate packets received by a TCP.   When we
tried out some new idea, we could observe its actual effects by looking at
such data collected, and judge whether or not the new idea was actually a
good one.  The Internet was of course tiny in those early days, and thus
much easier to observe as it operated.

AFAIK, those counters evolved into a more formal and organized mechanism
for collecting data, e.g., by the definition of a MIB circa 1988, and a
series of additions and refinements in later RFCs.   These presumably made
it possible to collect such data in a cohesive way as The Internet has
grown and evolved over the last 25 years.   But I have no idea whether or
not anyone has been doing such work, or even whether or not such mechanisms
as MIBs have been actually deployed or used in any significant scale.

So, my basic questions are:

   - "How do researchers now do protocol R&D, and validate a new idea by
   measurements in the live Internet?"
   - "What have been the results over the life of The Internet so far?"

I assume that use of mathematical models, simulations, and anecdotal
experiments is common and publicized in papers, theses, et al, but how are
ideas subsequently validated in the broad Internet world, and the results
of models and simulations and lab tests verified in the large scale world
of The Internet?

Looking back at the History of The Internet, there have been a stream of
new ideas, e.g., in the algorithms for congestion control, to name just one
research topic.   How was each idea validated in the live Internet?   What
metrics were observed to improve?

Taking a specific concrete case - was our guesstimate of 1% as a "normal"
packet loss rate valid?   We used to look at the counters and if the rate
was much higher than that we took it as an indication of a problem to be
investigated and addressed.

Has the packet-loss-rate of The Internet been going up, or down, over the
last 30 years?   Has the duplicate-packets rate improved? (Or whatever
other metrics might have surfaced as a measure of proper behavior)

Did the metric change positively in response to deploying some new idea
(e.g., a new congestion control algorithm)?

Today, TCP is everywhere, and packets are presumably getting discarded,
retransmitted, mangled, and delivered, with the power of TCP still hiding
most of the carnage inside from the users outside.   Have our improved
algorithms been getting better at delivering the user data with less and
less carnage?

I vaguely recall that we invented a metric called something like
"byte-miles-per-user-byte", that would simply measure how many bytes were
transported how many miles, for each byte of data successfully delivered to
the user process by a TCP connection.  The ultimate theoretical goal of
course was just the line-of-sight distance in miles between the two
endpoints - i.e., the metric for an actual physical error-free real circuit
limited by the physics of the speed of light.   But retransmissions,
congestion discards, routing decisions, and other such internal mechanisms
of the Virtual Circuits of The Internet, would dictate how close reality
came to that theoretical limit.

My TVs have TCP, and it can stream video from halfway around the world.
So can my phone(s).   So can the other millions of devices out there.   If
I could look at the Wastebins of the Internet after a typical day, how big
a pile of discarded packets would I find, in the various hosts, routers,
etc. out there?   Over the History of The Internet, how has that daily
operational experience been changing?   How much observed effect have the
new algorithms had on getting closer to that theoretical ideal behavior of
one byte-mile per user byte-mile?

Are the new algorithms even implemented in those devices?   Is anybody
watching the gauges and dials of The Internet?

/Jack Haverty
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://elists.isoc.org/pipermail/internet-history/attachments/20140526/7e7028bd/attachment.htm>