[ih] WiFi, DV, and transmission delay

Mon Feb 18 09:24:35 PST 2019

jnc at mercury.lcs.mit.edu (Noel Chiappa) writes:

>     > From: Dave Taht
>
>     > These days I periodically work on the babeld DV routing protocol. Among
>     > other things it now has an RTT based metric ... It's generally been my
>     > hope, that with the addition of fair queuing ... that the early
>     > experiences with DV being problematic were... early bufferbloat.
>
> I don't think so (and in any events IMPs had such a small amount of buffering
> that they couldn't do extreme buffer-bloat if they wanted to!) Newer updates
> being in the processing queue behind older updates (which is kind of the
> key thing that's going on in buffer-bloat, IIRC) were no doubt an issue, but
> the problems were more fundamental.

The queue may not have been building at the imp but in the app. Didn't
arpanet also have some forms of flow control? I see that the size of the
listen queue was noted as a potential problem in one of the arpanet documents
that I've now read this week, and timeouts in general seem to have been
a hard problem to conceptualize and formalize. 

One big side-effect of the bufferbloat effort is we also switchd the
linux world to head drop queuing rather than tail drop, with suitable
algorithms (rfc8290 and 8289, sch_fq) to resist starvation. TCP RTOs due
to tail drop vanished, stale information such as routing packets or voip
rarely gets sent.

A lot of folk including myself were very nervous about the stability of
this, but so far so good.

In terms of tackling many of the problems wifi has - slow bus
arbitration, bursty transmits, I'm kind of fiercely proud of what
we did in "ending the anomaly":

https://arxiv.org/pdf/1703.00064.pdf

> The thing is that a DV routing architecture is fundamentally a distributed
> computation - i.e. node A does a tiny bit of the work, hands its intermediate
> data output to node B, repeat many times. This is fundamentally different
> from LS and its descendants, where everyone gets the basic data at about the
> same time, and then does all the computation _locally_, in parallel.
>
> Although both links and processors are _much_ faster than they used to be,
> the speed of light (IOW point-point transmission delays) hasn't changed.
> So in whatever equation one uses to describe the settling time (i.e. the
> amount of time needed for everyone's tables to fully update and settle down),
> a major term will not have improved.

Later on in this thread was extrapolation of wired networks speed without
enough terms to capture how weird wireless ones are in comparison.

> Although the two different approaches are probably not _that_ far off, now,
> since real-time delay (RTD) in flooding data is also significant. The
> difference will be if the DV calculation requires multiple intermediate
> computational steps (i.e. a node having to process more than one incoming
> routing table update for a given destination), in which the RTD will repeat
> and accumulate, so it will probably always have a higher settling time than
> Map Distibution approaches such as LS.
>
> Anyway, if one is dealing with a network in which the rate of connectivity
> change is faster, in real time, than the settling time, hilarity
> ensues.

Agreed. Add head drop queuing though...

> The
> ARPANET's use of a poorly smoothed delay metric (well, load-sensitive routing
> _was_ a goal), which increased the dynamicity of the inputs, made things
> considerably worse.

yes! in the fq world though the delay is a function of the number of
flows, not packets. Much, much smoother.

>
> Now, since then work has been done on DV algorithms, to improve
> thing like count-to-infinity, etc, which I didn't pay a lot of attention to,
> since for other reason (policy routing, Byzantine robustness, etc) I decided
> MD was the way to go. I just did not like the 'distributed computation'
> aspect (which is fundamental in DV); it just felt less robust.

In terms of a referent for DV vs LS, for the last decade or so meshy
wireless networks were failng due to excessive (oft infinite) queuing.

Now that's fixed trying to figure out if olsrv2 (LS) or babel (DV) or
something else is on my mind going forward and as is trying to tease
apart "gut historical knowlege" from what we think is fixed in succeding
decades.

For example packet loss is now a lousy routing metric. Retries at the
mac layer largely eliminate that. Being sensitive to RTT - helps. Can we
be sensitive to smaller variance in RTT than 10s of ms? don't
know. Should we respond to ecn? don't know. should we go control plane
and tie things to out of band data, don't know.

>
> Anyway, IIRC, that DV work reduced the number of multiple intermediate
> computational steps (above) in many (most? all?) cases, so that would be
> a big help on that influence on settling time.

My defense of (at least babel's implementation of) dv in this case is
that that individual computation pushes packets in more or less the
right direction incrementally, instead of a global "kerchunk" as in LS.

>
> Alas, I don't remember any key names/titles to look up on that work; maybe
> some early IGRP thing references it? One name that keep coming up in my mind
> as associate with it is Zaw-Sing Su. There's someone else who I think did
> more, but I just can't rember who it was.
>
>
> But clearly BGP 'works', so maybe this is a case where it's 'good enough',
> and an less capable technology has won out.
>
> Anyway, you should read the 'problem analysis' section of BBN 3803;
> available here:
>
>   https://apps.dtic.mil/dtic/tr/fulltext/u2/a053450.pdf
>
> and see to what degree it still applies.

thx!!!!! I wish I'd join this list decades ago!

> 	Noel