[ih] DKIM history, was IETF relevance (was Memories of Flag Day?)

Jon Callas jon at callas.org
Wed Sep 13 15:43:10 PDT 2023


I am late to this DKIM history party, yet I'd like to add in some memories, because I have my own accounts that add to it.

I got involved with DKIM when the DomainKeys folks from Yahoo contacted me. I was CTO at PGP Corporation and they said they wanted to use digital signatures for a form of email handling. Even then, we tried not to say "spam fighting" as we knew it wasn't One Cool Trick (or silver bullet). I met with them, they told me what they wanted to do, and asked if I'd help with the digital signature parts. This was also when I first heard the problem statement that (e.g.) Bank of America wants to send you and email, it bounced through a .forward at your alumni association, and then arrives at Yahoo. Yahoo would like to know that it really came from BoA, and this affects how the message is handled. 

There was SPF at the time, of course, and while SPF is valuable, the problem statement is precisely a case that SPF *cannot* handle.

About two weeks later, a couple people from Cisco called up and said that they were trying to use digital signatures to know the source of an email -- you know, the case that SPF can't handle.

Mike Thomas called it convergent evolution and I can attest that that is exactly what it was. DK and IIM were a case of it being Steam Engine Time for this type of authentication.

From that, the team spread outward, to include Sendmail, Exchange people from Microsoft, Verisign, trad email infrastructure (Dave Crocker) and others. I remember there being over a dozen of us in a few of the meetings. From the start, one of the things we wanted to do was get it into the IETF, and there were also many meta-discussions. There's a reason for the rule of thumb of "rough consensus and working code" along with the need for an actual community of interest (as opposed to a good idea wanting an RFC number). There was also discussions about pieces that were implied but not there, and that there isn't a complete solution, let alone some silver bullet. The initial Yahoo problem statement has implicit things in there that are semantic, not syntactic. The original sender, so-called Bank of America, is implicitly assumed to be some sort of tacitly "good" sender. That implies something a lot like a reputation system, left as an exercise for later. That's just to start.

We discussed a lot about whether this could or should be done with existing standards like S/MIME or OpenPGP. The decision, of course, was ultimately no. I think there are both syntactic and semantic reasons for this. I'd love to rant on this; this isn't the place. DKIM decided it was key-based, "administrative domain" based, and actively focused on the email envelope (like headers). S/MIME and OpenPGP are completely about the body of the email, and we'd have to warp them to get it to the headers; once that's done, we'd have to deal with the differences of author signing and infrastructure signing (again, both syntactic and semantic). 

A relevant thing to this discussion is that DKIM uses what I called in a paper "authoritative trust": you want to send to Alice over at Example.com, so you ask the domain, "hey, what's Alice's key." The domain is an authority, and might have an uneasy relationship with Alice (this is the issue of the server lying about the key, and solutions to that include key transparency, safety numbers, short authentication strings, and so on). The knock-on issues don't really exist here because the conversation is the bank talking to the ISP with DNS as a transport. Thus, while there are DNS issues, there's also DNSSEC sitting over there with possible solutions. This is important, I think, because The Bank thinks it's talking to the alumni association (there's MTA to MTA communication) and it cannot know, a priori, that it's really talking to Yahoo. It's Yahoo who is in conversation with the bank. This recapitulates the phylogeny of Layer 3, where a router just passes the packets on, with no other involvement. Thus, I definitely disagree with the previous side discussions about how perhaps we could have used those protocols. No, we couldn't have and yes we thought about it and talked it out.

Aside: Since there's more discussion of how this might have worked otherwise, including Let's Encrypt or something like it, part of why DKIM is merely awful and not quite horrible is that it's a server-to-server -- excuse me, administrative domain -- conversation. If it were a user-to-doman conversation, that would require a certification system that included a per-user reputation system. It also has to include mechanisms where someone is improperly labeled a bad sender (it's going to be an amazing abuse vector), someone who reforms themselves (or the identifier is reused -- we don't want to have to retire "Alice" for all eternity because the last one was a spammer), and so on. We don't want to go there, because here lies dragons as well as a privacy, abuse, safety, and trust nightmare that's the superset of all the other ones.

Back to constructing DKIM. There are many cases where an IETF standard not only did rough consensus and running code, but it took existing system(s) and created combinations and compromises. Directly relevant, TLS took Netscape's SSL and Microsoft's PCT and joined them together. DKIM took DK and IIM and put them together. Things got left on the cutting room floor, and the combined use case isn't the same. I know that there were IIM features that were cool and didn't get in there; they didn't get in there because there wasn't a rough consensus. We see this historically other places -- there was once a huge philosophical fight that pitted the IPsec folks along with TLS and SECSH -- particularly over the use case that I'll call "telnet over SSL." Probably the standard case of SSH (and SECSH) is pretty much that. Then there's port forwarding, which is a mini-VPN; connections into file transfer territory and so on. All of these are a bit of a mess because the problem space is a bit messy and it's not really possible to apply mathematical rigor to a taxonomy.

I disagree with the word "vile" that Mike said when talking about this, but I get it. There were really good things in IIM that ended up on the cutting room floor, just like there were good things in PCT that didn't make it into TLS. I do agree with the conclusion which is that the combo of Cisco, Yahoo, and Sendmail hammered the convergent evolution into running code that reflected a rough-but-not-complete consensus. The process of getting an IETF standard is one of inevitably including something one doesn't like and dropping something one thought should be included. (And this is its own long topic.)

Obviously, there's more retrospective thoughts I could give; like producing a document, though, at some point one needs to hit send.

	Jon







More information about the Internet-history mailing list