Security/identity · 2006-11-12

When my identifier is none of your business

There’s been an interesting discussion on the OpenID general list (the thread starts here under ‘concerns about each user having a unique “URL”‘) about the need to allow a user’s identifier to be private. If you someday have a grand convergence of all your identities onto The One True URL Representing You, you may be sharing more information with OpenID-enabled web apps than is desirable.

Of course, secrecy of unique identifiers is an old idea, entwined with anonymity/pseudonymity goals. No one wants The One True URL to be of the form http://www.us.gov/SSNs/123-45-6789. And in systems that are cross-domain but don’t need to be Internet-wide, often great care is already taken not to uniquely identify users, such as in the Sun-BIPAC usage of Liberty Alliance-based federation and web services. So this definitely seems like an interesting confluence of use cases in the URL-based and SAMLish areas.

I last tackled this topic under the name Pseudonym picking, where I tried to do a compare/contrast between OpenID and SAML on this point based on what I was learning from Drummond. Unfortunately, I did an extraordinarily confused job of it. Reading it again, even I don’t understand it, which is pretty bad. But since I ended up giving it another try in the mail thread, I thought I’d share some of that here in case it’s useful. So consider this a sort of V2.0.

Here’s my understanding of the different motivations you might have for not supplying a particular URL that uniquely identifies you directly into the OpenID dialog box at a web app:

…there can be two reasons why you might want to tell the relying part what IdP to go to but not hand the relying party your identifier at that moment. The first is that you can do what I might call “late binding” of your choice of digital identity over at the IdP, which lets you pick the specific persona you want to use. The persona might or might not be associated openly with a natural person…

The second is that you never want the relying party to have a “real” identifier of yours, so that you can arrange with the IdP to pick “a one-time URL/XRI generated by the IdP just for this relationship”. This pretty much follows the textbook definition of a pseudonym…

Note that once you go into persona-land, there’s a continuum of possibilities about how private that URL really is. OpenID seems to leave the pattern of repeat usage of URLs entirely up to you.

And here’s my attempt to map these situations to SAML:

The typical SAML single sign-on flow doesn’t involve the user handing the relying party their identifier; only when redirected from the relying party to the IdP they authenticate as “someone” — and of course, if they have multiple digital identities managed by that IdP they’re free to choose whichever they want at that point. This matches the first situation above: late binding of identity-choosing. It would be similar to users always supplying merely an OpenID Provider URL to relying parties. SAML doesn’t prevent your providing additional information to the relying party, like the desired identifier to authenticate you against, but it’s not needed in order to have the relying party ultimately “let you in”.

If instead the user agrees to the relying party and IdP setting up a special relationship with each other and her, explicitly for her benefit (called “federating” in SAML), the identifier that passes between those parties is typically a privacy-preserving pseudonym (one-session or long-term), and the user never sees or picks it. This sounds quite close to the one-time URL/XRI situation above, except for the user knowing/not knowing the pseudonym.

(My original post had a couple of graphics showing the SAML use case for privacy-preserving federation and one of its technical options for achieving it.)

Now for a little probing into the OpenID functionality…

First question: If I already know what public persona I want to use with a web app, can I give them the persona URL directly? (I’m assuming that if I’ve got a bunch of them and can’t keep them all straight, it’s useful just to get over to my IdP and pick one from some kind of drop-down interface or something.) I have to say that I find the SAML flow more elegant: the web app initially doesn’t need to know anything more than where to take me to get me authenticated properly, and the resulting “yep, she’s authenticated” info that flows back to them in due course will include whatever identifier of mine is appropriate for them to see.

Second question: Is there a way to ensure that I, the user, can’t abuse/ruin a “private” identifier created for security purposes? It’s cool to let me be in control of the pattern of repeat usage of persona identifiers so that I can organize my online interactions on a per-persona basis, something I already do with email-based identifiers at a number of sites. (Perhaps it could also allow me to detect inappropriate “identity leakage”, similarly to subscribing to magazines with a phony last name — “Name on the subscription? Put me down as Eve O. Bscuresports-Quarterly” — and doing list salting.) But if I want to ensure that an identifier never gets reused, it seems best if I never know it, particularly if it’s in easy-to-remember URL/XRI form.

This time I think I haven’t lost myself along the way, but if I’ve still lost you and you’re masochistic enough to want a V3.0 (or sadistic enough to want to take a cluebat to me), let me know!

MORE: Johannes discusses the essential nature of having yourself be addressable on the web to exist. Though it took some time for me to get my head around it, this now makes a ton of sense to me in the open-Internet context (and heck, that’s why we made the very first design goal of XML be XML shall be straightforwardly usable over the Internet). However, if you want to allow for an anonymous-authorization use case or for any level of privacy at all with something like OpenID, is it sufficient to make the identity provider addressable?