When my identifier is none of your business

There’s been an interesting discussion on the OpenID general list (the thread starts here under ‘concerns about each user having a unique “URL”‘) about the need to allow a user’s identifier to be private. If you someday have a grand convergence of all your identities onto The One True URL Representing You, you may be sharing more information with OpenID-enabled web apps than is desirable.

Of course, secrecy of unique identifiers is an old idea, entwined with anonymity/pseudonymity goals. No one wants The One True URL to be of the form http://www.us.gov/SSNs/123-45-6789. And in systems that are cross-domain but don’t need to be Internet-wide, often great care is already taken not to uniquely identify users, such as in the Sun-BIPAC usage of Liberty Alliance-based federation and web services. So this definitely seems like an interesting confluence of use cases in the URL-based and SAMLish areas.

I last tackled this topic under the name Pseudonym picking, where I tried to do a compare/contrast between OpenID and SAML on this point based on what I was learning from Drummond. Unfortunately, I did an extraordinarily confused job of it. Reading it again, even I don’t understand it, which is pretty bad. But since I ended up giving it another try in the mail thread, I thought I’d share some of that here in case it’s useful. So consider this a sort of V2.0.

Here’s my understanding of the different motivations you might have for not supplying a particular URL that uniquely identifies you directly into the OpenID dialog box at a web app:

…there can be two reasons why you might want to tell the relying part what IdP to go to but not hand the relying party your identifier at that moment. The first is that you can do what I might call “late binding” of your choice of digital identity over at the IdP, which lets you pick the specific persona you want to use. The persona might or might not be associated openly with a natural person…

The second is that you never want the relying party to have a “real” identifier of yours, so that you can arrange with the IdP to pick “a one-time URL/XRI generated by the IdP just for this relationship”. This pretty much follows the textbook definition of a pseudonym…

Note that once you go into persona-land, there’s a continuum of possibilities about how private that URL really is. OpenID seems to leave the pattern of repeat usage of URLs entirely up to you.

And here’s my attempt to map these situations to SAML:

The typical SAML single sign-on flow doesn’t involve the user handing the relying party their identifier; only when redirected from the relying party to the IdP they authenticate as “someone” — and of course, if they have multiple digital identities managed by that IdP they’re free to choose whichever they want at that point. This matches the first situation above: late binding of identity-choosing. It would be similar to users always supplying merely an OpenID Provider URL to relying parties. SAML doesn’t prevent your providing additional information to the relying party, like the desired identifier to authenticate you against, but it’s not needed in order to have the relying party ultimately “let you in”.

If instead the user agrees to the relying party and IdP setting up a special relationship with each other and her, explicitly for her benefit (called “federating” in SAML), the identifier that passes between those parties is typically a privacy-preserving pseudonym (one-session or long-term), and the user never sees or picks it. This sounds quite close to the one-time URL/XRI situation above, except for the user knowing/not knowing the pseudonym.

(My original post had a couple of graphics showing the SAML use case for privacy-preserving federation and one of its technical options for achieving it.)

Now for a little probing into the OpenID functionality…

First question: If I already know what public persona I want to use with a web app, can I give them the persona URL directly? (I’m assuming that if I’ve got a bunch of them and can’t keep them all straight, it’s useful just to get over to my IdP and pick one from some kind of drop-down interface or something.) I have to say that I find the SAML flow more elegant: the web app initially doesn’t need to know anything more than where to take me to get me authenticated properly, and the resulting “yep, she’s authenticated” info that flows back to them in due course will include whatever identifier of mine is appropriate for them to see.

Second question: Is there a way to ensure that I, the user, can’t abuse/ruin a “private” identifier created for security purposes? It’s cool to let me be in control of the pattern of repeat usage of persona identifiers so that I can organize my online interactions on a per-persona basis, something I already do with email-based identifiers at a number of sites. (Perhaps it could also allow me to detect inappropriate “identity leakage”, similarly to subscribing to magazines with a phony last name — “Name on the subscription? Put me down as Eve O. Bscuresports-Quarterly” — and doing list salting.) But if I want to ensure that an identifier never gets reused, it seems best if I never know it, particularly if it’s in easy-to-remember URL/XRI form.

This time I think I haven’t lost myself along the way, but if I’ve still lost you and you’re masochistic enough to want a V3.0 (or sadistic enough to want to take a cluebat to me), let me know!

MORE: Johannes discusses the essential nature of having yourself be addressable on the web to exist. Though it took some time for me to get my head around it, this now makes a ton of sense to me in the open-Internet context (and heck, that’s why we made the very first design goal of XML be XML shall be straightforwardly usable over the Internet). However, if you want to allow for an anonymous-authorization use case or for any level of privacy at all with something like OpenID, is it sufficient to make the identity provider addressable?

No tags for this post.

4 Comments to “When my identifier is none of your business”

  1. David Kearns 13 November 2006 at 8:36 am #

    If this URI is a) only used once and b) not related to your primary/unique identity then it can hardly be called an “identifier” can it? If all you need is an attestation that you are, say, “human” (or over 21, or a citizen of British Columbia, etc.) then assert that. If you need to tie the assertion to a session token then do that. But, please, don’t muddy the already murky waters still further by calling it an “identifier”!

    -dave

  2. Eve M. 13 November 2006 at 10:27 am #

    Interesting point and good food for thought! But on reflection I’d prefer to make the opposite case. For example, if you choose a label as the “key” for representing a particular slice/persona/entry point for your identity, what’s not identifier-like about that?

    Also, I’d say that pseudonyms are indeed a kind of identifier, just ones with “hiding” properties. And a URI *is* an identifier in its essence (that’s what the “I” stands for), whether or not the resource it represents is ephemeral. (See innumerable discussions by TimBL and Norm Walsh…) Even in SAML, where identifiers aren’t necessarily expected to be URLs, pseudonyms are certainly treated as a class of identifier.

    Another question to ask: “What’s one-time about a one-time identifier?” It’s sort of a misnomer usually. If the identifier lasts for only a single session (like SAML’s transient pseudonym), the whole point is to have it available for multiple operations, such as later doing a single logout after a SSO. If the identifier lasts for the entire length of an IdP-RP-user triple’s relationship (like SAML’s persistent pseudonym), it will get used way more than once.

    Certainly for “attribute-based authorization”, there’s not strictly a conceptual need to provide an identifier of any sort, since you could just pass along a package of attributes, as you point out. However, the systems being used are often “identifier-based” and tend to work best by creating a temporary account and (throwaway?) identifier for that one-time usage. But I suppose in that case it’s almost a transaction identifier, so *something* is getting uniquely identified.

  3. […] Johannes for thoughtfully addresses the questions I posed on OpenID identifier matters. Here are a few more thoughts in response. […]

  4. […] So, anonymity first: Many people have written at length about the value of keeping your identity secret, even while going about your (necessarily public) business of living. It’s one of the reasons that people have been nervous about any kind of Single Identity Provider in the Sky that “knows” all of us. It’s why Sun has a Chief Privacy Officer (hi, Michelle!) who serves as a steward of information about Sun’s employees, customers, and partners — in many cases to ensure legal compliance. It’s why Phil Zimmermann invented PGP. It’s even been discussed as a use case on an OpenID list. Since I can already tell this post is gonna be long (and I’m just getting warmed up!), I’ll just assume we can agree there are sometimes good reasons to protect one’s identity from being exposed. […]