Security/identity · 2006-11-22

A universe of identifiers

Johannes thoughtfully addresses the questions I posed on OpenID identifier matters. Here are a few more thoughts in response.

I suspect the world doesn’t have enough aggregate experience yet to know how people will use their (one or more) identifiers; the promise implied in the notion of having many URL-based identifiers governed by oneself hasn’t been thoroughly explored yet. So the NetMesh answer to question 1 is probably reasonable for now, but might break down if people become enamored of inventing their own pseudonyms for various purposes.

I generally agree with Johannes’s comments about the difficulty of keeping users safe from themselves. I was poking at this issue because in the SAML universe, the user would typically never touch a pseudonym, so we have this as an existence proof of good privacy management. That’s where my second question was crucial: If OpenID is happy to allow users to provide only their identity provider’s URL to the relying party, then a bunch of good privacy things can flow. Thus, I think I disagree with his point about privacy levels in the pseudonym case vs. the well-known URL case.

To explain this, I need to delve into the universe of URL-based identifiers further. Here’s the taxonomy that’s currently floating around in my head:

  • A digital subject is (to make a simplifying assumption) a natural person, such as…me, Eve Maler.

  • A digital identity is a set of identifying info about me represented by a single identifier (OpenID makes this a resolvable URL). It presumes a set of “profile management” (attribute/claim provisioning) tasks that I undertake with the provider of the identity verification (authentication) services through that identifier, whether that IdP is on my own server or someone else’s. I can choose to share this identifier with relying parties in order to get verified (and, with technologies such as SAML today, also get my identifying attributes shared across). One of my digital identities might be keyed off of a URL like, say, https://www.xmlgrrl.com/eve.

  • I, a digital subject, may have multiple digital identities, keyed off of different identifiers, which may be verified/managed by the same IdP or by different ones, and which only I (a human) can correlate — any one IdP cannot 100% safely calculate or assume “they’re all Eve”.

  • A persona (I’m clearly insane getting into the persona swamp again) is a “view” onto a particular digital identity, managed by the IdP for that identity, keyed off of an alternate identifier supplied by me and associated with various bits of policy about what info about me can be shared and with whom. The value of these would presumably be that humans will know about their own persona URLs and supply them to RPs, and so these would be well-known, not secret. I don’t think we have much experience to speak of regarding persona URLs, but people do this with email addresses all the time (and most of us have multiple ones) — e.g., I declare on my site that if you send mail to eve-at-xmlgrrl.com, I will deem it to be bloggable unless you indicate otherwise. I can imagine using https://www.xmlgrrl.com/xmlgrrl (vs. …/eve) to correspond to the sharing of a relatively “safe” set of attributes with a relatively broad set of RPs.

  • An IdP always has a well-known identifier to facilitate my ability to avoid sharing an identifier of my own directly to an RP; as Johannes points out, this is a feature of the emerging OpenID 2.0. (If we were talking heavyweight web services, the IdP identifier would serve as something like an “endpoint address”.) Hmm, extending the notion of personae, I can imagine IdPs having “personae” (alternate URL identifiers) that would correspond to different sets of per-IdP policy that I want to activate (e.g., logging or RP authentication) when supplying that URL to an RP, but now I’m getting into generalizing-from-no-examples territory…

  • A pseudonym is an identifier that is designed to be secret and to be the key for a particular triple of an IdP, an RP, and a digital identity for some short or long period of time. The IdP knows which digital identity (or persona) is being invoked, but the RP only knows the pseudonym, which allows it to talk about “the same” identity with the IdP (and no other web app) — it’s anonymous correlation of a limited sort.

Johannes says:

…the relying party necessarily needs to have some kind of “handle”, relative to the identity provider, that uniquely identifies my account there, as opposed to somebody else’s. (Otherwise I would get their shipments, or they my invoices.) By concatenation of the identity provider URL and the local handle, we arrive at the same privacy (or lack thereof) as in case of using the identity URL in the first place.

If the “handle” has to be a URL (e.g., it’s constructed by doing the simple concatenation described above), then yeah, this is inherently not particularly private — though I can imagine constraints that the protocol could place on the RP’s sharing of the URL, rules about doing mutual auth when resolving it, etc. But I see no structural reason for it to be a URL; the RP can get it from the IdP as part of the exchange of identity verification information. (This is, in fact, how SAML does it when a pseudonym is requested.) Since the purpose of a pseudonym (defined to be secret) and a URL (the currency of the open Web) are at odds, there might be a bit of relief in relaxing the notion that a pseudonym has to be a URL, while assuming that persona URLs can always be made public.

So, what do folks think about this idea? In the meantime, it’s back to yarn-hunting for me. :-)