Archive for Security/identity

The Economist and “ecto gammat”

Remember in The Fifth Element when Leeloo threatens to shoot Korben Dallas for stealing a kiss, saying “ecto gammat”? Turns out it means “never without my permission”. A good rallying cry for personal data sharing in today’s world!

The Economist has a thoughtful article called The Data Deluge on the benefits, and the privacy risks, of making better use of the torrent of data (it mostly focuses on, but doesn’t ever say, “personal” data) being generated in all kinds of business and marketplace endeavors. My favorite part, ’cause I share this assumption with the author:

The best way to deal with these drawbacks of the data deluge is, paradoxically, to make more data available in the right way, by requiring greater transparency in several areas. First, users should be given greater access to and control over the information held about them, including whom it is shared with.

This article makes a great companion to this meaty blog post by Iain Henderson laying out a serious vision for the notion of a personal datastore as a personal data warehouse. Iain knows whereof he speaks; he’s been in the CRM business a long time, and runs the Kantara InfoSharing work group (along with Joe Andrieu, another thoughtful guy who’s passionate about this stuff). I’m lucky to have both of them on my entirely complementary User-Managed Access group, UMA serving as a technological match for InfoSharing use cases.

I tried to add a comment to the Economist article about an aspect it didn’t cover: the quality of the personal data that’s floating around. Either this commenting effort completely failed, or in the fullness of time three copies of the same comment will appear — sigh. In the spirit of using this blog as my pensieve, here’s the main bit:


Volatile data goes stale. Excessive data collected directly from people is often larded with, to put it bluntly, lies. (To acquire a comment account on this site, I was required to provide my given name, surname, email address, country of residence, gender, and year of birth. If everyone were totally honest when signing up, that’s a powerful set of facts with which to locate and track them pretty precisely. You can tell which fields are excessive by looking at which ones people lie to…) And data collected silently through our behavior is, at best, second-hand and can never know our true intent.

Privacy is not secrecy (says digital identity analyst Bob Blakley). It is context, control, choice, and respect. Ideal levels of personal data sharing may actually be higher in total than now — but more selective. And they won’t be interesting to people without offering convenience at the same time.


Wouldn’t it be great to get out of the defensive crouch of “never without my permission” and turn it into “with my permission, sure, why not, it’ll help me just as much as it will help you”?

(Any bets on whether I told the truth and nothing but the truth when I registered at the Economist site?)

Comments (1)

Digital shadow cruft

Robin Wilton’s post on Google Buzz hits the nail(s) right on the head(s). The benefits of social networking center on human-to-human connectedness and collaboration, but the entire “social networking” construct obscures the fact that it’s really human-to-application-to-human. In revealing information that its users never authorized nor expected to be revealed, Google has created digital shadow cruft.

Comments

Experiences not to miss

Experiences not to miss:

(I will not say “Join the conversation”, I will not say “Join the conversation”, I will not say “Join the conversation”…)

Comments (2)

How to rest assured

Everybody’s talking about identity assurance these days, meaning, generically, the confidence a relying party needs to have in the identity information it’s getting about someone so that it can manage its risk exposure.

A lot of the conversation to date has revolved around NIST Special Publication 800-63 (newer draft version here) and its global cousins, which boil down assurance into four levels — hence all the loose talk of LOA (for “level of assurance” or sometimes AL for “assurance level”), even when people aren’t focusing on specific levels or even systems of assurance numbering. NIST 800-63 is intended to answer the use cases defined in OMB Memo 04-04, which deals with making sure users of the U.S. Federal government’s online systems are who they purport to be. Here’s an example given in OMB M-04-04 for one particular need for level 3 assurance:

A First Responder accesses a disaster management reporting website to report an incident, share operational information, and coordinate response activities.

And here’s how NIST 800-63 defines assurance (I’m quoting the Dec 2008 draft here; strangely, the official Apr 2006 version doesn’t include a formal definition):

In the context of OMB M-04-04 and this document, assurance is defined as 1) the degree of confidence in the vetting process used to establish the identity of an individual to whom the credential was issued, and 2) the degree of confidence that the individual who uses the credential is the individual to whom the credential was issued.

So there’s an identity proofing component at registration time that nails down the precise real-world human being being referred to, and there’s a security/protocol soundness/authentication component at run time that establishes that the credential is being waved around legitimately. These get added up into four levels defined roughly like this (leaving aside the security and protocol soundness factors):

nist-matrix

(Here, “same unique user” means that the same user can be correlated by the RP across sessions. And “verified name provided” means that the user’s real-world name is exposed to the RP, versus some sort of pseudonym; level 1, where no proofing is done, is implicitly pseudonymous, while level 2 offers a choice.)

I don’t mean at all to criticize this rolled-up four-level approach. It seems to have met the needs set out in M-04-04, and it predated both the “user-centric” movement (Dale Olds has a nice rundown of its use cases here) and truly modern notions of online privacy.

But I think we need more clarity about assurance use cases and terminology, for two reasons: One is to help ensure that identity providers can give RPs what they need, rather than what might just be a poor approximation based on NIST 800-63’s fame. The other is to help ensure that IdPs give RPs only what they need, since more assurance is likely to involve more personal information exposure.


To that end, let me explain some assurance use case buckets I’m seeing in the wild, and their relationship to the NIST requirements and each other. First, here are some use case buckets hiding in plain sight in the NIST levels:

buried-use-cases

Simple cross-session correlation: While NIST 800-63 doesn’t formally include “same unique user” as a goal, it’s in there:

Level 1 – Although there is no identity proofing requirement at this level, the authentication mechanism provides some assurance that the same claimant is accessing the protected transaction or data.

Funnily enough, cross-session correlation (without the baggage of proofing) is a key requirement of many enterprise and Web federated identity interactions. Lots of sites don’t need or want to know you’re a dog; they just need to know you’re the same dog as last time. This way, they can authorize various kinds of ongoing access and give you something of a personalized experience across sessions. Though NIST treats this as an also-ran and couples it with weak authentication in level 1, other use cases may have reason to match up “mere correlation” with higher authentication.

Identity proofability: If an RP can trust that it’s dealing with a human being who has some level of serious representation in civil society, it’s a powerful kind of assurance for lots of purposes. More about this below.

Real-world identity mapping: When level 3 or 4, or verified-name level 2, is used, this means a user’s real name is used to build up the unique identifier that the RP sees, and this verified name leaks PII like crazy, even if it’s not itself unique. (As far as I know, I’m the only Eve Maler out there…) This is strong stuff, and in a modern federated identity environment, it is to be hoped that most RPs simply don’t need this information. (John Bradley — that is, the John Bradley who works with the U.S. government on its ICAM Open Identity Solutions program — tells me he believes pseudonyms should be an acceptable choice all up and down the four levels, indicating that this use case bucket is fairly rare.)


Now things get really interesting, because there are other use case buckets that you can sort of see in this matrix if you squint, but really they’re just different:

addl-use-cases

Anonymous authorization/personalization: This is the flip side of cross-session correlation. OMB M-04-04 talks about “attribute authentication” and the potential for user attributes to serve as “anonymous credentials” (where an RP simply can’t know if this is the same unique user coming back but can still base its authorization decisions and personalization actions on the veracity of the attributes it’s getting). The attributes in question can range from “this user is over 18″ to “this user is a student at University ABC” to “this user is of nationality XYZ”.

Ultimately M-04-04 puts the whole area of attribute authentication firmly out of scope, but lots of folks have been picking at the general problem of attribute assurance in the last several months — like Internet2 in its Tao of Attributes workshop, and the Concordia group in a forthcoming survey (stay tuned for more on that).

This bucket often requires being able to check who issued some assertion or claim, and considering whether they’re properly authoritative for that kind of info. The way I think about this is: Who has the least incentive to lie? That’s why you can be said to be truly authoritative for self-asserted preferences such as “aisle vs. window”. Any other way lies madness (“What is your favorite color?” “Blue. No yel– Auuuuuuuugh!”).

Of course, there are cases where an RP really does need attribute assurance along with other kinds, like correlation or identity mapping. And don’t forget that it takes precious little in the way of personal information for an RP to figure out “who you really are” anyway. (Check out this cool Tao of Attributes diagram, which touches on all these points.)

Financial engagement: Sometimes an RP just just wants some assurance they’re dealing with someone who has sufficient ties to the world’s legitimate financial systems not to screw them over entirely. It turns out that identity proofability can often be a serviceable proxy for this kind of confidence. (Financial account numbers are one kind of proofing documentation in NIST 800-63.) And the reverse is also true: financial engagement can sometimes give a modicum of confidence in identity proofability.

Interestingly, this bucket can be useful even without any of the other kinds, partly because the parties can lean on a mature parallel financial system instead of just lobbing identifiers and attributes all over the place. For example, users often “self-assert” credit card numbers (which RPs then validate out of band with the card issuer), or use third-party payment services like PayPal (where the service provider does a lot of the risk-calculation heavy lifting).


No doubt there are other assurance use cases. Understanding them more deeply can, I think, help us get better at sharing the truth and nothing but the truth about people online — without having to expose the whole truth.

(Thanks to John Bradley, Jeff Hodges, and Andrew Nash for comments on early drafts of this post. And check out Paul Madsen’s many excellent commentaries on assurance matters.)

Comments (7)

Both a data borrower and a data lender be

Christian Scholz and his Data Portability Project pals have roped me into their Data Without Borders podcasts. On Friday, Christian and Trent Adams and Steve Greenberg and I had some fun relaunching the series by talking about the DPP Terms of Service and End-User License Agreement (TOS/EULA) task force.

Steve was passionate in describing this work. I think he’s right when he says that you first have to ensure that people are aware of a site’s terms of service; disclosing them in a form human beings can grok (à la Creative Commons or the nutrition label approach I wrote about here) can begin to empower humans to change things if they so desire, using a variety of means.

At one point we talked about the Archive Team project run by Jason Scott, which I think of as “data portability of last resort”. These folks are like digital historian ninjas who swoop in to save data that might otherwise be lost forever — like everything on GeoCities.

The thing is, website-sanctioned bulk import and export of data isn’t all that huge an improvement on this kind of rescue operation. True data portability wants granularity and timeliness. For example, if you choose to host (so to speak) your current location info at FireEagle, you might still want to reuse it in other places for other purposes, and luckily OAuth lets FireEagle, Dopplr etc. give you a nimble and safe way to “port” this data back and forth.

This is a kind of data statelessness, in that when you tell various sites they can set, read, and republish your location, they’re letting go of any pretense of exclusive hosting control so that they can offer you a different kind of value.

Now, in the IdM and VRM worlds, some of us have been talking about identity statelessness for a while, which is similar but looks more like straight data-sharing (reading) rather than arbitrary service access (setting). For some reason this is a tougher sell — even though CRM systems and user accounts are shot through with pale copies of stale data (and, in the enterprise case, even though syncing directories and replicating databases is brittle and no fun).

Even when one party — say, you yourself — is authoritative for some piece of personal data (like your home address), all the sites insist on making you provision a copy of this data into their profile pages by hand and by value, and insist on thinking they own something truly valuable even after you move and forget to tell them.

In short: To the extent data is volatile, copies of it leak value. If the chain of evidence between its authoritative source and a recipient of data is broken, it quickly becomes value-free. And if the chain of authorization breaks, you’ve got digital shadow cruft. Why oh why can’t we get to a place where, as Scott Cantor put it to me once, identity-aware apps think in terms of data caching rather than data replication?

The Data Portability TOS/EULA work is helping us raise our standards for what true data portability should look like: Open Arms – Ever Fresh – Graceful Exit. OAuth already helps us get a bit beyond disclosure of site terms, closer to a world where users have an active say in what sites do with our stuff. I’m hoping UMA (recent deep-dive Technometria podcast here) can help us go even further because of its notion of user-dictated terms that recipients must meet in order to have the privilege of fresh access.

We’re likely to discuss this topic in the DWB podcast sometime soon, so I hope you’ll give a listen.

Comments (5)

A Venn of identity in web services, now with OAuth

In the past week, several people approached me with the idea of incorporating OAuth somehow into the Venn view of identity. Feels like more of that “destiny” Ashish invoked a couple of weeks ago — especially since I had already developed just such a Venn for my XML Summer School talk last week.

My very first Venn of Identity blog post also included a second diagram, covering something like “identity in web services”. It was little-noticed, I think, because the deployment of the more esoteric pieces of WS-* and ID-WSF was pretty low. I’ve been itching to add OAuth to it, given its wildfire-esque spread. Last week gave me my excuse, and with further feedback (thanks Paul and Dom!), I’ve continued to revise it. So here’s a new version for your perusal (click to enlarge).

VennOfBCID-Oct2009

As with the original version, the relative heights and sizes are significant: they indicate roughly how voluminous, vertically applicable, and far away from “plumbing” each solution gets. (Unlike the original, however, this one seems to give off a Jetsons vibe.)

Some thoughts from space-age 2009:

OAuth is helping many app developers meet their security and access goals with minimal fuss (80/20 point, anyone?), and by providing for user mediation of service permissions, it is easily as “user-centric” as any other technology claiming the title. It’s these lovable qualities that led the ProtectServe/User-Managed Access effort to use OAuth as a substrate.

ID-WSF still provides identity services functionality that nothing else does, and some folks I’ve been talking to lately still chafe at the lack of more widespread support for these features. But obviously it’s still a “rich” solution vs. a “reach” one.

WS-*, ah yes, what to say?… It uniquely solves certain issues, but do all of them really need solving? My Summer School trackmate Paul Downey had some choice words about this, and his WS-TopTrumps class exercise proved that the star in WS-* really does match everything possible — that’s too much. And trackmate Marc Hadley pointed out lots of benefits you get “for free” with a REST approach, which it was hard not to notice when we all chose to design REST interfaces for his class exercise despite having a SOAP option.

To be fair, Paul and Marc and also trackmate Rich Salz — who has an uncanny ability to explain complex security concepts simply — stressed the value of the core pieces for message security if you’re using SOAP. It would be interesting indeed if OAuth, or extensions to it with the same pure-HTTP design center, were to “grow leftward” to accommodate the use cases covered by the WS-*/ID-WSF intersection.

(Anyone think the new REST-* effort will win in this space anytime soon? I’m a bit dubious, myself. Its name sure didn’t inspire any love in our lecture room.)

Comments (5)

The Zen of Venn

“You will never be done with the Venn. That’s your destiny. Accept it.”

So said my colleague Ashish recently, as I agonized over some tweaks to the Venn of Identity diagram. The editing started out as a quick fix to the figure that appears in the IEEE Security and Privacy article of the same name; the diagram text was exactly what Drummond and I had specified — but the graphic emerged from the publication process visually “broken”, with no intersection lines.

But of course technologies and understandings and use cases evolve, and it began to seem like a good time to update the text too. What with the new U.S. federal government effort around Open Identity Solutions for Open Government (and PayPal’s involvement in same), I thought maybe I could do a better job of capturing the main strengths OpenID, InfoCard, and SAML bring to today’s table.

In that Zen-like and Concordic spirit, I hereby present a new — date-stamped — version of the Venn (click for the full-size .png):

VennOfIdentity-Sep2009

I hope this new version can continue to support productive discussions that help solve real-world identity problems.

If you’re wondering whether it’s okay to pick up and reuse the diagram — go for it! Just please note the Creative Commons license below. I’ll keep VennOfIdentity.org pointed to the new Venn category on my blog so that people who see propagated copies can keep up with updates if they like.

Creative Commons License The Venn of Identity – September 2009 by Eve Maler is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

p.s. Thanks to “W.” of the Tech and Law blog for our great email exchange this week on Venn-shaped matters, which sparked even more edits…

Comments (4)

Privacy nutrition labels

The recent Symposium on Usable Privacy and Security, SOUPS 2009, seemed to cover a whole lot of interesting topics. One of these days I hope to attend for real — but failing that, I’m just working my way through the proceedings slowly. One paper, A “Nutrition Label” for Privacy, is especially cool.

The researchers have gotten pretty far down the path of rationalizing website privacy policies into a graphical/tabular form that’s actually enjoyable to use (their word! and they have numbers to back it up!). Whereas such policies in natural-language form are usually wordy, complex, inconsistent, and stubbornly irrelevant to a user’s actual preferences, their proposed label format provably borrows the benefits of real U.S. FDA nutrition labels, such as making a policy more amenable to at-a-glance interpretation, allowing you to compare two policies, and providing visual boundaries for the regulated/trustable portion of what you’re seeing.

The data categories in the label are a very high-level, “cooked” version of what’s in the Platform for Privacy Preferences (P3P) policy system. It’s worthwhile asking if the labels, and even the original sophisticated descriptions of data collection and use that they’re based on, are measuring the right thing. (After all, I have very little confidence that actual FDA Nutrition Facts labels are measuring the right thing.) But the categories they list seem like a pretty good start; “your activity on this site”, for example, turns out to be one of the biggest loopholes in many of today’s prolix-but-slippery privacy policies:

  • contact information
  • cookies
  • demographic information
  • financial information
  • health information
  • preferences
  • purchasing information
  • social security number & govt ID
  • your activity on this site
  • your location

Now I’m consumed by the thought of letting a person use this matrix-based approach to configure her ProtectServe-enabled relationship manager, such that any would-be recipient has to meet her privacy terms if they want to get the goods…

Comments (2)

Making change

So last week I made a big transition, joining Andrew Nash’s identity services team at PayPal. (And I kind of told Twitter about it before I told y’all. Sorry about that; it’s the nature of the communications beast.) Working with Andrew, Ashish, and other great folks at PayPal is going to be a blast. And it’s an especially interesting time to shift from a technology-stack-providing world to a consumer-facing one.

Being with Sun Microsystems for ten years was an honor and a pleasure; I got to work closely with some of the most talented and interesting folks in the business. And during that time my experiences helped me layer new personae onto “old SGMLer”: “XMLgrrl”, “the SAML lady”, and even, ahem, “the queen of Venn”.

You’ll still find me involved in some familiar activities — for example, I remain involved in ProtectServe and User-Managed Access efforts, and I hope to keep up my fledgling Tek-Tips video-blogging series on identity and the cloud (#1 on the relevance of federated identity to cloud computing, #2 on the challenges of passwords for authenticating to cloud services).

Thanks for continuing to witness my pushing of string over here. I plan to continue blogging my thoughts on matters of identity, security, privacy, and trust (and occasionally nutrition, music, and knitting…), and look forward to your feedback. You can find fresh contact and bio information on my welcome page; drop me a note anytime.

Comments (1)

Concordia workshop: the secret word is authz

Dave Kearns asks, I deliver — in two parts (so far)…

Concordia workshop report

Monday’s Concordia workshop at Catalyst was a surprise and a delight. We tried to make it a more interactive and intimate experience than the mega-carnivals we do at RSA: check. We set up a theme — identity in Enterprise 2.0 — and hoped for a bunch of interesting use-case submissions to tee up the subject: check. We worried that the diverse agenda would hang together: we needn’t have. A leitmotif emerged pretty quickly: authorization.

A crack team of volunteer tweeters, led by Brett McDowell (in English; I helped!) and Tatsuo Kudo (in Japanese), helped keep the outside world connected to our discussions (searching #catalyst09 concordia gives you an accessible sampling, but looking for tweets on July 27 for just #catalyst09 will give a more complete listing).

All presentations and original sources are now linked from the workshop agenda, and I strongly encourage you to check out this rich material. Attendees were enthusiastic about the new XACML profile work and our Burton speakers’ thoughts on the complexity of social networking in enterprise settings (thanks again to Mike Gotta and Alice Wang for presenting some exciting/scary scenarios, and to Burton as a whole for continuing to support our Concordic efforts). And people had lots of useful feedback on the Levels of Assurance survey idea we’ve been hammering on for a couple of months now — basically, we think we’re going to start with in-depth interviews instead, since all our questions are open-ended and lead to more questions.

If you want to help us figure all this out going forward — including possibly contributing multi-technology authorization use cases for future interop experimentation — don’t forget to join Concordia in its new guise as a Kantara discussion group! Here are simple instructions.

ProtectServe and UMA deeper dive

At the workshop I had a great opportunity, given that my User-Managed Access group is just spinning up, to do a quick overview of the ProtectServe work that has inspired UMA and to review some alternative “use-case topologies” that could satisfy a single generic scenario in different ways. Srijith Nair et al. of BT submitted an interesting ProtectServe use-case document, and in my workshop presentation I walked through some of the implications.

The scenario I highlighted is about an employer and an employee, and the fact that both might want to impose their own constraints on the sharing of the same piece of information. Examples of pieces of information your employer holds that you might need to share (the Liberty ID-SIS employee profile spec might suggest more):

  • Employment status (often needed when you apply for a loan)
  • U.S. Internal Revenue Service W-4 (tax withholding) form details (handy for sharing with accountants and investment planners)

I (sort of ab)used the Scrum concept by formulating the following “user stories” that capture what’s special about the need:

  • As an employee, I want to audit and control the further dissemination of information my employer must know about me as a condition of employment.
  • As an employer, I want to adhere to laws and best practices regulating my sharing of information about my employee.

Three obvious ProtectServe entity topologies present themselves, each with a different sweet spot:

employer-1
#1: Employer as authorization manager and service provider

This topology preserves an explicit place for the employer to apply its own sharing policies — the authorization manager (and enclosing relationship management app) that it hosts itself. However, I think this is probably a “legacy” solution because it forces the employee to seek out other relationship managers in the outside world where they’re just an individual rather than an employee, and I can’t think of very good reasons for the employer to host this AM/RM other than corporate inertia (admittedly, a force to be reckoned with). Maybe I’m wrong, though, and a good reason will emerge.

employer-2
#2: Employer as service provider

For information for which the employer is authoritative (”Is this person employed here?”), it should host a service provider willing to attest to this on request (in accordance with the instructions issued by the employee’s personal AM). If the employer doesn’t want to release the data even though the employee is cool with the sharing, it could use existing access control mechanisms that are out of band with respect to ProtectServe, perhaps only surfacing a response code that reflects its refusal. (Ah, there’s a potential requirement for the UMA work if this use case is accepted by the group.)

employer-3
#3: Employer as consumer

For information that the employee already self-asserts to the employer (”What is the employee’s home address of record?”), why can’t the employer consume this data in the same way some other “vendor” (online service) on the open Internet could? If the employee moves, a number of workflow actions have to unroll on the employer’s side as they would have anyway (in the U.S., moving to a different state might involve withholding a different amount of state income tax), but this is already handled in existing systems when the employee provisions the new information into employee portal apps by hand. An on-board “personal datastore” service provider is shown here as being hosted out of the same relationship manager app as the user’s chosen AM, but the SP could just as easily have been hosted remotely somewhere.

If you have thoughts on this, either about the problem space or the solution space, please consider joining the UMA group and helping out!

Comments

« Previous entries