Tag Archives: VRM

Both a data borrower and a data lender be

Christian Scholz and his Data Portability Project pals have roped me into their Data Without Borders podcasts. On Friday, Christian and Trent Adams and Steve Greenberg and I had some fun relaunching the series by talking about the DPP Terms of Service and End-User License Agreement (TOS/EULA) task force.

Steve was passionate in describing this work. I think he’s right when he says that you first have to ensure that people are aware of a site’s terms of service; disclosing them in a form human beings can grok (à la Creative Commons or the nutrition label approach I wrote about here) can begin to empower humans to change things if they so desire, using a variety of means.

At one point we talked about the Archive Team project run by Jason Scott, which I think of as “data portability of last resort”. These folks are like digital historian ninjas who swoop in to save data that might otherwise be lost forever — like everything on GeoCities.

The thing is, website-sanctioned bulk import and export of data isn’t all that huge an improvement on this kind of rescue operation. True data portability wants granularity and timeliness. For example, if you choose to host (so to speak) your current location info at FireEagle, you might still want to reuse it in other places for other purposes, and luckily OAuth lets FireEagle, Dopplr etc. give you a nimble and safe way to “port” this data back and forth.

This is a kind of data statelessness, in that when you tell various sites they can set, read, and republish your location, they’re letting go of any pretense of exclusive hosting control so that they can offer you a different kind of value.

Now, in the IdM and VRM worlds, some of us have been talking about identity statelessness for a while, which is similar but looks more like straight data-sharing (reading) rather than arbitrary service access (setting). For some reason this is a tougher sell — even though CRM systems and user accounts are shot through with pale copies of stale data (and, in the enterprise case, even though syncing directories and replicating databases is brittle and no fun).

Even when one party — say, you yourself — is authoritative for some piece of personal data (like your home address), all the sites insist on making you provision a copy of this data into their profile pages by hand and by value, and insist on thinking they own something truly valuable even after you move and forget to tell them.

In short: To the extent data is volatile, copies of it leak value. If the chain of evidence between its authoritative source and a recipient of data is broken, it quickly becomes value-free. And if the chain of authorization breaks, you’ve got digital shadow cruft. Why oh why can’t we get to a place where, as Scott Cantor put it to me once, identity-aware apps think in terms of data caching rather than data replication?

The Data Portability TOS/EULA work is helping us raise our standards for what true data portability should look like: Open Arms – Ever Fresh – Graceful Exit. OAuth already helps us get a bit beyond disclosure of site terms, closer to a world where users have an active say in what sites do with our stuff. I’m hoping UMA (recent deep-dive Technometria podcast here) can help us go even further because of its notion of user-dictated terms that recipients must meet in order to have the privilege of fresh access.

We’re likely to discuss this topic in the DWB podcast sometime soon, so I hope you’ll give a listen.

ProtectServe news: User-Managed Access group

After a few weeks’ worth of charter wrangling, I’m delighted to announce the launch of a new Kantara Initiative work group called User-Managed Access (UMA). Quoting some text from the charter that may sound familiar if you’ve been following the ProtectServe story:

The purpose of this Work Group is to develop a set of draft specifications that enable an individual to control the authorization of data sharing and service access made between online services on the individual’s behalf, and to facilitate the development of interoperable implementations of these specifications by others.

Quite a few folks have expressed strong interest in using this work to solve their use cases and in implementing the protocol (speaking of which, sincere thanks to the dozen-plus people who joined with me in proposing the group). With a basic design pattern that is as generative as ProtectServe seems to be, and with the variety of communities we’ll need to engage, it could be tricky to stay focused on a core set of scenarios and solutions, but I intend to work hard to do just that. Better to boil a small pond than…well, you know. Stay tuned for more thoughts on how I think we can accomplish this.

If you’d like to contribute to the continuing adventures of ProtectServe, please check out the User-Managed Access WG charter and join up! Here’s where to go to subscribe to the wg-uma list, which is read-only by default, and to become an official participant in the group, which gains you list posting privileges. (In case you’re wondering, there is no fee whatsoever for Kantara group participation.)

By the end of this week we’ll start using the list to figure out a first-telecon time slot, and I’ll provide updates on various group milestones here. If you’ve got any questions at all, feel free to drop me a line.

ProtectServe: getting down to (use) cases

The need for permissioned data-sharing and relationship management doesn’t discriminate in favor of, or against, any type of entity; the stick figures below, representing a data-discloser and a data-consumer, could be a big company and one of its suppliers (B2B), or a company and one of its customers (B2C), or a customer and one of the vendors in her life (C2B), a citizen and one of the government agencies he deals with (C2G), etc.

peer-to-peer-generic

The process wants to become more “peer-to-peer” (P2P!) than it is today. Data-disclosers, often disadvantaged today because they’re pressured to over-disclose and under-enforce, need to be empowered in a more balanced way. But while we’d like our ProtectServe and Relationship Manager architecture to be suggestive for the general case, we had to get specific, and so our initial use cases involve a data-disclosing human being and a data-consuming web app; you can think of them as playing the roles of “customer” and “vendor” in VRM scenarios such as change-of-address.

Here were our major functional requirements:

  • Allow individuals to establish policies for each data-sharing relationship they have, as an interface mode separate from the login process
  • Allow individuals to conduct long-term relationship management, including modifying the conditions of sharing
 or terminating the sharing relationship entirely
  • Allow data recipients to retrieve data directly from authoritative sources, guided by policy, even while an individual is offline, reserving approval loops for extraordinary circumstances
  • Allow data recipients to retrieve individuals’ data from multiple online sources, on a one-time or repeated basis
  • Do this simply enough to attract adoption and energy

Obviously these requirements drive a lot of decision-making all by themselves, but soon enough even more specificity is needed. And that’s where Bob Blakley comes in. Bob kindly provided a lot of detailed feedback to me recently on our ProtectServe user experience mockups. In the course of our chat, he nicknamed two distinct use case categories we were hoping to solve for, which clarified my thinking in a big way.

One set of use cases involves a User explicitly provisioning a Consumer app with a way to get some set of data. For example:

  • Registering for a new online account (or even buying something on a “one-night stand” basis, with no ongoing account) and providing stock info like a shipping address and credit card data (likely packaged into a set somehow)

  • Providing calendar data to businesses to solicit event invitations (cable customer service, dentist’s appointment), or — in the case of travel calendars — to control mail/package/newspaper delivery or solicit travel-related offers of products and services (like country-specific prepaid calling cards)

  • Making home inventory data available to insurers, or to estate-sale catalogue assemblers

  • Making an album’s worth of photos from the latest vacation available to some group of friends and family, but reserving a few in the same album for a more select group
  • (Warning: meta-example!) Serving out some cooked form of Relationship Manager audit-log data to a company that builds reputation scores for Consumer apps

Noting that the user is fully in charge, and no Consumer even learns about the data’s availability without the User’s personal and active involvement, Bob gave this set of use cases the Delta Tau Chi name of Data Dominatrix.

We also worked with a secondary bucket of use cases, though it has presented us with interesting protocol and user experience difficulties: widely publishing the existence of data, then deciding whether to to release it on request (where the requests were not individually solicited). For example:

  • Putting links to your calendars, vCards, etc. on your blog, and then fielding requests from every party that wants it

  • Offering a package of demographic data about yourself to any survey service willing to pay your price

In his inimitable way, Bob named this one Hey, Sailor. (Hmm, I’m sensing a theme here. What sort of girl does he think customers are? Then again, it doesn’t help that sometimes we want “one-night stands” in our online relationships!)

These use cases affected our choices around things like:

  • The dynamic nature of the introduction process between Consumers and other parties
  • The granularity of contract terms as they apply to data resources
  • Where users need to be involved real-time vs. where they at least want the option of real-time consent vs. where they don’t want to be bothered

By the way, we also discussed a third use-case bucket that has not been on my team’s radar, and which I don’t believe got a nickname: The User puts together a prospectus of data he’s willing to assemble, if the right offer is made by a potential Consumer. While this sounds very interesting, there are already enough business and technical question marks around the rest of the proposition to make me want to hold off. But hey, if anyone’s inspired to defend it (or name it!), let me know.

To protect and to serve

To protect and to serve

In the last year, I’ve done a lot of thinking about the permissioned data sharing theme that runs through everything online, and have developed requirements around making the “everyday identity” experience more responsive to what people want: rebalancing the power relationships in online interactions, making those interactions more convenient, and giving people more reason to trust those with whom they decide to share information.

In the meantime, I’ve been fortunate to learn the perspectives of lots of folks like Bob Blakley, Project VRM and VPI participants, e-government experts, various people doing OAuth, and more.

Together with some very talented Sun colleagues (special shout-out to team members Paul Bryan, Marc Hadley, and Domenico Catalano), I started to get a picture of what a solution could look like. And then we started to wonder why it couldn’t apply to pretty much any act of selective data-sharing, no matter who — or what — the participants are.

So today I’m asking you to assess a proposal of ours, which tries to meet these goals in a way that is:

  • simple
  • secure
  • efficient
  • RESTful
  • powerful
  • OAuth-based
  • identity system agnostic

We call the web protocol portion ProtectServe (yep, you got it). ProtectServe dictates interactions among four parties: a User/User Agent, an Authorization Manager (AM), a Service Provider (SP), and a Consumer. The protocol assumes there’s a Relationship Manager (RM) application sitting above, acting on behalf of the User — sometimes silently. At a minimum, it performs the job of authorization management.

We’re looking for your input in order to figure out if there are good ideas here and what should be done with them. (The proposal is entirely exploratory; my employer has no plans around it at the moment, though our work has been informed by OpenSSO — particularly its ongoing entitlement management enhancements.)

Read on for more, and please respond in this thread or drop me a note if you’re interested in following or contributing to this work. If there’s interest, we’re keen to join up with like-minded folks in a public forum.

[...]

Mydex demo: lovely identity harmonics

Asa Hardcastle, OpenLiberty rock star, has posted some details on an exciting demo he’s put together on behalf of Mydex.

The demo is a pretty sophisticated combination of identity-related technologies: information cards for authentication and transfer of service-bootstrapping info; XRI for keying into the Mydex personal datastore and some user-driven services; the Identity Web Services Framework (ID-WSF) for pointing off to other loosely coupled services; and SAML as the (ahem) “universal-solvent” assertion format. The use case being addressed here illustrates what, to me, is an important point: we are going to need both front-channel (through the user/user agent) and back-channel (service-to-service) data sharing in the real world, and our identity-enabled architectures need to empower individuals as fully as possible even in the latter case.

Iain Henderson of Mydex is plumbing an interesting issue in Vendor Relationship Management; he calls it Volunteered Personal Information or VPI. Iain’s VPI Special Interest Group is currently working on encapsulating an individual’s contract terms for data-sharing, and I believe this work will ultimately apply to the entire VRM problem space and, indeed, to all cases of “free-agent” identity on the ‘net. Check it out!

Where should data live? (part two)

Yesterday I said “you might have reasons for choosing different hosts for information that has different levels of sensitivity [or] needs for high-availability access”. Today I happened to run across a company that makes a business out of this:

The DocuBank Emergency Card provides immediate access to your healthcare directives, any time, anywhere they are needed.

DocuBank provides access to the following critical documents: Living Will, Health Care Power of Attorney, HIPAA release, organ donation form, hospital visitation forms, burial instructions and more. DocuBank makes your healthcare directives work.

They give you a card for your wallet that acts as the “discovery service” to get to the documents, and you need to have authorization to see them: either they’re about you, or you’re a healthcare provider who has specially registered to get access to this type of information.

Poking around online, I also just learned about the Washington State Living Will Registry, which seems to function much the same except that it’s run by the state.

I’m glad there’s a choice of providers for healthcare directives in break-glass scenarios — and I’m also glad I don’t have to host such information myself on the computer under my desk. After all, I could never offer myself a service-level agreement that I’d find acceptable…

Where should data live?

George Fletcher provides interesting commentary on a good social-web discussion by Om Malik. The issue: Whether aggregation and federation of data are opposite, or complementary.

George says:

[F]or aggregation to work in the “open web”, it must be able to access my data whereever I’ve chosen to place it.

I agree. If we’re looking to empower people, it’s not realistic to insist that all their information live in a single place. Just as inventing a new identifier type isn’t sufficient to eliminate all the various identifiers we already have in our lives — there are good reasons, not just legacy reasons, to have more than one — solving the problem of storing everything in one place isn’t sufficient to eliminate all the places information about us is stored. Here are two non-legacy reasons.

First, you should be able to choose (as George says) where to store information you created, and you might have reasons for choosing different hosts for information that has different levels of sensitivity, needs for high-availability access, needs for fine-grained access control specific to certain data types, etc. There needs to be an option not just to import/export everything en masse from one competing hosting environment to another, but also to tolerate multiple sources of data at once. It almost feels anti-web to prefer an architecture that requires everything to live together on one server.

Second, it doesn’t make sense to throw information about you for which you’re not authoritative (like your credit score, or really any reputation data) into one aggregation pile; you can’t control the value, but it’s still “yours” in lots of other senses. You might have the right to track who sees it, but a live copy shouldn’t reside in your one big database where you have write access. (I liked Gerry Gebel’s insight around this: Try to think of any application that relies on data from elsewhere to be stateless with respect to it. Another way to think about it is that you want to achieve a sort of “first normal form”, where information properly lives wherever its authoritative source chooses it to live.)

George notes that if you can authorize a relying party to get the data from whatever your preferred source is, you can get the best of both worlds. It’s aggregating some parts of data provisioning, usage, and auditing, but not the actual residence of the data.

I’ve become convinced that multi-sourced data access is a requirement for the core permissioned data sharing issue that’s common to identity, VRM, and social networking use cases.

I happened to do a webcast yesterday that describes the VRM proposition — you can watch the recording if you register for a free account — and I went into a bit of detail about the technical requirements I see, along with reviewing some of the architectures at our disposal for achieving them. (The good news is, there are already several…) [UPDATE: Slides are now available here.] I think I need to start adding this requirement to my list.

Close encounters of the third kind

There are three obvious patterns for how humans and identity-enabled apps can interact.

The first pattern is human-initiated. This is when you reach out to an app — say, visiting a browser bookmark for Dopplr — in order to use it in real-time. These days, often you log in somewhere else (at some identity “provider”) so that the app you want to use can “consume” information about you to do the job you want. Think single sign-on and login-time transfer of data about you.

The second pattern is app-to-app. This is when an app, having been previously introduced to other apps that are sources of info about you, talks to them about you on your behalf, even if you’re not around — like when FireEagle and Dopplr share your location info. But it’s done in a way that’s privacy-enhanced and sensitive to your preferences. OAuth has been great for demonstrating why this is valuable, and of course it’s the central point of Liberty Identity Web Services too. (Check out Paul Madsen’s helpful series comparing OAuth and ID-WSF.)

I’m thinking it’s time for the pattern of the third kind to get more attention: app-initiated. This is when an app needs your attention and reaches out to you to get consent, or data, or an acknowledgment of receipt. Today in the wild, we see lots of notices sent through email and SMS (package tracking, flight cancellations), but don’t have a good way to set up our preferences for the way apps request action on our part. The Liberty Interaction Service could be a part of the solution.

This third pattern seems absolutely key for managing privacy robustly, assuming you’re properly auditing the app-to-human contact and its results. Here are some of the scenarios that have come up recently.

Emergency contacts: When you travel internationally or sign up to get treatment from a doctor or surgeon, you usually have to provide an emergency contact. It would be better to do this by telling apps how to look up how to contact the person in question, rather than giving phone numbers or email addresses that can go stale or be inappropriate (too synchronous or asynchronous, or too unlikely to elicit a response) for a particular purpose. This would also be useful for a variety of delegation-type tasks, like indicating who’s willing to sign for packages while you’re away — especially in conjunction with the Liberty People Service.

Integrating identity selectors: This was suggested by Pamela Dingle (in response to my critique of “classic” identity selector behavior at Burton Catalyst). Provision your Interaction Service to know how to fire up your identity selector when you’re online, so apps could use it to initiate contact with you to gather consent and get new claims. Cool idea, and maybe worth exploring a profile someday; I ended up mentioning it at a meeting of the Web Services Harmonization SIG we we wouldn’t lose it.

Health research: Gather consent when new uses of previously collected data arise (aggregating study data in a privacy-sensitive way), and gather more data over time for longitudinal studies. This idea came up at the Project VRM workshop in Boston, and it’s useful for not just health research but pretty much all VRM-enabled data-sharing scenarios — it can increase an app’s ability to gather less data on initial contact (the fewer required fields, the better!), and thus a human’s comfort level with choosing this vendor.

I get the idea that a lot of my Liberty colleagues haven’t gotten excited about the potential of the Interaction Service the way I do. Am I nuts? Am I missing other juicy use cases? What would it take to get something like this working in a standard way with things like OAuth?

Venn and the art of data-sharing

I come to the VRM world from a tradition (if that’s the right word) of digital identity management. With so many organizational efforts swirling around trying to create identity layers, data portability, metasystems, and suchlike, I kept noticing that there was a common set of bedrock features involving human beings and the networked apps they use. And, yes…I saw it as a Venn diagram.

I’ve been trying this out on folks for a while now, and used it in a couple of recent talks, particularly my Gnomedex 8.0 one. Here’s my thinking behind it. (This is more than a straight Venn because of the metaphorical shadow thingie. Couldn’t resist! My web services Venn “cheated” too.)

Digital identity management is, at base, about identification so app usage can be correlated and audited, authorization to provide secure controlled access, and personalization, all counterbalanced by privacy. It has a strong individual (single-human-to-app) bent, though sometimes it involves Shibboleth-style scenarios where you mostly track anonymous group members rather than unique people.

Social networking is about building feelings of connectedness and offering the benefits of collaboration, such as crowdsourcing. Social apps focus on human-to-human relationships, but to provide infrastructure for this, they have to do plenty of the human-to-app variety. Social networking today stresses revelation of personal details (the OpenSocial best practices doc is one example) much more than it stresses privacy, though the latter is an increasing concern.

VRM partly involves what could be called restriction of data flow — undoing vendors’ grip on users’ info in a way that’s familiar to proponents of privacy-enhanced and user-controlled IdM. But other VRM scenarios involve enhancement of individuals’ opportunities to share personal information, for example by issuing a personal RFP to potential vendors. As Doc Searls has said, VRM is “personal first and social second”, so it seems to have a closer kinship with digital identity but could provide new social opportunities as well.

Each area has its unique features. But all share a common trait — differentiated app behavior depending on special aspects of you (whether this comes from attributes, claims, and transactional details in IdM; social graph data and user-generated content in social apps; or proactive requests and other personal data offered up in VRM). And to deliver on this promise they all share a common requirement — knowing more about you, with permission.

By contrast, where apps know about you through improper data gathering or aggregation, you get digital shadow effects — like direct marketing that is distinctly not permissioned or welcomed. Today, permissioning is still something of an art rather than a science, hence the title of this post.

We have a number of infrastructural options that more or less satisfy the requirements of the intersection, and later I hope to provide further thoughts on that. For now, I hope you’ll let me know what you think of this new instance of John Venn’s invention.