Tag Archives: data portability

People and online services: leaving value on the table

The recent Google-Facebook flap demonstrates that the hottest battleground for users’ control of the data they pump into these online services is the sites’ Terms of Service. Why? Because when you’re not a paying customer, you’re not in a hugely strong bargaining position. As I put it to ReadWriteWeb in their piece on data portability implications of the debate: Facebook’s end-users are not its customers; they’re the product. (Or as my Data Without Borders pal Steve Greenberg sometimes puts it, users are crops…getting harvested. Oh dear.)

For all “free” online services, it’s worthwhile to ask: What am I paying instead? If it’s not money, is it attention to ads? …behavioral cues about myself and my preferences? …personally identifiable data? …beta-testing time? …what, exactly? Payment for services rendered isn’t a bad thing. But it’s always something, and you might as well not be a chump.

That’s why I like Frank Catalano’s new TechFlash post viewing personal data sharing through an economic lens and discussing how to barter your data more equitably. Regarding his second point, “hide”: I’d actually be thrilled if more online services that were marketed to individuals offered a premium for-pay option; it would keep out the riff-raff and give people more meaningful control over their relationships with the companies offering the services.

It’s not just individuals who are leaving something on the table, though. I think there’s a big untapped market in selective sharing, which is like “privacy” (poor abused word), without the assumption that minimal disclosure is the be-all and end-all. What would you start sharing with a selective set of people and businesses, if you could have confidence that your expectations around context, control, choice, and respect would be met?

That’s why I think Dave McClure has it right with his notion of intimacy as a market opportunity Facebook currently has no idea how to address. (“maybe I only want to tell a few close buddies about that episode with the VERY BAD bean burrito” — yeah, thanks for keeping this sharing episode VERY selective. :-)

And that’s why I think Esther Dyson doesn’t quite have it right in saying privacy is a marketing problem. Her exhortation to “Know your customer, and talk to that person as an individual, not as someone in a bucket” has a natural barrier: Facebook and others are serving their actual customers very well indeed by, uh, making more product.

And that’s why I think User-Managed Access could help: Becoming paying customers of services that need our data is good. But becoming, in addition, producers of data products as peers in a selective data-sharing network, and dictating our own Terms of Access for getting to them, is even better.

Data portability and wagon-circling

One of the breakout tracks at EIC last week was Cloud Platforms and Data Portability. Dave Kearns had asked me to speak for a few minutes on the subject of social data portability before joining Drummond and Christian for a panel discussion.

I brainstormed a bit and suggested that I could comment on the notion of data statelessness, and the continuum of individuals’ data portability on the web. That somehow turned into a boldface uppercase talk called Data Statelessness and the Continuum of Individuals’ Data Portability on the Web. :-) (Hmm, maybe in German that boils down to a single long word…) I thought I’d share those thoughts here.

The Web is a teenager already

People have been pouring content onto it since Web 1.0. It’s enough time for there to be major failures of data portability.

For example, Geocities started in 1994 (with an offer of 2 whole Mb free!), and ended its life in 2009 with about 23 million individual pages — which were at risk of being abandoned.

300px-Archiveteam

Archive Team is one of the groups that performed “data portability of last resort”; they’ve managed to resurrect more than a terabyte of all that content…at Geociti.es.

data-portability-logo

DataPortability.org was formed in 2007, and it advocates being able to “take your data with you” to new services.

The Web 2.0 cocktail is even more potent

It’s a mix of some application’s features plus our own data contributions. The more “social” the application — that is, giving us human-to-human connection benefits — the more we drink.

But there’s always an application in the middle. It knows everything we share — and increasingly, selling access to that information is its business model.

Just a reminder…

Take a look at EFF’s compilation of Facebook privacy policies from 2005 to now.

Recall that a newspaper’s readers traditionally were not its real customers; that would, of course, be the advertisers.

Facebook’s end-users are not its customers.

They’re the product.

[Not that I'm picking on Facebook specifically. Though this news about a Facebook all-hands meeting tomorrow afternoon to "circle the wagons" is interesting...]

Solving the password anti-pattern began a new era of data portability

Was it accidental?

In 2008, Robert Scoble famously discovered that Facebook’s terms of service didn’t allow him to bulk-extract his own contact information, and they cut him off (at which point he got involved in the Data Portability effort!).

In the meantime, Facebook and Yahoo! and AOL and Google and many others have discovered how valuable it is to let third-party apps get access to fresh feeds of your data without your having to reveal your username and password.

They couldn’t exactly let these connections happen without your go-ahead, and so user delegation of authorized access was born — or at least standardized.

facespace
(click to embiggen)

BBAuth, OpenAuth, and other proprietary solutions led to OAuth (and its proprietary competitor Facebook Connect) — and now the draft OAuth 2.0, which Facebook already supports.

Third-party services getting access to your data with your okay is tantamount to you getting access through an “agent” — and not just one-time export when you leave, either, but regular fresh access for a variety of purposes. This has turned out to be a Good Thing overall for individuals’ chances at data portability.

What is data statelessness?

It’s the ability of a third-party service to think in terms of caching rather than replicating your data, because they can get it whenever they need it.

It’s the ability of a third-party service to add value without having to “own” your data.

It’s the ability for a single source of truth to arise — and for you to choose what it is.

Even weirder, it’s the ability for automatic syncing among a variety of sources of truth to arise — and for you to choose where to inject the first copy. (This is the effect when, say, you tell a bunch of your OAuth-enabled location services that they can all read from and write to each other.)

treasure-chest

Federated identity management in the enterprise has been striving for just-in-time delivery of user attributes from authoritative sources for a long time; it’s perhaps ironic that consumer-driven web companies seem to be getting there first.

Enter Data Portability Policy

Along with privacy policies, terms of service, and end-user license agreements, sites should have a (good) data portability policy — and the DataPortability.org folks are working on it.

The project is spearheaded by Steve Greenberg (of stevenwonders.com! that’s stevenwonders.com — that’s S, T, E, … sorry, inside joke among our little Data Without Borders podcast crew).

It addresses issues like:

  • Are your APIs and data formats documented?
  • Do people need to create a new identity for this site, or can they use an existing one?
  • Must people import things into this product, or can the product refer to things stored someplace else?
  • Does this product provide an open, DRM-free way for people to retrieve or access via third party all of the things they’ve created or provided?
  • Will this site delete an account and all associated data upon a user’s request?

Having standard templates for policy of this sort is immensely valuable. (And I can’t resist a mention of how UMA may be able to help us demand the kinds of policies we want our services to follow, in an automated fashion vs. ever having to read legalese.)

End of rant

Exit questions:

Is Facebook’s new Open Graph Protocol, openly published and based on semantic web standards, a good thing for data portability? What relationship does that have to privacy?

And do individuals get more empowered, or less, when lots of newer, smaller social apps flood the market looking for user-delegated authorization to connect with your data?

Munich fuel

To get through the intense European Identity Conference last week in Munich (thanks, Kuppinger Cole folks!), I had to make sure to drink lots of fluids. I’m referring, of course, to coffee, beer, and one extraordinary whisky (thanks, Ping Identity folks!).

kaffee
Bavarian coffee cup – gift from a local friend

The 2010 edition of the conference was lively and valuable. Here are just a couple of stories about encounters I had there, with more thoughts and info to come.

I had the good fortune to meet Christian Scholz in person for the first time; we participate in the Data Without Borders podcast series together, but in the way of the modern world, had never occupied the same room. Christian was serving as a credentialed event blogger. We hung out together during many EIC sessions, and I learned a lot by seeing the enterprise IdM world through his eyes; we seem to share a strong interest in the idea of radically simplifying IT. (I also learned how he came by the moniker Mr. Topf…) Don’t miss his conference musings.

And I had the great pleasure of meeting UMA’s own Graphics/UX Editor, the talented Domenico Catalano — though I already felt I knew him well! Domenico’s graphical and intellectual work graces a lot of the UMA material (and if you’re going to IIW next week, you’ll see even more of it). What a delight to cement friendships by meeting IRL.

The erudite and prolific author Vittorio Bertocci kindly gave me a copy of his new book, A Guide to Claims-Based Identity and Access Control — and I couldn’t resist asking for an autograph. (Though I was forced to sleep off the week’s excesses on the plane rather than read, this tome is next on my list.)

Finally, I had the opportunity to participate in three panels (data portability, privacy-enhancing technologies, and trust frameworks), and really appreciated the skillz and charm of moderators Dave Kearns and John Hermans.

Thanks and congratulations again to KC+P gang; it was a heck of a show, and they were ever the gracious hosts. Stay tuned here for more about the week’s events from my perspective.

Both a data borrower and a data lender be

Christian Scholz and his Data Portability Project pals have roped me into their Data Without Borders podcasts. On Friday, Christian and Trent Adams and Steve Greenberg and I had some fun relaunching the series by talking about the DPP Terms of Service and End-User License Agreement (TOS/EULA) task force.

Steve was passionate in describing this work. I think he’s right when he says that you first have to ensure that people are aware of a site’s terms of service; disclosing them in a form human beings can grok (à la Creative Commons or the nutrition label approach I wrote about here) can begin to empower humans to change things if they so desire, using a variety of means.

At one point we talked about the Archive Team project run by Jason Scott, which I think of as “data portability of last resort”. These folks are like digital historian ninjas who swoop in to save data that might otherwise be lost forever — like everything on GeoCities.

The thing is, website-sanctioned bulk import and export of data isn’t all that huge an improvement on this kind of rescue operation. True data portability wants granularity and timeliness. For example, if you choose to host (so to speak) your current location info at FireEagle, you might still want to reuse it in other places for other purposes, and luckily OAuth lets FireEagle, Dopplr etc. give you a nimble and safe way to “port” this data back and forth.

This is a kind of data statelessness, in that when you tell various sites they can set, read, and republish your location, they’re letting go of any pretense of exclusive hosting control so that they can offer you a different kind of value.

Now, in the IdM and VRM worlds, some of us have been talking about identity statelessness for a while, which is similar but looks more like straight data-sharing (reading) rather than arbitrary service access (setting). For some reason this is a tougher sell — even though CRM systems and user accounts are shot through with pale copies of stale data (and, in the enterprise case, even though syncing directories and replicating databases is brittle and no fun).

Even when one party — say, you yourself — is authoritative for some piece of personal data (like your home address), all the sites insist on making you provision a copy of this data into their profile pages by hand and by value, and insist on thinking they own something truly valuable even after you move and forget to tell them.

In short: To the extent data is volatile, copies of it leak value. If the chain of evidence between its authoritative source and a recipient of data is broken, it quickly becomes value-free. And if the chain of authorization breaks, you’ve got digital shadow cruft. Why oh why can’t we get to a place where, as Scott Cantor put it to me once, identity-aware apps think in terms of data caching rather than data replication?

The Data Portability TOS/EULA work is helping us raise our standards for what true data portability should look like: Open Arms – Ever Fresh – Graceful Exit. OAuth already helps us get a bit beyond disclosure of site terms, closer to a world where users have an active say in what sites do with our stuff. I’m hoping UMA (recent deep-dive Technometria podcast here) can help us go even further because of its notion of user-dictated terms that recipients must meet in order to have the privilege of fresh access.

We’re likely to discuss this topic in the DWB podcast sometime soon, so I hope you’ll give a listen.