Archive for 'XML'

Plugging away, festively

Harold Carr and team have continued their track record of excellence on Project Metro (which you might remember under the informal name Tango), a web services stack with superior interop in mixed Java/.NET environments.

Harold has shared the results of the latest plugfest, which took place on the Microsoft campus last week. He observes:

Note: our shipping product, Metro 1.0 (built into GlassFish V2 UR1 and runs in other web containers—e.g., Tomcat), interoperates with .NET 3.0 based on mostly non-standard specifications. What we tested at this plugfest was our current development codebase that will interoperate with .NET 3.5 based on standard specifications.

As you’d expect, lots of painstaking work has to go into this sort of work — congrats to the team on their great progress.

By the way, there’s gold in them thar hills if you’re working with Metro; check out Harold’s mention of the cool $175,000 (USD) award you can win.

S.A.B.L.E.

Lauren gives her take on our fiberrific outing (or would that be “fibriffic” spelled her way?). I guess I needn’t have been so coy about the identities of “my very experienced and talented knitting friends”, and as it happens, she and Yvonne are also my very knowledgeable and talented colleagues. Lauren has a great crafting blog; I hope Yvonne considers blogging her crafting adventures as well.

Lauren notes that the tech quotient of the actual event was low, but we suspected there were plenty of techie-types in attendance. As we went around the room doing introductions in my Charting class, I mentioned that I had designed some XML-related cross-stitch charts; one young woman piped up: “You mean like web services?” Yowza.

One more language note: I learned a great acronym from the Creative Crochet Lace book. It’s common to yield to temptation repeatedly and buy lots of yarn for what is called one’s “stash”. Eventually you run the risk of a terrible condition called SABLE: Stash Acquisition Beyond Life Expectancy. This is an addiction, folks — clearly we should be taking it much more seriously. Time to start a .org!

XML at X; film at XI

The original XML Recommendation is 10 years old today. Happy XML Day!

These anniversaries feel a little artificial to me; my first clear memory of the XML work was a teleconference Jon Bosak had arranged among the “SGML on the Web Editorial Review Board” members in June (?) 1996, so for me XML is eleven [FIX: erroneously had twelve before] and a half years old. Of course, that just makes me feel ancient, but having just received my very first solicitation from AARP a few days ago (may I just say: eek) I’m getting used to the feeling.

On that early call, I remember insisting that we write down design principles before we do anything else; this was a core part of the methodology I used for SGML DTD development and I felt the effort would end in tears without it. (I’m pretty sure I was right.)

Right now I find myself sitting on a mountaintop in rural Ontario with my old friend Murray Maloney, who was also there at the beginning — in fact, with Yuri Rubinsky he had already been advocating for SGML on the Web by the time Jon began putting together his nefarious plan. I’ve been lucky to make so many lifelong friends through my work on SGML and XML; for some of us, as Tim demonstrates today, the people are a big part of the story.

As something of a birthday present, today I’m publishing something SGML-flavored that I hope may still be of use, or at least morbid interest, to modern XML practitioners. You see, I cowrote a book in the just-prior-to-XML era with another of my lifelong friends, Jeanne El Andaloussi, about SGML, in SGML. In DocBook, as a matter of fact. That methodology I mentioned above, with design principles and stuff? That came from this book. Now that the book is out of print, she and I discussed the matter, and we agreed to publish it here. For the occasion Jeanne penned this note:

Now that XML has become a commodity and most National XML User Groups have stopped their activities, it is time for our ELM methodology to be freely accessible on the Internet. I just hope our readers will have as much pleasure in reading it as we had writing it over a decade ago.

You’ll have to be the judge of how well the content has stood the test of time, but I can tell you the markup did beautifully. With a huge dollop of help from Norm Walsh (both his DocBook stylesheets and his mad skillz), the SGML-to-XML-to-HTML processing pipeline was downright trivial.

Voilà! We present to you Developing SGML DTDs: From Text to Model to Markup.

Developing SGML DTDs front cover (small)

p.s. It turns out the old joke is true. XML is good for reuse. It lets you reuse all your old SGML presentations. (rimshot)

XPointer, when I wasn’t looking

XPointer seems to have made its way into the heart of the Service Modeling Language! The SML spec defines an smlxpath1() scheme and everything, and seems to do it in the spirit originally intended. Wow.

Kids: just say no to XML

Oh no:

The iPhone recently fell victim to its first Trojan attack….

Security blog F-Secure warns users to be wary, however; speaking to the modding community, blogger Jarno writes “Hopefully this serves as a warning for those who have opened their iPhones using a security hole in the system and then installing unverified software without a second thought to what they are doing.”

He continues, “This time it was an 11-year-old kid playing with XML files who created the trojan. Next time it might be someone else with more skills and with specific target.” [emphasis mine]

People and cool URIs

W3C has a working draft out on cool URIs for the semantic web that looks at how to choose URIs for RDF descriptions of both information resources (web documents) and non-information resources (such as people). The paper suggests that URIs should unambiguously refer to only one resource at a time — for example, Alice’s web page and Alice herself are distinct and should be referred to with different URIs.

This lofty philosophical topic (the whole resource-identifier/resource/representation-of-a-resource distinction still gives me a headache) gets boiled down to a fairly ordinary recommendation to use tricks like http://www.example.com/about#alice — a URI reference with a fragment identifier of “alice” — to refer uniquely to Alice the person, and to use content negotiation to retrieve an RDF description of what that particular referent actually is.

What’s more interesting to me is how “cool URIs for non-information resources” intersects with identity. I’ve noticed that the RDF world talks about people on the web, but the reality is that people do web stuff with a layer of several digital identities in between. For instance, it’s likely that http://www.example.com/about#alice, in the context of the scenario introduced by the paper, refers to Alice in her role as an example.com employee, and that she has a whole other digital life beyond the confines of example.com.

Of course, the notion of a URI referring to a digital identity maps nicely to OpenIDs… An OpenID really has three potential “audiences”: (1) it’s a machine-resolvable endpoint that OpenID consumer sites can use get the OpenID’s owner authenticated; (2) it’s often “human-resolvable”, allowing arbitrary people to follow the link and see a “home page” for that identity (anyone can go see what’s at http://openid.sun.com/xmlgrrl, for example); and (3) the string itself becomes a well-known name for its owner and provides the ability to correlate her activities across the web.

If we accept that a URI will tend to encompass only a part (a persona or whatever) of a whole person, Cases #1-2 are copacetic with the mechanism of resolving a URI to get a (possibly RDF-encoded) representation of a resource and doing content negotiation at one level or another to accomplish description-switching. Case #3 isn’t covered by the paper, in the sense that the special circumstance of URIs-themselves-as-resources isn’t touched on — though I dimly recall that RDF does deal with this as a use case. I have a nagging feeling that a complete “cool semantic web URIs” solution does need to account for this.

As I was doing all this pondering, I came across Scott Kveton’s post today on URL’s are people too … and service end-points. Aha! Indeed they are (allowing for the rhetorical conflation of the resource identifier with the resource itself, which doesn’t give me a headache). That’s really what I meant by talking about the “user as web resource” metaphor back in August. Depending on the sophistication of the metadata and the various services we choose to provide at that endpoint, a URI can, over time, do a better and better job of representing a flesh-and-blood person online.

The I-Files: The Truth Is Out There

Great news about Higgins and its first steps on the road to SAML V2.0 support! Some of the input from use-case contributors at the Project Concordia table included concerns about how identity selectors might tie into their current use of SAML and Liberty Web Services, and I’m sure Higgins will prove to be a useful test bed for scenarios these deployers have in mind.

This news bears on a topic I’ve been mulling over for some weeks now. In Concordia we’ve been discussing the different tradeoffs involved in making different entities “smarter” to handle multiple protocols. Scott Cantor has spoken eloquently in favor of the smart-relying-party method of improving interop in heterogeneous environments, by making what amounts to a do-everything toolkit available for service providers to install and use. I think it constitutes one good use case, but SURFnet chooses the smart-identity-provider approach using a translation hub/proxy instead, and George Fletcher has proposed some ways RPs and clients could get smarter together — these are valid too.

Scott’s comments highlighted the tension between APIs and protocols in providing consistency, repeatability, and interoperability in deployments. (In this post’s title, “I” stands for interoperability rather than identity.) Interop is something you measure between actual implementations (such as products, open-source projects, and other technology stacks), but protocols are a key part of managing and encouraging interop. I told a little story in the OASIS workshop to illustrate this tension…

A solution for cohesive treatment of pretty much any technical problem often starts as a proprietary API (or two or three).

Proprietary API

In order to promote cross-platform and cross-language support, a message-oriented protocol gets created, either by fiat or by some open standardization process, to match it. (Sometimes a protocol is created without a nod to any existing API, of course.) As the saying goes, “The truth is on the wire.”

Protocol

But in a highly distributed environment, inevitably such as those where federated identity is needed, the cost of getting all your partners to build and deploy support for handling these messages can be prohibitive. So some sort of common or standardized API, often associated with an open-source project, gets built that makes deployment easier. This is usually complicated by the need to create an API for each language or development environment. (By the way, I don’t mean to knock standardized APIs — such as those defined through the Java Community Process! — for their interop value. I’m just working through what happens when protocols appear on the scene.)

Standard API

But then some competing protocols show up on the scene, which offer a different mix of capabilities or conceptually slice up the world in a different way.

Multiple protocols

So next we see an abstract API — enter Higgins! — created to unify or blur the distinctions between them. As a natural consequence, “glue code” is added, which is a huge convenience but which also adds distance between the deployer and the contract they enter into by using any particular message protocol.

Unified API

Now multiple abstract APIs come onto the scene that purport to cover the same ground (I guess these would be the Lone Gunmen :-) ). Do they choose all the same protocol options in exactly the same way? Are translations back and forth happening similarly in each stack? What about the handling of impedance mismatches, where, say, one protocol has more features than another? Are some good ideas for consistent treatment buried in different pots of glue rather than being exposed on the wire?

Multiple unified APIs

Really, when a deployer chooses any one implementation stack, it amounts to a product decision — and the answers to these questions are important factors in figuring out exposure to lock-in. This is why documenting glue behavior is important, whether in profiles, extensions, gateways, translations, or even just descriptions of best practices.

The truth

It’s my hope that deployer-driven Concordia use cases can highlight specific needs for additional protocol guidance…which is useful grounding when architecture astronautics beckon.

XML: the knit apparel analogy

Given the odd tech-and-stitching grooves I get into here, however did I miss this 1998 article??

XML is a simplified dialect of SGML (Standard Generalized Markup Language). For those of you unfamiliar with SGML, it is an international standard (ISO-8879) for defining descriptions of the structure and content of documents in an electronic form. XML simplifies SGML by capturing about 80 percent of SGML’s functionality with only 20 percent of the complexity.

HTML, which is a description of the structure and content of a single type of document called a “Web page,” is just one instance of what can be created with SGML. In other words, if HTML is a single knit sweater, SGML and XML are how-to books on knitting. By learning XML, you can create sweaters, socks, leg warmers, or any kind of knitted apparel you want!

Not bad, though it misses the opportunity to capture the generative power of a single XML vocabulary. If instance:web page:sweater and model:HTML:sweater knitting pattern, then metamodel:XML:this. (Ooh, and maybe model building tool:XML schema IDE:this.)

Summer School droplets

Barred from punting?!

Despite parts of Oxford turning into a big blue wobbly thing and punting getting canceled, the XML Summer School this year provided a great experience for speakers and track chairs and, I hope, delegates as well. Others have written about their experience. I thought I’d share some of the more interesting moments from the Web Services and Identity speakers here, with more to come as I slowly complete my reverse timezone shift.

Marc Hadley (apologies for lack of speaking photo!): The first of two speakers to strongly recommend the O’Reilly RESTful Web Services book. Make services “part of the web” rather than just working “over the web”.

Paul Downey - be afraid Paul Downey: In addition to thinking about the services part, we should think about and exploit the web part. “Aristotle, the canonical information architect, says you must have command of your metaphor.” The contract approach is a problem and taxonomies bias things. “I don’t think there is a WS-* caching spec, interestingly enough. [pause] Please don’t write one!”

John Kemp John Kemp: The same web services concepts apply to networked services that aren’t on the web per se. John generated an amazingly prescient horoscope from a Python-based web service running on a virtual phone on virtual Windows on MacOS, using an HTTP-like protocol over SOAP over BEEP (whew): “You will make a presentation about web services.”

Jeff Barr Jeff Barr: Shared the Amazon Web Services story, which demonstrates the power of web services for fun and profit. Demoed sales rank messaging; surprise surprise, Deathly Hallows was #1. :-) Showed cool sites liveplasma, blingee, and The Sheep Market. Developers don’t ask about “SOAP vs. REST” anymore; they tend to use purpose-built AWS toolkits.

Rich Salz Rich Salz: “You are your key” — that is, your cryptographic key is a very close analogue to your digital identity, particularly in app-to-app interactions. On any digital signature system, XML canonicalization is the most expensive part of the processing.

Paul Madsen and his new Web 2.0 app, Bladder Paul Madsen: In response to a question about whether he truly understands XRIs, admitted that “I don’t have any magical powers.” :-) Rudely used his speaking opportunity to look for investors in his new Web 2.0 app.

John Chelsom John Chelsom: An argument for holding health information in a national electronic record as opposed to paper copies all over the place is that if a breach happens, at least you know about it!

One more moment I have to share: Bob DuCharme and I have been talking for a couple of years about setting up the perfect geek photo. See, one of his daughters is named Alice, and we saw an opportunity to illustrate an important data security principle…

Eve eavesdropping on Alice and Bob's conversation

(More pix of the event by various people here and here, and flood photos here.)

IBM makes a pledge

Belatedly noting the good news that IBM has issued what it’s calling an Interoperability Specifications Pledge, which amounts to a non-assertion covenent on a list of covered standards. If you haven’t already checked out Bob Sutor’s post on the subject, go forth and read: it discusses this action and its implications with commenters, including the very knowledgeable Simon Phipps.

My take is that the pledge is a pretty darned good one. Like so many others (but not Sun’s), it has the “necessary claims” flaw, discussed by Simon in the comment thread, but despite that it puts in place some relatively strong protection around developers’ ability to get on with developing. I agree that the covered standards list is handily precise for its many links out to the relevant specs. (The list seems a bit, well, padded — listing the individual specs that make up SAML V1.1 and SAML V2.0, for instance, which makes it seem as though two standards are twelve. But that’s okay.)

I haven’t done an exhaustive comparison of covered vs. non-covered specs, but Johannes Ernst has noticed that OpenID is not listed. It would be interesting to know why, especially given IBM’s participation in OSIS and Higgins.