In my time in Integration, I’ve written a lot of XML schemas. But which ones have been "good quality"? Thanks to Priscilla Walmsley of "XML schema’s" fame I’ve written reasonably sophisticated xsds. But have my sophisticated schema’s actually been good? Or have they just been over-engineered?
The higher levels
At the highest level of schema design is the entity modelling. How do you model the business entities? Do you use the tried and true techniques of data modelling? Or do you use some fancy technique such as "Asset Oriented Modelling", which claim to be more specific to XML style work. I haven’t had the opportunity to (re)model an entire category of documents since I found out about this, but it looks interesting.
Another interesting method of XML design is put forward by the gentlemen who run the DocOrDie blog. Of course, they have a book too.
The lower levels
A couple of years ago, a colleague said my xsd style was the Venetian Blind style. What? xsd’s have styles? Exactly, what is Venetian Blind style, and should I be using some other style because it’s better?
Evolving a schema is hard work. Most web service implementations are brittle as buggery. When you generate wsdl2java what do you have other than highly coupled RPC mechanisms? I’ve talked about versioning schema’s before. This discussion goes further, into what is good schema?
In past projects I’ve tried to get some quality in my schemas. I’ve used IBM’s Schema Quality Checker as a first pass. This is a good tool, but doesn’t really help with higher level heuristics. I’ve also used MindReef’s SoapScope to make sure my wsdl (and by extension xsd’s) are WS-I compliant. Surely WS-I will help some way? In many ways, what I want at the lower levels is some kind of code inspection of my xml, much in the way intellij inspects your java code as you create it, the same thing for xml. At the higher levels, I want something that will speak to me about design trade-offs etc.
The first thing to deal with is this "Venetian Blind" thing. What is it, and is there anything better. I found an article on the web here and basically, no, the Venetian blind model is the best way to structure your xsd. It turns out that the Open Applications group use this style as well. And since they are the creators of the Universal Canonical Model of All things, who am I to argue?
In searching for quality heuristics, I find a gentleman by the name of Boonserm Kuyvatunya who’se done some deep thinking about this matter. The best of his articles is called XML Schema Design Quality Test Requirement which talks about how various groups, OAG, UBL, DoD, etc go about getting XML schema quality. This is a good foundation document for your own schema quality initiatives.
On the matter of Schema Evolution, he doesn’t have much to say. On the same conference paper’s site where I found Boonserm, I also find Best Practices for XML Schema Evolution in Application Development. This is a paper, never submitted, by a BEA employee who works on XMLBeans. Looked really promising.
What it probably would not have mentioned, is how do you version schemas? Do you version the whole schema? Do you version the types individually? I still don’t know. I’m becoming close to the idea of versioning individual types and then composing them into a versioned schema – anyone else think like this, or think this is ridiculous?
In sum, the work of Boonserm allows the architecture group to automate some aspects of schema quality. In fact, Boonserm is developing a tool in support of these aims.
XML Quality is a fertile area. I suspect that over time there will be much throwing away of XSDs as we muddle through to what is good in Schema quality.
tags:
schema+quality
Comments