开发者

Is there an alternative to XML schema with support for generic types?

Greetings, I am having a lot of issues during implementation of a model driven architecture. There is a specification for an information model, which makes use of generic types and inheritance. It is meant to be implemented in various languages across multiple platforms (MS, *nix, OsX..)

The problem is, XML schema is seen as the first tool to represent this information model. The assumption is everything is connected to XML. However, XML schema does not support generic types, which correspond to generics in Java, C# etc.. The type erasure in Java generics implementation is also another big issue, but with a modelling formalism that supports generics, I can find a walk around for this.

So I need a computable standard which lets me express this information model using generic types and inheritance. With XML schema, I am not able to express generic types, so there is loss of information when going from [Information model specification] -- ~~~ --> [XML Schema] which causes a lot of issues.

Protocol buffers is attractive in many ways, since it seems to allow fast cross platform/language communication, but I have not had a chance to look at its modelling capabilities..

I feel trapped with all these constraints. At the moment, I 开发者_开发知识库am forced to use some other form of representation on top of xml schema, to keep track of generic types, and this is not a good solution.

Any suggestions would be much appreciated

Regards Seref


I feel like you may be asking too much of your model layer.

XML is often used for integration tasks because you can represent and serialize structured data in a semi-standardized way. However, for each subsystem you will still have to address the issues that arise when you are (un)marshalling your live data; there is nearly always a litte impedence mismatch.

So I think you should accept that your canonical datamodel (represented as XML types and events) does not match one-on-one with the OO- or relational model of every subsystem, or cover all their details like use of generics. The connectors (SOAP web services, file parsers, ESB's or whatever infrastructure you will be using) should translate to/from your canonical model. The canonical data model should be 'leading', detailed enough to allow all business requirements and general enough to leave some leeway for different subsystems with different internal representations.

Hope this gives new insight and helps you find the right solution.


Why don't you want to stick to relational table model as a basis if you're using a model driven approach? It sounds quite reasonable in this case. If you had your RDMS agnostic model (in DDL, like CREATE TABLE...), you could leverage any language's tools to deal with it (generate source code and so forth). I agree that inheritance in DDL is quite cumbersome, but it is still possible (e.g., see this article)


I think you have a category error in describing XML schema as 'not supporting generics'. XML schema does not, naturally, map to many constructs of Java, C++, or C#. It has a type model which is very different. The lack of something that looks like a generic, or a parameterized type, is a fairly minor point.

The common libraries that impose a mapping from XML schema to languages (JAX-B, xmlbeans, Microsoft's .NET mappings) work by refusing to deal with some of the more document-oriented pieces of XML schema and then fairly arbitrarily mapping the rest. The lack of generics isn't a characteristic of XML schema, it's a characteristic of these mapping layers.

Using JAX-B, for example, nothing stops you from writing plugins to implement some convention for generics. That won't help you with .NET.

The underlying difficulty here is that XML Schema was never intended as a cross-programming-language type system for arbitrary data structures. It was intended to describe and constrain the contents of XML documents.


Seref, I agree that XML Schema isn't really suitable as a basis for information modelling unless XML documents are your only or primary way to represent information and even then it depends.

I also agree with @denisk that a relational model, as used for databases, seems a better idea.

I don't see how protocol buffers help; they are similarly bound to a particular serialization technique.

You are considering object metamodels (.NET or Java); they are more expressive, but they are designed for in-memory data representation, so they would be suitable if that is your primary way to represent information from which everything else is derived. Also, as you mention, language specific features (such as generics) will tie you to that language or create translation issues. If you want to go that way, UML may be a better basis.

The best approach depends on your requirements. What problems can you resolve with generics? What assumptions and strong wishes can you state regarding how your information is stored, serialized, and accessed from application code? E.g. will storage be distributed or in a central database? May different applications, even in different languages, access the same information, even at the same time? Etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜