It all started when we decided to replace Moxie with Devissa*. Moxie was a decent system and it had aged well, but its years had started to show. The rigid data schema, the inflexible order representation, the monoloitic C++ server... Don't get me wrong, it was and still is working great, but with the time we realized that we need something more. Something that would let us define the way we do business instead of having us change the business to fit in its model.
* All names have been changed to protect the innocent
The global roll out of Devissa looked like a good opportunity to bring in a more capable trading system. Devissa itself, was a huge beast, composed of hundreds of instances of several native processes running with variety of configurations, held together by TCL code, cron jobs and a templated meta-configuration.
The Moxie communication protocol was simple - fixed length records sent in one direction, 32 bit status code in the other, over a TCP socket (actually 2 sockets - uplink and downlink). Devissa was much more complex - the messages were framed using XML-like self-describing hierarchical format (logically it was the standard map of strings-to-arrays of maps... ending up with primitive values at the leaf nodes). The session level protocol was simple and luckily there was a java library for it (I'll bitch about it some other time). On top of the sessions, sit a bunch of application level protocols, each with different QoS and MEP. There is also a registry, authentication service and a fcache/replicator/database/event processor thingie that sits in the center, but I am digressing.
I'm actually started this article to share some interesting stuff I learned while we migrated the order flow from Moxie to Devissa. The phase-zero was to make a point to point integration Devissa to Moxie using the FIX gateways of the respective products, routing orders entered into Devissa to Moxie, so the traders could work them in the familiar Moxie interface. It allowed us to receive flow from other offices which were already on the Devissa bandwagon and it was great because we didn't have to code data transformations and behaviour orchestration logic - it all 'just worked'.
The next task was to make sure that we can trade on Devissa and still be able to produce our end-of-day reports from a single point. Right then all reporting was done from Moxie, so what seemed to make most sense was to capture the reportable events from Devissa and feed them back to Moxie. I'll spare you the BA minutae for now.
As we were looking for a suitable base for creating a platform on which to build various applications around Devissa, I shortlisted a couple of ESB solutions (although it's an interesting topic, I won't talk about "what's an ESB and do I need one"). I looked at Artix, Tibco, Aqualogic, ServiceMix and Mule. I found that Artix ESB was great, Artix DS looked like a good match for our data mapping needs, the only thing I was concerned was the cost. Before I get in contact with the vendor, I asked my managers about our budget - thay replied with almost surprise that we don't know - if it was good and worth the money we might try to pitch it to the global architecture group - in other words, commercial product was not really an option. This ruled out pretty much everything, leaving ServiceMix and Mule (if I was starting now I would also consider Spring Integration). I read a bit about JBI. I tried to like it, I really did... still I couldn't swallow the idea about normalizing your data on each endpoint and being forced to handle all these chunks of XML flying arround. At that time Mule looked like the obvious answer for OS ESB.
The first thing I had to do was to build custom transport for Moxie and Devissa. That took about 2-3 days. They didn't have any fancy features (actually they barely worked), but I was able to receive a message from one and stuff a message in the other. During the following year both transport evolved a lot, ending up with full rewrite last month, porting them to Mule2 and adding goodies like container-managed dispatcher threading, half-sync support, support for all Devisa application protocols and others.
The second phase was to build a neutral domain model as described in the Eric Evans's "Domain Driven Design" which I had read recently. Then I wrote two transformers - Devissa2Domain and Domain2Moxie, implemented a simple POJO with about 15 lines of real code and voila - all our Devissa orders and Executions appeared in Moxie. Forking the flow to a database was really easy, since I could use the Mule JDBC connector and it took only 10 lines of config. Storing the messages in XML was also easy with the Mule XStream transformer and the Mule File connector. The world was great.
Not really. It turned out that the DB storage and the file-based audit were not real requirements, so we cut them really quick (or perhaps they made the first release). Soon, during UAT, it turned out that even though the the BAs had created quite detailed requirements, they didn't match what the business wanted. Even worse - the business itself wasn't sure what they wanted. We were going through a few iterations a day, discovering more data that needs to be mapped, formats that need to be converted, vital pieces of information that were present in one model and not in the other and they had to be either looked up from static table or calculated from couple of different fields and sometimes ended up stuck in a field that had different purpose, which we were not using right now.
During all this time, the domain model was growing. Each new piece of information was captured clearly and unambiguously in a Java bean with strongly typed properties, validation and stuff. We went live on December 14-th. On the next day the system broke. We kept tweaking the business logic for quite some time and for each tweak, there were always three places to change - the domain model, the inbound transformer and the outbound transformer.
One day I decided to see what would it be if we drop the domain model altogether and replace the inbound transformer with isomorphic conversion from the Devissa data classes to standard Java collections and then use a rule engine to build the outgoing Moxie message. Enter Drools. The experiment was success - in a couple of days, I was able to ditch my domain model (which has grown to be so specific to the application that it wasn't really neutral any more). Drools was working fine, though I had the feeling that something was wrong... I never asserted, nor retracted any facts in my consequences - I was abusing the Rete engine. Actyally, all I was doing was a glorified switch statement.
While I was at it, I decided to ditch Drools as well and use MVEL - one of the consequence-dialects of Drools, which turned out to be a nice, compact and easy to embed language. MVEL is designed mainly as expression language, though it has control-flow statements and other stuff. With MVEL, all my transformation fitted on one screen and had the familiar imperative look and feel, but without the cruft. I was able to plug some Java functions using the context object, which allowed me to hide some ugly processing; and the custom resolvers allowed me to resolve MVEL variables directly from the Devissa message and assign them directly to the properties of the Moxie message beans.
Some time after that, for different project, building on the same foundation, I decided to see if I can infer an XML schema from the XML serialization of the Devissa messages. After some massaging I used that schema to generate the domain model using JAXB and tried to see how it feels. It was a disaster. A typical Devissa message has more than 50 properties (often more than 100). Usually you need 10-20 of them. Alsi, the generated property names were ugly. Even after conversion from CONSTANT_CASE to camelCase, they were still ugly. The automatically generated beans was practically unusable, the XML looked not-human-editable, the XSD was not adding any real value since it lacked any semantic restrictions, so the whole thing felt like jumping through hoops. In the end I dropped the whole JAXB idea and went with MVEL again.
3rd time lucky, beginning of this March, I started a new project. This time I again decided to try a new approach - in the inbound transformer, I was wrapping the raw Devissa message in an adapter, exposing the fields I need as bean properties, but carrying the full dataset of the original messages. It works well. One particular benefit is that you can always look at the source data and see if there is anything there that might be useful.
In conclusion I'll try to summarize:
- Neutral model plus double translation can yield benefits when the domain is well known, especially if it is externally defined (i.e. standard). On the other hand it's a pain in the ass to maintain, especially if the domain objects change frequently.
- Rule engines are good when you have... ahem, rules. Think about complex condition and simple consequence. Actually, in the original Rete paper, the consequences are only meant to assert and retract facts. Changing an object in the working memory or doing anything else with side-effect behind the engine's back is considered a bad practice at best or (usually) plain wrong. Even when using fact invalidation (truth maintenance), it has big performance impact.
- Direct mapping using expression language works well, especially for big and complex messages. The scripts are compact and deterministic, which makes them maintainable. You might need to write your own variable resolvers and extend the language with custom functions. Also, debugging could be a nusance, but if you keep your control-flow to minimum and use plugged Java functions, it's quite OK.
- Adapters are a middle ground between double translation and direct mapping. They tend to work well to provide internal representation for the application, you can also stuff some intelligence in them without worrying that somebody might regenerate them. With a bean mapping framework like Dozer you can even automate the transformation to the output datatype, though for many cases that would be overkill (sometimes 200 lines of straight Java code are more maintainable than 50 lines of XML or 10 lines of LISP).
- Xml works well if your output format is XML; if you need to apply transformations with XSLT or render it using XSL:FO. As we know, you can run XPath on bean and collection graphs using JXpath; also any expression language can provide sililar capabilities.
Next time, I'll write about component decomposition, content-based routing vs coarse-grained components and how to decide whether to do the transformation in a component or in a transformer.