Wednesday, July 23, 2008

Transformers vs Components

When I started using Mule, for some time I used to wonder what's the big difference between a transformer and a component. Why don't we just stick with the plain pipes and filters model and we complicate it with the 'component' concept?

In general, transformers have conceptually simpler interface - they are supposed to get a payload and return a payload, possibly of different type. You should prefer transformers for reusable transformation tasks. On the other hand, transformers can not:

  • Swallow messages - to achieve this use filter or router.
  • Return multiple messages - use router.
  • Pause temporarily

In addition, despite their more complex configuration components provide some neat features:

  • Message queuing - if your message goes through multiple stages of enrichment, each taking variable time, you might be better off implementing the stages as components and separating them with VM queues.
  • Statistics - automatically measures the processing time and volume through each in or outbound endpoint.
  • Lifecycle control - you can start, stop or pause component multiple times (while a component is not running the messages will queue up.) The transformers are only initialized and disposed once.
  • A componen allows you to tack on multiple filters, transformers, routers, etc. A transformer is one part of this chain.

To put it together, here is an overview of Mule's component roles in a simple message routing sequence (I'm trying to explain the roles, not the exact sequence.)

  • The message enters through an inbound endpoint, where the transport extracts the payload and the message headers and bundles them in a message. (The message and some other objects are bundled in EventContext, but it is still a mystery to me why do we need it.)
  • After the message is created it passes through a filter chain, where the filters are applied one after another and each of them votes with true or false a full consensus is required or otherwise the message is rejected. Depending on the config the rejected messages can be routed somewhere else. The filters are not supposed to modify the message (in practice they can, but it's a very bad idea.)
  • If a message passes all filters it goes through a transformer chain, where each transformer is applied to the output of the preceding one. A transformer is not able to gracefully reject a message, but it can throw an exception.
  • We might have an interceptor wrapped around the component, which can log stuff before or after invocation or measure latency, etc.
  • Based on the payload and on the component type, Mule resolves the method it should call and if necessarry transforms the payload to something that fits. The return value of the method is used as a payload to a single outgoing message.
  • The outgoing message is passed to an outbound router, that can split it in multiple messages or just drop it. The router then usually sends the message(s) to arbitrary number of outgoing endpoints. The outgoing router uses the filter chains of the outgoing endpoints to decide which ones to use. The outgoing endpoints can be Mule internal endpoints or periphery endpoints, conecting the application to external system.

In general, I would recommend that you start with transformer and refactor it to a component if you realize that you want to add any of the aspects mentioned above. You can even easily use transformers as POJO components.

Monday, July 21, 2008

Testing Mule applications

Antoine Borg has posted an intersting article at Ricston Blog about the feasibility of using mock objects for testing Mule applications. The bottom line is that in integration scenario is usually difficult to capture the application specification in mocks and often it's easier to write stub applications simulating the external systems (outside of your application process).

I've found that often I don't really test the whole application, but instead I test single components with a few attached transformers, filters and routers. I start by writing a simple Mule configuration - a single component using VM endpoints with queuing. Then I use MuleClient to feed in canned input data and assert the output from the outbound endpoint(s). As the application takes shape, I add transformers and routers as needed to approximate the real usage.

I could imagine that the next step would be to reuse my production configuration and extract the perimeter endpoint definitions in a new file (separating them from the model, connectors and internal endpoints) and pass the two configuration files to the config builder. This would allow me to create an alternative perimeter endpoints file using VM endpoints, so I can instantiate it in test case and use MuleClient for testing.

The benefits of the approach are that you are testing your component and routing logic and are not exposed to the peculiarities of the external system. Ideally we still want to have a full integration tests, including ones covering crash-failure, failover, connectivity loss and overload scenarios. We can achieve parts of this by stubbing the external system, but so far I usually find it difficult to reproduce the behaviour faithfully enough (especially when we have limited understanding about the external system).

No bulletpoints this time.

About Me: check my blogger profile for details.

About You: you've been tracked by Google Analytics and Google Feed Burner and Statcounter. If you feel this violates your privacy, feel free to disable your JavaScript for this domain.

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 Unported License.