Wednesday, July 23, 2008

Transformers vs Components

When I started using Mule, for some time I used to wonder what's the big difference between a transformer and a component. Why don't we just stick with the plain pipes and filters model and we complicate it with the 'component' concept?

In general, transformers have conceptually simpler interface - they are supposed to get a payload and return a payload, possibly of different type. You should prefer transformers for reusable transformation tasks. On the other hand, transformers can not:

  • Swallow messages - to achieve this use filter or router.
  • Return multiple messages - use router.
  • Pause temporarily

In addition, despite their more complex configuration components provide some neat features:

  • Message queuing - if your message goes through multiple stages of enrichment, each taking variable time, you might be better off implementing the stages as components and separating them with VM queues.
  • Statistics - automatically measures the processing time and volume through each in or outbound endpoint.
  • Lifecycle control - you can start, stop or pause component multiple times (while a component is not running the messages will queue up.) The transformers are only initialized and disposed once.
  • A componen allows you to tack on multiple filters, transformers, routers, etc. A transformer is one part of this chain.

To put it together, here is an overview of Mule's component roles in a simple message routing sequence (I'm trying to explain the roles, not the exact sequence.)

  • The message enters through an inbound endpoint, where the transport extracts the payload and the message headers and bundles them in a message. (The message and some other objects are bundled in EventContext, but it is still a mystery to me why do we need it.)
  • After the message is created it passes through a filter chain, where the filters are applied one after another and each of them votes with true or false a full consensus is required or otherwise the message is rejected. Depending on the config the rejected messages can be routed somewhere else. The filters are not supposed to modify the message (in practice they can, but it's a very bad idea.)
  • If a message passes all filters it goes through a transformer chain, where each transformer is applied to the output of the preceding one. A transformer is not able to gracefully reject a message, but it can throw an exception.
  • We might have an interceptor wrapped around the component, which can log stuff before or after invocation or measure latency, etc.
  • Based on the payload and on the component type, Mule resolves the method it should call and if necessarry transforms the payload to something that fits. The return value of the method is used as a payload to a single outgoing message.
  • The outgoing message is passed to an outbound router, that can split it in multiple messages or just drop it. The router then usually sends the message(s) to arbitrary number of outgoing endpoints. The outgoing router uses the filter chains of the outgoing endpoints to decide which ones to use. The outgoing endpoints can be Mule internal endpoints or periphery endpoints, conecting the application to external system.

In general, I would recommend that you start with transformer and refactor it to a component if you realize that you want to add any of the aspects mentioned above. You can even easily use transformers as POJO components.


antoine said...

Hi ...

Excellent post for a very common question. A discussion that I had with MuleSource people ended up becoming the "Best practices" page for transformations: and that discussion is still pretty valid for transformations.

However, you touch upon a different aspect to the whole thing - when is a component not a component.

When with clients, I use the following rules:
01 - Are we trying to represent business logic? That's a component
02 - Are we converting data formats because our apps/components don't understand the inbound format? That's a transformation (Same applies for outbound)
03 - Are we converting data because the transport needs us to? That's a transformation.

The key point is "business logic" and now I'm moving into SOA and the concept of a service which is beyond the point of your blog post.

Entertaining stuff - thanks!


Dimitar said...

I agree, still there are cases when the boundary between business logic and infrastructure is not that clear cut. Consider an integration that maps one realtime process to another - whether a complex application-specific transformation is business logic or infrastructure job is a matter of perception.

From my point of view, there is a clear functional difference between transformers and components.

I usually prefer transformers for operations that take the payload, modify it and return one and only one result (even if the transformation is very application specific). I use components when the transformation requires features available to components only (i.e. callout to nested router).

Ross Mason said...

Great post, you have exactly the right balance here and agree that Antoine's questions help users to clarify the common use cases. One case that is sometimes hard to grasp is when you need to perform a transformation but require to read reference data from an external data source. This is a payload transformation and there is only one input and one output, but as you are enriching the data content this should be done in a component. Some users would class this as Business Logic anyway hence a component would be used.

Chad said...

I still confused on when i should you a component and when i should use a transformer. Can some one clarify for me some more. Thanks.

Dimitar said...

Hi Chad, can you elaborate more about what is your confusion? If you post a particular scenario, I can give you my opinion.

You might also find interesting this follow-up article

Chad said...

i have a transformer that contains a instant of an object and the transformer uses this object to transform the data that comes in. Also i have heard that a transformer is created for each message it receive. is this true? what is the lifecycle of a transformer?

About Me: check my blogger profile for details.

About You: you've been tracked by Google Analytics and Google Feed Burner and Statcounter. If you feel this violates your privacy, feel free to disable your JavaScript for this domain.

Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 Unported License.