Dimitar's Blog: On Maven2

Maven is a tool with an interesting history dating back to 2001. In its first years it got deservedly bad reputation for being unstable, poorly documented and more or less experimental piece of work. The release of version 2.0 in 2005 fixed many of the early quirks and set right many of the short-sighted design decisions. After having some bad experience with Maven 1, I was weary to get on the M2 bandwagon, but when I moved to a new job in 2006 I decided to give it a go. So far there have been ups and downs, but I'm fairly happy with it. I still haven't abandoned all my Ant and shell scripts, but I find that I'm using Maven as a primary building tool for most of my projects.

The core proposition of Maven is that one should be able to declare what they are building in some sort of manifest file and the build tool should be able to figure how to build it. The manifest should contain only the information that is specific for the project and all the build procedures should be implemented as plugins. Each build should be related to exactly one artifact of certain type. The artifacts are stored in repositories (more about this later.)

In Maven parlance, the manifest file is called POM (that stands for Project Object Model). If a project adheres to a predefined filesystem layout, the actual XML one has to write can be very small. Here is a minimal example:

<project schemalocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd ">
  <modelversion>4.0.0</modelversion>
  <groupid>com.acme.foo</groupid>
  <artifactid>foobar</artifactid>
  <version>3.14-alpha</version>
</project>

This configuration is enough to enable Maven to build, package, install, deploy and clean your project. It would even generate the repots site if you run mvn site. That's the way I usually start my projects. Later, you can tack more elements as needed.

The top-level element defines the default schema for the POM. The schema has a version which has to match the modelversion element below. While the schema is not strictly necessary, it makes the POM editing much easier if you are using XML aware editor. It is well constrained and there are few extensibility points, so the autocompletion works very well. It also features annotations for each element, which means that if you use IntelliJ IDEA you can just press Ctrl+Q on any element and get instant documentation. The schema design is a bit annoying, but it is very regular and easy to understand:

No attributes - everything is an element.
No mini-languages - everything is an element and any custom textual notations are avoided as much as possible (though in many places they use URLs). This allows for simple parsing and processing.
If an element can be repeated more than once it is enclosed in container element that can appear only once. This ensures that all elements of one type are textually next to each other. Each container can contain only elements of the same type.

Overall, the POM bears a resemblance to an IDE configuration file and they serve the same functionality. Both Maven and IDEs runs a predefined build process, parameterized by the information in the project file (or POM). One major difference is that Maven is designed to be ran from the command line and also encapsulate all environment-specific factors into the POM and the settings.xml files. You can use Maven to generate project files for IDEA, Eclipse and Netbeans based on the information in the POM.

The machine-specific and user-specific configuration is specified in the settings.xml files. There are two of them, the machine-specific settings are stored under the $M2_HOME/conf directory and apply to all the users on the machine. Usually the contents is standardized within the team (internal repositories, proxy settings, etc.) In our company, this file is posted on the wiki where everybody can download it. Alternatively we could have built our internal Maven distribution with the file pre-included. The second settings.xml file resides under ~/.m2 and contains user-specific settings overriding the machine-specific. One can use the user-specific settings to keep login credentials, private keys, etc. On Unix machine, this file should be readable only by the user.

Though the POM is very flexible and can be tweaked to accommodate a number of different scenarios, it is very recommended to refrain from overriding the defaults and use the Maven conventions as much as possible. This way, new developers on the project can get up to speed faster and (important) it's much less likely that you get bitten by untested plugin 'feature'.

One of the thing that new users tend to dislike most is the standard directory layout. In brief, you have pom.xml in the root, and your files go under a directory called src. By default, all files (artifacts) generated during the build go under a directory called target which helps for an easy cleanup. Note that there is no 'lib' as all the libraries reside in the local repository (more about this in another post.)

So far so good, but then under source we usually have main and possibly test, integration and site directoryes and then under them we have java, resources and only there we put the actual source files. This means that we have at least 3 directory levels used for classification above our sources and if you jump between them using Windows Explorer or bash it makes for a lot of clicking/typing. On the other hand, this is the price one pays for the Maven's magic - each directory level means something to the plugins that build your project. E.g. the unit test goal knows that it should runt he tests under test and not the ones under integration. All the files in the resources directory are copied in the final JAR, while the ones under classes are not and so on and so forth.

This post became rather long, so I'll finish it here. Next week I'm going to cover Maven's dependencies management and repository organization. Again I'll try to talk more about "why's" and less about the "what's" that are already covered pretty well in the following tutorials:

Dimitar's Blog

Saturday, October 27, 2007

On Maven2

No comments:

Blog Archive