Per-file compilation time in a Scala/Maven project

Where is Moore when he is needed ?

Scala is a fabulous language, the tooling is now quite good, but it still have one big drawback… One that makes wonder if XKCD 303 wasn’t drawn with Scala in mind. Scalac takes forever to compile anything.

In a non trivial project, that forever means (tens of) minutes for a cold build, with a continuous integration server burning CPU like hell. Developers are a bit better off as they now have incremental compilation in their tooling, but even for them, sometimes a full build is needed and when it happens, you just have to wait…

If the given project also use Maven, then you both wait forever for compilation and for Maven to download the internet three times. So, can something be done to help ?

For the Maven part, the internet is full of resources explaining how to make it better (the better answer is most of the time “switch of from Maven”).

For Scalac, we know that the beast is smart and does a great deal of things for us. So it’s understandable that compilation take some time. But sometime, Scalac try to be far to smart, and for example, the inferencer could work for a long time to build a really complex type when all you wanted was a simple, known type that could have been specified.

Knowing where Scalac spend its time…

In a big project, you generally know that the compilation time is too long, but you just don’t have any idea where the compilation time is spent. So you can’t even start to look for oddities like a file taking 10 times as long as other file to compile.

Scala 2.10 could have had a really great feature for that with the “-Dscala.timings=true”, but it was removed in the final release.

Fortunately, Grzegorz Kossakowski implemented an AspectJ equivalent of that timing option, available here: https://github.com/gkossakowski/scalac-aspects. In fact, there are even several aspects available, allowing to get different metrics: time to compile a file or a type.

Following the readme, it is quite easy to have the timing for the demonstration file. But now, how do you scale the timing to a whole project built with maven ?

… in a project build with Maven ?

The good new is that David Bernard did an awesome job with scala-maven-plugin (of course, you are using it for your Scala project) and with recent 3.0.3-SNAPSHOT, the integration is a matter of adding some configuration option in you pom.xml.

But before explaining them, let’s set up some environment and context.

Setting up AspectJ and scala-aspects

First, you have to install AspectJ: download it, then install it with the “java -jar aspectj-X.X.X.jar" (as explained at the bottom of the previous link.)
In this blog post, we refer to the installation path as: “/path/to/aspectj” (so that the “aspectj” directory contains bin, doc and lib sub-directories).

% export ASPECTJ_HOME=/path/to/aspectj
% export PATH=${ASPECTJ_HOME}/bin:${PATH}

Next, clone scala-aspects project from GitHub, and generate the “PerUnitTiming” aspects with the example file:

% git clone git://github.com/gkossakowski/scalac-aspects.git
% cd scalac-aspects
% ./scalac-aspects PerUnitTiming.aj Foo.scala
/path/to/aspectj/bin/ajc -1.5 -cp /path/to/aspectj/lib/aspectjrt.jar:/path/to/scala/lib/akka-actors.jar
:/path/to/scala/lib/jline.jar:/path/to/scala/lib/scala-actors.jar:/path/to/scala/lib/scala-actors-migration.jar
:/path/to/scala/lib/scala-compiler.jar:/path/to/scala/lib/scala-library.jar:/path/to/scala/lib/scala-partest.jar
:/path/to/scala/lib/scalap.jar:/path/to/scala/lib/scala-reflect.jar:/path/to/scala/lib/scala-swing.jar
:/path/to/scala/lib/typesafe-config.jar -outxml -outjar /tmp/scalac-aspects.adTGFYsZsb/aspects.jar PerUnitTiming.aj

/path/to/workspaces/rudder-project/scalac-aspects/PerUnitTiming.aj:23 [warning] advice defined in PerUnitTiming has not been applied [Xlint:adviceDidNotMatch]
/path/to/workspaces/rudder-project/scalac-aspects/PerUnitTiming.aj:9 [warning] advice defined in PerUnitTiming has not been applied [Xlint:adviceDidNotMatch]
2 warnings

/path/to/scala/bin/scalac -nobootcp 
  -J-javaagent:/path/to/aspectj/lib/aspectjweaver.jar  
 -toolcp /tmp/scalac-aspects.adTGFYsZsb/aspects.jar 
  -d /tmp/scalac-aspects.adTGFYsZsb Foo.scala

Per-file timings
Foo.scala 2376,454 ms

Installing scala-aspects in maven local repository

With the previous command, scalac-aspects created an aspects.jar file in some random directory (emphasized above). We are going to make that Jar available for Maven in your local repository thanks to the Maven install file command:

mvn install:install-file 
   -Dfile=/tmp/scalac-aspects.adTGFYsZsb/aspects.jar 
   -DgroupId=scalac-aspects 
   -DartifactId=per-unit-timing 
   -Dversion=1.0-SNAPSHOT 
   -Dpackaging=jar 
   -DgeneratePom=true

The groupId, artifactId and version were chosen at random, you can use whatever you want.

Of course, if you want to use other Scalac Aspects, just repeat the last steps with them.

And now, everything is set-up !

Pom.xml configuration for timing

So, the last bit is to tell to the Scala maven plugin to use AspectJ and the PerUnitTiming aspect.

You will have to specify three things in the configuration section of scala-maven-plugin to make it use your aspect:

-javaagent:/path/to/aspectj/lib/aspectjweaver.jar
  • scalac must not use a specific boot classpath. this is configured with the parameter arg:
-nobootcp
  • And the actual aspect to use is specified with a dependency tag to the previously installed aspect:
<dependency>
  <groupId>scalac-aspects</groupId>
  <artifactId>per-unit-timing</artifactId>
  <version>1.0-SNAPSHOT</version>
</dependency>

And so, that’s an example of pom.xml that use the per file timing aspect:

  ....
   <build>
    <plugins>
       <plugin>
        <groupId>net.alchim31.maven</groupId>
        <artifactId>scala-maven-plugin</artifactId>
        <version>3.1.3-SNAPSHOT</version>
        <configuration>
          <dependencies>
            <dependency>
              <groupId>scalac-aspects</groupId>
              <artifactId>per-unit-timing</artifactId>
              <version>1.0-SNAPSHOT</version>
            </dependency>
          </dependencies>
          <args>
            <arg>-nobootcp</arg>
          </args>
          <jvmArgs>
            <jvmArg>-javaagent:/path/to/aspectj/lib/aspectjweaver.jar</jvmArg>
          </jvmArgs>
        </configuration> 
       .....

Building the project with that pom.xml will lead to output like:

% mvn clean compile
[INFO] Scanning for projects...
[INFO]
....
[INFO] Per-file timings
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/LDAPConnection.scala 6642,821 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/GeneralizedTime.scala 275,197 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/Tree.scala 326,971 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/schema/LDAPSchema.scala 627,807 ms
....

That Maven output can be trivially post-proccessed in shell to only get sorted timing information (last file the longest):

% mvn clean compile | grep ".* ms"  | sort --field-separator=" " -k3 -n
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/GeneralizedTime.scala 275,197 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/Tree.scala 326,971 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/schema/LDAPSchema.scala 627,807 ms
[INFO] .../rudder-project/scala-ldap/src/main/scala/com/normation/ldap/sdk/LDAPConnection.scala 6642,821 ms

And now, you can start to try to understand why some of your files take SEVERAL seconds to compile !

SHARE THIS