Tuesday, April 9, 2013

Compiling JavaScript code to detect errors early

Even though JavaScript is the most platform independent language we have today, and despite its flexibility, one thing hasn’t really been resolved: as an interpreted language it will hide any errors until runtime.

As we are moving away from rich and compiled web client platforms like Java Applets, Flash, and Silverlight, we are filling the gap using complex JavaScript code and HTML5/CSS3 animations.
That is all fine until the amount of JavaScript code in the project gets so big that it starts to be scary. Suddenly no one really wants to touch the codebase as even renaming a variable can be painful and error prone. The development slows down and the quality declines.

When we generally talking about compiling the source to detect errors we are not really referring to the whole compilation process which would include: lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization. Most likely we are only interested in the first four steps, the actual code generation and optimization is not essential for catching errors early on.

Compiling JavaScript

The term can be a bit misleading as the result of this process in not a binary that looks and feels different from the source, but a checked and optimized version of the original source.

Statically checking the source in some cases is quite easy, we just have to partially emulate the JavaScript engine and check whether the basic references are okay in the code.

The following is quite easy to catch:

function hello(name){
 window.console.log("hello "+ name);
}
hello(); // missing parameter

However, the following is a bit harder:

hello(null); // will print “hello null”

From the first look, it’s okay as hello() requires a parameter and null is a parameter. However, running the code would give an undesired result.

As the JavaScript language itself does not solve issues like this, usually all compilers accept hints in forms of comments:

/**
 * @param {string} name
 */
function hello(name){
 window.console.log("hello "+ name);
}
hello(null); // the compiler would catch this

In this case the compiler will use the hint and give a warning:

WARNING - actual parameter 1 of hello does not match formal parameter.


This gives more feedback to the developer than a Java or C# compiler as by default null is not accepted: if we expect string it means it has to have a value!

Google Closure Compiler

One of the best JavaScript compilers is Google Closure Compiler that can be very strict with our code and check and enforce all Java like rules. A nice feature of the tool is that our compiled code is smaller and theoretically faster running than the original one. Of course the real value comes from the parsing and statically checking our code, but having a harder to read production JavaScript code can be handy sometimes.

Renaming, extern and export

One of the very important things about Closure Compiler is that it will rename any variables and functions and update the references too to make the code more compact. Variable names help us to read the code but for the machine they are just strings.
Compiling the above code with:

java -jar compiler.jar --js jsdemo.js --compilation_level=ADVANCED_OPTIMIZATIONS --warning_level VERBOSE

Would yield:

function a(b){window.console.log("hello "+b)}a(null);

However, this can be dangerous as maybe someone wants to call the hello() function from a different, not compiled codebase. To hint the compiler not to rename something, we have a couple of options.

Store it explicitly in the global namespce:

…
window.['hello'] = hello;

Will compile to:

function a(b){window.console.log("hello "+b)}window.hello=a;a(null);

Even though the hello() function is renamed, a reference is kept to it under name “hello”. This way the code can be still shorter but any external dependencies are still intact.

The other option if we want to keep a property/method name intact within an object, we can expose it:

/** @expose */
myClass.prototype.myProperty = '';

This can be useful if we need to match the object to a remote system, like mapping JSON values to it from the server - who does not know anything about the compilation.

The third option is to use externs. When we are using external libraries (like jQuery), the compiler will always warn us that:

$('#div').hide();

ERROR - variable $ is undeclared

To solve this issue, we simply need to download the jQuery extern file (link below) and use it as an extern. Note: if you can, use the actual library as an extern. In jQuery case the annotations aren't too good, so the compiler throws a lot of warning. Just use the jQuery extern file:

java -jar closure-compiler-latest/compiler.jar --js jsdemo.js --externs externs.js --compilation_level=ADVANCED_OPTIMIZATIONS --warning_level VERBOSE

The code now compiles without warnings.

Tree shaking, dead code removal

One of the really interesting ideas about Closure Compiler is that just like any reasonable compiler it would simply drop the code that we are not calling at all. This process is called tree shaking or dead code removal and the basic idea is that the compiler builds a logical tree of the code dependencies and anything that is not part of this graph is considered dead, so removed - or actually just not added to the output.

The following code:

/**
 * @constructor
 */
var Hello = function(){}

/**
 * @param {string} name
 */
Hello.prototype.hello = function(name){
 window.console.log("hello " + name);
}

var h = new Hello();

compiles to an empty string. And that is a good thing – the compiler recognized that hello is never called, the constructor is empty so it’s safe to remove this whole block. Using this method the code we compile for every actual webpage might be significantly smaller than just compiling the whole codebase together. Maybe on our home page we don’t use alerts or our AJAX library – so those can go too.

Namespaces

Of course Closure Compiler in not without it’s faults either, so it can be a bit annoying sometimes, complaining about things that are not really problems.
Let’s say we are working in the “m” namespace, so every file would start with the namespace declaration:

var m = m || {};
m.myFunc = function(){…}

This code is perfectly valid, will not destroy m as it will reassign its original value if it is not null or undefined. However, Closure Compiler would think this would wipe the m variable so will give the namespace warning:

namespace {name of object} should not be redefined if having

Unfortunately this warning cannot be suppressed using JavaScript annotation, so one way to work around this is to make it look like a normal variable, not a namespace:

var m = m || new Object();

as the var x = x || {} form is a hard wired namespace declaration for Closure, we can cheat and define the same using a bit different syntax. However, this will give us duplicate warning:

ERROR - Variable m first declared in …

Now, that’s an easy one, let’s just force the compiler to ignore that:

/** @suppress{duplicate} */
var m = m || new Object();

All good, we define the namespace if it was not defined or just reuse it if it was there – without compiler warning.

Get started

Writing JavaScript code is not just adding an alert box any more. Almost all web pages have quite large codebase but we neither take JavaScript unit testing seriously nor compile the code to make sure it looks any reasonable at all.
Turning on the ADVANCED_OPTIMIZATIONS on Closure Compiler might not be for everyone as a first step, but running even the basic checks can yield surprising result. Give it a go!

Download Closure Compiler then check How to annotate JavaScript code for the compiler. When using external libraries, make sure you download the Closure Externs for it.

No comments:

Post a Comment