How and why the hoisting works in JavaScript

in JavaScript

Most of the decent fronts-end developers I met knew the basic principles of hoisting. How variables declarations are handled by JavaScript engines and why you may encounter what could appear as a quirk for a non JS developer. But fewer knew why it works that way and what happen under the hood. So I’ll try a basic explanation for these elements.

What is hoisting

Just in case you’re not familiar with the principle, here are a few code examples to show it in action. Let’s play with the variable foo and log it in the console. You can type it in your own console to get a better grasp on the concept.

console.log( foo );

That one is easy, foo is not declared anywhere before, so we get a reference error : ReferenceError: foo is not defined.

var foo;
console.log( foo );
var bar = 1;
console.log( bar );

That one is easy too, foo is declared but not value has been assigned, its value is undefined. bar is declared and a value is assigned, its value is 1.

console.log( foo );
console.log( bar );
var bar = 1;
var foo;

Now, that one is tricky. It could output a pair reference errors because I try to log the variables before declaring them. Except it does not. It logs undefined for both foo and bar.

var foo = 1;
(function() {
    console.log( foo );
    var foo = 10;
})();

This one is the complete classic hoisting example. Again, it won’t return 1 or 10, but undefined.

So what happened it the two last examples ? The JavaScript engine behave exactly as if it rewrote your code a little bit. For the last two chunks of code they are equivalents to those two :

var foo, bar;
console.log( foo );
console.log( bar );
bar = 1;

/* ***** */

var foo,
foo = 1;
(function() {
    var foo;
    console.log( foo );
    foo = 10;
})();

This is the hoisting. When the compilation occurs, the javascript engine seems to hoist all variable declarations at the top of the execution context and then it execute the code.

Wait, did you just read compilation ? Yes you did. We’ll see that later.

A quirk ? Nop, it’s the spec.

JavaScript is an implementation of the ECMAScript® specification. Currently on the fifth (dot 1) version (the glorious sixth is coming !). And the begining of the section 10.5 : Declaration Binding Instantiation is clear :

Every execution context has an associated VariableEnvironment. Variables and functions declared in ECMAScript code evaluated in an execution context are added as bindings in that VariableEnvironment’s Environment Record. For function code, parameters are also added as bindings to that Environment Record.

Which Environment Record is used to bind a declaration and its kind depends upon the type of ECMAScript code executed by the execution context, but the remainder of the behaviour is generic. On entering an execution context, bindings are created in the VariableEnvironment as follows using the caller provided code and, if it is function code, argument List args
So, when the JS engine enters an execution context (a function for example), it first creates the bindings and then execute the instructions. Which can mean a few things for all the variables it encounter in this context :

If the engine find a variable declaration, it creates a new binding in the current context.

var foo; // creates a binding

If the engine cannot find a variable in the bindings of the current context, it will look for it in the parent context (and parent of parent recursively) until it finds it. When it finds a bindings, it create a new one for the current context as a reference to the parent execution context binding.

var foo;
(function(){
    foo = 1;    // creates a binding with the foo declared on parent context
})();

For this first set (bindings instantiation), the engine may adopt multiple strategies. It could first scan the code for variable declarations (var = ...); then re-scan with this first set of bindings to create the bindings for the variables that are not declared in the current scope. It could check binding for each variable in the context and force the declaration each time (it would mean only one pass, but probably kill bindings sometimes.

It’s hard to tell for sure for the closed source engine. So the best practice it to always declare all your variables at the top of your execution context.

var foo;
(function(){
    foo = 1;    // does nothing for this step
    var foo;    // erase the previous binding and create a new one > set to undefined
    foo = 2;    // does nothing for this step
})();

The current execution context holds a list of all the bindings, the VariableEnvironment.

Technically, what we call the scope is the LexicalEnvironment. Which is one of the three components of the execution context. But this is just the theory you will find in the spec. This is where the variables are living, but this is often confused with the list of the binding or the whole execution context.

The JavaScript compiler

Back to the compiler mention. Yes, JavaScript is a “script language” (and a dynamic programming language). Yes, you send high level code to your clients and not a compiled binary that will be able to execute itself. But still, do you think that var foo = 1; means anything for a CPU ? Because it does not.

For what follows, I’ll talk about the way it works on Chromium (and Chrome) and its js engine : V8. Because both are open source and we may read and check the code. The other browsers and engines does not work exactly the same way internally, but they produce pretty much the same result modulo the speed.

If you explore the brilliant source code of V8, you can see that there is a version for each platform (arm, x64, ia32…). Because you need a specific language and set of instruction to address every platforms. The instruction allocate 128 bits of memory for my variable won’t be written the same way for an arm64 CPU and an old x86 pentium.

So yes, JavaScript has to be compiled. It just happens right before the execution on the client’s machine.

Compilation steps to hoist

When v8 enters the execution context of bar in the following code, a few steps will occur.

var foo;
var bar = function(){
    foo = 1;
    var foo;
};
bar();

There are actually many other steps to optimize compilation. In V8, the compilation infrastructure is named crankshaft. After the scope analysis, it generates a control flow graph in Hydrogen. Which is the high level cross-platform representation language of the AST for the V8 compiler. Then a dozen of optimizations are performed on this code. Things like inlining, static type inference, range analysis, canonicalization, dead code deletion… I won’t go too deep because each optimization would deserve an entire article and are way beyond what we really need to know as JavaScript developers. After this step, the code is translated into another language, lower level, closer to the processor and platform specific : Lithium. The final compiled machine code will be written from this lithium version.

Conclusion

The hoisting is not a quirk, this is the way the language is supposed to work. But if you try to respect some basic best practices, like declaring all your variables in the beginning of your execution context, you should not encounter any problem with that.