Reimplementing To Understand How JavaScript Works

By Marc-Andre Cournoyer

Nobody truly understands how the world works. It is too complex. Instead, we build models that we believe are close to how the real thing works. We then use those models to discover new things and expand our understanding of how the world works.

Following the same line of reasoning, I want to show you a model of how JavaScript work inside. It will not be the real thing, because the real thing is too complex to learn in a short article. But it will help you create a model in your mind of how it works. It will help you understand how objects, variable scopes and functions work. So that you no longer program by blindly applying rules, but by understanding how things work inside.

Every language is separated into at least three parts. The parser, transforms your program into something the language can understand. The runtime, is the living world in which your program executes. Finally, the interpreter, puts the two together, modifying the runtime by interpreting the output of the parser.

A parser defines how the language looks (its syntax). The runtime defines how it behaves (how we will create object, for example). Since I want to help you create a mental model of how JavaScript behaves with scopes and such, I will be talking about the runtime. If you're interested in learning about the full picture of how a languages work, take a look at my book and my online class. If you enjoy this article, I'm sure you'll enjoy them.

The Runtime

The runtime is the environment in which the language executes. Think of it as a box with which we'll interact using a specific API. In this section, we're defining this API to build and interact with the runtime.

We need to build representations for everything we'll have access to inside the language. The first one being: objects!

Object

Objects have properties. One missing piece of our implementation is the prototype. To keep things simple, we will not have any form of inheritance. All our objects will be able to do is get and set the values stored in their properties.

It will also be able to wrap a real JavaScipt value, like a string or a number. We'll use this to represent strings and numbers inside our runtime. Every object living inside our program, will be an instance of JsObject.

function JsObject(value) {
  this.properties = {};
  this.value = value; // A JavaScipt string or number.
}
runtime.JsObject = JsObject;

JsObject.prototype.get = function(name) {
  return this.properties[name];
}
JsObject.prototype.set = function(name, value) {
  return this.properties[name] = value;
}

So now if we want to create a number in our runtime, we do: new JsObject(4), a string: new JsObject("hi") or a plain object, new JsObject(). And set a property like so: object.set('name', 'Marc')

Scopes

The most confusing part of JavaScript is the way it handles the scope of variables. When is it defined as a global variable? What's the value of this? All of this can be implemented in a very simple and straightforward fashion.

A scope encapsulates the context of execution, the local variables and the value of this, inside a function or at the root of your program.

Scopes also have a parent scope. The chain of parents will go down to the root scope, where you define your global variables.

function JsScope(_this, parent) {
  this.locals = {};     // local variables
  this.this = _this;    // value of `this`
  this.parent = parent; // parent scope
  this.root = !parent;  // is it the root/global scope?
}
runtime.JsScope = JsScope;

JsScope.prototype.hasLocal = function(name) {
  return this.locals.hasOwnProperty(name);
}

Getting the value in a variable is done by looking first in the current scope, then recursively going in the parent until we reach the root scope. This is how you get access to variables defined in parent functions, also why defining a variable in a function will override the variables of parent functions and global variables.

JsScope.prototype.get = function(name) {
  if (this.hasLocal(name)) return this.locals[name]; // Look in current scope
  if (this.parent) return this.parent.get(name); // Look in parent scope
}

Setting the value of a variables follows the same logic as when getting it's value. We search where the variable is defined, and change its value. If the variable was not defined in any parent scope, we'll end up in the root scope, which will have the effect of declaring it as a global variable.

This is why, in JavaScript, if you assign a value to a variable without declaring it first (using var), it will search in parent scopes until it reaches the root scope and declare it there, thus declaring it as a global variable.

JsScope.prototype.set = function(name, value) {
  if (this.root || this.hasLocal(name)) return this.locals[name] = value;
  return this.parent.set(name, value);
}

Scopes will be created each time we enter a function.

Functions

Functions encapsulate a body (block of code) that we can execute (eval), and also parameters: function (parameters) { body }.

function JsFunction(parameters, body) {
  JsObject.call(this);
  this.body = body;
  this.parameters = parameters;
}
util.inherits(JsFunction, JsObject); // Functions are objects.
runtime.JsFunction = JsFunction;

When the function is called, a new scope is created so that the function will have its own set of local variables and its own value for this.

The function's body is a tree of nodes. We'll see in the next section what is a node and a tree of nodes.

To execute the function, we eval its body.

JsFunction.prototype.call = function(object, scope, args) {
  var functionScope = new JsScope(object, scope); // this = object, parent scope = scope

  // We assign arguments to local variables.
  // That's how you get access to them.
  for (var i = 0; i < this.parameters.length; i++) {
    functionScope.locals[this.parameters[i]] = args[i];
  }

  this.body.eval(functionScope);
}

Primitives

We map the primitives to their JavaScript counterparts (in their value property). Note that true and false are objects, but null and undefined are not..

runtime.true = new JsObject(true);
runtime.false = new JsObject(false);
runtime.null = { value: null };
runtime.undefined = { value: undefined };

The root object

The only missing piece of the runtime at this point is the root (global) object.

It is the scope of execution at the root of your program and also the root or window object that you get access to.

Thus, we create it as a scope that also acts as an object (has properties).

var root = runtime.root = new JsScope();
root.this = root; // this == root when inside the root scope.

Properties of the root/global scope are also the local variables. That's why when you use var a = 1; in the root scope, it will also assign the value to root.a.

root.properties = root.locals;

Here we'd normally define all the fancy things, like global functions and objects, that you have access to inside your JavaScript programs. But we're keeping it simple and only define root and the console.log function.

root.locals['root'] = root;
root.locals['console'] = new JsObject();

We can pass real JavaScript functions in our runtime objects since they have a call property too, like JsFunction.prototype.call. So they will behave much like JsFunction, but without the need to create a scope.

root.locals['console'].properties['log'] = function(scope, args) {
  console.log.apply(console, args.map(function(arg) { return arg.value }));
}

Now that's less than 60 lines of code and we have enough of a runtime to execute a lot of JavaScript!

The Nodes

Before we put the runtime into action, we must talk briefly about the internal representation of our programs. How our language understands the code that we feed it.

A program will be represented as a tree of nodes. One node at the top with many children. The parser will output this tree of nodes after reading your program.

The nodes we're using are simple object that hold values, for example:

nodes.BlockNode = function(nodes) { this.nodes = nodes; }

nodes.StringNode = function(value) { this.value = value; }

nodes.DeclareVariableNode = function(name, valueNode) {
  this.name = name;
  this.valueNode = valueNode;
}

nodes.CallNode = function(objectNode, name, argumentNodes) {
  this.objectNode = objectNode;
  this.name = name;
  this.argumentNodes = argumentNodes;
}

// [...]

The following is the tree of nodes for var s = "hi!"; console.log(s);. It is the equivalent of what the parser would produce.

treeOfNodes = new nodes.BlockNode([

  // `var s = "hi";`
  new nodes.DeclareVariableNode(
    "s", // variable name
    new nodes.StringNode("hi!") // value
  ),

  // `console.log(s);`
  new nodes.CallNode(
    new nodes.GetVariableNode("console"), // receiving object
    "log", // function name
    [ // arguments
      new nodes.GetVariableNode("s")
    ]
  )

]);

This tree is the internal representation of our program. It will not execute our program. To execute it, we must evaluate it. That's the job of the interpreter.

The Interpreter

The interpreter part of our language is where we'll evaluate the nodes to execute the program. Thus the name eval for the function we'll be defining here.

We'll add an eval function to each node produced by the parser. Each node will know how to evaluate itself. For example, a StringNode will know how to turn itself into a real string inside our runtime.

One thing our interpreter does not support is variable declaration hoisting. In JavaScript, variable declarations are moved (hoisted) invisibly to the top of their scope. This would require two passes in the node tree by the interpreter, a first one for declaring and one last one for evaluating.

The top node of a tree will always be of type BlockNode. Its job is to spread the call to eval to each of its children.

nodes.BlockNode.prototype.eval = function(scope) {
  this.nodes.forEach(function(node) {
    node.eval(scope);
  });
}

Literals are pretty easy to eval. Simply return the runtime value.

nodes.ThisNode.prototype.eval      = function(scope) { return scope.this; }
nodes.TrueNode.prototype.eval      = function(scope) { return runtime.true; }
nodes.FalseNode.prototype.eval     = function(scope) { return runtime.false; }
nodes.NullNode.prototype.eval      = function(scope) { return runtime.null; }
nodes.UndefinedNode.prototype.eval = function(scope) { return runtime.undefined; }

Creating various objects is done by instantiating JsObject.

nodes.ObjectNode.prototype.eval = function(scope) { return new runtime.JsObject(); }
nodes.StringNode.prototype.eval = function(scope) { return new runtime.JsObject(this.value); }
nodes.NumberNode.prototype.eval = function(scope) { return new runtime.JsObject(this.value); }

Variables are stored in the current scope. All we need to do to interpret the variable nodes is get and set values in the scope.

nodes.DeclareVariableNode.prototype.eval = function(scope) {
  return scope.locals[this.name] = this.valueNode ? this.valueNode.eval(scope) : runtime.undefined;
}

nodes.GetVariableNode.prototype.eval = function(scope) {
  var value = scope.get(this.name);
  if (typeof value === "undefined") throw this.name + " is not defined";
  return value;
}

nodes.SetVariableNode.prototype.eval = function(scope) {
  return scope.set(this.name, this.valueNode.eval(scope));
}

Getting and setting properties is handled by the two following nodes. One gotcha to note here. We want to make sure to not inject real JavaScript values, such as a string, number, true, null or undefined inside the runtime. Instead, we want to always return values created for our runtime. For example here, we make sure to return runtime.undefined and not undefined if the property is not set.

nodes.GetPropertyNode.prototype.eval = function(scope) {
  return this.objectNode.eval(scope).get(this.name) || runtime.undefined;
}

nodes.SetPropertyNode.prototype.eval = function(scope) {
  return this.objectNode.eval(scope).set(this.name, this.valueNode.eval(scope));
}

Creating a function is just a matter of instantiating JsFunction.

nodes.FunctionNode.prototype.eval = function(scope) {
  return new runtime.JsFunction(this.parameters, this.bodyNode);
}

Calling a function can take two forms:

  1. On an object: object.name(...). this will be set to object.
  2. On a variable: name(...). this will be set to the root object.

Here's the final touch on our interpreter.

nodes.CallNode.prototype.eval = function(scope) {
  if (this.objectNode) { // object.name(...)
    var object = this.objectNode.eval(scope);
    var theFunction = object.get(this.name);
  } else { // name()
    var object = runtime.root;
    var theFunction = scope.get(this.name);
  }

  var args = this.argumentNodes.map(function(arg) { return arg.eval(scope) });

  if (!theFunction || !theFunction.call) throw this.name + " is not a function";

  return theFunction.call(object, scope, args) || runtime.undefined;
}

It's Alive!

To put it all together now, we pass some code to the parser, who will return a tree of nodes, like we've seen before.

node = parser.parse(code);

Finally, we start the evaluation of our program on the top of the tree, passing the root (global) object as the scope in which to start its execution.

node.eval(runtime.root);

That's it!

Our own little tiny implementation of JavaScript is now able to execute simple programs such as:

// Create an object
var a = {};
a.x = "object";
var x = "global";

// Create a function referencing property `x`.
var f = function() { console.log(this.x) }
a.f = f;

a.f(); // => "object"
f(); // => "global"

And more complex ones involving nested functions like this:

var top = "top";

var f1 = function() {
  var f1Var = "f1 var";
  var f2 = function() {
    top = "top overridden from nested function";
    global = "global defined from function";

    f1Var = "f1 var modified from f2";
  }
  f2();
}

f1();

All because of our simple 60 lines runtime. How awesome is that?

Thanks!

I hope you enjoyed this! The goal, once again, was to help you create a mental model of how the JavaScript runtime works. Once you do, you won't ever be confused again about this, variable scoping and functions.

If you enjoyed my article, you'll also enjoy my book Create Your Own Programming Language and my class The Programming Language Masterclass. Both have helped thousands of developers create their first programming language, one of the first ones being CoffeeScript.

In the bonus video of the class (Live Plus package only), you'll get the full source code used in this article and I'll show you how the parser is implemented and complete the runtime by adding support for new, return, operators, inheritance, variable declaration hoisting and more.

Thanks again for reading and happy coding!

- Marc

Follow Marc on Twitter

See More Posts About:

JavaScript


Posted by Marc-Andre Cournoyer

LinkedIn Website