It’s JavaScript, surprisingly.
You’re an experienced application developer and you’re looking at JavaScript for the first time. You grok C, or one of its relatives like C# or Java, and it looks kind of the same. There’s all those doubled-up operators, code blocks in curly braces, array subscripts in square brackets, even the signature ternary operator, all present and correct. It’s even got Java in the name. This is going to be a breeze, right?
Well, maybe not. If you start out thinking that you are just bringing your Java or C# or C(++) skills to a different sweatshop then you will soon find that almost everything you need to write robust and maintainable code appears to be missing. How to create great software with such a toy? Well, in this case appearances are deceiving. There is a surprisingly serious programming language hiding in there. It’s just not the one that it looks like.
So, here are some of the bigger surprises packaged up in a convenient list.
this
Mean?
Depending on which half of the JavaScript moniker you were paying most
attention to, you might be surprised to discover that its variables are not
statically typed. That is to say that you don’t fix a type to a variable when
you declare it, and you are free to assign it values of different types at
different points in your code’s execution. Nothing like Java, in fact, but
very script-like. This makes perfect sense, if you think about it. What
characterises a scripting language is that the source code is executed
directly. There are no separate compilation and runtime phases, so the very
idea of compile-time errors has no meaning. This means that the kind of blooper
that strong typing is supposed to prevent can’t be picked up until run time,
when it’s already too late. So we might as well save ourselves the bother of
having to state — and stick to — the type of variables when we
declare them. Instead, all JavaScript variables are declared using the
keyword var
rather than the name of a type.
The fact that variables are untyped certainly does not mean that JavaScript has no types. Variables are symbolic names that are associated with values, and JavaScript values always have a type. Here is an exhaustive list of JavaScript types:
It’s not a very long list, and the first two look more like values than
types, though they really aren’t. You can slice this list up in other ways
too. The first five can be considered as simple types, even
string
, because they are all immutable. The last one,
object
, is almost embarrassingly mutable unless you prohibit it
by, for example, bringing down the wrath of Object.seal()
.
Object values can be thought of as reference types, if that helps. Should you
wish to return something from a function via a reference parameter then using
an object is your only choice. In practice, however, this is rarely necessary.
So that you can discover the type of a value at run time, JavaScript provides
a typeof
operator. Unfortunately, it sometimes lies. In
particular, it will tell you that a null
value is an object,
which kind of makes sense, but is wrong, and that NaN
(Not A
Number) is a number, which is absurd, but technically true.
The type object
might seem a bit generic, and it is. JavaScript
also intrinsically provides some specialised refinements of the plain object,
adapted to particular roles. These are:
Each has its own superpower, but in other respects they all act exactly the
same as any other kind of object, and the typeof
operator
identifies all bar Function
as such. Even there the mask slips
if you interrogate it with instanceof
.
(function(){}) instanceof Object // true
With the exception of Date
, each has a unique literal syntax
that you can use to create values without having to call a constructor. More
of that later.
JavaScript has ==
and !=
operators that you can use
to compare two values.
They don’t do what you think. The correct operators for testing equality and
inequality are ===
and !==
.
The shorter versions quite often give the same result as the longer ones, but
sometimes they don’t. When comparing two things that have both the same type
and the same value, the result is the same. If the two things have different
types, then ==
and !=
will perform type coercion to
try to find out if they are equivalent.
This is bad for several reasons. For a start, usually you actually want to
know about equality, not equivalence. Then there is the performance overhead
of that type coercion. And lastly, the coercion rules are difficult to
remember and can produce unexpected results. If you expected the string
"\t\n"
to be equivalent to the number 0
then clearly
not much surprises you.
Getting ahead of ourselves, the boolean
type can be thought of
as a very short enumeration comprising the two values false
and
true
. That’s pretty simple, but not quite as austere as
undefined
and null
, each of which is a set
containing just a single eponymous member.
When you declare a variable without initialising it, its value is
undefined
. Some less helpful languages will give you garbage
values for uninitialised variables, but JavaScript variables always have a
deterministic state. The language does allow some ambiguity, however.
Accessing a non-existent property of an object gives the value
undefined
, which is literally true, but so does accessing an
existent property that happens to have been assigned the value
undefined
. To make matters worse, undefined
is not
a keyword, but a global variable that hasn’t been initialised. Your
mischievous, malevolent or intellectually challenged co-worker is entirely
free to assign a value to it. If you want to be absolutely sure that this has
not happened, you can use the void
operator instead. That
void
is an operator, not a type, may surprise you. That it
simply takes an operand (of any type) and ignores it is frankly baffling.
Nevertheless, the expression void undefined
is guaranteed to be
definitely undefined, as is void "hi there"
or, more succinctly,
void 0
.
Assign the value null
to a variable when you want to signify that
it does not contain meaningful data rather than that you have forgotten to
initialise it. Because JavaScript variables are not statically typed, all
types are effectively nullable.
Apart from the two minimalist types undefined
and
null
, values of any type can indulge in
object-like behaviour, even when they are not really objects. String values
have a length property and inherit a number of useful methods such as
split
and slice
. More surprisingly, numbers and
boolean values have methods too. The expression true.toString()
is entirely valid, as is 65228.062255859375.toString(16)
.
The usual way to initialise a variable to a simple value is with the appropriate literal syntax.
var b = true;
var n = 42;
var s = "hello, world";
But it’s not the only way. You could use constructors instead.
var b = new Boolean(true);
var n = new Number(42);
var s = new String("hello, world");
Normally you wouldn’t do this, because the results actually are objects rather than simple values that happen to behave like objects. The fact is that JavaScript will silently convert simple values to their object equivalents as required, so you don’t need to bother.
While there is a boolean type, logical expressions can be built from values
of any type at all. The following are considered equivalent to
false
, and are often given the epithet falsy:
undefined
, null
, strings of zero length and the
numbers 0
and NaN
. Everything else is
truthy, that is, logically equivalent to true
.
If that isn’t surprising enough in itself, there’s more. The result of a
logical expression need not be boolean. The elements of the expression are
not coerced to false
or true
, but used as is.
Concretely, the value of a && b
depends on whether or not
a
is falsy. If it is, then the expression evaluates to the value
of a
; if not it is that of b
. Conversely, for
a || b
the result is the value of b
if
a
is falsy, or a
otherwise.
"" && 42 // ""
"JavaScript" && 42 // 42
"" || 42 // 42
"JavaScript" || 42 // "JavaScript"
You might be surprised to discover that JavaScript only has one type of
number, a signed 64-bit floating-point real that is known to some other
languages as double
. During bitwise operations the operands are
temporarily transformed into 32-bit integers and the result converted back to a
double. This is not terribly efficient, so it’s not something you are likely to
do often.
Numeric operations, along with conversions from other types to numbers, never
fail. If a numeric expression does not resolve to a number, then its value is
NaN
. No error is thrown.
One very surprising property of NaN
is that it is not equal (or
equivalent) to anything else, or even to itself.
NaN == NaN // false
You can use this peculiarity to test for NaN
.
var x = GetSomeNumber();
if (x === x)
{
// x is a proper number, not NaN
}
The language also boasts an intrinsic isNaN()
function. This can
trip you up if you are expecting it to return true if, and only if, its
argument is NaN
. In fact the devious blighter returns true for
anything that isn’t a number, not just specifically NaN
.
isNaN(NaN) // true, obviously
isNaN(42) // false
isNaN("forty two") // true, surprisingly
In the age of cultural imperialism, also called mid-to-late-20th century, a string was just an array of bytes containing ASCII values. JavaScript was born into a slightly more enlightened time and its strings are UNICODE, as it was then understood. We would now (slightly wrongly) call it UTF-16, which is to say that a JavaScript string is a sequence of 16-bit integers, each encoding one character.
There is no separate character type, so you are at liberty to delimit string literals with either single or double quotes. If you need to embed an example of your chosen delimiter inside the string, escape it with a backslash.
Most implementations allow you to access the individual characters of a
string by subscripting it like an array, a feature that was formally adopted
in ECMAScript 5. (If you prefer not to rely on this, use the
.charAt()
method.)
var s = "JavaScript";
var c = s[2]; // "v"
However, since strings are immutable, this access is read-only. You can assign a value, if you like, but the attempt will be silently ignored.
var s = "JavaScript";
s[2] = "w"; // no error, but s is still "JavaScript"
Since they are so central to JavaScript, we had better make sure that we understand what object means here. In many object-oriented languages an object is a block of heap-allocated memory containing a bunch of fields that hold instance data and a pointer to a thing called a vtable stuffed with references to the shared class methods. JavaScript’s objects are nothing like that. Fortunately that doesn't really matter.
In JavaScript, an object is an unordered collection of key-value pairs, called properties, with a secret link in its pocket to a prototype object, about which more later. It is very probably implemented as a hash table, should that kind of detail interests you. The keys are ordinary strings, and the values can be of any type, including other objects.
This is very like what some other languages call an associative array, and indeed an object’s values can be accessed by subscripting it with the property names. Imagine that object o represents a philosophical problem. It has properties named question and answer.
var q = o["question"]; // "life, the universe and everything"
var a = o["answer"]; // 42
You can use a variable for the subscript, if you prefer.
var s = "answer";
var a = o[s]; // 42
It is more usual to specify property names literally. Provided that the name obeys the rules for legal identifiers (starting with a letter, an underscore or dollar symbol, and containing nothing other than those characters plus the digits 0 to 9) and is not a reserved word, then a more natural syntax is allowed using a full stop.
var q = o.question; // "life, the universe and everything"
JavaScript’s objects are, unless you expressly forbid it with
Object.defineProperties()
or one of its chums, exceedingly
mutable. Adding a new property to an object is trivial.
o.author = "Douglas Adams";
If the named property already exists, it is overwritten. If it did not exist, it is created.
You can access a property that doesn’t exist without error. Its value is
undefined
.
var x = o.solution; // undefined
Testing that a property is undefined
is not, however, a
reliable proof of its non-existence. The following is legal:
o.ineffable = undefined;
var y = o.ineffable; // undefined
There is a reliable test, however, using the in
operator.
var b = "solution" in o; // false
var c = "ineffable" in o; // true
You can also use the keyword in
to iterate the names of an
object’s properties, a kind of reflection.
for (var p in o)
{
console.log(p); // question/answer/author/ineffable
console.log(o[p]); // life.../42/Douglas Adams/undefined
}
The order in which the property names are iterated is implementation-defined, but is usually the same order that they were added to the object.
Every object that you create has a hidden link to another object which acts as its prototype. You might think that a strange thing to call it, given that it’s actually a means to inherit properties. Perhaps executor would be a better name.
When attempting to read the value of a property from an object, JavaScript
doesn't just give up if no property exists with that name. Instead it moves
on up to the object’s prototype and searches that too. Should the
prototype have a property with the right name, its value is read and the
quest ends. Note that this is only true when reading properties.
Writing a property augments the original object, never its prototype.
This might mean duplicating a property that exists in the prototype.
Subsequent reads will get the object’s own version, effectively hiding that
in the prototype. Should you then change your mind and delete
the property, the one from the prototype will spring back into action, which
might be a surprise.
Since the prototype is itself an object, it too can have a prototype. If so,
searching for a property’s name continues up the chain until either it is
found or the end of the chain is reached. Where the name isn't found anywhere
in the chain, its value is undefined
.
An object only has a single prototype, so best not to think too much about multiple inheritance.
The link to an object’s prototype is not merely hidden, but immutable.
While the prototype object can be mutated, the link to it is set
for good when calling a constructor or Object.create()
.
There are other ways of creating objects, but they don’t give you any
choice at all as to what the prototype should be.
In the natural order of things all objects inherit directly or indirectly
from Object
. The only way to subvert that linkage is with
Object.create()
and a null argument.
While JavaScript is certainly object-oriented, it is completely class-free. To create an object, simply magic it up out of nothing using the literal notation, or call a function that returns an object.
Some functions are intended to be used as constructors. When called as a
constructor, a function always returns an object. Any function can be
dressed up as a constructor simply by prepending it with the new
operator, but unless it was designed for the clothes it might not
return anything useful.
JavaScript offers some built in constructors, one of which is
Object()
. You can use it to create an empty object like this:
var o = new Object();
And then you can add arbitrary properties to it in the usual way.
o.name = "Felix";
o.voice = "meeow";
o.legs = 4;
If you expect to make a lot of objects having the same morphology, you might want to get formal with a constructor function.
function Cat(name)
{
this.name = name;
this.voice = "meeow";
this.legs = 4;
}
var o = new Cat("Felix");
When an object is created by calling a constructor function, its hidden
prototype link is set to point to a property of the constructor
itself. As we shall see, all functions are actually objects, and constructors
are no different. Functions always have a property named
prototype
, and when called as a constructor the resultant
object’s prototype is this object. Apart from this inheritance, there’s
nothing special about a function’s prototype property. You can mutate it,
even replace it in its entirety with something else. The crucial point is
that all objects created with the constructor inherit from the same prototype
object.
Assuming that all cats have 4 legs and go "meeow", we can use this behaviour
to reduce the size of the objects created by the Cat
constructor.
function Cat(name)
{
this.name = name;
}
Cat.prototype.voice = "meeow";
Cat.prototype.legs = 4;
var o = new Cat("Felix");
var n = o.legs; // 4
In fact, if you need a cat with a different number of legs you can still use
this constructor and then modify the property after construction. Doing so
will add a new legs
property directly to the object, which hides
the (unmodified) property of the same name inherited from the prototype.
var o = new Cat("Lucky");
o.legs = 3;
var n = o.legs; // 3
var m = Cat.prototype.legs; // 4
The Object()
constructor provides a familiar-looking mechanism
for creating objects, but it is a bit verbose, and it doesn’t allow us to
initialise the object’s properties at creation time. It is often more
convenient to create an object complete with some or all of its properties
in one single operation, using a literal expression.
Object literals are surrounded by curly braces. The following two lines of code are equivalent.
var o = new Object();
var o = {};
The equivalence even extends to the fact that in both cases the resulting
object inherits from Object
.
Adding properties to the declaration is simple. Simply put a comma-separated list of name: value pairs between the braces.
var point = { x: 6, y: 9 };
var left = point.x; // 6
var top = point.y; // 9
Property names that are reserved words or not legal identifiers have to be quoted.
var scores = { "try": 5, conversion: 2, "drop goal": 3 };
The least impressive superobject is date, which will give you the time of day.
And the current date. There is only one way create a date object, and that is
using the Date()
constructor. There is no literal syntax for
dates. When you call the Date()
constructor without arguments,
its return value is an object representing the current system date and time.
This is the only way to consult the system clock.
At one level the array object is unimpressive too. Aside from a bunch of
useful array manipulation methods inherited from its prototype, the only
thing special about it is its length property. This has magical powers.
JavaScript doesn’t actually have anything like what most programmers would
understand as an array type, but it fakes it well enough to be useful. Recall
that you can access an object’s properties using square bracket notation, and
that this works even for names that would not otherwise be legal. That
applies for numeric strings, and, since the notation expects a string
indexer, numbers are silently converted to their string representation. This
is to say that a[42]
is the same thing as a["42"]
.
Now, if a
is an instance of an array object, and you are
writing a property value, and the property name is a non-negative integer,
then JavaScript notices. It compares the property name to the current value
of the length
property and updates it if equal or greater.
var a = new Array();
var n = a.length; // 0
a[42] = "Marvin";
var m = a.length; // 43
Conversely, setting the length
property to something smaller
causes truncation, i.e. the loss of any properties having names that are
integers equal to the new length or larger.
a.length = 9;
var p = a[42]; // undefined
It is never necessary to call the Array()
constructor, because
JavaScript offers a literal syntax for defining an array object. Simply list
the array values sequentially between square brackets.
var a = [ 6, 9, 13, "joke", new Date() ];
var n = a.length; // 5
Being nothing but ordinary objects with a fancy length property, the elements
of arrays do not need to be of the same type. Neither do they need to be
contiguous. The length
property does not actually tell you for
sure how many elements are in the array, just what the name of the last one
is, plus one. Since you are free to add properties that are not non-negative
integers, and these have no effect whatsoever on the length, even that is not
reliable.
You probably think you know what a function is. You’re probably wrong. In
JavaScript a function is just an otherwise quite ordinary object that happens
to have the ability to execute code. If you find this difficult to believe
then consider the fact that there is a Function()
constructor
for you to abuse. You shouldn’t, though, because the function literal
syntax is so much more natural, and so much less of a security risk.
The function literal syntax comprises the keyword function
followed by a (possibly empty) parameter list between parentheses, and then a
sequence of JavaScript statements enclosed inside curly braces. There can be
a name between the word function
and the parameter
list. When not part of a larger expression, the name is required and the
whole contraption defines a function object of that name in the current
scope. So far, so normal. But a function literal can be part of a larger
expression, with or without a name. It can be an argument to, or the value
returned from, another function. You can assign it to a variable. This last
is similar to, but subtly different from declaring the function in the
familiar way. The following two statements both create a function object
named square in the current scope.
function square(n) { return n * n; }
var square = function(n) { return n * n; };
The difference between them is that the declared version is defined
everywhere in the current scope. You can call it before the declaration with
no error. The version that is assigned to a variable is not defined before
the var
statement. (Strictly speaking, because of a phenomenon
known as variable hoisting, square exists everywhere in the current
scope, but has the value undefined
before the var
statement.)
Whichever way it is defined, you call square the same way: invoke it by name followed by zero or more arguments within parentheses.
var x = square(42); // 1764
Being an object, a function has a prototype from which it inherits
various methods. For example, the .toString()
method just
returns the source code of the function. Most objects have a
.toString()
method, but only functions come equipped with
.call()
and .apply()
methods that serve as
alternative ways to invoke them. More about those later.
Our last superobject also has the ability to be executed. Unlike functions,
though, the code that is executed by a RegExp
is not JavaScript,
but a regular expression. Whole books have been written about
regular expressions, which tells you a lot about just how mind-botheringly
impenetrable they are. So there is not enough space here to describe that
language beyond the usual guff about performing powerful pattern matching
on strings.
You can create a regular expression object using the RegExp()
constructor, which can be useful if you need to build the expression on the
fly. Otherwise it is conventional to use the literal form, which is delimited
by forward slashes. Since regular expression syntax makes use of the
asterisk, this means that you can’t safely use /*
and
*/
to comment out code.
Declared inside a function, a variable is local to it. Code outside of the function cannot see or interact with it. You can also declare variables outside of any function. They are globally visible and accessible to all JavaScript code embedded within or linked to the same web page. That goes for third party libraries too, so if you put sensitive information in global variables you deserve everything you get.
Of course, being a seasoned programmer you avoid global variables like a cliché. But if you assign to a variable that is declared neither globally nor locally then a global variable is implicitly created, with no declaration. This is probably not what you want, so avoid doing it deliberately.
Assuming that your JavaScript code is in one or more files separate from your HTML
document, you might imagine that there is a module scope, local to each source
code file. I have to tell your that there is not. If a variable isn't declared
locally then it is global to the entire web page and visible to code from any linked
file. You don’t need me to tell you that global variables are evil, and will
be wondering how you can write sensible code without some notion of module. If
only we could create and execute a global function without chucking its name
into the global namespace... but of course, we can. In the previous section we
saw that a function literal forming part of an expression need have no name.
We also saw that a function object can be invoked, either by appending
parentheses to it (with arguments inside) or by calling its .call()
or .apply()
method. There are many ways that you can turn a
function declaration into an expression. One commonly seen is to surround the
definition with parentheses.
(function () { /* put module code here */ });
OK, you have created a function object without polluting any namespace. Unfortunately there is now no way of executing it, so the utility is limited. For the purpose of creating module scope we only want to execute the function once, so we can invoke it immediately at the point of definition.
(function () { /* put module code here */ })();
Notice the final pair of parentheses? Personally I find this kind of
punctuation pile-up ugly and confusing. That’s why I prefer to invoke a module
function using the .call()
method.
(function () { /* put module code here */ }).call();
Also, the parentheses surrounding the function definition are only there to
make it an expression rather than a declaration. There are prettier ways of doing
that too. You could, for example, prefix it with a unary operator, such as
+
or -
. I find that looks odd, so I like this
better:
void function () { /* put module code here */ }.call();
Having ensured that the function is part of an expression, it doesn’t matter whether or not it has a name. We can use the optional name simply as documentation.
void function module() { /* put module code here */ }.call();
Fine, but what code shall we put inside the module? Well, any variables declared in the module are local to it, but shared by any functions that are also declared inside. Yes, in JavaScript you can nest functions inside other functions. Doing so creates a new scope, completely local to the nested function, but its code can also see and manipulate variables in the outer scope of the containing function.
I just asserted that the only kinds of scope available to you are global and function-local. So how come this is allowed?
function factorial(x)
{
var fact = 1;
for (var i = 2; i <= x; ++i)
{
fact *= i
}
return fact;
}
Surely the variable i
is local to the for
loop?
You might think so, but you would be wrong. No matter where you put your
variable declaration, it is as though it were at the top of the function and
it exists everywhere inside the function body. Note, though, that if the
declaration also initialises the variable, that initialisation does not take
place until the declaration is encountered. So this code:
function foo()
{
console.log(x); // undefined
// ...
var x = 100;
console.log(x); // 100
}
is equivalent to this:
function foo()
{
var x;
console.log(x); // undefined
// ...
x = 100;
console.log(x); // 100
}
This is called hoisting. Since it happens anyway, you might as well explicitly declare all your variables at the top of the function.
Incidentally, it is perfectly legal to declare a variable more than once in the same scope. If the declarations also initialise the variable, the second and subsequent examples are treated as simple assignments.
If you want to simulate block scope, one way is to define a nested function
and immediately execute it. Another is to make use of the much-maligned
with
statement. This inserts an ordinary object at the head of
the scope chain, so that you don’t have to qualify its properties with the
variable name. Conventional wisdom states that you should avoid doing this,
because the behaviour can be unpredictable.
function danger(foo, bar)
{
with (foo)
{
console.log(bar); // depends on whether foo.bar exists
}
}
But if you use an object literal, then it is obvious what properties exist.
function safe()
{
var foo = "lolly", bar = "stick";
with ({ foo: 99, bar: "flake" })
{
console.log(foo); // 99
console.log(bar); // flake
}
}
The object literal becomes the local scope of the block defined by the
with
statement. Which can be handy.
Like most control statements, the effect of with
is restricted
to the single following statement or group of statements enclosed in curly
braces. One very common use of block scope in languages that support it is
to define the index variable of a for
statement. We can fairly
succinctly emulate that thusly:
var a = [];
a.shift("everything");
a.shift("the universe");
a.shift("life");
with ({ i: 0 }) for (; i < a.length; ++i)
{
console.log(a[i]); // life/the universe/everything
}
console.log(i); // error: undefined variable
To create a module we defined a function expression and then immediately executed it. Surely once that execution finishes, the module’s variables go out of scope and disappear? And if that happens, what will the nested functions inside the module have to work with? In many programming languages, local variables are just names for memory locations in the system stack. When a function exits, the stack pointer is moved past those location and they are available for reuse elsewhere. JavaScript is nothing like that. Not only is a JavaScript function an object, but so is its invocation. Calling a function results in the creation of a context object which is inserted at the head of the scope chain. Calling a nested function just inserts its context in front of the container’s. And at the end of the chain is the global object. It’s quite a lot like the prototype chains attached to objects. To find a variable the runtime first looks for its name in the context of the currently executing function. If it is not found there, it moves up the chain and tries again. For an assignment, failure to find the name in any object along the way will result in the creation of a global variable, which is to say a property of the global context.
None of these context objects are named, and with the exception of the global
object, the only way to mutate them is by declaring and assigning values to
variables. This is a syntactic sleight of hand, as we can see in the case of
the global object. While it doesn’t have a name, it does have a property,
named window
, which refers to itself. This means that you can
directly access other global variables as properties of window
.
Since you don’t really want to have loose code floating outside of any module
scope, and since assigning values to variables that have not been declared looks
like an error, the best way of creating global variables is as properties of
window
.
window.myglobal = 666;
var local = myglobal; // 666
Objects remain in memory all the time that there are references to them. If a nested function refers to a variable in some containing scope, then the container object stays alive for as long as the nested function is itself accessible. Functions’ contexts remain accessible while there is any reference to them, directly or indirectly, in the global context. This includes event handlers attached to DOM elements.
When you create an external reference to a nested function, and so prolong the life of the container function’s context object beyond that of its execution, it is known (for reasons that are best described as obscure) as a closure. Some coders fear closures and suspect them of being little more than memory leaks. In fact, since you pretty much require closures to avoid dumping everything into global scope, it’s a good idea to get used to them.
It is quite common to want to share event handling code between several HTML elements, while associating some unique data to each. Let us imagine that we want to log the index number of one of a group of buttons when it clicked.
function AddButtonClicks()
{
var container = document.getElementById("button-container");
var buttons = container.getElementsByTagName("button");
var i;
for (i = 0; i < buttons.length; ++i)
{
buttons[i].click = function()
{
console.log("button " + i + " clicked");
}
}
}
There are four buttons. We are disappointed to discover that no matter which
one we click on, the logged message is always button 5 clicked.
Why is this? Well, we created a closure all right, but only one. All four
buttons’ click event handlers hold a reference to a variable named
i
in the same execution context. And by the time that the click
handler is invoked, the for
loop has finished, and so the value
of i
is 5. We correct this by making a handler factory function,
and so create a different closure for each button.
function AddButtonClicks()
{
var container = document.getElementById("button-container");
var buttons = container.getElementsByTagName("button");
var factory = function(index)
{
return function()
{
console.log("button " + index + " clicked");
};
};
var i;
for (i = 0; i < buttons.length; ++i)
{
buttons[i].click = factory(i);
}
}
A closure exist when a reference to a function execution context lives on after that function has finished executing. In the example just given, such references are embedded inside the handler functions attached to click events on some DOM nodes, which are effectively global variables.
Earlier we saw how to create a code module using an immediately executed function. Assuming that we want other code outside of the module to be able to call some functions inside it, we are going to have to somehow link our public interface to the global scope. It might be enough that the module attaches some event handlers to the DOM — after all, everything ultimately gets triggered by some kind of event — but this is not always convenient, and in the case of library code intended to be consumed by callers unknown to the author, not really practical. A solution is to create a single global variable and attach all of our exported functionality to it. Properties of that global variable that are functions defined inside the module continue to have access to variables private to it, and this implicitly creates a closure.
When you write a function you decide what parameters the caller should
provide, and you give names to them within the parentheses following the
function
keyword. Inside the body of the function the parameters
are conveniently referred to using these names, which are in all other
respects just ordinary local variables.
What if the caller omits to provide parameter values?
It’s not an error. If your function defines two parameters and is called with
only one argument, the value of the second parameter is
undefined
. You can test for that and either substitute a valid
default value (assuming that undefined
is invalid) or throw an
error.
Since undefined
is logically false, one way of ensuring a valid
value is using boolean operators. For example, if we had a function to create
a string of fixed with n, allowing the caller to specify the
character to repeat n times and defaulting to spaces, we could write
this.
function repeat(length, character)
{
return new Array(length + 1).join(character || ' ');
}
What if the caller provides too many arguments?
This too is not an error. The parameters will be assigned values in the
expected order, and the additional arguments are silently ignored. They are
still accessible, however, because all functions get passed a bonus
pseudo-variable having the not terribly original name arguments
.
This is an array-like object, meaning that its members are indexed
numerically and it has a (non-magical) length
property, but it
does not inherit from Array
. That is unfortunate since some of
the array methods would definitely be useful. Oh well. That does not prevent
you from iterating over all the argument values, whether or not they line
up with any named parameter, and so you can write functions that take
variable-length parameter lists.
So, inside a function you can tell how many arguments were actually passed.
On the other side of the invocation, the calling code can know how many
arguments the function expected by inspection of its own (also non-magical)
length
property. See, I told you that functions are objects.
They have properties too.
this
Mean?
How a function behaves in respect of the intrinsic this
pseudo-variable depends not on how or where the function was defined, but on
how it is invoked. So this
can be something different
between two invocations of the same function.
In the trivial case, where the function is invoked by naming it and appending
a parameter list between parentheses, the value of this
is the
global object, which is not especially useful. But then again, for such an
archetypally procedural pattern, we should have no real expectation of
this
being useful at all.
Simply prepending the invocation with the keyword new
changes
things radically. In this situation, this
is an entirely new and
virgin object instance which the function may initialise as it sees fit. When
the function ends, the value of this
is returned to the caller.
Well, usually. The point of this behaviour is that it allows a function to act as a
constructor. Now, if the function explicitly returns a value, and if that
value is an object, not a simple type, then this
is ignored, and
the explicit value is returned instead. Note that doing it like that means
that it doesn’t matter whether or not the caller used the new
operator, the same object is returned either way. With new
the
function acts like a constructor, without it the behaviour is that of a
factory. Also, without the new
decoration, the object does not
inherit from the prototype
property of the factory function.
Always remember that functions are objects, and objects can be properties of
other objects. When a function is invoked as a property of another object it
is considered to be a method of the object and the value of
this
is the object itself. You can define methods directly on an
object (or in a constructor), but you don’t have to. You can borrow a
function defined elsewhere and stick it on to any old object to create a
method.
var foo = { name: "foo" };
var bar = { name: "bar" };
var func;
// create a method directly on object foo
foo.method = function() { console.log(this.name); };
// borrow the same method for object bar
func = foo.method;
bar.who = func;
// invoke methods
foo.method(); // foo
bar.who(); // bar
// invoke as a simple function
func(); // (probably an empty string)
That last procedural invocation of func() has the global object as its value
for this
. The global object does actually have a
name
property, but its value is likely to be an empty string
unless the containing web page was opened by some external JavaScript code.
If we add a method to an object inside its constructor, that method is a
nested function and can access the local variables of the constructor. We can
use that fact to create a closure with which to hide private data but still
allow users of the object to manipulate them programmatically. This is in
stark contrast to adding methods to the prototype of a constructor. They
can only access the public properties of this
, not the variables
private to the constructor.
function Sequencer(value)
{
// value is private, and can't be changed
// by the caller after construction
this.next = function() { return value++; };
}
var o = new Sequencer(42);
console.log(o.next()); // 42
console.log(o.next()); // 43
// etc.
Finally, we can choose for ourselves what the value of this
should be at the moment of invocation. Do this by using the .call()
or .apply()
methods that are inherited by all instances of function
objects. Whatever you supply as the first parameter to .call()
or .apply()
will be used as the value of this
inside the function. Any subsequent parameters to .call()
are
passed on verbatim to the target function, whereas .apply()
expects to receive an array as its second argument, the elements of which are
separated out and fed to the function as individual arguments. If you supply
fewer than two parameters, then .call()
and
.apply()
are equivalent.
For the most part the built-in objects, such as the predefined constructor functions, are no different from any other object. You are free to add new properties to them and change or even delete existing ones, though this might make you unpopular with your co-workers.
Extending the functionality of the built-in objects is most naturally achieved by adding methods to the constructor’s prototype. A couple of simplistic examples follow. That's all they are, examples. Don’t even think about suing me over their defects.
We can enumerate the properties of an object using the
for ... in ...
syntax, but we can’t control the order in which
the properties are visited, which is implementation-defined. So let’s add a
means of getting all the properties as name-value pairs in alphabetical
order.
Object.prototype.ordered = function()
{
var obj = this; // the object that we are ordering
var sorted = Object.keys(obj).sort();
var mapper = function(key)
{
return { name: key, value: obj[key] };
};
return sorted.map(mapper);
};
Prototypical inheritance can be tricky to understand and no easier to implement using the tools at hand. Let’s make it a bit simpler for one constructor to inherit functionality from another.
Function.prototype.inherit = function(from)
{
var proto = Object.create(from.prototype);
// inherit from an object that inherits from the base prototype
this.prototype = proto;
// borrow the base constructor’s instance methods and data
proto._base = function() { from.apply(this, arguments) };
};
To use it, write a constructor function. In it, call
this._base()
with whatever parameters are necessary to create an
appropriate instance of the base class from which you are inheriting. Then
call the .inherit()
method of your new constructor, passing to
it a reference to the constructor function you are inheriting from.
function Animal(legs, voice)
{
this.legs = legs;
this.voice = voice;
}
function Cat(name)
{
this._base(4, "meeow");
this.name = name;
}
Cat.inherit(Animal);
var o = new Cat("Tom");
console.log(o.voice); // meeow
Sometimes it’s possible to try too hard. We've already seen how JavaScript “helpfully” creates global variables implicitly when we forget to declare them. I suppose the idea was to forgive the errors of noob coders hacking unnecessary visual effects into garish web pages. Another mistake that JavaScript indulges is forgetting to terminate statements with a semicolon. If the interpreter discovers a newline where a statement could legally end, it sometimes inserts a virtual semicolon for you.
Quite a lot of the time, this accords with what you intended anyway. Sometimes it doesn’t. JavaScript can't know one way or the other, so occasionally automatic semicolon insertion breaks your code, usually in subtle and/or inconsistent ways.
In particular, automatic semicolon insertion means that you can't separate a
return
statement from the value you are returning. Suppose your
function returns an object representing a point in Cartesian space.
function PolarToCartesian(dist, angle)
{
return { x: dist * Math.cos(angle), y: dist * Math.sin(angle) };
}
You might decide that it would be clearer to spread that object literal over separate lines, like this:
function PolarToCartesian(dist, angle)
{
return
{
x: dist * Math.cos(angle),
y: dist * Math.sin(angle)
};
}
Seems reasonable. Unfortunately, this function now insists on returning
undefined
no matter what arguments you supply. The reason being
that JavaScript inserted a semicolon immediately after the
return
statement. The fact that an object literal follows over
the next four lines is neither here nor there: it is a valid expression, even
though it does nothing useful.
Some JavaScript coders consider this unfortunate behaviour sufficient justification for adopting wholesale the monstrous K&R style of brace placement.
function PolarToCartesian(dist, angle) {
return {
x: dist * Math.cos(angle),
y: dist * Math.sin(angle)
};
}
Those of us who prefer to find our braces in predictable places will find some other workaround. Perhaps the simplest is to assign an intermediate variable.
function PolarToCartesian(dist, angle)
{
var point =
{
x: dist * Math.cos(angle),
y: dist * Math.sin(angle)
};
return point;
}
Either way, it's a shame to have to junp through such hoops. There is a school of thought that embraces automatic semicolon insertion to the extent of never writing any at all. This requires knowledge and understanding of the rather odd automatic insertion rules and a willingness to abandon the habits learned from other C-like languages.
Not all surprises are nice.