Identifier algorithms

Introduction

This document discusses the internal representation of lexical environments and the exposed algorithms for identifier handling, taking into account the internal representation.

The goal of the compiler is to convert almost all identifier related accesses to register accesses and to avoid the explicit creation of environment record objects whenever possible. For full compliance, a "slow path" is needed, and is provided by the environment record objects and the identifier algorithms described here. The slow path is optimized for a compact code footprint, not speed.

Identifier handling encompasses a large part of the E5 specification and the implementation:

Variable and function declarations in global code, eval code, and function code (compile time)
Function argument list declarations (compile time)
Establishing execution contexts (calls to global code, eval code, and function code; run time)
Establishing other lexical environments (catch blocks, with blocks; run time)
Reading identifier values and typeof (run time)
Assigning values to identifiers (run time)
Deleting identifier bindings using delete (run time)
Special argument name bindings for the arguments object (run time)

These have many complications from an implementation point of view:

Many things are processed during compile time (and are, in fact, required to be processed compile time)
Identifiers may be bound to compiled function registers (preferred) or to explicit environment record objects (fallback)
Register-bound identifiers need to be moved to an explicit environment record object if they may be referred to later (closure)
Establishing execution contexts is rather tricky and varies between call types

Specification notes

Key parts

Key parts of ECMAScript E5 discussion identifier handling include Section 10 (Executable Code and Execution Contexts):

Sections 10.2.2.1 and 10.3.1 cover identifier lookup (GetIdentifierReference)
Sections 10.4 and 10.5 cover execution context initialization for global code, eval code, and function calls
Section 10.6 covers the arguments object initialization (for function calls)

Other relevant sections include:

Section 11.4.1: the delete operator, which is used for variable deletion (and property deletion)
Section 11.4.3: the typeof operator
Section 12.10: the with statement
Section 12.14: the try statement, special handling of catch
Section 13: function declarations and function expressions
Section 15.1.2.1: eval (x), defines the concept of a direct call to eval

Key concepts

Lexical environment

: Consists of an environment record and a reference to an outer lexical environment.

The current implementation makes no distinction between a lexical
environment and the environment record contained in it: they are
represented by the same object. Hence, the term \"environment
record\" is used loosely below to refer to the lexical environment
and/or its environment record.

The *outer lexical environment* is also referred to as the
environment record\'s *parent* below.

Environment record

: An environment record is either a declarative environment record or an object environment record.

Declarative environment records hold variable bindings in an
internal map not accessible from user code.

Object environment records hold variable bindings in an ECMAScript
object. They are used for handling global program execution for
`with` statements.

Execution context

: Consists of (note the confusing names and terminology here):

1.  `LexicalEnvironment`, a lexical environment (an environment
    record) for processing variable lookups,
2.  `VariableEnvironment`, a lexical environment (an environment
    record) for processing variable and function declarations, and
3.  a `this` binding

The two lexical environments are usually the same, except for:

-   `with`, which establishes an object environment record for
    variable lookups, but keeps the previous one for declarations
-   `catch`, which establishes a single-variable declarative
    environment record for the variable containing the error

Note the confusing terminology, especially the use of \"lexical
environment\" as a general term and `LexicalEnvironment` as a
component of an execution context. Below, we refer to these loosely
as \"lexical environment\" (for `LexicalEnvironment`) and \"variable
environment\" (for `VariableEnvironment`).

Immutable binding

: A binding whose value cannot be changed after initialization. An immutable binding cannot be deleted.

Mutable binding

: A binding whose value can be changed. A mutable binding may be deletable or non-deletable.

Implicit this value

: A binding may be associated with an implicit this value. If the value is not provided, undefined is used in its place.

Declarative environment records never provide an implicit
`this value`. Object environment records for which the `provideThis`
internal flag is set provide an implicit `this` value.

Unsorted notes

Almost all bindings in the E5 specification are non-deletable mutable bindings, except:[^1]

declaration in eval: function and variable declarations in eval code are deletable mutable bindings
"arguments" binding: the arguments binding for strict mode function is an immutable binding
function name in a function expression: in a function expression function f(args) { body }, the identifier f is an immutable binding for the function body; this is not true of function declarations

There are no practical cases where an immutable binding would be accessed in an uninitialized state:[^2]

Immutable bindings are created with CreateImmutableBinding and initialized with InitializeImmutableBinding
For arguments, creation and initialization are consecutive, there is little chance of failure (except "out of memory" like conditions); even in failure cases, the intermediate uninitialized binding is not accessible to program code
For function name in a function expression, creation and initialization are not consecutive but the same applies as for arguments: an uninitialized binding is never exposed to program code (the function expression is not "returned" until its initialization is complete)
As a result, the implementation does not need to handle lookups for uninitialized immutable bindings, implied for declarative environment records in E5 Section 10.2.1.1.4, step 3

Implicit this value is relevant only for:[^3]

Object environment records initialized in a with statement.
Function call using an identifier reference: a possible implicit this value is used for the this binding (similarly to how a function call using a property reference binds this to the property base object).

New declarative environment records are created for:[^4]

Entering strict eval code (with a direct eval call)
Entering function code
Entering catch clause
Evaluating a named function expression

New object environment records are created for:[^5]

Entering with statement body
Implicitly for the global object environment record

Establishing an execution context (= calling)

Different cases for calling:

Function [[Scope]] initialization is described in E5 Section 13:

For function declarations, the [[Scope]] internal property is initialized to the variable environment of the execution context containing the declaration. This is the same as initializing [[Scope]] to the lexical environment of the containing context, because function declarations are only allowed at the top level. If function declarations are allowed in other places (than the top level), it makes more sense to initialize [[Scope]] to the current lexical environment.
For anonymous function expressions, the [[Scope]] internal property is initialized to the lexical environment of the execution context containing the expression.
For named function expressions, a new declarative environment is created whose parent is the lexical environment of the execution context containing the expression. The new environment will be initialized with an immutable binding for the name of the function, and will then be used as the [[Scope]] of the function object.

Environment record implementation

Specification

Environment records are described in E5 Section 10.2.1. They define how identifiers are bound to values in an ECMAScript execution context. Bindings can be established in several ways:

variable and function declarations in the global scope
variable and function declarations in eval code
variable and function declarations in functions
catch clauses
with statements

An environment record maintains bindings to a certain set of identifiers, and may have (at most one) parent record (called outer environment reference) with further bindings.

There are two types of environment records:

declarative environment record: binds identifiers to values directly
object environment record: refers to an external ECMAScript object which contains the variable bindings

Declarative environment records are the ordinary way of binding identifiers. The underlying map of identifiers to values is only visible to the program through variable accesses. For object environment records, the identifiers are bound through an ordinary ECMAScript object (referred to by the object environment record); object environment records are used e.g. in the global propgram scope and in the with statement.

An ECMAScript E5 execution environment maintains two environment record references (E5 Section 10.3):

A variable environment refers to an environment record which is used for function and variable declarations
A lexical environment refers to an environment record which is used for variable lookups (including deletes)

Very often these two refer to the same environment record. A new, nested lexical environment can be created e.g. through a with, a catch, a nested function, etc.

Internal representation

The current implementation makes no distinction between a lexical environment and the environment record it refers to. Instead, a combination of a lexical environment, its environment record, and a possible outer lexical environment reference is implemented as a single internal object which is not directly visible to program code.

An outer lexical environment reference (the "parent record") is identified using the internal prototype field of the object. This does not mean that ordinary property lookups (with automatic prototype walking) are used for variable lookups: the internal prototype is simply a convenient place to maintain the parent relationship, having an easily accessible slot, being already visible to garbage collection etc.

Declarative environment record

A declarative environment record object maintains identifier mappings in two ways:

directly as properties of the environment record
by identifying an active activation record ("open scope") and a set of identifiers bound to the registers of that activation record

In the latter case, the environment record identifies the thread, the function, and the "register base" of the activation record in the thread's value stack. These allow an identifier to be mapped to a value stack entry as follows:

The function's variable map (if it exists) is consulted to find the identifiers register number, relative to the bottom of the activation. (Of course, the identifier might not be mapped.)
The "register base" of the environment record identifies the absolute value stack index of the activation record's frame bottom, and is added to the relative register number to get an absolute value stack index.

Note that the actual activation record does not need to be identified, we just need the register base to resolve absolute value stack index of a variable. However, this approach does not allow validation of the resulting value stack index (e.g. to verify that it is indeed inside the activation record's frame).

When a declarative environment is "closed", identifiers bound to activation registers are copied to the internal environment record object as plain properties (with the help of the callee's _Varmap). The internal control fields are updated to indicate that the environment is closed, so that later lookups can skip open scope lookups which would now reference stale value stack indices.

The variables mapped as properties have their attributes set as follows:

[[Enumerable]]: does not matter as the properties are not visible to the program (currently set to true)
[[Writable]]: set to true for mutable bindings, false for immutable bindings
[[Configurable]]: set to true for mutable bindings which are also deletable, false for non-deletable mutable bindings and immutable bindings

Register-bound identifiers are assumed to be non-deletable mutable bindings: register bindings cannot be easily deleted (the bytecode may refer to them directly), and protecting register bound identifiers from being modified would require some control information we don't have. As a result, other types of bindings cannot be mapped to registers (i.e. declarations inside eval, arguments binding, and the name binding of a named function expression) and require an explicit environment record object.

Object environment record

An object environment record refers to an external ECMAScript object (visible to the program) "backing" the identifier bindings. The target object is identified with an internal control property.

The internal object is initialized with:

Object class set to DUK_HOBJECT_CLASS_OBJENV
Internal prototype referring to outer environment record
Internal control property: _Target

Identifier lookups proceed to the _Target object while the parent environment record is identified by the prototype of the environment record (not of _Target).

Example internal environment record objects

Let's consider an environment with:

the global object (outermost)
a declarative environment (of a function, scope is open)
an object environment (e.g. of a with clause)

This could happen in this example:

// global scope: a variable lookup here would use 'record1'

var with_object = { "xyz": "zy" };

function func() {
  var foo = 'my foo';
  var bar = 'my bar';

  // in the example below, we assume this variable was not mapped
  // to a register for some reason (wouldn't ordinarily happen)
  var quux = "a non-register binding";

  // variable lookup here would use 'record2'

  with (with_object) {
    // variable lookup here would use 'record3'
  }
}

The objects could be roughly as follows; leading underscore indicates an internal value not visible to the program, double leading underscore indicates internal properties stored directly in internal structures outside the property table, e.g. __prototype denotes internal prototype:

global_object = {
  "NaN": NaN,
  "Infinity": Infinity,
  "undefined": undefined,
  "eval": [function],
  "parseInt": [function],
  ...
}

func = {
  // "foo" maps to reg 0, "bar" to reg 1, "quux" not register mapped
  "_Varmap": { "foo": 0, "bar": 1 },
  ...
}

with_object = {
  "xyz": "zy"
}

record1 = {
  // Flag DUK_HOBJECT_CLASS_OBJENV set
  "__prototype": null,
  "__target": global_object,    // identifies binding target
  "__has_this": false
}

record2 = {
  // Flag DUK_HOBJECT_CLASS_DECENV set
  "__prototype": record1,
  "__varmap": varmap,  // _Varmap of target function (name-to-reg)
  "__thread": thread,  // identifies valstack
  "__regbase": 100,   // identifies valstack base for regs
  "quux": "a non-register binding"

  // var "foo" resides in value stack absolute index 100 + 0 = 100,
  // var "bar" in absolute index 100 + 1 = 101
}

record3 = {
  // Flag DUK_HOBJECT_CLASS_OBJENV set
  "__prototype": record2,
  "__target": with_object,
  "__has_this": true
}

Once again, the compiler strives to avoid creating explicit environment records whenever possible. In the example above the compiler won't be successful: the object environment required by with also causes the declarative environment (record2) to be created.

Functions and environment records

A function object (duk_hcompfunc) records the following conceptual variable access related information:

A variable map, which maps an identifier to a register number. Ideally all function variables / functions are register mapped, but this is not always possible.
Control data for initializing the lexical and variable environments of new invocations.

There are separate function template objects (which are essentially functions without an outer environment) and concrete function instance objects. The detailed properties vary a bit between the two.

More concretely:

The DUK_HOBJECT_FLAG_NEWENV object level flag, and the lex_env and var_env fields of duk_hcompfunc control activation record lexical and variable environment initialization as described below.
The internal property _Varmap contains a mapping from an identifier name to a register number relative to the activation record frame bottom.
The internal property _Formals contains a list of formal argument names.
Template function objects, used for creating concrete function instances, use DUK_HOBJECT_FLAG_NAMEBINDING flag to indicate that the template represents a named function expression. For such functions, the function name (stored in name) needs to be bound in an environment record just outside a function activation's environment record.
The compiler attempts to drop any fields not required from compiled objects. In many common cases (even when dynamic variables accesses cannot be ruled out) no control fields are required.

The detailed handling of these is documented as part of the source for:

Creating function templates in the compiler
Instantiating function instances from templates (closures)

Notes:

Environment record initialization is done only when (if) it is actually needed (e.g. for a function declaration). It is not created unnecessarily when a function is called.
The DUK_HOBJECT_FLAG_NEWENV is set for ordinary functions, which always get a new environment record for variable declaration and lookup. The flag is cleared for global code and eval code; or rather functions compiled from global code and eval code.
Unlike Duktape, ECMAScript does not separate the compilation and execution of global code and eval code. Hence the handling is seemingly a bit different although the outcome should be the same.

Preliminary work

GETIDREF

The GetIdentifierReference specification algorithm walks the chain of environment records looking for a matching identifier binding. It returns a "reference" value which can act as both a left-hand-side and a right-hand-side value. The reference identifies (1) an environment record, (2) an identifier name, and (3) a strictness flag.

When using the reference as a left-hand-side value, the environment record is written to. For declarative records, this is a write to internal object (an internal duk_hobject in Duktape) or a value stack entry (a duk_tval for register bound identifiers). For object environment records, this is a property write (a [[Put]] call) to a user visible object, possibly invoking a setter call.

When using the reference as a right-hand-side value, the environment record is read from. For declarative records, this is a read from an internal object or a value stack entry (for register bound identifiers). For object environment records, this is a property read (a [[Get]] call), possibly invoking a getter call.

Note that the storage location of any identifier value is conceptually an external or an internal object. However, the concrete implementation is a bit more complicated: an identifier value may be either stored in an object or in an activation's register file, i.e., a particular entry in the value stack of some thread. This needs to be taken into account when representing a "reference" type internally.

A close analog of GetIdentifierReference is currently implemented as a helper function (referred to as GETIDREF in this document). GETIDREF returns multiple values which the caller can use to implement the actual operation (GETVAR, PUTVAR, DELVAR, HASVAR). The values returned include:

result: true if binding found, false otherwise. If false, other values are undefined.
holder: a pointer to the "holder" object, the internal or external object storing the binding value. For register-bound identifiers, this is NULL.
value: an duk_tval pointer to the current value for register bindings, points to a value stored in a value stack. For declarative environment records, value points to the duk_tval property entry of the internal object. For object environment records, this is NULL.
attrs: property attributes of value (if value is NULL, this field is not needed). Attributes are needed in PUTVAR: before updating a value in-place using a direct duk_tval write, we need to know that the value is writable. Register bound variables are always writable (mutable), denoted "W" in the table below.
this_binding: an duk_tval pointer to the this binding related to the reference, points to a value stored in an object.
env: a pointer to the lexical environment record (an duk_hobject) where the binding was found. For register-bound identifiers, this is NULL.

The following table clarifies the return values in different cases:

Case result holder value attrs this_binding env

Delayed declarative true NULL points to W NULL NULL environment, bound valstack
in register of
current activation

Declarative true NULL points to W NULL env environment, bound valstack
in register of open
environment record

Declarative true env points to from NULL env environment, bound prop prop
in (open or closed) storage
environment record
object

Object environment, true target NULL n/a NULL env bound in target
object, no "this
binding"

Object environment, true target NULL n/a target env bound in target
object, has "this
binding"

Not found false n/a n/a n/a n/a n/a

The object environment records created by the with statement provide a "this binding" (provideThis is true, see E5 Section 12.10); other object environment records do not. The "this binding" only affects function calls made through bound identifiers, e.g. as in:

var foo = {
  bar: function() { print("" + this); },
  toString: function() { print("i'm foo"); }
}

with (foo) {
  // prints "i'm foo", similar to being called
  // like: foo.bar()
  bar();
}

Original algorithm

The original GetIdentifierReference is described in E5 Section 10.2.2.1. The inputs are: lexical environment lex, identifier string name, and a strict flag:

If lex is the value null, then a. Return a value of type Reference whose base value is undefined, whose referenced name is name, and whose strict mode flag is strict.
Let envRec be lex's environment record.
Let exists be the result of calling the HasBinding(N) concrete method of envRec passing name as the argument N.
If exists is true, then a. Return a value of type Reference whose base value is envRec, whose referenced name is name, and whose strict mode flag is strict.
Else a. Let outer be the value of lex's outer environment reference. b. Return the result of calling GetIdentifierReference passing outer, name, and strict as arguments.

Notes:

The algorithm supports the case where the starting lexical environment is null, although step 1 is more likely intended to just be the recursion terminator.
The recursion walks the outer reference chain, which our implementation handles through internal prototypes of the environment records.

Eliminating recursion

NEXT: If lex is the value null, then: a. Return a value of type Reference whose base value is undefined, whose referenced name is name, and whose strict mode flag is strict.
Let envRec be lex's environment record.
Let exists be the result of calling the HasBinding(N) concrete method of envRec passing name as the argument N.
If exists is true, then a. Return a value of type Reference whose base value is envRec, whose referenced name is name, and whose strict mode flag is strict.
Let lex be the value of lex's outer environment reference.
Goto NEXT.

Concrete implementation

A few notes first:

The implementation delays the creation of an explicit declarative environment record when possible. In this case the initial lex value is NULL and should be treated like an empty declarative environment record with a certain outer reference, and possibly a set of identifiers bound to registers. To do this, we need a reference to the current activation (act below).
Some callers require a variant which does not follow the outer environment reference chain. The algorithm incorporates a flag parents controlling this (if true, parent chain is followed).

First draft:

If lex is null and act is defined then (delayed declarative environment record): a. Check whether name is bound to a register of act. To do this, the function object needs to be looked up based on act, and the function metadata be consulted; in particular, the _Varmap internal property (which maps names to register numbers) is used. b. If name is mapped, return the following: - Result: true - Value pointer: point to register storage - Attributes: writable - This pointer: NULL - Environment pointer: NULL - Holder pointer: NULL c. If parents is false, goto NOTFOUND. d. Else, let lex be the outer environment record that a declarative environment record created for act would have. This is concretetely looked up from the _Lexenv internal property of the function related to act.
NEXT: If lex is the value null, then goto NOTFOUND.
If lex is a declarative environment record, then: a. If lex is open (activation registers are still in use): 1. Check whether name is mapped to a register of the activation related to the environment record. These are concretely looked up using internal properties of lex. (Note that the related activation may be any function, and even that of another thread.) 2. If so, return the following values (value pointer can always be given, and the caller is always allowed to modify the value in-place, because all register bindings are mutable): - Result: true - Value pointer: point to register storage - This pointer: NULL - Environment pointer: lex - Holder pointer: NULL b. If lex has a property named name, return the following values: - Result: true - Value pointer: point to storage location of property in lex - Attributes: from lex property (non-writable for immutable bindings, writable for others) - This pointer: NULL - Environment pointer: lex - Holder pointer: lex
Else lex is an object environment record: a. Let target be the binding object for lex. (Note: this is always defined, and an object.) b. If the result of calling [[HasProperty]] for target with the property name name is true: 1. If lex has the internal property _This, set thisBinding to its value. Else set thisBinding to NULL. 2. Return the following values: - Result: true - Value pointer: NULL - Attributes: arbitrary (use zero) - This pointer: thisBinding - Environment pointer: lex - Holder pointer: target
If parents is false, goto NOTFOUND.
Let lex be the internal prototype of lex.
Goto NEXT.
NOTFOUND: Return the following:
- Result: false
- Value pointer: NULL
- Attributes: arbitrary (use zero)
- This pointer: NULL
- Environment pointer: NULL
- Holder pointer: NULL

HASVAR: check existence of identifier

Unlike e.g. GETVAR, HASVAR does not traverse the environment record outer reference chain. HASVAR is also not really an exposed primitive; ECMAScript code cannot access it directly. It is used internally for function call handling, and can also be used from the C API.

Using GETIDREF:

Let res be the result of calling GETIDREF with the arguments env, name, and parents set to false.
Return the "result" component of res.

GETVAR: read identifier value

Conceptual steps:

Identifier resolution (E5 Section 10.3.1) is used to resolve identifier references.
Identifier resolution calls GetIdentifierReference (E5 Section 10.2.2.1), which returns a Reference type.
A right-hand-side expression "coerces" the Reference to a value using GetValue (E5 Section 8.7.1).

In the optimal case, all of these can be resolved at compile time, converting the identifier read into a register lookup. No explicit run-time processing happens in this case.

In other cases the compiler emits a GETVAR instruction which performs the necessary (slow) steps at run time. The identifier name (a string) is always known at compile time as there is no indirect variable lookup; an eval call might look like one, but any identifier reference has a string name when compiling the eval argument string, e.g. as in:

function f() {
  return eval("return foo;);
}

The run time part begins with, essentially, GetIdentifierReference which is given a lexical environment env, an identifier name name, and a strict flag which depends on the function containing the expression.

The GETVAR primitive differs from a plain identifier lookup in that it also returns the "this binding" related to the identifier, if defined.

Let's look at GetValue first.

GetValue

GetValue simplifies to (here, V is the Reference):

Let base be the result of calling GetBase(V) (which must be an environment record).
If IsUnresolvableReference(V), throw a ReferenceError exception.
Return the result of calling the GetBindingValue (see 10.2.1) concrete method of base passing GetReferencedName(V) and IsStrictReference(V) as arguments.

Inlining the GetBindingValue calls (E5 Sections 10.2.1.1.4 and 10.2.1.2.4):

Let base be the result of calling GetBase(V) (which must be an environment record).
If IsUnresolvableReference(V), throw a ReferenceError exception. (Note: this is unconditional.)
If base is a declarative environment record, then: a. If the binding for name is an uninitialized immutable binding, then: 1. If strict is true, then throw a ReferenceError exception. 2. Else, return undefined. b. Return the value currently bound to name in base. (Note: the value must exist, because IsUnresolvableReference() checks that it does.)
Else base must be an object environment record and: a. Let bindings be the bindings object for base. b. Let value be the result of calling the [[HasProperty]] internal method of bindings, passing name as the property name. c. If value is false, then: 1. If strict is true, then throw a ReferenceError exception. 2. Else, return undefined. d. Return the result of calling the [[Get]] internal method of bindings, passing name for the argument. (Note: this may invoke an accessor.)

Reworking a bit to eliminate duplication of ReferenceError throwing, and cleaning up:

Let base be the result of calling GetBase(V) (which must be an environment record).
If IsUnresolvableReference(V), throw a ReferenceError exception. (Note: this is unconditional.)
If base is a declarative environment record, then: a. If the binding for name is an uninitialized immutable binding, then goto NOTFOUND. b. Return the value currently bound to name in base. (Note: the value must exist, because IsUnresolvableReference() checks that it does.)
Else base must be an object environment record and: a. Let bindings be the bindings object for base. b. If the result of calling the [[HasProperty]] internal method of bindings, passing name as the property name, is false, then goto NOTFOUND. c. Return the result of calling the [[Get]] internal method of bindings, passing name for the argument. (Note: this may invoke an accessor.)
NOTFOUND: a. If strict is true, then throw a ReferenceError exception. b. Else, return undefined.

Notes:

Step 3.a: uninitialized immutable bindings don't occur when running user code, they only exist temporarily in the specification algorithms.
Step 4.c: it is important to note that getting a value from an object environment record accesses a user visible property, and may lead to an accessor call. The accessor can have arbitrary side effects, such as:
- Modifying arbitrary objects, even the binding object itself.
- Causing a garbage collection, and resizing and reallocation of any object's property allocation or any valstack. This may invalidate any existing duk_tval pointers to such structures (but not any "plain" heap object pointers, such as pointers to strings and objects).

Using GETIDREF

Arguments are environment record env, and identifier name name. The return value is a pair (value, this_binding).

Let res be the result of calling GETIDREF with the arguments env, name, and parents set to true.
If res.result is false, throw a ReferenceError. (Note: this is unconditional.)
If res.value is not NULL (identifier bound to a register in a declarative environment record) then: a. Return res.value and undefined.
Else res.holder must not be NULL (identifier bound to a declarative environment record or an object environment record target object): a. Let this be ref.this_binding. b. Let val be the result of calling [[Get]] on res.holder with the property name name. c. Return val and this.

Notes:

In step 4, note that the [[Get]] call may invoke a getter and may thus have an arbitrary number of side effects, including resizing of the property allocation of any object and any valstack. Any existing duk_tval pointers may be invalidated. This is why step 4.a should conceptually happen first.

Handling of `typeof` for an unresolvable identifier

The typeof operator needs slightly different behavior to the above algorithm for unresolvable references. Instead of throwing a ReferenceError typeof returns undefined for an unresolvable reference.

Another alternative would be to use HASVAR first and then (depending on the result) use GETVAR.

PUTVAR: write identifier value

GetIdentifierReference
PutValue

Note: the E5 specification prohibits a binding or assignment to an identifier named eval or arguments in strict mode. This is actually prevented during compilation, and causes a compile time SyntaxError:

E5 Section 11.13: single or compound assignment
E5 Section 12.2.1: variable or function declaration in a function body
E5 Section 13.1: function argument name
E5 Section 12.14.1: catch clause variable name

As a result, there is no need to check for this at run time when assigning values to variables (either in actual program code, or in bytecode prologue initializing function bindings). The implementation does assert for this condition though.

Let's look at PutValue first.

PutValue

PutValue simplifies to (here, V is the Reference and W is the value):

Let base be the result of calling GetBase(V) (which must be an environment record).
If IsUnresolvableReference(V), then: a. If IsStrictReference(V) is true then throw a ReferenceError exception. b. Call the [[Put]] internal method of the global object, passing GetReferencedName(V) for the property name, W for the value, and false for the Throw flag.
Call the SetMutableBinding (10.2.1) concrete method of base, passing GetReferencedName(V), W, and IsStrictReference(V) as arguments.
Return.

Inlining the SetMutableBinding calls (E5 Sections 10.2.1.1.3 and 10.2.1.2.3):

Let base be the result of calling GetBase(V) (which must be an environment record).
If IsUnresolvableReference(V), then: a. If IsStrictReference(V) is true then throw a ReferenceError exception. b. Call the [[Put]] internal method of the global object, passing GetReferencedName(V) for the property name, W for the value, and false for the Throw flag.
If base is a declarative environment record, then: a. If the binding for GetReferencedName(V) in base is a mutable binding, change its bound value to W. b. Else this must be an attempt to change the value of an immutable binding so throw a TypeError exception.
Else base must be an object environment record and: a. Let bindings be the binding object for base. b. Call the [[Put]] internal method of bindings with arguments GetReferencedName(V), W, and IsStrictReference(V). (Note: this may invoke an accessor.)
Return.

Notes:

Step 4.c may have a wide variety of side effects including resizing any object property allocation or valstack.

Using GETIDREF

Arguments are environment record env, identifier name name, new identifier value val, and a strict flag indicating whether the code executing a PUTVAR is strict.

Let res be the result of calling GETIDREF with the arguments env, name, and parents set to true.
If res.result is false: a. If strict is true, throw a ReferenceError. b. Call the [[Put]] internal method of the global object, passing name for the property name, val for the value, and false for the Throw flag.
If res.value is not NULL (identifier bound to a register, or to a property in a declarative environment record) and res.attrs indicates value is writable, then: a. Write val to the target of the pointer res.value. (Identifier bound to a register in a declarative environment record.) b. Return.
Else res.holder must not be NULL. Identifier is bound to a declarative environment record (an immutable binding) or an object environment record target object: a. Call the [[Put]] internal method of res.holder with arguments name, val, and strict. (Note: this may invoke an accessor.) b. Return.

Notes:

In step 4, note that the [[Put]] call may invoke a setter and may thus have an arbitrary number of side effects, including resizing of the property allocation of any object and any valstack. Any existing duk_tval pointers may be invalidated.

DELVAR: delete identifier

GetIdentifierReference
delete operator applied to an identifier (not a property)
DeleteBinding

The deletion process locates the nearest declaration and deletes that (if possible). There may be on outer declaration which is still in effect. For instance (Rhino):

js> var a = 10;
js> function f() {
...     eval("var a = 20; print(a); " +
...          "print(delete a); print(a); " +
...          "print(delete a)"); };
js> f()
20
true
10
false

The innermost binding is an established by eval into an empty declarative environment of the function. The declaration succeeds and creates a deletable, mutable binding, which is then printed and successfully deleted. The global binding is still visible, but it is a non-deletable, mutable binding, so the delete fails. Multiple levels of deletable bindings for the same identifier are thus possible, and delete will always try to delete the one that is currently visible.

The delete operator

The delete operator is defined in E5 Section 11.4.1. A few notes:

In non-strict mode, deletion of an unresolvable identifier succeeds silently (step 3.b), e.g.:
```
function f() { return delete foo; }
print(f());  // prints true
```

In non-strict mode, deletion of a resolvable but undeletable binding succeeds with false:

// 'x' is a non-deletable mutable binding
function f() {
    var x = 1;
    print(delete x);
}

// -> false
f();

In non-strict mode, deletion of a resolvable and deletable binding succeeds with true:

foo = 1;  // establishes 'foo' into global object

// -> {"value":1,"writable":true,
//     "enumerable":true,"configurable":true}
print(JSON.stringify(this, 'foo'));

// -> true
print(delete foo);

// -> empty
print(JSON.stringify(this, 'foo'));

In strict mode, any attempt to delete an identifier (resolvable or not) is always a compile time SyntaxError, see steps 3.a and 5.a.

Example 1:

// SyntaxError (compile time)
function f() {
    'use strict';
    delete foo;  // unresolvable
}

Example 2:

// SyntaxError (compile time)
foo = 1;
function f() {
    'use strict';
    delete foo;  // resolves, still a SyntaxError
}

Example 3 (applies even to object bindings):

foo = { bar: 1};

with (foo) {
    var f = function() {
        'use strict';
        delete bar;  // resolves, still a SyntaxError
    }
}

Note that all SyntaxError exceptions must be thrown at compile time (E5 Section 16). So, any run time attempts to delete identifiers with a DELVAR operation must happen from non-strict code.

With this in mind, the run time part of delete operator (i.e., the DELVAR primitive) only executes in non-strict code, and for a reference ref becomes:

If IsUnresolvableReference(ref) then return true.
Else ref is a reference to an environment record binding; let bindings be GetBase(ref).
Return the result of calling the DeleteBinding concrete method of bindings, providing GetReferencedName(ref) as the argument.

Inlining the concrete DeleteBinding algorithms (E5 Sections 10.2.1.1.5 and 10.2.1.2.5), and renaming ref to V and bindings to base to match the GETVAR and PUTVAR algorithms:

If IsUnresolvableReference(V) then return true.
Else V is a reference to an environment record binding; let base be GetBase(V).
If base is a declarative environment record, then: a. If base does not have a binding for the name GetReferencedName(V), return true. b. If the binding for GetReferencedName(V) in base cannot be deleted, return false. c. Remove the binding for GetReferencedName(V) from base. d. Return true.
Else base must be an object environment record and: a. Let bindings be the binding (target) object for base. b. Return the result of calling [[Delete]] internal method of bindings, passing GetReferencedName(V) and false arguments.

Notes:

In step 4.b: [[Delete]] returns true if the own property either does not exist, or the property exists and is deletable. false is only returned if a non-configurable own property exists. This matches the behavior for identifiers in declarative environment records.

Using GETIDREF

Arguments are environment record env, identifier name name, new identifier value val, and a strict flag indicating whether the code executing a DELVAR is strict (the strict flag is always false, though).

Let res be the result of calling GETIDREF with the arguments env, name, and parents set to true.
If res.result is false: a. Return true.
If res.value is not NULL (identifier bound to a register, or a property in a declarative environment record) and res.attrs indicates value is non-configurable, then: a. Return false. (Note: register-bound identifiers are not deletable.)
Else res.holder must not be NULL (identifier bound to a declarative environment record or an object environment record target object): a. Call the [[Delete]] internal method of res.holder with arguments name and false. b. Return.

Notes:

The compiler should never emit a DELVAR for strict code, and the bytecode executor should refuse to execute such an instruction for strict code. There is no explicit check in the algorithm.
Step 4.a covers two cases:
1. An identifier bound to a declarative environment record. In this case res.holder is the internal environment record, and the property delete follows normal [[Delete]] behavior. In particular, [[Delete]] only returns false if the property exists and is not configurable. Otherwise [[Delete]] returns true. This matches the desired behavior for declarative environment records (see the abstract, inlined version of the algorithm).
2. An identifier bound to an object environment record. In this case res.holder is the target object, and the [[Delete]] call is the desired behavior.

DECLVAR: declare identifier

Background

Bindings are created with CreateMutableBinding and CreateImmutableBinding in E5 Section 10.5 (Declaration Binding Instantiation) and Section 12.14 (catch clause).

CreateMutableBinding and CreateImmutableBinding both assume that they are never called if the binding has already been declared. The algorithms establishing new bindings carefully use HasBinding to avoid duplicate declaration attempts (see "declaration binding instantiation" for instance).

Declarations always go to a specified environment record; its outer environment (parent) is not checked or affected. Thus, a variable name can be re-declared if it exists in an outer context, e.g.:

var a = 10;
function f() {
  // new declarative environment
  var a = 20;
}

More specifically, new bindings are created in the following places (the method of preventing a duplicate declaration is in parentheses):

Section 10.5 step 4.d.iii - 4.d.iv: argument binding (checks for existing binding)
Section 10.5 step 5.c - 5.d: function declaration (checks for existing binding; special handling for re-declarations of global functions added in E5.1)
Section 10.5 step 6, 7.b.i, and 7.c.i: arguments binding (checks for existing binding)
Section 10.5 step 8.b - 8.c: variable declaration (checks for existing binding)
Section 12.14 catch step 4: catch variable binding (new environment, no need to check)

The DECLVAR algorithm can ignore attempts to re-declare a variable, except that the re-declaration of global functions has special handling in E5.1.

Note that unlike GETVAR, PUTVAR, and DELVAR, DECLVAR has no "baseline" algorithm in the E5 specification. Rather, it is a primitive which is useful internally, and needs to match the scattered variable declaration needs identified above.

Also note that all non-register-bound identifiers are stored as object properties (either of an internal or an external object). Hence, DECLVAR ultimately adds or updates a property of some holder object.

Algorithm

Inputs:

environment record env
variable name name
initial value val
property attributes attrs, which allow the caller to control whether the binding is deletable ([[Configurable]] attribute) and mutable ([[Writable]] attribute)
flag is_func_decl, which indicates whether the binding being declared is a function declaration; this has no other effect than to control the special global function re-declaration behavior of E5.1

Outputs:

none

Algorithm:

Let res be the result of calling GETIDREF with the arguments env, name, and parents set to false.
If res.result is true (already declared): a. If is_func_decl is false or env is not the global object environment record, return (ignore re-declaration). Else is_func_decl is true and env is the global object environment record, and E5.1 special behavior is needed. b. Let holder be ref.holder; this must be the global object, which must hold an own property called name. This is the case because the global object has a null internal prototype. c. Let X be the property descriptor for name in holder. d. If X.[[Configurable]] is false: 1. If X is an accessor property, throw a TypeError. 2. If X.[[Writable]] is false or X.[[Enumerable]] is false, throw a TypeError. 3. Set attrs to the current property attributes of X. (Note: in effect, don't update X attributes; we know it is writable, enumerable, and non-configurable.) e. Update the property name of holder to be a data property with the value val, and attributes set to attrs. f. Return.
Let holder be the property holder object of env (this is env itself for a declarative environment, and the target (bindings) object for an object environment).
Define a new property name to holder with property attributes attrs. Note that this may fail if holder is not extensible; this can only happen for object environment records, as declarative environment records are never non-extensible.
Return.

Notes:

The concrete implementation has to deal with the fact that env creation for an activation may be delayed. So, the environment needs to be created on-the-fly if it doesn't exist yet.
Step 2 inlines yet another version of [[DefineOwnProperty]].
If a function is redeclared, it must have its binding value updated.

Fast path

As a general rule, the compiler uses static analysis in an attempt to allocate variables to registers and convert variable accesses to direct register read/write operations. Only a minority of variable accesses thus happen using the slow identifier lookup described above.

If the compiler can determine that all variable accesses can be handled this way (and no dynamic accesses are possible e.g. through a direct eval call), the control information does not need to be recorded in the compiled function at all. However, if the compiler cannot rule out the possibility of arbitrary dynamic variable lookups, the control information needs to be stored.

Future work

A declarative environment record now records _Callee to get access to its _Varmap property. Instead, the record could store a _Varmap reference directly, which would drop one step from lookup of a register mapped variable. Also, if the function itself is freed, only the varmap needs to survive in the heap. The downside would be that there would be no access to function metadata, should that be useful (e.g. for debugging).

[^1]: See the following E5 sections:

-   Section 10.5 steps 2, 5.d, 8.c (eval code)
-   Section 10.5 step 7.b (`arguments` binding)
-   Section 13 (function expression)

[^2]: See the following E5 sections:

-   Section 10.5, steps 7.b.i and 7.b.ii (`arguments` for a strict
    function)
-   Section 13, named function expression algorithm, steps 3-5
    (named function expressions)

[^3]: See the following E5 sections:

-   Sections 10.2.1.1.6, 10.2.1.2.6: when implicit `this` is
    provided
-   Section 11.2.3 step 6.b: function call handling
-   Sections 10.2.1.2, 12.10: object environment records which have
    `provideThis` set (`with` statement only)

[^4]: See the following E5 sections:

-   Section 10.4.2, step 3: strict eval code
-   Section 10.4.3: function call
-   Section 12.14: `catch` clause, step 3
-   Section 13.1: function expression

[^5]: See the following E5 sections:

-   Section 10.2.3: global environment record
-   Section 12.10: `with` statement, step 4

Identifier algorithms ​

Introduction ​

Specification notes ​

Key parts ​

Key concepts ​

Unsorted notes ​

Establishing an execution context (= calling) ​

Environment record implementation ​

Specification ​

Internal representation ​

Declarative environment record ​

Object environment record ​

Example internal environment record objects ​

Functions and environment records ​

Preliminary work ​

GETIDREF ​

Not found false n/a n/a n/a n/a n/a ​

Original algorithm ​

Eliminating recursion ​

Concrete implementation ​

HASVAR: check existence of identifier ​

GETVAR: read identifier value ​

GetValue ​

Using GETIDREF ​

Handling of typeof for an unresolvable identifier ​

PUTVAR: write identifier value ​

PutValue ​

Using GETIDREF ​

DELVAR: delete identifier ​

The delete operator ​

Using GETIDREF ​

DECLVAR: declare identifier ​

Background ​

Algorithm ​

Fast path ​

Future work ​

Identifier algorithms

Introduction

Specification notes

Key parts

Key concepts

Unsorted notes

Establishing an execution context (= calling)

Environment record implementation

Specification

Internal representation

Declarative environment record

Object environment record

Example internal environment record objects

Functions and environment records

Preliminary work

GETIDREF

Not found false n/a n/a n/a n/a n/a

Original algorithm

Eliminating recursion

Concrete implementation

HASVAR: check existence of identifier

GETVAR: read identifier value

GetValue

Using GETIDREF

Handling of `typeof` for an unresolvable identifier

PUTVAR: write identifier value

PutValue

Using GETIDREF

DELVAR: delete identifier

The delete operator

Using GETIDREF

DECLVAR: declare identifier

Background

Algorithm

Fast path

Future work