Identifier algorithms
Introduction
This document discusses the internal representation of lexical environments and the exposed algorithms for identifier handling, taking into account the internal representation.
The goal of the compiler is to convert almost all identifier related accesses to register accesses and to avoid the explicit creation of environment record objects whenever possible. For full compliance, a "slow path" is needed, and is provided by the environment record objects and the identifier algorithms described here. The slow path is optimized for a compact code footprint, not speed.
Identifier handling encompasses a large part of the E5 specification and the implementation:
- Variable and function declarations in global code, eval code, and function code (compile time)
- Function argument list declarations (compile time)
- Establishing execution contexts (calls to global code, eval code, and function code; run time)
- Establishing other lexical environments (
catch
blocks,with
blocks; run time) - Reading identifier values and
typeof
(run time) - Assigning values to identifiers (run time)
- Deleting identifier bindings using
delete
(run time) - Special argument name bindings for the
arguments
object (run time)
These have many complications from an implementation point of view:
- Many things are processed during compile time (and are, in fact, required to be processed compile time)
- Identifiers may be bound to compiled function registers (preferred) or to explicit environment record objects (fallback)
- Register-bound identifiers need to be moved to an explicit environment record object if they may be referred to later (closure)
- Establishing execution contexts is rather tricky and varies between call types
Specification notes
Key parts
Key parts of ECMAScript E5 discussion identifier handling include Section 10 (Executable Code and Execution Contexts):
- Sections 10.2.2.1 and 10.3.1 cover identifier lookup (
GetIdentifierReference
) - Sections 10.4 and 10.5 cover execution context initialization for global code, eval code, and function calls
- Section 10.6 covers the
arguments
object initialization (for function calls)
Other relevant sections include:
- Section 11.4.1: the
delete
operator, which is used for variable deletion (and property deletion) - Section 11.4.3: the
typeof
operator - Section 12.10: the
with
statement - Section 12.14: the
try
statement, special handling ofcatch
- Section 13: function declarations and function expressions
- Section 15.1.2.1:
eval (x)
, defines the concept of a direct call to eval
Key concepts
Lexical environment
: Consists of an environment record and a reference to an outer lexical environment.
The current implementation makes no distinction between a lexical
environment and the environment record contained in it: they are
represented by the same object. Hence, the term \"environment
record\" is used loosely below to refer to the lexical environment
and/or its environment record.
The *outer lexical environment* is also referred to as the
environment record\'s *parent* below.
Environment record
: An environment record is either a declarative environment record or an object environment record.
Declarative environment records hold variable bindings in an
internal map not accessible from user code.
Object environment records hold variable bindings in an ECMAScript
object. They are used for handling global program execution for
`with` statements.
Execution context
: Consists of (note the confusing names and terminology here):
1. `LexicalEnvironment`, a lexical environment (an environment
record) for processing variable lookups,
2. `VariableEnvironment`, a lexical environment (an environment
record) for processing variable and function declarations, and
3. a `this` binding
The two lexical environments are usually the same, except for:
- `with`, which establishes an object environment record for
variable lookups, but keeps the previous one for declarations
- `catch`, which establishes a single-variable declarative
environment record for the variable containing the error
Note the confusing terminology, especially the use of \"lexical
environment\" as a general term and `LexicalEnvironment` as a
component of an execution context. Below, we refer to these loosely
as \"lexical environment\" (for `LexicalEnvironment`) and \"variable
environment\" (for `VariableEnvironment`).
Immutable binding
: A binding whose value cannot be changed after initialization. An immutable binding cannot be deleted.
Mutable binding
: A binding whose value can be changed. A mutable binding may be deletable or non-deletable.
Implicit this
value
: A binding may be associated with an implicit this
value. If the value is not provided, undefined
is used in its place.
Declarative environment records never provide an implicit
`this value`. Object environment records for which the `provideThis`
internal flag is set provide an implicit `this` value.
Unsorted notes
Almost all bindings in the E5 specification are non-deletable mutable bindings, except:[^1]
- declaration in eval: function and variable declarations in eval code are deletable mutable bindings
- "arguments" binding: the
arguments
binding for strict mode function is an immutable binding - function name in a function expression: in a function expression
function f(args) { body }
, the identifierf
is an immutable binding for the function body; this is not true of function declarations
There are no practical cases where an immutable binding would be accessed in an uninitialized state:[^2]
- Immutable bindings are created with
CreateImmutableBinding
and initialized withInitializeImmutableBinding
- For
arguments
, creation and initialization are consecutive, there is little chance of failure (except "out of memory" like conditions); even in failure cases, the intermediate uninitialized binding is not accessible to program code - For function name in a function expression, creation and initialization are not consecutive but the same applies as for
arguments
: an uninitialized binding is never exposed to program code (the function expression is not "returned" until its initialization is complete) - As a result, the implementation does not need to handle lookups for uninitialized immutable bindings, implied for declarative environment records in E5 Section 10.2.1.1.4, step 3
Implicit this
value is relevant only for:[^3]
- Object environment records initialized in a
with
statement. - Function call using an identifier reference: a possible implicit
this
value is used for thethis
binding (similarly to how a function call using a property reference bindsthis
to the property base object).
New declarative environment records are created for:[^4]
- Entering strict eval code (with a direct eval call)
- Entering function code
- Entering
catch
clause - Evaluating a named function expression
New object environment records are created for:[^5]
- Entering
with
statement body - Implicitly for the global object environment record
Establishing an execution context (= calling)
Different cases for calling:
+-------------+-----------+-----------+------------+---------------+ | Call type | Lexical | Variable | This | Notes | | | en | en | binding | | | | vironment | vironment | | | +=============+===========+===========+============+===============+ | Executing | Global | Global | Global | | | strict | en | en | object | | | global code | vironment | vironment | | | | | (object | (object | | | | | env | env | | | | | ironment) | ironment) | | | +-------------+-----------+-----------+------------+---------------+ | Executing | Global | Global | Global | | | non-strict | en | en | object | | | global code | vironment | vironment | | | | | (object | (object | | | | | env | env | | | | | ironment) | ironment) | | | +-------------+-----------+-----------+------------+---------------+ | Executing | Global | Global | Global | Non-direct | | strict | en | en | object | eval code, or | | non-direct | vironment | vironment | | eval code | | eval code | (object | (object | | without a | | | env | env | | calling | | | ironment) | ironment) | | context (when | | | | | | in practice?) | +-------------+-----------+-----------+------------+---------------+ | Executing | Global | Global | Global | Non-direct | | non-strict | en | en | object | eval code, or | | non-direct | vironment | vironment | | eval code | | eval code | (object | (object | | without a | | | env | env | | calling | | | ironment) | ironment) | | context (when | | | | | | in practice?) | +-------------+-----------+-----------+------------+---------------+ | Executing | New | Same as | Use | The parent of | | strict | de | lexical | calling | the new | | direct eval | clarative | en | context's | declarative | | code | en | vironment | this
| environment | | | vironment | | binding | record is the | | | | | | calling | | | | | | context's | | | | | | lexical | | | | | | environment. | +-------------+-----------+-----------+------------+---------------+ | Executing | Use | Use | Use | | | non-strict | calling | calling | calling | | | direct eval | c | c | context's | | | code | ontext's | ontext's | this
| | | | lexical | variable | binding | | | | en | en | | | | | vironment | vironment | | | +-------------+-----------+-----------+------------+---------------+ | Executing | New | Same as | Coerced | The parent of | | non-strict | de | lexical | thisArg
: | the new | | function | clarative | en | | declarative | | code | en | vironment | If null
| environment | | | vironment | | or | record is the | | | | | u | called | | | | | ndefined
, | function's | | | | | use global | [[Scope]]
| | | | | object. | internal | | | | | | property (see | | | | | If | below). | | | | | n | | | | | | on-Object, | thisArg
is | | | | | use | supplied by | | | | | ToObject
| caller in the | | | | | ( | function call | | | | | thisArg)
. | algorithm. | | | | | | | | | | | If Object, | | | | | | use | | | | | | thisArg
| | | | | | directly. | | +-------------+-----------+-----------+------------+---------------+ | Executing | New | Same as | thisArg
| The parent of | | strict | de | lexical | without | the new | | function | clarative | en | any | declarative | | code | en | vironment | coercion | environment | | | vironment | | | record is the | | | | | | called | | | | | | function's | | | | | | [[Scope]]
| | | | | | internal | | | | | | property (see | | | | | | below). | | | | | | | | | | | | thisArg
is | | | | | | supplied by | | | | | | caller in the | | | | | | function call | | | | | | algorithm. | +-------------+-----------+-----------+------------+---------------+
Function [[Scope]]
initialization is described in E5 Section 13:
- For function declarations, the
[[Scope]]
internal property is initialized to the variable environment of the execution context containing the declaration. This is the same as initializing[[Scope]]
to the lexical environment of the containing context, because function declarations are only allowed at the top level. If function declarations are allowed in other places (than the top level), it makes more sense to initialize[[Scope]]
to the current lexical environment. - For anonymous function expressions, the
[[Scope]]
internal property is initialized to the lexical environment of the execution context containing the expression. - For named function expressions, a new declarative environment is created whose parent is the lexical environment of the execution context containing the expression. The new environment will be initialized with an immutable binding for the name of the function, and will then be used as the
[[Scope]]
of the function object.
Environment record implementation
Specification
Environment records are described in E5 Section 10.2.1. They define how identifiers are bound to values in an ECMAScript execution context. Bindings can be established in several ways:
- variable and function declarations in the global scope
- variable and function declarations in eval code
- variable and function declarations in functions
catch
clauseswith
statements
An environment record maintains bindings to a certain set of identifiers, and may have (at most one) parent record (called outer environment reference) with further bindings.
There are two types of environment records:
- declarative environment record: binds identifiers to values directly
- object environment record: refers to an external ECMAScript object which contains the variable bindings
Declarative environment records are the ordinary way of binding identifiers. The underlying map of identifiers to values is only visible to the program through variable accesses. For object environment records, the identifiers are bound through an ordinary ECMAScript object (referred to by the object environment record); object environment records are used e.g. in the global propgram scope and in the with
statement.
An ECMAScript E5 execution environment maintains two environment record references (E5 Section 10.3):
- A variable environment refers to an environment record which is used for function and variable declarations
- A lexical environment refers to an environment record which is used for variable lookups (including deletes)
Very often these two refer to the same environment record. A new, nested lexical environment can be created e.g. through a with
, a catch
, a nested function, etc.
Internal representation
The current implementation makes no distinction between a lexical environment and the environment record it refers to. Instead, a combination of a lexical environment, its environment record, and a possible outer lexical environment reference is implemented as a single internal object which is not directly visible to program code.
An outer lexical environment reference (the "parent record") is identified using the internal prototype field of the object. This does not mean that ordinary property lookups (with automatic prototype walking) are used for variable lookups: the internal prototype is simply a convenient place to maintain the parent relationship, having an easily accessible slot, being already visible to garbage collection etc.
Declarative environment record
A declarative environment record object maintains identifier mappings in two ways:
- directly as properties of the environment record
- by identifying an active activation record ("open scope") and a set of identifiers bound to the registers of that activation record
In the latter case, the environment record identifies the thread, the function, and the "register base" of the activation record in the thread's value stack. These allow an identifier to be mapped to a value stack entry as follows:
- The function's variable map (if it exists) is consulted to find the identifiers register number, relative to the bottom of the activation. (Of course, the identifier might not be mapped.)
- The "register base" of the environment record identifies the absolute value stack index of the activation record's frame bottom, and is added to the relative register number to get an absolute value stack index.
Note that the actual activation record does not need to be identified, we just need the register base to resolve absolute value stack index of a variable. However, this approach does not allow validation of the resulting value stack index (e.g. to verify that it is indeed inside the activation record's frame).
When a declarative environment is "closed", identifiers bound to activation registers are copied to the internal environment record object as plain properties (with the help of the callee's _Varmap
). The internal control fields are updated to indicate that the environment is closed, so that later lookups can skip open scope lookups which would now reference stale value stack indices.
The variables mapped as properties have their attributes set as follows:
[[Enumerable]]
: does not matter as the properties are not visible to the program (currently set totrue
)[[Writable]]
: set totrue
for mutable bindings,false
for immutable bindings[[Configurable]]
: set totrue
for mutable bindings which are also deletable,false
for non-deletable mutable bindings and immutable bindings
Register-bound identifiers are assumed to be non-deletable mutable bindings: register bindings cannot be easily deleted (the bytecode may refer to them directly), and protecting register bound identifiers from being modified would require some control information we don't have. As a result, other types of bindings cannot be mapped to registers (i.e. declarations inside eval, arguments
binding, and the name binding of a named function expression) and require an explicit environment record object.
Object environment record
An object environment record refers to an external ECMAScript object (visible to the program) "backing" the identifier bindings. The target object is identified with an internal control property.
The internal object is initialized with:
- Object class set to
DUK_HOBJECT_CLASS_OBJENV
- Internal prototype referring to outer environment record
- Internal control property:
_Target
Identifier lookups proceed to the _Target
object while the parent environment record is identified by the prototype of the environment record (not of _Target
).
Example internal environment record objects
Let's consider an environment with:
- the global object (outermost)
- a declarative environment (of a function, scope is open)
- an object environment (e.g. of a
with
clause)
This could happen in this example:
// global scope: a variable lookup here would use 'record1'
var with_object = { "xyz": "zy" };
function func() {
var foo = 'my foo';
var bar = 'my bar';
// in the example below, we assume this variable was not mapped
// to a register for some reason (wouldn't ordinarily happen)
var quux = "a non-register binding";
// variable lookup here would use 'record2'
with (with_object) {
// variable lookup here would use 'record3'
}
}
The objects could be roughly as follows; leading underscore indicates an internal value not visible to the program, double leading underscore indicates internal properties stored directly in internal structures outside the property table, e.g. __prototype
denotes internal prototype:
global_object = {
"NaN": NaN,
"Infinity": Infinity,
"undefined": undefined,
"eval": [function],
"parseInt": [function],
...
}
func = {
// "foo" maps to reg 0, "bar" to reg 1, "quux" not register mapped
"_Varmap": { "foo": 0, "bar": 1 },
...
}
with_object = {
"xyz": "zy"
}
record1 = {
// Flag DUK_HOBJECT_CLASS_OBJENV set
"__prototype": null,
"__target": global_object, // identifies binding target
"__has_this": false
}
record2 = {
// Flag DUK_HOBJECT_CLASS_DECENV set
"__prototype": record1,
"__varmap": varmap, // _Varmap of target function (name-to-reg)
"__thread": thread, // identifies valstack
"__regbase": 100, // identifies valstack base for regs
"quux": "a non-register binding"
// var "foo" resides in value stack absolute index 100 + 0 = 100,
// var "bar" in absolute index 100 + 1 = 101
}
record3 = {
// Flag DUK_HOBJECT_CLASS_OBJENV set
"__prototype": record2,
"__target": with_object,
"__has_this": true
}
Once again, the compiler strives to avoid creating explicit environment records whenever possible. In the example above the compiler won't be successful: the object environment required by with
also causes the declarative environment (record2) to be created.
Functions and environment records
A function object (duk_hcompfunc
) records the following conceptual variable access related information:
- A variable map, which maps an identifier to a register number. Ideally all function variables / functions are register mapped, but this is not always possible.
- Control data for initializing the lexical and variable environments of new invocations.
There are separate function template objects (which are essentially functions without an outer environment) and concrete function instance objects. The detailed properties vary a bit between the two.
More concretely:
- The
DUK_HOBJECT_FLAG_NEWENV
object level flag, and thelex_env
andvar_env
fields ofduk_hcompfunc
control activation record lexical and variable environment initialization as described below. - The internal property
_Varmap
contains a mapping from an identifier name to a register number relative to the activation record frame bottom. - The internal property
_Formals
contains a list of formal argument names. - Template function objects, used for creating concrete function instances, use
DUK_HOBJECT_FLAG_NAMEBINDING
flag to indicate that the template represents a named function expression. For such functions, the function name (stored inname
) needs to be bound in an environment record just outside a function activation's environment record. - The compiler attempts to drop any fields not required from compiled objects. In many common cases (even when dynamic variables accesses cannot be ruled out) no control fields are required.
The detailed handling of these is documented as part of the source for:
- Creating function templates in the compiler
- Instantiating function instances from templates (closures)
Notes:
- Environment record initialization is done only when (if) it is actually needed (e.g. for a function declaration). It is not created unnecessarily when a function is called.
- The
DUK_HOBJECT_FLAG_NEWENV
is set for ordinary functions, which always get a new environment record for variable declaration and lookup. The flag is cleared for global code and eval code; or rather functions compiled from global code and eval code. - Unlike Duktape, ECMAScript does not separate the compilation and execution of global code and eval code. Hence the handling is seemingly a bit different although the outcome should be the same.
Preliminary work
GETIDREF
The GetIdentifierReference
specification algorithm walks the chain of environment records looking for a matching identifier binding. It returns a "reference" value which can act as both a left-hand-side and a right-hand-side value. The reference identifies (1) an environment record, (2) an identifier name, and (3) a strictness flag.
When using the reference as a left-hand-side value, the environment record is written to. For declarative records, this is a write to internal object (an internal duk_hobject
in Duktape) or a value stack entry (a duk_tval
for register bound identifiers). For object environment records, this is a property write (a [[Put]]
call) to a user visible object, possibly invoking a setter call.
When using the reference as a right-hand-side value, the environment record is read from. For declarative records, this is a read from an internal object or a value stack entry (for register bound identifiers). For object environment records, this is a property read (a [[Get]]
call), possibly invoking a getter call.
Note that the storage location of any identifier value is conceptually an external or an internal object. However, the concrete implementation is a bit more complicated: an identifier value may be either stored in an object or in an activation's register file, i.e., a particular entry in the value stack of some thread. This needs to be taken into account when representing a "reference" type internally.
A close analog of GetIdentifierReference
is currently implemented as a helper function (referred to as GETIDREF in this document). GETIDREF returns multiple values which the caller can use to implement the actual operation (GETVAR
, PUTVAR
, DELVAR
, HASVAR
). The values returned include:
result
:true
if binding found,false
otherwise. Iffalse
, other values are undefined.holder
: a pointer to the "holder" object, the internal or external object storing the binding value. For register-bound identifiers, this is NULL.value
: anduk_tval
pointer to the current value for register bindings, points to a value stored in a value stack. For declarative environment records,value
points to theduk_tval
property entry of the internal object. For object environment records, this is NULL.attrs
: property attributes ofvalue
(ifvalue
is NULL, this field is not needed). Attributes are needed in PUTVAR: before updating a value in-place using a directduk_tval
write, we need to know that the value is writable. Register bound variables are always writable (mutable), denoted "W" in the table below.this_binding
: anduk_tval
pointer to thethis
binding related to the reference, points to a value stored in an object.env
: a pointer to the lexical environment record (anduk_hobject
) where the binding was found. For register-bound identifiers, this is NULL.
The following table clarifies the return values in different cases:
Case result holder value attrs this_binding env
Delayed declarative true NULL points to W NULL NULL environment, bound valstack
in register of
current activation
Declarative true NULL points to W NULL env environment, bound valstack
in register of open
environment record
Declarative true env points to from NULL env environment, bound prop prop
in (open or closed) storage
environment record
object
Object environment, true target NULL n/a NULL env bound in target
object, no "this
binding"
Object environment, true target NULL n/a target env bound in target
object, has "this
binding"
Not found false n/a n/a n/a n/a n/a
The object environment records created by the with
statement provide a "this binding" (provideThis
is true, see E5 Section 12.10); other object environment records do not. The "this binding" only affects function calls made through bound identifiers, e.g. as in:
var foo = {
bar: function() { print("" + this); },
toString: function() { print("i'm foo"); }
}
with (foo) {
// prints "i'm foo", similar to being called
// like: foo.bar()
bar();
}
Original algorithm
The original GetIdentifierReference
is described in E5 Section 10.2.2.1. The inputs are: lexical environment lex
, identifier string name
, and a strict
flag:
- If
lex
is the valuenull
, then a. Return a value of type Reference whose base value isundefined
, whose referenced name isname
, and whose strict mode flag isstrict
. - Let
envRec
belex
's environment record. - Let
exists
be the result of calling theHasBinding(N)
concrete method ofenvRec
passingname
as the argumentN
. - If
exists
istrue
, then a. Return a value of type Reference whose base value isenvRec
, whose referenced name isname
, and whose strict mode flag isstrict
. - Else a. Let
outer
be the value oflex
's outer environment reference. b. Return the result of callingGetIdentifierReference
passingouter
,name
, andstrict
as arguments.
Notes:
- The algorithm supports the case where the starting lexical environment is
null
, although step 1 is more likely intended to just be the recursion terminator. - The recursion walks the
outer
reference chain, which our implementation handles through internal prototypes of the environment records.
Eliminating recursion
- NEXT: If
lex
is the valuenull
, then: a. Return a value of type Reference whose base value isundefined
, whose referenced name isname
, and whose strict mode flag isstrict
. - Let
envRec
belex
's environment record. - Let
exists
be the result of calling theHasBinding(N)
concrete method ofenvRec
passingname
as the argumentN
. - If
exists
istrue
, then a. Return a value of type Reference whose base value isenvRec
, whose referenced name isname
, and whose strict mode flag isstrict
. - Let
lex
be the value oflex
's outer environment reference. - Goto NEXT.
Concrete implementation
A few notes first:
- The implementation delays the creation of an explicit declarative environment record when possible. In this case the initial
lex
value isNULL
and should be treated like an empty declarative environment record with a certain outer reference, and possibly a set of identifiers bound to registers. To do this, we need a reference to the current activation (act
below). - Some callers require a variant which does not follow the outer environment reference chain. The algorithm incorporates a flag
parents
controlling this (if true, parent chain is followed).
First draft:
- If
lex
isnull
andact
is defined then (delayed declarative environment record): a. Check whethername
is bound to a register ofact
. To do this, the function object needs to be looked up based onact
, and the function metadata be consulted; in particular, the_Varmap
internal property (which maps names to register numbers) is used. b. Ifname
is mapped, return the following: - Result:true
- Value pointer: point to register storage - Attributes: writable - This pointer: NULL - Environment pointer: NULL - Holder pointer: NULL c. Ifparents
isfalse
, goto NOTFOUND. d. Else, letlex
be the outer environment record that a declarative environment record created foract
would have. This is concretetely looked up from the_Lexenv
internal property of the function related toact
. - NEXT: If
lex
is the valuenull
, then goto NOTFOUND. - If
lex
is a declarative environment record, then: a. Iflex
is open (activation registers are still in use): 1. Check whethername
is mapped to a register of the activation related to the environment record. These are concretely looked up using internal properties oflex
. (Note that the related activation may be any function, and even that of another thread.) 2. If so, return the following values (value pointer can always be given, and the caller is always allowed to modify the value in-place, because all register bindings are mutable): - Result:true
- Value pointer: point to register storage - This pointer: NULL - Environment pointer:lex
- Holder pointer: NULL b. Iflex
has a property namedname
, return the following values: - Result:true
- Value pointer: point to storage location of property inlex
- Attributes: fromlex
property (non-writable for immutable bindings, writable for others) - This pointer: NULL - Environment pointer:lex
- Holder pointer:lex
- Else
lex
is an object environment record: a. Lettarget
be the binding object forlex
. (Note: this is always defined, and an object.) b. If the result of calling[[HasProperty]]
fortarget
with the property namename
istrue
: 1. Iflex
has the internal property_This
, setthisBinding
to its value. Else setthisBinding
toNULL
. 2. Return the following values: - Result:true
- Value pointer: NULL - Attributes: arbitrary (use zero) - This pointer:thisBinding
- Environment pointer:lex
- Holder pointer:target
- If
parents
isfalse
, goto NOTFOUND. - Let
lex
be the internal prototype oflex
. - Goto NEXT.
- NOTFOUND: Return the following:
- Result:
false
- Value pointer: NULL
- Attributes: arbitrary (use zero)
- This pointer: NULL
- Environment pointer: NULL
- Holder pointer: NULL
- Result:
HASVAR: check existence of identifier
Unlike e.g. GETVAR, HASVAR does not traverse the environment record outer reference chain. HASVAR is also not really an exposed primitive; ECMAScript code cannot access it directly. It is used internally for function call handling, and can also be used from the C API.
Using GETIDREF:
- Let
res
be the result of callingGETIDREF
with the argumentsenv
,name
, andparents
set tofalse
. - Return the "result" component of
res
.
GETVAR: read identifier value
Conceptual steps:
- Identifier resolution (E5 Section 10.3.1) is used to resolve identifier references.
- Identifier resolution calls
GetIdentifierReference
(E5 Section 10.2.2.1), which returns a Reference type. - A right-hand-side expression "coerces" the Reference to a value using
GetValue
(E5 Section 8.7.1).
In the optimal case, all of these can be resolved at compile time, converting the identifier read into a register lookup. No explicit run-time processing happens in this case.
In other cases the compiler emits a GETVAR
instruction which performs the necessary (slow) steps at run time. The identifier name (a string) is always known at compile time as there is no indirect variable lookup; an eval
call might look like one, but any identifier reference has a string name when compiling the eval
argument string, e.g. as in:
function f() {
return eval("return foo;);
}
The run time part begins with, essentially, GetIdentifierReference
which is given a lexical environment env
, an identifier name name
, and a strict
flag which depends on the function containing the expression.
The GETVAR primitive differs from a plain identifier lookup in that it also returns the "this binding" related to the identifier, if defined.
Let's look at GetValue
first.
GetValue
GetValue simplifies to (here, V
is the Reference):
- Let
base
be the result of callingGetBase(V)
(which must be an environment record). - If
IsUnresolvableReference(V)
, throw aReferenceError
exception. - Return the result of calling the
GetBindingValue
(see 10.2.1) concrete method ofbase
passingGetReferencedName(V)
andIsStrictReference(V)
as arguments.
Inlining the GetBindingValue
calls (E5 Sections 10.2.1.1.4 and 10.2.1.2.4):
- Let
base
be the result of callingGetBase(V)
(which must be an environment record). - If
IsUnresolvableReference(V)
, throw aReferenceError
exception. (Note: this is unconditional.) - If
base
is a declarative environment record, then: a. If the binding forname
is an uninitialized immutable binding, then: 1. Ifstrict
istrue
, then throw aReferenceError
exception. 2. Else, returnundefined
. b. Return the value currently bound toname
inbase
. (Note: the value must exist, becauseIsUnresolvableReference()
checks that it does.) - Else
base
must be an object environment record and: a. Letbindings
be the bindings object forbase
. b. Letvalue
be the result of calling the[[HasProperty]]
internal method ofbindings
, passingname
as the property name. c. Ifvalue
isfalse
, then: 1. Ifstrict
istrue
, then throw aReferenceError
exception. 2. Else, returnundefined
. d. Return the result of calling the[[Get]]
internal method ofbindings
, passingname
for the argument. (Note: this may invoke an accessor.)
Reworking a bit to eliminate duplication of ReferenceError
throwing, and cleaning up:
- Let
base
be the result of callingGetBase(V)
(which must be an environment record). - If
IsUnresolvableReference(V)
, throw aReferenceError
exception. (Note: this is unconditional.) - If
base
is a declarative environment record, then: a. If the binding forname
is an uninitialized immutable binding, then goto NOTFOUND. b. Return the value currently bound toname
inbase
. (Note: the value must exist, becauseIsUnresolvableReference()
checks that it does.) - Else
base
must be an object environment record and: a. Letbindings
be the bindings object forbase
. b. If the result of calling the[[HasProperty]]
internal method ofbindings
, passingname
as the property name, isfalse
, then goto NOTFOUND. c. Return the result of calling the[[Get]]
internal method ofbindings
, passingname
for the argument. (Note: this may invoke an accessor.) - NOTFOUND: a. If
strict
istrue
, then throw aReferenceError
exception. b. Else, returnundefined
.
Notes:
- Step 3.a: uninitialized immutable bindings don't occur when running user code, they only exist temporarily in the specification algorithms.
- Step 4.c: it is important to note that getting a value from an object environment record accesses a user visible property, and may lead to an accessor call. The accessor can have arbitrary side effects, such as:
- Modifying arbitrary objects, even the binding object itself.
- Causing a garbage collection, and resizing and reallocation of any object's property allocation or any valstack. This may invalidate any existing
duk_tval
pointers to such structures (but not any "plain" heap object pointers, such as pointers to strings and objects).
Using GETIDREF
Arguments are environment record env
, and identifier name name
. The return value is a pair (value, this_binding).
- Let
res
be the result of callingGETIDREF
with the argumentsenv
,name
, andparents
set totrue
. - If
res.result
isfalse
, throw aReferenceError
. (Note: this is unconditional.) - If
res.value
is not NULL (identifier bound to a register in a declarative environment record) then: a. Returnres.value
andundefined
. - Else
res.holder
must not be NULL (identifier bound to a declarative environment record or an object environment record target object): a. Letthis
beref.this_binding
. b. Letval
be the result of calling[[Get]]
onres.holder
with the property namename
. c. Returnval
andthis
.
Notes:
- In step 4, note that the
[[Get]]
call may invoke a getter and may thus have an arbitrary number of side effects, including resizing of the property allocation of any object and any valstack. Any existingduk_tval
pointers may be invalidated. This is why step 4.a should conceptually happen first.
Handling of typeof
for an unresolvable identifier
The typeof
operator needs slightly different behavior to the above algorithm for unresolvable references. Instead of throwing a ReferenceError
typeof
returns undefined
for an unresolvable reference.
Another alternative would be to use HASVAR first and then (depending on the result) use GETVAR.
PUTVAR: write identifier value
GetIdentifierReference
PutValue
Note: the E5 specification prohibits a binding or assignment to an identifier named eval
or arguments
in strict mode. This is actually prevented during compilation, and causes a compile time SyntaxError
:
- E5 Section 11.13: single or compound assignment
- E5 Section 12.2.1: variable or function declaration in a function body
- E5 Section 13.1: function argument name
- E5 Section 12.14.1:
catch
clause variable name
As a result, there is no need to check for this at run time when assigning values to variables (either in actual program code, or in bytecode prologue initializing function bindings). The implementation does assert for this condition though.
Let's look at PutValue
first.
PutValue
PutValue simplifies to (here, V
is the Reference and W
is the value):
- Let
base
be the result of callingGetBase(V)
(which must be an environment record). - If
IsUnresolvableReference(V)
, then: a. IfIsStrictReference(V)
istrue
then throw aReferenceError
exception. b. Call the[[Put]]
internal method of the global object, passingGetReferencedName(V)
for the property name,W
for the value, andfalse
for theThrow
flag. - Call the
SetMutableBinding
(10.2.1) concrete method ofbase
, passingGetReferencedName(V)
,W
, andIsStrictReference(V)
as arguments. - Return.
Inlining the SetMutableBinding
calls (E5 Sections 10.2.1.1.3 and 10.2.1.2.3):
- Let
base
be the result of callingGetBase(V)
(which must be an environment record). - If
IsUnresolvableReference(V)
, then: a. IfIsStrictReference(V)
istrue
then throw aReferenceError
exception. b. Call the[[Put]]
internal method of the global object, passingGetReferencedName(V)
for the property name,W
for the value, andfalse
for theThrow
flag. - If
base
is a declarative environment record, then: a. If the binding forGetReferencedName(V)
inbase
is a mutable binding, change its bound value toW
. b. Else this must be an attempt to change the value of an immutable binding so throw aTypeError
exception. - Else
base
must be an object environment record and: a. Letbindings
be the binding object forbase
. b. Call the[[Put]]
internal method ofbindings
with argumentsGetReferencedName(V)
,W
, andIsStrictReference(V)
. (Note: this may invoke an accessor.) - Return.
Notes:
- Step 4.c may have a wide variety of side effects including resizing any object property allocation or valstack.
Using GETIDREF
Arguments are environment record env
, identifier name name
, new identifier value val
, and a strict
flag indicating whether the code executing a PUTVAR is strict.
- Let
res
be the result of callingGETIDREF
with the argumentsenv
,name
, andparents
set totrue
. - If
res.result
isfalse
: a. Ifstrict
istrue
, throw aReferenceError
. b. Call the[[Put]]
internal method of the global object, passingname
for the property name,val
for the value, andfalse
for theThrow
flag. - If
res.value
is not NULL (identifier bound to a register, or to a property in a declarative environment record) andres.attrs
indicates value is writable, then: a. Writeval
to the target of the pointerres.value
. (Identifier bound to a register in a declarative environment record.) b. Return. - Else
res.holder
must not be NULL. Identifier is bound to a declarative environment record (an immutable binding) or an object environment record target object: a. Call the[[Put]]
internal method ofres.holder
with argumentsname
,val
, andstrict
. (Note: this may invoke an accessor.) b. Return.
Notes:
- In step 4, note that the
[[Put]]
call may invoke a setter and may thus have an arbitrary number of side effects, including resizing of the property allocation of any object and any valstack. Any existingduk_tval
pointers may be invalidated.
DELVAR: delete identifier
GetIdentifierReference
delete
operator applied to an identifier (not a property)DeleteBinding
The deletion process locates the nearest declaration and deletes that (if possible). There may be on outer declaration which is still in effect. For instance (Rhino):
js> var a = 10;
js> function f() {
... eval("var a = 20; print(a); " +
... "print(delete a); print(a); " +
... "print(delete a)"); };
js> f()
20
true
10
false
The innermost binding is an established by eval into an empty declarative environment of the function. The declaration succeeds and creates a deletable, mutable binding, which is then printed and successfully deleted. The global binding is still visible, but it is a non-deletable, mutable binding, so the delete fails. Multiple levels of deletable bindings for the same identifier are thus possible, and delete
will always try to delete the one that is currently visible.
The delete operator
The delete
operator is defined in E5 Section 11.4.1. A few notes:
In non-strict mode, deletion of an unresolvable identifier succeeds silently (step 3.b), e.g.:
function f() { return delete foo; } print(f()); // prints true
In non-strict mode, deletion of a resolvable but undeletable binding succeeds with
false
:// 'x' is a non-deletable mutable binding function f() { var x = 1; print(delete x); } // -> false f();
In non-strict mode, deletion of a resolvable and deletable binding succeeds with
true
:foo = 1; // establishes 'foo' into global object // -> {"value":1,"writable":true, // "enumerable":true,"configurable":true} print(JSON.stringify(this, 'foo')); // -> true print(delete foo); // -> empty print(JSON.stringify(this, 'foo'));
In strict mode, any attempt to delete an identifier (resolvable or not) is always a compile time
SyntaxError
, see steps 3.a and 5.a.Example 1:
// SyntaxError (compile time) function f() { 'use strict'; delete foo; // unresolvable }
Example 2:
// SyntaxError (compile time) foo = 1; function f() { 'use strict'; delete foo; // resolves, still a SyntaxError }
Example 3 (applies even to object bindings):
foo = { bar: 1}; with (foo) { var f = function() { 'use strict'; delete bar; // resolves, still a SyntaxError } }
Note that all SyntaxError
exceptions must be thrown at compile time (E5 Section 16). So, any run time attempts to delete identifiers with a DELVAR operation must happen from non-strict code.
With this in mind, the run time part of delete
operator (i.e., the DELVAR primitive) only executes in non-strict code, and for a reference ref
becomes:
- If
IsUnresolvableReference(ref)
then returntrue
. - Else
ref
is a reference to an environment record binding; letbindings
beGetBase(ref)
. - Return the result of calling the
DeleteBinding
concrete method ofbindings
, providingGetReferencedName(ref)
as the argument.
Inlining the concrete DeleteBinding
algorithms (E5 Sections 10.2.1.1.5 and 10.2.1.2.5), and renaming ref
to V
and bindings
to base
to match the GETVAR and PUTVAR algorithms:
- If
IsUnresolvableReference(V)
then returntrue
. - Else
V
is a reference to an environment record binding; letbase
beGetBase(V)
. - If
base
is a declarative environment record, then: a. Ifbase
does not have a binding for the nameGetReferencedName(V)
, returntrue
. b. If the binding forGetReferencedName(V)
inbase
cannot be deleted, returnfalse
. c. Remove the binding forGetReferencedName(V)
frombase
. d. Returntrue
. - Else
base
must be an object environment record and: a. Letbindings
be the binding (target) object forbase
. b. Return the result of calling[[Delete]]
internal method ofbindings
, passingGetReferencedName(V)
andfalse
arguments.
Notes:
- In step 4.b:
[[Delete]]
returnstrue
if the own property either does not exist, or the property exists and is deletable.false
is only returned if a non-configurable own property exists. This matches the behavior for identifiers in declarative environment records.
Using GETIDREF
Arguments are environment record env
, identifier name name
, new identifier value val
, and a strict
flag indicating whether the code executing a DELVAR is strict (the strict
flag is always false
, though).
- Let
res
be the result of callingGETIDREF
with the argumentsenv
,name
, andparents
set totrue
. - If
res.result
isfalse
: a. Returntrue
. - If
res.value
is not NULL (identifier bound to a register, or a property in a declarative environment record) andres.attrs
indicates value is non-configurable, then: a. Returnfalse
. (Note: register-bound identifiers are not deletable.) - Else
res.holder
must not be NULL (identifier bound to a declarative environment record or an object environment record target object): a. Call the[[Delete]]
internal method ofres.holder
with argumentsname
andfalse
. b. Return.
Notes:
- The compiler should never emit a DELVAR for strict code, and the bytecode executor should refuse to execute such an instruction for strict code. There is no explicit check in the algorithm.
- Step 4.a covers two cases:
- An identifier bound to a declarative environment record. In this case
res.holder
is the internal environment record, and the property delete follows normal[[Delete]]
behavior. In particular,[[Delete]]
only returnsfalse
if the property exists and is not configurable. Otherwise[[Delete]]
returnstrue
. This matches the desired behavior for declarative environment records (see the abstract, inlined version of the algorithm). - An identifier bound to an object environment record. In this case
res.holder
is the target object, and the[[Delete]]
call is the desired behavior.
- An identifier bound to a declarative environment record. In this case
DECLVAR: declare identifier
Background
Bindings are created with CreateMutableBinding
and CreateImmutableBinding
in E5 Section 10.5 (Declaration Binding Instantiation) and Section 12.14 (catch
clause).
CreateMutableBinding
and CreateImmutableBinding
both assume that they are never called if the binding has already been declared. The algorithms establishing new bindings carefully use HasBinding
to avoid duplicate declaration attempts (see "declaration binding instantiation" for instance).
Declarations always go to a specified environment record; its outer environment (parent) is not checked or affected. Thus, a variable name can be re-declared if it exists in an outer context, e.g.:
var a = 10;
function f() {
// new declarative environment
var a = 20;
}
More specifically, new bindings are created in the following places (the method of preventing a duplicate declaration is in parentheses):
- Section 10.5 step 4.d.iii - 4.d.iv: argument binding (checks for existing binding)
- Section 10.5 step 5.c - 5.d: function declaration (checks for existing binding; special handling for re-declarations of global functions added in E5.1)
- Section 10.5 step 6, 7.b.i, and 7.c.i:
arguments
binding (checks for existing binding) - Section 10.5 step 8.b - 8.c: variable declaration (checks for existing binding)
- Section 12.14
catch
step 4:catch
variable binding (new environment, no need to check)
The DECLVAR algorithm can ignore attempts to re-declare a variable, except that the re-declaration of global functions has special handling in E5.1.
Note that unlike GETVAR, PUTVAR, and DELVAR, DECLVAR has no "baseline" algorithm in the E5 specification. Rather, it is a primitive which is useful internally, and needs to match the scattered variable declaration needs identified above.
Also note that all non-register-bound identifiers are stored as object properties (either of an internal or an external object). Hence, DECLVAR ultimately adds or updates a property of some holder object.
Algorithm
Inputs:
- environment record
env
- variable name
name
- initial value
val
- property attributes
attrs
, which allow the caller to control whether the binding is deletable ([[Configurable]]
attribute) and mutable ([[Writable]]
attribute) - flag
is_func_decl
, which indicates whether the binding being declared is a function declaration; this has no other effect than to control the special global function re-declaration behavior of E5.1
Outputs:
- none
Algorithm:
- Let
res
be the result of callingGETIDREF
with the argumentsenv
,name
, andparents
set tofalse
. - If
res.result
istrue
(already declared): a. Ifis_func_decl
isfalse
orenv
is not the global object environment record, return (ignore re-declaration). Elseis_func_decl
istrue
andenv
is the global object environment record, and E5.1 special behavior is needed. b. Letholder
beref.holder
; this must be the global object, which must hold an own property calledname
. This is the case because the global object has anull
internal prototype. c. LetX
be the property descriptor forname
inholder
. d. IfX.[[Configurable]]
isfalse
: 1. IfX
is an accessor property, throw aTypeError
. 2. IfX.[[Writable]]
isfalse
orX.[[Enumerable]]
isfalse
, throw aTypeError
. 3. Setattrs
to the current property attributes ofX
. (Note: in effect, don't updateX
attributes; we know it is writable, enumerable, and non-configurable.) e. Update the propertyname
ofholder
to be a data property with the valueval
, and attributes set toattrs
. f. Return. - Let
holder
be the property holder object ofenv
(this isenv
itself for a declarative environment, and the target (bindings) object for an object environment). - Define a new property
name
toholder
with property attributesattrs
. Note that this may fail ifholder
is not extensible; this can only happen for object environment records, as declarative environment records are never non-extensible. - Return.
Notes:
- The concrete implementation has to deal with the fact that
env
creation for an activation may be delayed. So, the environment needs to be created on-the-fly if it doesn't exist yet. - Step 2 inlines yet another version of
[[DefineOwnProperty]]
. - If a function is redeclared, it must have its binding value updated.
Fast path
As a general rule, the compiler uses static analysis in an attempt to allocate variables to registers and convert variable accesses to direct register read/write operations. Only a minority of variable accesses thus happen using the slow identifier lookup described above.
If the compiler can determine that all variable accesses can be handled this way (and no dynamic accesses are possible e.g. through a direct eval
call), the control information does not need to be recorded in the compiled function at all. However, if the compiler cannot rule out the possibility of arbitrary dynamic variable lookups, the control information needs to be stored.
Future work
- A declarative environment record now records
_Callee
to get access to its_Varmap
property. Instead, the record could store a_Varmap
reference directly, which would drop one step from lookup of a register mapped variable. Also, if the function itself is freed, only the varmap needs to survive in the heap. The downside would be that there would be no access to function metadata, should that be useful (e.g. for debugging).
[^1]: See the following E5 sections:
- Section 10.5 steps 2, 5.d, 8.c (eval code)
- Section 10.5 step 7.b (`arguments` binding)
- Section 13 (function expression)
[^2]: See the following E5 sections:
- Section 10.5, steps 7.b.i and 7.b.ii (`arguments` for a strict
function)
- Section 13, named function expression algorithm, steps 3-5
(named function expressions)
[^3]: See the following E5 sections:
- Sections 10.2.1.1.6, 10.2.1.2.6: when implicit `this` is
provided
- Section 11.2.3 step 6.b: function call handling
- Sections 10.2.1.2, 12.10: object environment records which have
`provideThis` set (`with` statement only)
[^4]: See the following E5 sections:
- Section 10.4.2, step 3: strict eval code
- Section 10.4.3: function call
- Section 12.14: `catch` clause, step 3
- Section 13.1: function expression
[^5]: See the following E5 sections:
- Section 10.2.3: global environment record
- Section 12.10: `with` statement, step 4