Better support for the tagref64 type

Created by: wks

In dynamic languages, the tagref64 type (or other future tagged reference type variants) will be used pervasively in the language runtime.

This issue summarises potential improvements on the support for such types.

Tagged reference constant

The Mu IR currently does not have a constant for tagref64 mainly because it may holds a reference and non-NULL references cannot be constant. However, one possible use of the tagref64 type is to store a NULL reference together with an int<6> tag. In this case, the tag determines the concrete thing it is representing (undefined, nil, null, false, true or other frequently used singleton objects). So it should be possible in Mu to create such tagref64 as a constant.

Proposed new syntax:

.const @name <@tagref64> = TR64FP @double_constant
.const @name <@tagref64> = TR64INT @int52_constant
.const @name <@tagref64> = TR64NULLTAG @int6_constant // The ref is NULL, the tag is @int64_constant

.const @double_constant <@double> = 3.14d
.const @int52_constant <@i52> = 0x123456789abcd
.const @int6_constant <@i6> = 30

Tagged reference equality

Comparing floating point numbers bit by bit is not equivalent to IEEE754's definition of "equality". However, when two tagref64 values both holds integers or references+tags, the result is deterministic.

In dynamic languages, such comparisons can quickly determine whether two tagged references have the same type (identified by the tag part) and refers to the same object.

Proposed semantic of EQ comparison between tagref64 values:

The result of the EQ comparing instruction between v1 and v2 is 1 (true) if and only if any of the following is true:

Both holds double values, and
- neither were NaN and both have the same bit-wise representation, or
- both are NaN and they happen to have the same bit-wise representation after converted to tagref64.
Both holds int<52> values and they are bit-wise equal.
Both holds references, and
- their references refer to the same object or both are NULL, and
- their int<6> tags are bit-wise equal.

The NE instruction returns the opposite result of EQ.

NOTE: tagref64 uses the NaN space of double. Real NaN double values may lose its precise bit-wise representation when converted to tagref64. So comparing two tagref64 values both holding NaNs has unspecified result.

Alternative possibility: Require Mu to canonicalise all NaNs to one unique bit-wise representation. In this way, all NaNs compare equal when comparing tagref64 values bit by bit.

Default values of `tagref64` types.

Currently the default value (all zero bits. All newly-allocated memory (heap, stack, global) holds all zero bits.) of tagref64 holds +0.0 as a double value. In this representation, all tagref64 values which hold double contents are bit-wise equal to its real double representation. So converting a tagref64 to double is trivial: just do a bitcast.

However, languages usually define the values for uninitialised variables/fields as null-like values: undefined in JS, nil in Lua, null in java. There should be an option to make 00000000..00 represent their null types.

There could be a flag to determine the zero value of a tagref64 type. The proposed syntax is:

.typedef @tr64_with_fp_default = tagref64 <DEF_FP(3.14d)> // All 0s represents double value 3.14d
.typedef @tr64_with_ref_default = tagref64 <DEF_REF(0x5a)> // All 0s represents NULL ref with 0x5a as tag.
.typedef @tr64_with_int_default = tagref64 <DEF_INT(0x55aa55aa55aa5)> // All 0s represents integer 0x55aa55aa55aa5.

.typedef @tr64_as_current = tagref64 <FP_DEF(0.0d)> // All 0s represents double value 0.0d, which is the same as the current `tagref64`.

The kind of default is a static metadata and the garbage collector can identify it.

This can be implemented by applying an XOR mask on the value after encoding to tagref64 and before decoding an existing tagref64.

Better support for the tagref64 type

Tagged reference constant

Tagged reference equality

Default values of tagref64 types.

Default values of `tagref64` types.