Comparing ref<T> against ref<U>
Currently the cmpOp instructions take only one type parameter, and both operands must have the same type.
%result1 = EQ <int<32>> %a1 %b1 %result2 = EQ <int<64>> %a2 %b2 %result3 = FEQ <float> %a3 %b3 %result4 = EQ <ref<T>> %a4 %b4 %result5 = EQ <iref<T>> %a5 %b5 %result6 = EQ <funcref<sig>> %a6 %b6 %result7 = EQ <uptr<T>> %a7 %b7 %result8 = EQ <ufuncptr<sig>> %a8 %b8
In object-oriented programming, we sometimes want to compare a
ref<U>, where T is a superclass of U. Mu does not know OOP, but Mu has the prefix rule so that a value of type
ref<T> can actually refer to an object of type
U as long as
T is a prefix of
U. This allows OOP to be implemented in the Mu type system.
For this reason, comparing
ref<U> for equality is meaningful: The result is true iff both references refer to the same object, or both NULL. The semantics of
EQ is actually defined as so in the spec, but requires both operands to have the
// Assume %a is ref<T> and %b is ref<U> %result = EQ <ref<T>> %a %b // Disallowed in the spec because %b is a ref<U>. But the refimpl does not check the type parameter, so it works for now.
To work-around this problem, the client can use
REFCAST to cast both operands to the same type (such as
ref<void>) before comparing:
// Assume %a is ref<T> and %b is ref<U> %aa = REFCAST <ref<T> ref<void>> %a %bb = REFCAST <ref<U> ref<void>> %b %result = EQ <ref<void>> %aa %bb
This would potentially make the Mu instruction stream very verbose.
The simplest work-around is to let REFCAST ignore the type parameter when comparing between two
The code will look like:
// Assume %a is ref<T> and %b is ref<U> %result = EQ <ref<Blah>> %a %b // The micro VM ignores the Blah
This will elide the two
REFCAST instructions. Similarly, when comparing
T is ignored; it also disregards the
sig signature in
ufuncptr<sig>. If this behaviour is standardised, the client can rely on this and emit less instructions.
What the micro VM sees
When the micro VM sees this instruction:
EQ <ref<Blah>> %a %b, the compiler knows that both
ref of something, but does not know what
%b refers to (the "Blah" can just be a lie). As long as refs are always represented in the same way (such as represented as pointers to the beginning of the object, but may be moved by the GC), the compiler can still generate code without knowing the object type. The compiler cares about the storage type, not the high-level parameterised type.
Potential side effects (unlikely)
This will require all
ref<_> types to have the same representation (sizes, as pointer or as handle) regardless of the type parameter. It prevents the possibility that "
ref<U> may have different sizes. But I don't think implementing different refs in different sizes would be useful.
A more aggressive design
We can push it further by removing all type and signature parameters in
ufuncptr<sig>, so they become simply
To compensate the lack of the knowledge about the referent type, instructions must be annotated with the referent types. But the micro VM only needs to know the referent type when doing pointer arithmetics (GETFIELDIREF...) and memory access (LOAD, STORE, ...). For example:
// Assume %a is an iref to T, and T is a struct %b = GETFIELDIREF <T 3> %a // Assume %c is an iref to int<64> %v = LOAD <int<64>> %c
Actually this is the same as the current Mu IR. The type annotations on the instructions are intended to ease the job of the Mu-to-machine compiler inside the micro VM.
By discarding the type parameters, REFCAST will be unnecessary, and PTRCAST only casts between pointers and integers, but not between pointers.
But the Mu IR programs themselves will carry less information about the destination of refs/uptrs. It may make the behaviour of the program harder to reason about. But since the client can perform REFCAST at any time, it can always choose to cast all refs to
ref<void>, and still write correct programs.
It is unlikely that we will adopt this aggressive design soon, but may be considered if we redesign the IR.
Comparing ref against ptr
A related topic is whether it should be allowed to compare
The obvious answer is "no".
ptr (as well as
funcref) do not have the same storage type.
ref can be represented as the address to the beginning of the object, and may be modified by the GC when the object is moved.
ref can also be represented as a handle, or as a pair of <addr, type>. On the other hand,
ptr must be treated as raw addresses. Even if
ref is represented as address, consider an extreme case where we have a micro VM that performs GC between every pair of instructions, and moves every object at that time. It is a valid micro VM implementatoin, but the address of any
ref is totally non-determinestic.
In some VMs (such as JikesRVM), there are VMMagics that allows getting the address from an object reference, or converting an address into an object reference. In this way, the GC can be implemented in the same language as the language it is serving. However, the
obj->addr conversions alone are not enoug. Such VMs must also have mechanisms to specify uninterruptable regions in which GC must not happen. If the GC is concurrent, there must be other mechanisms to handle this gracefully. But all "magics" are closely related to the concrete (micro)VM implementation, and should be kept private.