Extra types for the Native Interface
Created by: wks
Philosophy: There should be a subset of Mu types and instructions that can do what C can do. It should be possible to implement the C programming language in this subset of Mu while still be able to access the memory in a way specified by the platform's ABI (be compatible with "good" native programs).
Types
Pointer types
-
ptr<T>
: A memory pointer to typeT
. (Is there a better name? A pointer always points to somewhere in the memory. Maybe "data pointer" or "value pointer"? In C, it is object pointer, but "object" has a different meaning in Mu.) -
funcptr<sig>
: A function pointer to a function with signaturesig
Pointers are addresses. They can be cast to and from integer values by interpreting the integer as the address. Mu does not check the validity of this cast.
ptr<T>
can be used by the memory addressing instructions: GETFIELDIREF
, GETELEMIREF
, ... will work as they are iref
types. Memory access instructions can work with ptr<T>
with a PTR
flag:
// %p is ptr<int<64>>
%result1 = LOAD PTR ACQUIRE <@i64> %p
STORE PTR RELEASE <@i64> %p @const1
%result2 = CMPXCHG PTR SEQ_CST SEQ_CST <@i64> %p @const1 @const2
%result3 = ATOMICRMW PTR SEQ_CST ADD <@i64> %p @const3
funcptr<sig>
can be called with the CCALL
instruction:
// assume @write is funcptr<@size_t (@i32 @voidptr @size_t)>
%result = CCALL C <@sig> @write (%fd %buf %sz) // C means the "C" calling convention
Union type
I think there is a way to introduce the union type from C without compromising the safety of Mu's reference types.
Define the union type as: union<T1 T2 T3 ...>
T1
, T2
, T3
, ... are its members. The members of a union type cannot contain ref
, iref
, weakref
, func
, thread
, stack
or tagref64
types as they are either object references or opaque references. However, ptr
and funcptr
are allowed.
union
must be in the memory. It cannot be the type of an SSA variable. It does not make sense: union is a, err..., "union" of several types (no puns intended), but an SSA variable holds exactly one type.
One may argue that "I want to LOAD a union and STORE to another location without looking into it, so I need union to be an SSA variable". However, for data transfer, there could be a
memcpy
-like instruction that can copy large structures efficiently. So it is unnecessary.
When allocated in the Mu memory, its initial value is all zeros: If any member is loaded before another value is stored into it, the result is always the "zero value" of that type (int 0, fp +0.0, ref NULL).
A union only holds the latest stored member:
- if a load is *not atomic, and there is only one visible store to a member of the union, then
- if the store accesses the same member as the load, the load gets the value of that store;
- if the store accesses a different member, the load instruction has undefined behaviour.
- Union members cannot be accessed atomically.
I am still uncertain how the C memory model plays together with unions. C11 defines a union as "an overlapping set of member objects" and "When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values." This implies that storing into one member of a union has the side effect of modifying other members.