general-issue-tracker issueshttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues2018-06-28T23:11:31+10:00https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/71Benchmarking framework for Mu projects2018-06-28T23:11:31+10:00Zixian CaiBenchmarking framework for Mu projectsWe'd want a benchmarking framework to run various tests across different Mu projects (implementations, clients).
This will help us discover bugs or performance problems introduced in commits. It can also facilitate development by know...We'd want a benchmarking framework to run various tests across different Mu projects (implementations, clients).
This will help us discover bugs or performance problems introduced in commits. It can also facilitate development by knowing, for example, what the impact of decisions are.
Currently, we have a benchmark suite developed by @u5157779 under https://gitlab.anu.edu.au/mu/mu-impl-fast/tree/master/tests/test_jit
It compares the performance of
- RPython with Mu backend (which can be run on different Mu implementations)
- RPython with C backend
- Hand-written C
- Hand-written Mu (which can be run on different Mu implementations)
Some of these are specific to a certain Mu client (in this case, Mu backend of RPython).
For a more general framework, following client-neutral aspects can be abstracted out
- Collecting, storing and processing metrics
- Visualizing
Then, each project can have its own agent for the framework to invoke for executing tests and collecting results.
cc @U1817699 @u3789498 @u4776528 @u5157779Zixian CaiZixian Caihttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/59API: Automatic creattion of MuInstResNode2018-06-28T23:11:31+10:00Kunshan WangAPI: Automatic creattion of MuInstResNode# Problem
Currently instruction results are *added* to the instruction after the instruction is created.
Alternatively, results can be created *when* the instruction is created. When taking this approach, the client will never add th...# Problem
Currently instruction results are *added* to the instruction after the instruction is created.
Alternatively, results can be created *when* the instruction is created. When taking this approach, the client will never add the wrong number of results to any instructions.
In theory, the client knows everything the micro VM knows about the IR bundle, so it is always capable of adding the right number of results. Changing the API in the following way will let the micro VM provide slightly more help to the client, but should not affect the performance: the micro VM always has the required information.
Minimalism will be affected if more query API (finding the opcode/mnemonic of instructions, the instruction parameters, etc.) is added. The client builds the IR, and always has full information of the structure of the CFG. So the client is always capable of pretty-printing what it is going to build without using any of the Mu API functions. For convenience, however, query API could be provided at a higher level by client-side libraries, while the C-based API provides an **efficient** (rather than rich) tool to transfer the information from the client to the micro VM. For debug purpose, the reference implementation can now print the loaded bundle as text.
# Proposed change
Replace the `new_inst_res` API function:
```c
MuInstResNode (*new_inst_res )(MuCtx *ctx, MuInstNode inst);
```
with these :
```c
MuInstResNode (*get_inst_res)(MuCtx *ctx, MuInstNode inst, int index);
int (*get_num_res)(MuCtx *ctx, MuInstNode inst);
```
`get_inst_res` will get the *index-th* result from an instruction. `get_num_res` gets the number of results.
Other API functions do not need to be changed.
# About the number of results
**Can the number of results be determined when constructing the instruction?**
Checklist ( *instruction*: *numOfResults* ) :
- binary ops: 1
- comparing: 1
- conversion: 1
- SELECT: 1
- BRANCH: 0
- BRANCH2: 0
- SWITCH: 0
- CALL: the number of results in the signature
- TAILCALL: 0
- RET: 0
- THROW: 0
- EXTRACTVALUE: 1
- INSERTVALUE: 1
- EXTRACTELEMENT: 1
- INSERTELEMENT: 1
- SHUFFLEVECTOR: 1
- NEW: 1
- NEWHYBRID: 1
- ALLOCA: 1
- ALLOCAHYBRID: 1
- GETIREF: 1
- GETFIELDIREF: 1
- GETELEMIREF: 1
- SHIFTIREF: 1
- GETVARPARTIREF 1
- LOAD: 1
- STORE: 0
- CMPXCHG: 2 (succ, oldVal)
- ATOMICRMW: 1
- FENCE: 0
- TRAP: as many as the length of the type parameters.
- WATCHPOINT: as many as the length of the type parameters.
- WPBRANCH: 0
- CCALL: the number of results in the signature. Same as CALL
- NEWTHREAD: 1
- SWAPSTACK:
- 0 if `KILL_OLD`
- n if `RET_WITH<T1 T2 ... Tn>`
- COMMINST: currently all comminsts have fixed number of return values.
Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/46Initialized aggregate values / objects2018-06-28T23:11:31+10:00John ZhangInitialized aggregate values / objects*Created by: eliotmoss*
For the C client, and more particularly for the Java client, we would like to be able to describe an initialized aggregate value. The obvious case is string literals for C. In Java we would like to be able to d...*Created by: eliotmoss*
For the C client, and more particularly for the Java client, we would like to be able to describe an initialized aggregate value. The obvious case is string literals for C. In Java we would like to be able to describe a Java char array for these. A Java array will be a HYBRID object. Not having these means unpleasant work-arounds involving element by element initialization by code that we have to arrange to have invoked before the objects could possibly be used.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/45Multiple return values2018-06-28T23:11:32+10:00John ZhangMultiple return values*Created by: wks*
*As for 2016, this change is already merged in both the spec and the refimpl.*
This issue discusses the design choices of allowing functions and Mu instructions to return more than one value (or a tuple of values)...*Created by: wks*
*As for 2016, this change is already merged in both the spec and the refimpl.*
This issue discusses the design choices of allowing functions and Mu instructions to return more than one value (or a tuple of values).
# Status quo
The one-return-value assumption has **a chain of causes and effects** to the design of the IR.
First of all, Mu [**functions**](https://github.com/microvm/mu-spec-goto/blob/master/uvm-ir.rest#function-signature-definition) have multiple parameters and exactly one return value. i.e. `f : (P1,P2,P3,...,Pm) -> R` This is the tradition of C-like languages as well as LLVM. In case it does not return useful values, it returns `void`
Following this tradition, all Mu **instructions** return exactly one value, too. e.g. `%rv = CALL <...> @some_func (%a1 %a2 ... %am)`. In case more than one values need to be returned, a function shall return a `struct<T1 T2 ... Tm>`. In case no useful value is returned, it returns `void`. For example:
```
%result = CMPXCHG WEAK SEQ_CST SEQ_CST <T> %loc %expected %desired
// %result is a struct<T @i1>. The second result indicates if it is successful.
%oldval = EXTRACTFIELD <@cmpxchg_result_T 0> %result
%is_successful = EXTRACTFIELD <@cmpxchg_result_T 1> %result
```
Since all Mu instructions have a single return value, the [SWAPSTACK](https://github.com/microvm/mu-spec-goto/blob/master/instruction-set.rest#swapstack-instruction) instruction shall have exactly one return value, too. That return value is the value received from another stack (or injected by the client). Since one stack also passes value using the `SWAPSTACK` instruction, **SWAPSTACK takes one parameter**, too, since *the peer only receives exactly one return value*. e.g. in the sender: `SWAPSTACK %receiver ... PASS_VALUE <@T> %val`; in the receiver: `%retval = SWAPSTACK %sender RET_WITH <@T> ...`
Since a swapped-away stack always stops at so called "**resumption point**" (resumable by swapstack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL), all of which receive exactly one return value, then the **state of a stack** is `READY<T>` when swapped away, where T is a single type -- the return type of the *resumption point*. This makes one important special case: **a newly-created frame** (by [NEWSTACK](https://github.com/microvm/mu-spec-goto/blob/master/instruction-set.rest#newstack-instruction) or [push_frame](https://github.com/microvm/mu-spec-goto/blob/master/uvm-client-interface.rest#on-stack-replacement) (OSR)). Currently since a function takes multiple parameters but SWAPSTACK passes only one argument, there is no way to supply all arguments during SWAPSTACK. As a compromise, NEWSTACK initialises all arguments and the subsequent SWAPSTACK passes `void`.
As a side effect, **the return value uniquely identifies the instruction**. This can bring some profits: during OSR, the variable that holds the return value can identify the OSR-point (instructions that may be the "current instruction" in a `READY<T>` stack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL, all happen to be *resumption points*. This is actually intentionally designed as so.). This is especially useful for **the TRAP instruction**, in which case the trap handler always first identify which TRAP is triggered. Because of the one-to-one mapping between instructions and return values, the term "SSA variable of an instruction" and "the instruction" are used interchangeably.
# Proposed change
The change starts with adopting "multiple return values".
First, **functions now return 0 or more values**. i.e. `f : (P1,P2,P3,..,Pm) -> (R1,R2,...,Rn)`. For compatibility with C programs (in the unsafe native interface) , a 0-tuple is returned when the C function returns `void`. i.e. either `c_function: (P1,P2,...,Pm) -> (R1)` or `c_function: (P1,P2,...,Pm) -> ()`.
Then **Mu instructions may return 0 or more values**. When no useful values are returned, the instruction returns `()`. The `CALL` instruction has exactly as many return values as the callee's return values. For example:
* `(%oldval %succ) = CMPXCHG WEAK SEQ_CST SEQ_CST %loc %expected %desired`:
* `(%rv1 %rv2 %rv3) = CALL <..> @func (%a1 %a2 ...)`
* `RET %rv1 %rv2 %rv3`
* `() = BRANCH %blah(....)` or just `BRANCH %blah(...)`
Then **SWAPSTACK can pass more than one values to the swappee**. The swappee expects to receive a list of values (statically decided at the particular SWAPSTACK instruction). If the swapper does not match the value required by swappee, it has undefined behaviour. For example:
```
// receiver:
(%rv1 %rv2 %rv3) = SWAPSTACK %sender_stack RET_WITH <@T1 @T2 @T3> PASS_VALUE <> ()
// sender:
() = SWAPSTACK %receiver_stack RET_WITH <> PASS_VALUES <@T1 @T2 @T3> (%v1 %v2 %v3)
// bad sender:
() = SWAPSTACK %receiver_stack RET_WITH <> PASS_VALUES <@T1 @T2> (%v1 %v2) // ERROR: too few values
```
Since all *resumption point* instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL) may receive multiple values (in fact, all instructions can), **the state of a "swapped-away" stack is `READY<T1 T2 ... Tn>`**. When binding a thread to a stack, exactly that many values need to be bound.
Then **the state of a newly created frame is `READY<T1 T2 ... Tn>`** where T1 T2 ... Tn are the parameter types of the function (also the entry block). **NEWSTACK no longer supply arguments**, but **the first SWAPSTACK supplies the parameters to the new frame**. For example:
```
.funcsig @sig = (@T1 @T2 ... @Tm) -> (@R1 @R2 ... @Rn)
.funcdef @f VERSION %v1 <@sig> {
%entry(<@T1> %p1 <@T2> %p2 .. <@Tm> %pm):
...
}
// in another function:
%new_stack = NEWSTACK <@sig> @a
// later
SWAPSTACK %new_stack RET_WITH <...> (...) PASS_VALUES <@T1 @T2 ... @Tm> (%v1 %v2 ... %vm)
```
Now **not all instructions are identifiable**. If an instruction does not return value, e.g. a trap that does not expect values from the client (which is the most common case): `TRAP <> KEEPALIVE(...)`, then there is no identifiers for these instructions. As a compromise, explicit names can be given to instructions using a slightly different syntax:
```
() = [%the_important_trap] TRAP <> KEEPALIVE (%local %vars)
// or simply
[%the_important_trap] TRAP <> KEEPALIVE (%local %vars)
// it also works if there are return values:
(%rv1 %rv2) = [%the_important_call_site] CALL <@sig> @func (%arg1 %arg2)
```
And **SSA variables** and **instructions** completely divorce. Now an instruction has **a list of results**, each of which is an SSA variable. The new SSA variable hierarchy is:
- SSA variable
- global variable
- constant, global cell, function, exposed function
- local variable
- parameter
- one result of an instruction (was just "instruction")
This change also undoes https://github.com/microvm/microvm-meta/issues/43, a previous proposal to make void more like "unit". Now we have the real "unit" return value: `()`, and the `void` type has only two use cases:
* `hybrid<void T>` for hybrids without a fixed part, and
* `ref<void>`, `iref<void>`, `weakref<void>` and `ptr<void>`, for referencess/pointers to "anything".
# Coroutine Abstractions in High-level Languages
## Similar to the old Mu model
Python's generator is similar to the old model. All arguments are supplied when the generator is created. The first `g.send()` must send `None`.
```python
def gen(a,b,c):
print("Hello!", a, b, c)
d = (yield)
print("d = ", d)
return
g = gen(1,2,3) # initialises a, b and c
g.send(None) # must be None, otherwise TypeError: can't send non-None value to a just-started generator
# prints Hello 1 2 3
try:
g.send(4) # received by d. The generator throws StopIteration to the main coroutine.
# prints d = 4
except StopIteration as e:
print("bye")
```
If the new model is desired, the program can be rewritten as:
```python
def gen():
def actual_gen():
a,b,c = (yield)
print("Hello!", a, b, c)
d = (yield)
print("d = ", d)
return
ag = actual_gen()
ag.send(None)
# now ag stops at the first yield
return ag
g = gen() # just creates the actual_gen
g.send((1,2,3)) # received by a,b,c at the first yield. a,b,c are not really parameters.
# prints Hello 1 2 3
try:
g.send(4) # received by d. The generator throws StopIteration to the main coroutine.
# prints d = 4
except StopIteration as e:
print("bye")
```
## Similar to the new model
Lua's coroutine model is similar to the new model. Arguments are supplied at the first `coroutine.resume`.
```lua
function gen(a,b,c)
print("Hello", a, b, c)
local d = coroutine.yield()
print("d = ", d)
return
end
co = coroutine.create(gen)
coroutine.resume(co, 1,2,3) -- initialises a, b, c. If less arguments are supplied, other parameters receive nil
-- prints Hello 1 2 3
coroutine.resume(co, 4) -- received by d
-- prints d = 4
```
If the old model is desired, the code can be rewritten as:
```lua
function make_gen(a,b,c)
local function actual_gen(a, b, c)
coroutine.yield()
print("Hello", a, b, c)
local d = coroutine.yield()
print("d = ", d)
return
end
local coro = coroutine.create(actual_gen)
coroutine.resume(coro, a,b,c) -- immediately initialises a,b,c. They are not parameters of actual_gen, but "up-values" of it.
return coro
end
co = make_gen(1,2,3)
coroutine.resume(co) -- It is stopping on its first empty yield() point
-- prints Hello 1 2 3
coroutine.resume(co, 4) -- received by d
-- prints d = 4
```
In conclusion, both models of swapstack are equivalent. It just takes a few initial swapstack operations to mimic each other.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/42Redesign the undefined function handler API2018-06-28T23:11:32+10:00John ZhangRedesign the undefined function handler API*Created by: wks*
# Current status
Currently there are two handlers: *trap handler* and *undefined function handler*. They are different.
In trap handlers:
* Thread is temporarily unbound from the stack, and is rebound after tr...*Created by: wks*
# Current status
Currently there are two handlers: *trap handler* and *undefined function handler*. They are different.
In trap handlers:
* Thread is temporarily unbound from the stack, and is rebound after trap handling.
* Stack is unbound and is ready for introspection.
* OSR (pop frame, push frame) is possible.
In undefined function handlers:
* Thread is, conceptually, still executing the CALL instruction that calls the undefined function.
* The stack is in the RUNNING state, not available for introspection or OSR
* The client must define the function. Then CALL will be tried again. i.e. If the client does not define the function, it will loop forever.
This asymmetry is complicating the interface.
1. The undefined function handling API **leaves no room for errors in loading the bundle**. If, in Java, the client puts a stub for an unloaded method, and there is an error when loading the class, then the client has no other things to do than "defining" the method as "doing nothing but throwing an exception". This isn't very nice.
2. It leaves the stack and the thread in a strange state: they cannot continue, but still "RUNNING".
3. Introspection and OSR are impossible.
# Proposed change
I propose unifying the two interfaces.
Let undefined functions behave like:
```
.funcdef @undefined_function VERSION noversion <sig> (a1 a2 a3 ...) {
%entry:
TRAP <void> KEEPALIVE (a1 a2 a3...)
undefined behaviour after TRAP
}
```
That is, an undefined function behaves like executing a trap with all parameters as keep-alive variables.
Add an extra API call:
```
MuID (*cur_func)(MuCtx *ctx, MuStackRefValue stack, int frame)
```
It returns the ID of the function of the current frame. It works even if the function is not defined. Return 0 if the frame is native.
Modify the `cur_func_ver` API call: It returns 0 when the frame is a frame without version (undefined function).
So a client can identify an undefined function by "cur_func != 0 && cur_func_ver == 0". FYI:
* ``cur_func != 0 && cur_func_ver != 0``: a defined Mu function
* ``cur_func != 0 && cur_func_ver == 0``: an undefiend Mu function
* ``cur_func == 0 && cur_func_ver == 0``: native frame
* ``cur_func == 0 && cur_func_ver != 0``: impossible
Modify the `cur_inst` API call: It returns 0 if the frame is just created (before the first instruction), is native, or *if the function is not defined*.
Modify the `dump_keepalives` API call: It dumps keep-alive variables for OSR point instructions, or *the arguments to an undefined function*. Cannot be used on native frames or a just-created frame.
The `pop_frame` and the `push_frame` API behave like before. Specifically to undefined functions, popping an undefined function frame reveals its caller to the top of the stack. After loading a bundle that defines the function, a `push_frame` can re-creates the callee frame, but this time the callee is defined. The client can also choose not to rebind to the same stack, but swap away. Or just throw an exception (probably `ClassNotFoundException`) to the caller without defining the callee.
# About the NEWSTACK instruction
Creation of the stack will be successful, but it will trap when executing the undefined function after binding the stack to a thread.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/41Checklist for the first final version of the specification2018-06-28T23:11:32+10:00John ZhangChecklist for the first final version of the specification*Created by: wks*
- [X] HAIL #29
- [X] Unsafe native interface. #24
- [X] API in C (header)
- [X] API in C (documented)
- [X] API as Mu instructions (common instructions). #36
- [X] Adjust the undefined function API
*Created by: wks*
- [X] HAIL #29
- [X] Unsafe native interface. #24
- [X] API in C (header)
- [X] API in C (documented)
- [X] API as Mu instructions (common instructions). #36
- [X] Adjust the undefined function API
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/36Metacircular Client Interface2018-06-28T23:11:32+10:00John ZhangMetacircular Client Interface*Created by: wks*
It has been proposed long ago that "everything the API can do should also be possible in the Mu IR".
This issue maintains a checklist of features not in the Mu IR. These functions should be gradually added to the IR...*Created by: wks*
It has been proposed long ago that "everything the API can do should also be possible in the Mu IR".
This issue maintains a checklist of features not in the Mu IR. These functions should be gradually added to the IR.
**Features in the API but not the IR**:
* bundle loading
* stack introspection: `current_func_ver`, `current_instruction`, `dump_keepalives`
* OSR: `pop_frame`, `push_frame`
* Handle traps and undefined functions in Mu IR. It depends on how the Mu VM itself is implemented.
* Handle watchpoint in Mu IR: an instruction which is a no-op when disabled, but a (maybe limited kind of) Mu function call when enabled.
**Things that can be done dynamically via handles in the API, but can only be done statically in the IR**. I am not sure how dynamic Mu should be, or need to be, because some of the items below can be worked around with some Mu IR code, such as maintaining a hash table implemented in Mu IR, or writing wrapper functions.
* Opaque handle type: a handle that holds **any** Mu value. `ref<T>` can be a candidate for this purpose.
* Getting a Mu constant value (including constants, globals and functions) by its ID. This is a kind of introspection.
* This can be done with a type argument: `%c = GET_CONST <@T> %id`, `%c` is a `@T`
* Or return a handle: `%ch = GET_CONST_H %id`, `%ch` is a `ref<void>`
* Creating Mu heap objects by a type ID.
* Calling a Mu function (or constructing a Mu frame) with both the callee and the arguments as handles. This allows calling a Mu function with a run-time-determined arity and arg types.
* Dump keepalive variables.
* This can be done with a type argument: `%sixth_ka = GET_KEEPALIVE <@T> %frame 6`
* or return handle: `%sixth_ka_handle = GET_KEEPALIVE_H %frame 6`
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/34Extra types for the Native Interface2018-06-28T23:11:32+10:00John ZhangExtra types for the Native Interface*Created by: wks*
Philosophy: There should be a subset of Mu types and instructions that can do what C can do. It should be possible to implement the C programming language in this subset of Mu while still be able to access the memory i...*Created by: wks*
Philosophy: There should be a subset of Mu types and instructions that can do what C can do. It should be possible to implement the C programming language in this subset of Mu while still be able to access the memory in a way specified by the platform's ABI (be compatible with "good" native programs).
# Types
## Pointer types
* `ptr<T>`: A memory pointer to type `T`. (Is there a better name? A pointer always points to somewhere in the memory. Maybe "data pointer" or "value pointer"? In C, it is object pointer, but "object" has a different meaning in Mu.)
* `funcptr<sig>`: A function pointer to a function with signature `sig`
Pointers are addresses. They can be cast to and from integer values by interpreting the integer as the address. Mu does not check the validity of this cast.
`ptr<T>` can be used by the memory addressing instructions: `GETFIELDIREF`, `GETELEMIREF`, ... will work as they are `iref` types. Memory access instructions can work with `ptr<T>` with a `PTR` flag:
```
// %p is ptr<int<64>>
%result1 = LOAD PTR ACQUIRE <@i64> %p
STORE PTR RELEASE <@i64> %p @const1
%result2 = CMPXCHG PTR SEQ_CST SEQ_CST <@i64> %p @const1 @const2
%result3 = ATOMICRMW PTR SEQ_CST ADD <@i64> %p @const3
```
`funcptr<sig>` can be called with the `CCALL` instruction:
```
// assume @write is funcptr<@size_t (@i32 @voidptr @size_t)>
%result = CCALL C <@sig> @write (%fd %buf %sz) // C means the "C" calling convention
```
## Union type
I think there is a way to introduce the union type from C without compromising the safety of Mu's reference types.
Define the union type as: `union<T1 T2 T3 ...>`
`T1`, `T2`, `T3`, ... are its members. The members of a union type cannot contain `ref`, `iref`, `weakref`, `func`, `thread`, `stack` or `tagref64` types as they are either object references or opaque references. However, `ptr` and `funcptr` are allowed.
`union` must be in the memory. It cannot be the type of an SSA variable. It does not make sense: union is a, err..., "union" of several types (no puns intended), but an SSA variable holds exactly one type.
> One may argue that "I want to LOAD a union and STORE to another location without looking into it, so I need union to be an SSA variable". However, for data transfer, there could be a `memcpy`-like instruction that can copy large structures efficiently. So it is unnecessary.
When allocated in the Mu memory, its initial value is all zeros: If any member is loaded before another value is stored into it, the result is always the "zero value" of that type (int 0, fp +0.0, ref NULL).
A union only holds the latest stored member:
* if a load is **not atomic*, and there is only one visible store to a member of the union, then
* if the store accesses the same member as the load, the load gets the value of that store;
* if the store accesses a different member, the load instruction has undefined behaviour.
* Union members cannot be accessed atomically.
> I am still uncertain how the C memory model plays together with unions. C11 defines a union as "an overlapping set of member objects" and "When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values." This implies that storing into one member of a union has the side effect of modifying other members.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/30Mu Loadable Format (MuLF)2018-06-28T23:11:32+10:00John ZhangMu Loadable Format (MuLF)*Created by: wks*
This proposal describes an extended code delivery unit of the Mu VM.
# Rationale
A "standalone Mu IR" (if there is such thing) needs more than a code bundle to run. They include:
* A way to allocate and initiali...*Created by: wks*
This proposal describes an extended code delivery unit of the Mu VM.
# Rationale
A "standalone Mu IR" (if there is such thing) needs more than a code bundle to run. They include:
* A way to allocate and initialise heap objects at load time (addressed by the HAIL format. See https://github.com/microvm/microvm-meta/issues/29 )
* Embedded binary native programs (for example, native libraries, a native client, or even the Mu VM itself)
* Static dependencies to other units of loading (MuLF files).
* (optionally) An entry point to start execution.
Existing mechanisms can perform all of the above because the client has total control over the Mu VM. This proposal only gives a "standard" format to do so.
# Proposal
This new unit of loading is called **Mu Loadable Format (MuLF)**.
## The MuLF file sample
This proposal uses XML as the human-readable format. It is obviously not ideal.
```xml
<mulf>
<dependency kind="mulf" name="uvm.std.io" />
<dependency kind="native" name="libc.so" />
<muir-bundle> <!-- the code section --> <![CDATA[
.typedef @i64 = int<64>
.typedef @i8 = int<8>
.typedef @string = hybrid<@i64 @i8>
.typedef @ref_string = ref<@string>
.typedef @array_ref_string = hybrid<@i64 @ref_string>
.typedef @ref_array_ref_string = ref<@array_ref_string>
.const @I64_0 <@i64> = 0 // to be initialised in the next section
.global @helloWorld <@ref_string>
.funcsig @main_sig = @i64 (@ref_array_ref_string)
.funcdef @main <@main_sig> (%args) {
%entry:
%hw = LOAD <@ref_string> @helloWorld
CALL <@println_sig> @println (%hw)
RET <@i64> @I64_0
}
]]>
</muir-bundle>
<heap-initialise format="hail"> <!-- the "heap section" --> <![CDATA[
.newhybrid $hw_obj <@string>
.init $hw_obj = {12, {'H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '!'}}
.init @helloWorld = $hw_obj // Assign this object to the global cell @helloWorld
]]>
</heap-initialise>
<binary kind="native-code">
...
</binary>
<binary kind="native-data">
...
</binary>
<initialiser function="@main" synchronous="true">
<param kind="cmdline-args-as-standard-string-ref-array" />
</initialiser>
</mulf>
```
## MuLF Clients
As other programming languages, this format is handled by a **MuLF client**. This is to keep the core Mu VM minimal.
* The Mu micro VM provides the Mu Client API, which can load [**Mu IR bundles**](https://github.com/microvm/microvm-spec/wiki/uvm-ir) and [**HAIL files**](https://github.com/microvm/microvm-meta/issues/29), as well as mechanisms to create Mu stacks and Mu threads.
* The MuLF client handles the MuLF format. The client loads and parses the MuLF file, invokes the API calls to load the included bundle and the included HAIL file, creates stacks and threads to execute the initialisation functions and perform necessary synchronisation to make sure the initialisation functions finish before "other parts" (the meaning depends on the concrete high-level language) can run.
## MuLF sections
* **dependencies**: references to other MuLF files or native libraries
* **uir-bundle**: a Mu IR bundle
* **heap-initialise**: a HAIL file which initialises the heap
* **binary**: embedded binary data
* **initialiser**: a function to be executed after loading the MuLF bundle
## The loading process
The MuLF client shall process dependencies before loading the current MuLF file.
TODO: The Mu IR is designed not to allow circular dependencies (just put multiple Mu IR bundles into one big bundle so circular types and function calls can be resolved). But whether MuLF allows circular dependencies is an open question.
Then the MuLF client loads the binary section, then the uir-bundle section, then the heap-initialiser section.
Finally the MuLF executes the initialisers in the order they are declared. Each initialiser is executed in a new Mu stack and a new Mu thread. If an initialiser is marked as "synchronous", the loading process pauses and waits for the initialiser function to return. But this does not prevent the initialisers to trigger traps to the client and result in other Mu IR bundles or MuLF files to be loaded.
Then the loading process finishes. The Mu VM will continue executing until the last Mu thread is killed.
## The relation between the MuLF client and the higher-level language client
The MuLF client can be considered as a "client-level library" which helps a higher-level client which implements a language.
Conversely the higher-level client can be considered as a library in the MuLF "framework": implementing language-specific things reactively as "call-backs" from the MuLF client.
# ELF compatibility
Using the standard ELF format will bring many profits, including making use of existing system facilities. It is possible to make a MuLF file a self-contained executable file.
# Open questions
**Do we standardise this MuLF as part of the Mu VM specification?**
Probably yes. It should provide a standard way to load "something more than a code bundle". However, this MuLF format is more oriented to the traditional ahead-of-time "linker-loader" model rather than the JIT compiling model. Complex languages (like Java) may wish to precisely control its loading process (e.g. loading more than one circularly-related classes and submit them in one huge Mu IR bundle and initialise heap objects together). In this case, MuLF is not as useful as ahead-of-time compiled programs.
We may make MuLF an "optional component" of the Mu VM. Some very tiny Mu implementation may not have it, but anyone who claims to implement MuLF shall do it in the standard-compliant way.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/26Tutorial and Spec versioning2018-06-28T23:11:32+10:00John ZhangTutorial and Spec versioning*Created by: wks*
I am evaluating http://readthedocs.org/ . A µVM tutorial is being written.
Jekyll seems to bias strongly towards blogging and the structures (including a table of content and a per-page sidebar of the outline) are n...*Created by: wks*
I am evaluating http://readthedocs.org/ . A µVM tutorial is being written.
Jekyll seems to bias strongly towards blogging and the structures (including a table of content and a per-page sidebar of the outline) are not automatically handled.
There should be a common version scheme for the spec, the refimpl and the tutorial. I propose using the major version "2" for the current version. The tutorial will target the "main" version (currently "2").
GitHub Wiki sucks (very primitive structure support, no table of content). Considering migrating the specification (https://github.com/microvm/microvm-spec/wiki) back to Sphinx/reStructuredText if readthedocs.org is proved useful.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/22Trap and OSR2018-06-28T23:11:32+10:00John ZhangTrap and OSR*Created by: wks*
This ticket tracks the design of trap handling, stack introspection and OSR API.
# Overview
There are three flavours of stack usage.
The first case is using TRAP to compute some value (or change some µVM state...*Created by: wks*
This ticket tracks the design of trap handling, stack introspection and OSR API.
# Overview
There are three flavours of stack usage.
The first case is using TRAP to compute some value (or change some µVM states)
1. Enter the `handle_trap` call-back from a TRAP instruction, leaving the stack in the `READY<T>` state. The current thread is unbound and suspended.
2. The Client compute some value (of type `T`) and change some µVM states.
3. Return from the trap and continue normally. That is, re-bind the thread to the stack, passing the value of type T.
The second case is using TRAP for OSR.
1. Enter the `handle_trap` call-back from a TRAP instruction, leaving the stack in the `READY<T>` state. The current thread is unbound and suspended.
2. The Client queries the current version of function, the current instruction, and the current KEEPALIVE variables of any frame in the current stack.
3. The Client pops frames. Now the stack is in some inconsistent state.
4. The Client pushes frames. For each frame, supply the current version of function, the current instruction and the value of any live variables.
5. The Client re-bind the thread to the stack and return from the TRAP. Just before rebinding, the stack should be in some `READY<U>` state where U may not be T.
The third case is to manipulate some arbitrary stack in a `READY<T>` state.
1. The Client do whatever it wants to the stack.
2. The stack is in `READY<U>` state where U may not be T.
# Open questions
## Is an UNDER_CONSTRUCTION flag needed?
Observed from the previous cases, there are generally two categories:
1. Do not perform OSR and simply return with some value (or throw exceptions to the stack).
2. Perform OSR.
The second case may leave the stack temporarily in an inconsistent state. Any attempt to swap-stack to such a stack is meaningless. The UNDER_CONSTRUCTION flag indicates such a state.
This requirement can be interpreted in two ways:
1. This flag is a physical flag. The Client takes an action to set the flag. After OSR, it clears the flag. This flag can be probed and can be tested during SWAP-STACK. A µVM implementation may implement a mutual-exclusive lock for swap-stack (but may be inefficient).
2. This flag is only conceptual, that is, it does not physically exist. The Client simply does OSR. There is no way to see whether a stack is "under construction". Swapping to such a stack gives undefined behaviour.
I prefer the second approach. Swapping to an "under construction" stack is never meaningful and always requires extra synchronisation in the program. We may trust the Client to generate correctly synchronised code.
## What state is a stack in when some frames are popped?
All frames other than the top frame must be executing the CALL instruction. After popping any frame, the "caller frame" is exposed as the top frame and it may continue with a value or receive an exception just like the TRAP instruction. So it is natural to define that after popping, the stack is in the `READY<R>` state where R is the return type of the current function.
However, from the implementation point of view, SWAP-STACK must have a different calling convention from ordinary calls (mainly because SWAP-STACK cannot have any callee-saved registers because the callee may not swap back). There must be a "ghost frame" above the current frame with CALL to adapt to the SWAP-STACK calling convention. The value passed by SWAP-STACK will be returned from the "ghost frame" to the CALLer.
We may assume adding the "ghost frame" is cheap. Maybe not.
# Hypothetical Client code in Java
This code lets the Client perform some computation.
```Java
/* Assume the following µVM IR code:
%bb:
@current_time_millis_1234567 = TRAP <@i64>
CALL <...> @print (@current_time_millis_1234567)
....
*/
class Client extends MicroVMClient {
@Override
public TrapReturnValue handleTrap(ClientAgent ca, int threadHandle, int stackHandle) {
long time = System.currentTimeMillis();
ca.putLong("@i64", time);
return new RebindThreadPassValue(stackHandle, time);
}
}
```
This code replaces the top frame:
```Java
class Client extends MicroVMClient {
@Override
public TrapReturnValue handleTrap(ClientAgent ca, int threadHandle, int stackHandle) {
// Introspect the frames
int curInstID = ca.getCurrentInstruction(stackHandle, 0); // 0 = top frame
int[] keepAlives = ca.dumpKeepAlives(stackHandle, 0); // 0 = top frame
// Re-compile the function. newFunc also tells the Client where to continue.
HighLevelFunction newFunc = compileNewFunction(...);
// Pop a frame
ca.popFrame();
// What µVM function is the new high-level function?
int funcID = newFunc.getUvmFuncID();
// Where to continue?
int contInstID = newFunc.getContinuationPoint();
// What are the values of local variables?
Map<Integer, Integer> variableToValue = new HashMap<Integer, Integer>();
for (LocalVariable lv: newFunc.localVariables()) {
int valHandle = ca.putXxxx(lv.getValue())
int varID = lv.getUvmVarID();
variableToValue.put(varID, valHandle)
}
// Push the frame
ca.pushFrame(funcID, contInstID, variableToValue);
// Return from TRAP, tell the µVM to re-bind the thread with the stack. The trap does not receive values.
return new RebindThreadPassVoid(stackHandle);
}
}
```
This example emulates the JVMTI function `ForceEarlyReturnInt` (force a function (of `int` return value) to return early with a specific value, not executing any finalisers).
```java
/*
Assume the following µVM function:
.funcsig @foo_sig = @i32 ()
.funcdef @foo VERSION @foo_v1 () {
%entry:
@my_trap_xxxxxx = TRAP <@void>
THROW @NULLREF
}
*/
class Client extends MicroVMClient {
@Override
public TrapReturnValue handleTrap(ClientAgent ca, int threadHandle, int stackHandle) {
ca.popFrame(stackHandle); // Pop the top frame and expose its caller to the top
int returnValue = 42;
int rvHandle = ca.putInt("@i32", returnValue);
return new RebindThreadPassValue(stackHandle, rvHandle);
}
}
```
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/21Proposed Lua-like µVM-Client Interface2018-06-28T23:11:32+10:00John ZhangProposed Lua-like µVM-Client Interface*Created by: wks*
In Lua, the C program exchanges values with a "Lua state" using a stack.
1. All Lua values are kept on the stack and can converted to and from C values on demand.
2. All Lua references must be kept on the stack. Al...*Created by: wks*
In Lua, the C program exchanges values with a "Lua state" using a stack.
1. All Lua values are kept on the stack and can converted to and from C values on demand.
2. All Lua references must be kept on the stack. All operations involving tables require the table operand to be on the stack.
Why?
1. The type systems of Lua and C are different. This stack segregates all Lua types from C types.
2. Lua uses garbage collection. (The official Lua uses mark-sweep.) This stack simplifies GC by preventing the C program from keeping a reference to the Lua world.
Reference: The stack in the Lua C API http://www.lua.org/pil/24.2.html
# Overview
The Client interacts with the µVM via a µVM Client Agent. The Agent keeps a stack of µVM values, a thread-local allocator and so on. It is the counterpart of a µVM thread, albeit working for the Client. Each Agent is only accessible from one Client thread (not thread safe), but a µVM Client may have arbitrary number of Agents.
There are several principles:
* The stack holds any µVM value that can be held in a µVM SSA variable. A cell in the stack is like a µVM SSA variable. Unlike the µVM memory, it does not have memory location and cannot be referred to.
* The Client provide implementation-specific way to add Client values into the Agent stack and extracting µVM values to Client values.
* All µVM API messages that take µVM values as parameters shall use existing µVM values on the stack. All µVM API messages that return µVM values shall push new values on the stack.
Example (Java as Client):
```java
MicroVM mvm = …;
ClientAgent ca = mvm.newClientAgent();
// push some values
ca.pushInt("@i32", 0x12345678);
ca.pushLong("@i64", 0x123456789abcdef0L);
ca.pushFloat(3.14F);
ca.pushDouble(6.28);
// The Java integer type and the µVM type does not need to match.
// The following pushes convert Java integer type to µVM types of different lengths
ca.pushInt("@i64", -0x22334455);
ca.pushLong("@i32", 0x123456789abcdef0L); // truncated to 0x9abcdef0
ca.pushInt("@i1", 1); // integer of 1 bit (boolean type)
ca.pushBool("@i1", true); // same as above.
// Retrieving values from the stack
int v1 = ca.toInt(-4); Get the 4th top element from the stack and convert to Java int. So it is -0x22334455.
long v2 = ca.toLong(-4); Same as above, but convert to Java long (zero extended). So it is 0xddccbbabL.
// Popping
ca.pop(7); Pop 7 elements from the stack.
// Memory access
ca.pushGlobal("@global_var"); // Get the internal reference of a global cell and push on the stack
ca.load(MEMORD_ACQUIRE); // Load. Assume the top element is an internal reference.
ca.pushGlobal("@global_var");
ca.pushInt("@i64", 42);
ca.store(MEMORD_RELEASE); // Store. The top element is the new value and the second is the internal reference.
// Calling a µVM function
// This is fairly complicated because this involves creating both a µVM stack and a µVM thread.
ca.pushFunc("@some_func"); // Push a function reference
ca.pushInt("@i32", 42); // Push argument1
ca.pushDouble(3.14); // Push argument2
try {
ca.newStack(2); // Create a new stack, using a function with 2 arguments on the stack.
// So the top 2 elements are arguments and the third element is the function itself.
// Those elements are popped and a new stack value is pushed.
} catch (MicroVMStackOverflowException e) {
...
}
ca.newThread(); // Create a new thread. Assume the top element on the stack is a µVM stack value.
```
# API functions
The principles are
+ Pushing operations add new values to the top.
+ The popping operation removes values from the top.
+ Queries (the operations that convert values back to Java values) are non-destructive. They keep the values.
+ Operations on the top of the stack are destructive: they pop the operands, like the JVM, then push new values on the top.
## Getting the µVM Client Agent
* message: `new_client_agent`
* parameters: none
* returns: a handle of new client agent
Create a new Client Agent.
Example Java signature: `public ClientAgent MicroVM#newClientAgent()`
* message: `close_client_agent`
* parameters:
1. `ca`: the handle to the client agent
* returns: none
Close the Client Agent.
Example Java signature: `public void ClientAgent#close()`
## Pushing new values to the stack
* message: `push_value`
* parameter:
1. `uvm_type`: the µVM type of the value
2. `val`: the value in the Client's representation
* returns: none
* stack top:
+ before: ...
+ after: ..., `val_in_uvm_type`
Convert the Client value `val` to the µVM type `uvm_type` and push it to the stack. This message only work for non-reference values, including integers and floating point numbers.
The µVM may implement this as multiple functions/methods that best suits the Client programming language.
Example Java signatures:
* `public void ClientAgent#pushInt(int uvmTypeID, int val)`
* `public void ClientAgent#pushLong(int uvmTypeID, long val)`
* `public void ClientAgent#pushBigInteger(int uvmTypeID, BigInteger val)`
* `public void ClientAgent#pushFloat(int uvmTypeID, float val)`: will truncate/extend to the µVM type
* `public void ClientAgent#pushDouble(int uvmTypeID, double val)`: will truncate/extend to the µVM type
* `public void ClientAgent#pushFloatNoType(float val)`: always convert to the µVM `float` type
* `public void ClientAgent#pushDoubleNoType(double val)`: always convert to the µVM `float` type.
* message: `push_global`
* parameters:
1. `global`: the ID/name of a µVM global cell
* returns: none
* stack top:
+ before: ...
+ after: ..., `global_val`
Push an internal reference of a global cell to the stack.
Example Java signature: `public void ClientAgent#pushGlobal(int uvmGlobalID)`
* message: `push_func`
* parameters:
1. `func`: the ID/name of a µVM function
* returns: none
* stack top:
+ before: ...
+ after: ..., `func_val`
Push a function reference (µVM's `func` type) of a µVM function to the stack
Example Java signature: `public void ClientAgent#pushFunc(int uvmFuncID)`
## Converting to Client types
* message: `to_client_value`
* parameters:
1. `pos`: the position in the stack
* returns: the µVM value in the client type
* stack top: not changed
Convert a value in the stack to the client type. This applies for non-reference types including integers and floating point numbers.
The µVM may implement this as multiple functions/methods that best suits the Client language.
Example Java signatures:
* `public int ClientAgent#toInt(int pos)`
* `public long ClientAgent#toLong(int pos)`
* `public BigInteger ClientAgent#toBigInteger(int pos)`
* `public float ClientAgent#toFloat(int pos)`
* `public double ClientAgent#toDouble(int pos)`
## Popping
* message: `pop`
* parameters:
1. `num`: the number of values to pop
* returns: none
* stack top:
+ before: ..., `elem_1`, `elem_2`, ..., `elem_num`
+ after: ...
Pop `num` elements from the stack.
Example Java signature: `public void ClientAgent#pop(int num)`
## Memory Allocation
* message: `new`
* parameters:
1. `type`: the ID/name of the µVM type of the object
* returns: none
* stack top:
+ before: ...
+ after: ..., `ref`
Allocate an object of type `type` on the µVM heap and push the object reference on the stack.
Example Java signature: `public void ClientAgent#newObj(int typeID)`
* message: `new_hybrid`
* parameters:
1. `type`: the ID/name of the µVM type of the object
* returns: none
* stack top:
+ before: ..., `len`
+ after: ..., `ref`
Allocate an object of type `type`, which must be a `hybrid` type, on the µVM heap and push the object reference on the stack. The length of the variable part is `len`, which is any µVM integer types zero_extended to the machine word length.
Example Java signature: `public void ClientAgent#newHybridObj(int typeID)`
## Memory Access
* message: `load`
* parameters:
1. `memord`: the memory ordering
* returns: none
* stack top:
+ before: ..., `iref`
+ after: ..., `val`
Load from an internal reference `iref` on the stack and push the loaded value to the stack, using the `memord` memory ordering.
Example Java signature: `public void ClientAgent#load(MemoryOrder memOrd)`
* message: `store`
* parameters:
1. `memord`: the memory ordering
* returns: none
* stack top:
+ before: ..., `iref`, `new_val`
+ after: ...
Store `new_val` on the stack into an internal reference `iref` on the stack, using the `memord` memory ordering.
Example Java signature: `public void ClientAgent#load(MemoryOrder memOrd)`
## Stack and Thread operations
* message: `new_stack`
* parameters:
1. `nparams`: the number of parameters to the stack-bottom function
* returns: none
* stack top:
+ before: ..., `func`, `arg_1`, `arg_2`, ..., `arg_nparams`
+ after: ..., `stack`
Create a new stack using `func` as the stack-bottom function and `arg_x` as its arguments. Push the newly created `stack` value to the stack.
Example Java signature: `public void ClientAgent#newStack(int nParams)`
* message: `new_thread`
* parameters: none
* returns: none
* stack top:
+ before: ..., `stack`
+ after: ..., `thread`
Create a new thread which is initially bound to a stack `stack`. Push the thread value on the Client Agent stack. The new thread `thread` starts execution immediately.
Example Java signature: `public void ClientAgent#new_thread()`
## Other API functions
TODO: define them later
* `is_int`, `is_float`, `is_ref`, `is_iref`, ... `is_stack`, `is_thread`, `is_tagref64`
* `tagref64_is_int`, ... `tagref64_get_ref`, ..., `tagref64_set_fp`... : manipulate the `tagref64` type.
* `copy_value`, `remove_value`: manipulate the Client Agent stack.
* `extract_field`, `insert_field`: manipulate `struct` types
* `extract_element`, `insert_element`: manipulate `vector` types
* `get_iref`, `get_field_iref`, `get_elem_iref`, `shift_iref`, `get_fixed_part_iref`, `get_var_part_iref`: manipulating reference types.
* `get_current_stack`, `kill_stack`, `bind_thread_to_stack`: advanced thread/stack operations
* `get_active_func_version_id`, `get_current_instruction_id`, `dump_keepalive_variables`, `pop_frame`, `push_frame`: for OSR
# Known Issues
This API assumes a stack which can contain ANY µVM types that were applicable for SSA variables. This makes it a dynamically types. This is good for Lua because Lua is dynamic and has a small set of types (only nil, boolean, number, string, table, function, userdata, ...), all of which have similar sizes.
The main problem with the µVM is when there are µVM `struct` type values (especially large structs which, themselves, are bad to be represented as value rather than reference to heap object). Some corner cases include the "complex number" type which can be represented as a struct of two doubles. In any cases, extra type information must be kept for the stack to know the type of all of its elements.
I have to trust the µVM implementation to handle the dynamic typing efficiently.
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/18How to represent merging of variables / Phi functions2018-06-28T23:11:32+10:00John ZhangHow to represent merging of variables / Phi functions*Created by: eliotmoss*
This is an update of a proposal from a year ago.
The current Mu definition of SSA-form has labels, branches to labels, and Phi functions just after labels.
I propose an alternative, which we might call "got...*Created by: eliotmoss*
This is an update of a proposal from a year ago.
The current Mu definition of SSA-form has labels, branches to labels, and Phi functions just after labels.
I propose an alternative, which we might call "goto with values". It is similar to continuation passing style except that all the continuations are statically defined (one for each label). This alternative would work as follows:
* Each label would have zero or more (SSA-variable : type) pairs. These SSA-variables are local to the block that starts at the label.
* Each branch label (both labels in a conditional branch, the single labels in an unconditional branch, etc.) would list locally visible values being "sent" to the branch target.
Example:
x = 3;
y = 25;
goto l(x, y);
...
l: (x2:int<32>,y2:int<8>)
This form makes it clear that the association ("assignment") of values to phi-variables has to happen as part of the control transfer. Phi-functions are simply one way of representing that, but they're not not a way that seems immediately helpful for code generation.
I further observe that the current definition apparently allows any value to be passed in to a Phi, but I think it should be restricted to global/constant SSA variables or SSA variables defined in the sending block. The new form perhaps makes that clear, not least because with it you can explicitly disallow referring to local SSA variables defined in other blocks.
Putting these another way, non-global SSA-variables have a scope only from where they are defined to the end of their block. Some may be defined at the label that start the block and others may be defined later, but all live values must be mentioned at branches and explicitly passed on.
Compared with traditional SSA form this may appear verbose. However it has at least a couple of advantages:
* A code generator or optimizer need not perform a live variable analysis to know what is live at a given point in the code.
* Longer live ranges are broken up into smaller ones, which may allow better register allocation, and which is very suited to a coalescing register allocator.
* This form appears simpler to deal with in a formal specification of Mu.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/16Undefined vs Implementation-defined2018-06-28T23:11:32+10:00John ZhangUndefined vs Implementation-defined*Created by: wks*
NOTE: a higher-level discussion is in https://github.com/microvm/microvm-meta/issues/17
During the meeting in 23 September 2014, we talked about the difference between "undefined behaviour" and "implementation-defin...*Created by: wks*
NOTE: a higher-level discussion is in https://github.com/microvm/microvm-meta/issues/17
During the meeting in 23 September 2014, we talked about the difference between "undefined behaviour" and "implementation-defined behaviour".
# Background
Some operations in C as well as LLVM are undefined behaviours.
* Division by zero.
* Example: 42 / 0
* Overflow in signed division.
* Example: int a = -0x80000000; int b = a / -1;
* Shifting an integer by a number of bits greater than the length of the left-hand-side
* Example: int a = 42; int b = a << 32; int c = a << -1; assume int is 32-bit.
However, the machine instruction counterparts have defined behaviours in each and every architecture.
* Division by zero
* x86: IDIV, DIV: Divide-by-zero raises "divide error".
* ARMv7: SDIV, UDIV:
* ARMv7-A: Divide-by-zero always gets 0.
* ARMv7-R: Controlled by SCTLR.DZ, an "Undefined Instruction" exception may or may not be raised.
* A64: Divide-by-zero always gets 0
* division overflow
* x86: IDIV, DIV: If the result is not representable (positive too large, negative too small) by the corresponding type (signed or unsigned), then raises "divide error".
* ARMv7-A: SDIV and UDIV are optional. They may be implemented by software.
* ARMv7, A64: SDIV, UDIV: result is truncated to the number of bits of the corresponding type. No error is raised. So -0x80000000 / -1 == -0x80000000 when it is 32-bit.
* shifting:
* x86: SAL,SAR,SHL,SHR: the count operand is masked to 5 bits (32-bit integer) or 6 bits (64-bit integer)
* ARMv7:
* LSL, LSR, ASR (immediate): It can only encode 5 bits of shift amount.
* LSL, LSR, ASR (register): The shift mount register is masked to 8 bits. After shifting, the last 32 bits are the result.
* A64:
* ASR, LSL, LSR (immediate): It can only encode 6 bits of shift amount.
* ASRV, LSLV, LSRV (register controlled): The shift amount is the second register modulo the register size (i.e. masked).
# Undefined behaviour vs implementation-defined behaviour
In C11:
+ **undefined behavior**: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
+ **implementation-defined behavior**: unspecified behavior where each implementation documents how the choice is made
Implementation-defined behaviour has an additional requirement that the implementation should document the behaviour. As long as the behaviour is documented, it is still considered "defined", but at a different layer.
If a behaviour is never defined, the higher level (e.g. the Client) has no chance to depend on it even if the lower level (e.g. the CPU) has a precisely defined behaviour. On the contrary, if the behaviour is implementation-defined, there are still ways for the higher level to use the low-level detail. The ways include (assume the higher level is the client, the middle level is the µVM and the lower level is the CPU):
+ The Client programmer read the µVM manual for the particular CPU.
+ The µVM provides compile-time-checkable flags and the Client is conditionally compiled. (`configure --with-uvm=x86-64`)
+ The µVM provides run-time-callable functions and the Client generates code conditionally (`if (uvm.wordSize() == 64) { emitStore64(reg,mem); }`).
# How should an abstraction layer be made over differences?
Undefined behaviours usually occur in the cases (other than errors) where different platforms behave differently. Division and shifting are two examples.
When creating an abstraction layer over such differences, there are basically three choices.
1. Define the different part as *undefined behavior* and the high-level user must avoid using them.
* C and LLVM takes this approach.
* The advantage is to make the specification very simple.
* The disadvantage is making it very difficult for the higher level to make efficient use of these cases since they must try everything to avoid those undefined behaviours.
2. Define the different part in one particular way and provide an implementation on every platform to behave like that.
* Java takes this approach.
* The advantage is the maximum portability of high-level code, since all programs work the same everywhere.
* The disadvantage is to make implementation and optimisation very difficult because the are too much invariants to maintain.
3. Define the different part as *platform-specific* and require the high-level to handle the difference.
* This will be the approach taken by the µVM.
* The advantage is to let the client make full use of the capabilities provided by the platform, resulting in efficient code. The µVM spec is also kept simple by pushing the differences to the implementation and the client.
* The disadvantage is adding more burden to the clients. The client must be aware of the difference between the µVM implementation on different platforms. The good thing is, the µVM still abstracts over the hardware, so the knowledge required by the client is limited to the µVM layer, not the hardware.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/15Summary of changes in the Type System and the Instruction Set2018-06-28T23:11:32+10:00John ZhangSummary of changes in the Type System and the Instruction Set*Created by: wks*
# Top-level IR
* Support new OSR mechanism #5
- Function version clause #5
# Type system
* Add vector type. #9
* (REMOVED) Add lock type #11
# Instruction set
* Add "exception clause" #6 #7 #10
...*Created by: wks*
# Top-level IR
* Support new OSR mechanism #5
- Function version clause #5
# Type system
* Add vector type. #9
* (REMOVED) Add lock type #11
# Instruction set
* Add "exception clause" #6 #7 #10
- Also merge instructions: `CALL`/`INVOKE`, `ICALL`/`IINVOKE`
* Basic operations allow vector types and new instructions for vectors #9
- BinOp, Cmp, Conv, `SELECT`, `PHI`, `CALL`, `RET`, `LOAD` (gather), `STORE` (scatter)
- `EXTRACTELEMENT`, `INSERTELEMENT`, `SHUFFLEVECTOR`
* Match the new stack states (`READY<T>` and friends) #7
- Revised `TRAP`/`WATCHPOINT` instruction #7
- Revised `SWAPSTACK` instruction #7
* Revised `ICALL` instruction #9 spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/14Document memory layout requirements2018-06-28T23:11:32+10:00John ZhangDocument memory layout requirements*Created by: wks*
Because of the difference between architectures, it is better to leave the object layout to be implementation-specific. An implementation of a µVM can choose its optimal strategy to make memory accesses as fast as poss...*Created by: wks*
Because of the difference between architectures, it is better to leave the object layout to be implementation-specific. An implementation of a µVM can choose its optimal strategy to make memory accesses as fast as possible.
However, although the memory layout, is part of the implementation, some guarantees must be made to enable:
1. implementation of object-oriented languages: the superclass-subclass relationship can be most conveniently implemented as the superclass being a prefix of a subclass structure.
2. interoperation between µVM programs and C programs: during a foreign function call, data structures are shared so that data can be passed both ways.
3. some basic data structures must support atomic accesses. One of such type is references.
# Prefix rules
Some type **is a prefix of** another type. If T1 is a prefix of T2, then there are **shared components** between T1 and T2. A component can be the whole value or some part of it, including fields in structs, elements in arrays and vectors and both the fixed part and the variable part in hybrids.
Specifically:
* Any type is a prefix of itself.
+ All corresponding components are shared.
* `void` is trivially a prefix of any type.
+ No component is shared.
* `T1 = T` is a prefix of `T2 = struct <SEQ>` for any T where SEQ is a sequence of types beginning with T.
+ The whole T1 itself is a shared component with the first field in the struct T2.
* `T1 = T` is a prefix of `T2 = hybrid<T U>` for any T.
+ The whole T1 is a shared component with the fixed part in the hybrid T2.
* `T1 = T` is a prefix of `T2 = array<T n>` for any T if n >= 1.
+ The whole T1 is a shared component with the first element in array T2.
* For all types T1, T2 and T3, if T1 is a prefix of T3 and T3 is a prefix of T2, then T1 is a prefix of T2.
+ The shared component between T1 and T2 are their mutual shared components with T3.
Examples:
* `float` is a prefix of `struct<float double>`.
+ The first field is shared.
* `struct<@TIB_REF @LOCK @LENGTH_TYPE>` is a prefix of `hybrid<struct<@TIB_REF @LOCK @LENGTH_TYPE> int<8>>`.
+ The fixed part of the latter type is shared with the former type.
* `int<8>` is a prefix of `array <int<8> 100>`.
+ The first byte element of the latter is shared with the former.
* `@TIB_REF` is a prefix of `hybrid<struct<@TIB_REF @LOCK @LENGTH_TYPE> int<8>>`
+ There is an intermediate type `struct<@TIB_REF @LOCK @LENGTH_TYPE>` that bridges the "is a prefix of" relation.
If:
* There is a memory location M which represents data of type T2, and,
* T1 is a type and T1 is a prefix of T2, and,
* r1 is an `iref<T1>` and refers to memory location M1, and,
* r2 is an `iref<T2>` and refers to memory location M, and,
* the beginning of M and M1 are the same (i.e. the have the same address), and,
* rc1 is r1 or an internal reference derived from r1, and,
* rc2 is r2 or an internal reference derived from r2, and,
* rc1 and rc2 refer to a shared component between T1 and T2,
then rc1 and rc2 refer to the same memory location. This means the shared components can be accessed as if it is a field of a prefix. This allows treating a subclass as an instance of a superclass.
Related standard: C11
* 6.3.2-3: (**array --> pointer to first element**) ... an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. ...
* 6.7.2.1-15:(**pointer to struct <--> pointer to first field**) ... A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. ...
TODO: C does not explicitly allow the prefixing between the following structs where their sequences of elements has a "prefix" relation:
```
struct Foo { short a; int b; long c; };
struct Bar { short a; int b; long c; float d; double e; };
```
There may be a reason behind it.
# Object layout and C foreign function interface (FFI)
The object layout *should* follow the application binary interface (ABI) as much as possible because:
* The ABI is carefully designed by system programmers for performance. The µVM implementer needs a good reason why not to follow it.
* When a foreign function call to external C programs is needed, the µVM data structure should already be in the desired layout expected by the C programs.
The µVM only needs to guarantee some compatibility between µVM types and C types in the FFI. It is already documented in the [Instruction Set](https://github.com/microvm/microvm-spec/wiki/instruction-set), but needs to be double-checked.
# Basic data structures that needs atomic accesses
To guarantee memory safety, the µVM must not allow out-of-thin-air reference values or opaque values. Affected types are:
* ref
* iref
* weakref (loaded into SSA variables as ref)
* func
* thread
* stack
* tagref64 (may contain ref)
* futex (one word integer, loaded into SSA variables as plain `int<WORD_SIZE>`)
Storing internal references (`iref`) in the memory (heap or stack or global) is discouraged because of space inefficiency (no better way than encoding them as fat pointers). But if an implementation does allow putting `iref` in the memory, the accesses to them should be atomic. Other types than `iref` are too important not to be implemented as lock-free atomic types (all atomic accesses in µVM are lock-free).
An alternative to requiring `iref` to be atomic is to:
- document that accesses to `iref` in the memory is not atomic, and,
- require the client to compile all memory accesses to `iref` with locks, and,
- have significant performance penalty.
Since a fat pointer consists of two words: an object reference plus an in-object offset, some (I assume there are only very few of them) architectures may not provide atomic access to such a length.
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/13Change the term "SSA Value" to "SSA Variable"2018-06-28T23:11:32+10:00John ZhangChange the term "SSA Value" to "SSA Variable"*Created by: wks*
The term "SSA Value" currently used in the µVM is confusing. People tend to think "value" means data value, that is, instances of types and are representable in binary. But what it actually means in the µVM is the **th...*Created by: wks*
The term "SSA Value" currently used in the µVM is confusing. People tend to think "value" means data value, that is, instances of types and are representable in binary. But what it actually means in the µVM is the **things** that can **generate** data value.
# The current conceptual model
The current hierarchy is:
* **SSA Value** (the union of all such things)
- **Global SSA Value** (defined globally, actually constants, written as `@a`, `@b`, `@c`, ...)
+ **Declared constants** (defined literally as constants)
+ **Global memory references** (internal references to the global memory, which never moves. Hence the reference is constant)
+ **Functions** (There is a constant "handle" for each function, which does not change even when a function is redefined. So it is effectively a constant.)
- **Local SSA Values** (defined locally, determined during program execution, written as `%a`, `%b`, `%c`, ...)
+ **Parameters** (passed to functions, determined at each call site)
+ **Instructions** (computations that generate data values, or void, the "unit" value. The resulting data value is determined every time the instruction is executed)
The other confusion is that currently "an instruction **is an** SSA Value". When an instruction uses another instruction as its parameter, like:
```
.const @a <int<64>> = 0x123456789abcdef0
%x = ADD <int<64>> @a @a
%y = SUB <int<64>> %x @a
```
In the above example, `@a`, `%x` and `%y` are all names of SSA Values. In the definition of `%y`, it means "`%y` **is** a SUB instruction and it takes the **computing result of** instruction `%x` and **the data value of** declared constant `@a` as arguments".
When the Client runs a trap handler, the names identify the TRAP instructions.
```
%trap0 = TRAP KEEPALIVE (%a %b %c %d ...)
```
Then the client will see "the `%trap0` instruction caused the trap".
The above model, though internally consistent, confused many people.
# The use in LLVM
* Value
- Argument
- BasicBlock
- InlineASM
- MDNode
- MDString
- User
+ Constant
* BlockAddress
* ConstantInt
* ConstantFP
* ConstantPointerNull
* ConstantStruct
* ConstantVector
* ConstantExpr
- BinaryConstantExpr
- CompareConstantExpr
- ...
* GlobalValue
- GlobalAlias
- GlobalObject
+ Function
+ GlobalVariable
+ Instruction
- BinaryOperator
- CmpInst
- PHINode
- CallInst
- UnaryOperation
+ LoadInst
+ AllocaInst
+ ...
- StoreInst
- TerminatorInst
+ BranchInst
+ ReturnInst
+ InvokeInst
+ ...
- ...
+ Operator
- ...
LLVM defines Value as (http://llvm.org/doxygen/classllvm_1_1Value.html):
> LLVM Value Representation.
>
> **This is a very important LLVM class. It is the base class of all values computed by a program that may be used as operands to other values.** Value is the super class of other important classes such as Instruction and Function. All Values have a Type. Type is not a subclass of Value. Some values can have a name and they belong to some Module. Setting the name on the Value automatically updates the module's symbol table.
>
> Every value has a "use list" that keeps track of which other Values are using this Value. A Value can also have an arbitrary number of ValueHandle objects that watch it and listen to RAUW and Destroy events. See llvm/IR/ValueHandle.h for details.
The µVM's hierarchy is a subtree of the LLVM's Value hierarchy. Similar with LLVM, instructions that does not return values (like ReturnInst, BranchInst, ...) are also a kind of "Value".
# Alternative notation
An SSA Variable is a thing that holds a data value. A data value is an instance of a type.
An SSA Variable has exactly one definition and the definition corresponds to exactly one SSA Variable. That definition can be a declaration of constant, global memory, function, parameter or instruction.
The revised hierarchy is:
* **SSA Variable**
- **Globally defined SSA Variable** (written as `@a`, `@b`, `@c`, ...)
+ defined by **Declared constants**
+ defined by **Global memory references**
+ defined by **Functions**
- **Locally defined SSA Variable** (written as `%a`, `%b`, `%c`, ...)
+ defined by **Parameters**
+ defined by **Instructions**
This does not change the hierarchy. In an implementation, it is practical to let class `Instruction` extend class `SSAValue`. In the documentation, we say, if `%a = ADD <int<64>> %b %c`, then:
* `%a` is an SSA Variable
* `%a` is defined by the `ADD <int<64>> %b %c`
* That instruction is bound to the SSA Variable `%a`
In this example:
```
.const @a <int<64>> = 0x123456789abcdef0
%x = ADD <int<64>> @a @a
%y = SUB <int<64>> %x @a
```
We say, `@a`, `%x` and `%y` are all SSA Variables. `%y` is **defined by** a SUB instruction. It takes the **value held by** the SSA Variables `%x` and `@a` as arguments".
In a trap handler in the Client for the following TRAP:
```
%trap0 = TRAP KEEPALIVE (%a %b %c %d ...)
```
The client will see "the TRAP instruction that **defines** `%trap0` caused the trap".
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/12Document trampoline instruction2018-06-28T23:11:32+10:00John ZhangDocument trampoline instruction*Created by: mn200*
At meeting on 2 September 2014, we agreed that an LLVM-style trampoline primitive would probably be a good idea. Because
* the abstraction will allow generate of good machine code for varying architectures
* it ...*Created by: mn200*
At meeting on 2 September 2014, we agreed that an LLVM-style trampoline primitive would probably be a good idea. Because
* the abstraction will allow generate of good machine code for varying architectures
* it may be necessary to turn `fnptr * environment` into a single `fnptr` for the sake of FFI calls https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/10More instructions with abnormal control flow2018-06-28T23:11:32+10:00John ZhangMore instructions with abnormal control flow*Created by: wks*
There are a few instructions that may not always be executed successfully.
* TRAP, WATCHPOINT, SWAPSTACK, CALL/INVOKE, ICALL/IINVOKE: Some instructions expect exceptions. We already have multiple versions of instruc...*Created by: wks*
There are a few instructions that may not always be executed successfully.
* TRAP, WATCHPOINT, SWAPSTACK, CALL/INVOKE, ICALL/IINVOKE: Some instructions expect exceptions. We already have multiple versions of instructions for those with or without exceptions.
* BinOp: division-by-zero and floating-point exceptions are generated by the hardware, but many programming languages have defined behaviour on those cases (e.g. Java throws language-level exceptions). Some (e.g. FP exceptions) cannot be prevented easily by checking before doing operation
* NEW/NEWHYBRID/ALLOCA/ALLOCAHYBRID: out-of-memory error
* Function calls and the NEWSTACK instruction may call declared but undefined functions.
* Function calls may result in out-of-memory errors.
* LOAD/STORE/CMPXCHG/ATOMICRMW: they may access invalid memory locations, including the `NULL` reference and invalid addresses introduced by array element addressing. The hardware cannot cover all illegal cases, but it may be possible to let the hardware handle common errors (like `NULL` reference dereferencing). See Issue #6
* All intrinsic functions that may have exceptions (covered by the IINVOKE instruction).
# The proposal
Change the grammar so that all such instructions take an optional "exception clause". This generalises the handling of many instructions. Existing instruction pairs that distinguish the awareness of exceptions (CALL/INVOKE, ICALL/IINVOKE) can be merged. The identification of "basic block terminator" instructions now depends on the presence of such clause, not just by the opcode.
Example:
Do not handle exception locally:
```
%entry:
%rv = CALL <@sig> @func (%arg) // continue with the next instruction, or unwind the current frame
%b = ADD <@i64> %rv 1
...
```
Handle exception locally:
```
%entry:
%rv = CALL <@sig> @func (%arg) EXC(%nor %exc) // terminates the current basic block.
%nor:
... // normal destination
%exc:
%e = LANDINGPAD // exceptional destination
...
```
Even simple divisions may go wrong:
```
%entry:
%result = SDIV <@i64> %a %b EXC(%nor %exc)
%nor:
... // continue normally
%exc:
THROW @DivisionByZeroException
```
NOTE: The µVM will not create new exceptions for the client. If the exceptional case is not because of an exception being thrown from a function or another stack (by swap-stack), then it is an error. Errors are undefined behaviours. The following code is fatal when `%b` is zero:
```
%entry:
%result = SDIV <@i64> %a %b // undefined behaviour if %b is zero.
```
# About factored CFG
Factored CFG is used by the JikesRVM. It allows a basic block to have multiple exceptional side-exits, breaking the single-entry single-exit property. Although that will reduce the number of exceptional branches, the traditional CFG is better-understood and is still capable to express all control flows with additional verbosity.
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/8Next milestone2018-06-28T23:11:32+10:00John ZhangNext milestone*Created by: wks*
As we already found many problems in the current µVM design, many changes can be done to improve it. Here is a list of issues to be addressed in the next µVM specification.
- [X] #6 #10 : More instructions can resu...*Created by: wks*
As we already found many problems in the current µVM design, many changes can be done to improve it. Here is a list of issues to be addressed in the next µVM specification.
- [X] #6 #10 : More instructions can result in abnormal control flows. This should be reflected in the instruction set.
- Stick to the standard control flow graph, not factored control flow graph.
- [X] #4 : Use C++11 memory model.
- [X] #7 : Update the swap-stack API.
- [X] #5 : Update the OSR API.
- [X] #9 : Support vector instructions.
- [X] Provide mechanisms (futex) to support blocking locks and other thread synchronisation primitives.
In case there are someone who wants to play with the reference implementation, the current implementation will be branched and the current specification will be forked into another repository.
spec-2