general-issue-tracker issueshttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues2015-04-15T19:47:10+10:00https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/31GC: Is "liveness" of objects really needs to be defined?2015-04-15T19:47:10+10:00John ZhangGC: Is "liveness" of objects really needs to be defined?*Created by: wks*
Currently the Mu spec defines "live object" as being reachable from roots.
An alternative definition is to define heap objects as *always live*, but implementations can "*cheat*". We define:
A memory location has...*Created by: wks*
Currently the Mu spec defines "live object" as being reachable from roots.
An alternative definition is to define heap objects as *always live*, but implementations can "*cheat*". We define:
A memory location has a **lifetime**. An internal reference (iref) to a memory location is **valid** as long as the memory location's lifetime has not expired. Specifically,
* The lifetime of a memory location in the *heap* begins when the `NEW` or `NEWHYBRID` that allocates the heap object is executed. It **never expires**.
* The lifetime of a memory location in the *global memory* begins when the bundle that defines it is loaded. It never expires.
* The lifetime of a memory location in the *stack* begins when the `ALLOCA` or `ALLOCAHYBRID` that allocates the stack cell is executed. It expires when the function activation which the stack cell is allocated is destroyed by either returning, throwing an exception, or killing the stack.
If the Mu spec no longer define liveness by the "root set" and transitive reachability, the Mu implementation must infer those reachability rules from other parts of the spec. I believe a carefully defined spec implies the same rules as explicitly defined reachability rules.
## Examples and corner cases
**When an object is unreachable from the roots** (previously defined as "dead"): Mu can reclaim the object. Since it cannot be reached, the client and Mu IR programs will never find out Mu is cheating about the lifetime of heap objects, which were defined as "forever". (You can kill an immortal if nobody can see him/her again.)
**Object pinning**: Since an address is exposed during the period of pinning, the GC must not collect the object otherwise the native code will find Mu is cheating. In other words, pinning keeps an object alive.
## Weak references and finalisers
**Weak reference**: We must change the meaning of "weak references" because we no longer define "reachable". We can define it as:
* At any time, Mu **may** atomically set the values of some weak references to NULL if "after doing so, no one can prove that their referred objects can otherwise be reached". (This is not formal at all. Maybe weak references are really meaningless.)
In this way, a Mu implementation that never clear weak references is a valid implementation. But an implementation that does so may legally do so.
**Finaliser**: It was not defined because of it is not guaranteed to be caught. But in order to allow the Mu implementation cheat, I define it as:
* Any object may have a "prevent-one-death" flag (set when a finalisable object is created).
* There is a queue maintained by Mu. (finalising queue. The client implements a finalising thread watching the queue.)
* At any time, Mu **may** atomically remove the "prevent-one-death" flag of an object and put it in the queue mentioned above, provided that "the only way to get a reference to that object is via the queue". (This does not sound very formal, either.)
In this way, a Mu implementation that never call any finaliser is a valid implementation. But an implementation that does so may legally do so.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/32Better support for the tagref64 type2015-05-21T17:20:07+10:00John ZhangBetter support for the tagref64 type*Created by: wks*
In dynamic languages, the `tagref64` type (or other future tagged reference type variants) will be used pervasively in the language runtime.
This issue summarises potential improvements on the support for such types...*Created by: wks*
In dynamic languages, the `tagref64` type (or other future tagged reference type variants) will be used pervasively in the language runtime.
This issue summarises potential improvements on the support for such types.
# Tagged reference constant
The Mu IR currently does not have a constant for `tagref64` mainly because it may holds a reference and non-NULL references cannot be constant. However, one possible use of the `tagref64` type is to store a NULL reference together with an `int<6>` tag. In this case, the tag determines the concrete thing it is representing (undefined, nil, null, false, true or other frequently used singleton objects). So it should be possible in Mu to create such `tagref64` as a constant.
Proposed new syntax:
```
.const @name <@tagref64> = TR64FP @double_constant
.const @name <@tagref64> = TR64INT @int52_constant
.const @name <@tagref64> = TR64NULLTAG @int6_constant // The ref is NULL, the tag is @int64_constant
.const @double_constant <@double> = 3.14d
.const @int52_constant <@i52> = 0x123456789abcd
.const @int6_constant <@i6> = 30
```
# Tagged reference equality
Comparing floating point numbers bit by bit is not equivalent to IEEE754's definition of "equality". However, when two `tagref64` values both holds integers or references+tags, the result is deterministic.
In dynamic languages, such comparisons can quickly determine whether two tagged references have the same type (identified by the tag part) and refers to the same object.
Proposed semantic of `EQ` comparison between `tagref64` values:
The result of the `EQ` comparing instruction between `v1` and `v2` is 1 (true) if and only if any of the following is true:
* Both holds `double` values, and
* neither were NaN and both have the same bit-wise representation, or
* both are NaN and they happen to have the same bit-wise representation after converted to `tagref64`.
* Both holds `int<52>` values and they are bit-wise equal.
* Both holds references, and
* their references refer to the same object or both are NULL, and
* their `int<6>` tags are bit-wise equal.
The `NE` instruction returns the opposite result of `EQ`.
> NOTE: `tagref64` uses the NaN space of double. Real NaN `double` values may lose its precise bit-wise representation when converted to `tagref64`. So comparing two `tagref64` values both holding NaNs has unspecified result.
*Alternative possibility*: Require Mu to canonicalise all NaNs to one unique bit-wise representation. In this way, all NaNs compare equal when comparing `tagref64` values bit by bit.
# Default values of `tagref64` types.
Currently the default value (all zero bits. All newly-allocated memory (heap, stack, global) holds all zero bits.) of `tagref64` holds +0.0 as a `double` value. In this representation, all `tagref64` values which hold `double` contents are bit-wise equal to its real `double` representation. So converting a `tagref64` to `double` is trivial: just do a bitcast.
However, languages usually define the values for uninitialised variables/fields as null-like values: `undefined` in JS, `nil` in Lua, `null` in java. There should be an option to make 00000000..00 represent their null types.
There could be a flag to determine the zero value of a `tagref64` type. The proposed syntax is:
```
.typedef @tr64_with_fp_default = tagref64 <DEF_FP(3.14d)> // All 0s represents double value 3.14d
.typedef @tr64_with_ref_default = tagref64 <DEF_REF(0x5a)> // All 0s represents NULL ref with 0x5a as tag.
.typedef @tr64_with_int_default = tagref64 <DEF_INT(0x55aa55aa55aa5)> // All 0s represents integer 0x55aa55aa55aa5.
.typedef @tr64_as_current = tagref64 <FP_DEF(0.0d)> // All 0s represents double value 0.0d, which is the same as the current `tagref64`.
```
The kind of default is a static metadata and the garbage collector can identify it.
This can be implemented by applying an XOR mask on the value after encoding to `tagref64` and before decoding an existing `tagref64`.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/33Impossible states in full-state frame making (OSR)2015-06-18T12:48:59+10:00John ZhangImpossible states in full-state frame making (OSR)*Created by: wks*
There was a proposal about the OSR before, https://github.com/microvm/microvm-meta/issues/5
According to our discussion recently, we decided that:
1. the state of a frame is the PC and a set of local variable val...*Created by: wks*
There was a proposal about the OSR before, https://github.com/microvm/microvm-meta/issues/5
According to our discussion recently, we decided that:
1. the state of a frame is the PC and a set of local variable values.
2. It should be only possible to continue from some designated "OSR points", or the beginning of a basic block, rather than from arbitrary point in the code. This reduces the compiler's burden to generate stack maps.
3. The client supplies an arbitrary subset of local variables and their values and
1. if a variable is supplied but is never used, it has no effect on the execution and is simply ignored.
2. if a variable is not supplied but is used later, it gives undefined behaviour.
But this leads to a problem: the supplied state may never be reproducible from normal execution. For example:
```
.funcsig @foo_sig = @i64 (@i64 @i64)
.funcdef @foo VERSION @foo_v1 <@foo_sig> (%a %b) {
%entry:
%x = MUL <@i64> %a %b
%y = ADD <@i64> %x @i64_0 // The rhs is the constant 0.
%trap = TRAP <@void>
// OSR continues here
CALL @print (%x)
CALL @print (%y)
RET <@i64> %y
}
```
Assume we perform an OSR and construct a frame which continues *after* the `%trap` instruction with local variable values: `%a = 6; %b = 9; %x = 42; %y = 54`. Then the value `%x = 42` is impossible.
But the code generator may consider "adding zero" as a no-op and thus generates the machine code that aliases the register of `%x` and `%y`. For example:
```
foo:
push rbx ; save callee-saved register
mov rbx, rdi ; do multiplication. rbx holds the value of %x and also %y
mul rbx, rsi
mov rdi, rbx ; prepare to call @print (%x)
call print
mov rdi, rbx ; prepare to call @print (%y)
call print
pop rbx
ret
```
Then it is impossible to create such a state as mentioned above. This implies that either
1. we require that such state construction must be possible and **require the code generator not to generate the code like above**, or
2. we further **restrict our API** on frame state construction.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/35Alternative LISP-like Mu IR format2015-06-22T16:04:08+10:00John ZhangAlternative LISP-like Mu IR format*Created by: wks*
Problem: Mu IR needs a parser, but constructing a parser is tedious. Parser generators pulls in additional dependencies.
Solution: Use a simplistic syntax based on LISP.
Example:
```scheme
(typedef @i32 int 3...*Created by: wks*
Problem: Mu IR needs a parser, but constructing a parser is tedious. Parser generators pulls in additional dependencies.
Solution: Use a simplistic syntax based on LISP.
Example:
```scheme
(typedef @i32 int 32)
(typedef @float float)
(typedef @void void)
(typedef @refvoid ref @void)
(typedef @foo struct @i32 @i64 @float @double @refvoid)
(funcsig @f_sig @i32 (@i32 @i32))
(const @FORTY_TWO @i32 42)
(const @DOUBLE_FORTY_TWO @double 42.0d)
(const @SOME_STRUCT_CONST @some_struct @const1 @const2 @const3)
(const @NULLREF @refvoid NULL)
(global @errno @i32)
(funcdecl @write @write_sig)
(funcdef @write @write_v1 @write_sig (%p0 %p1 %p2)
(basic-block %entry
(inst %a (ADD @i32 %p0 %p1))
(inst %b (CALL @sig @callee (%arg1 %arg2 %arg3) (exc %nor %exc) (keepalive %v1 %v2 %v3)))
)
(basic-block %nor
(inst _ (SUB @i32 %p0 %p2)) ; unnamed instruction
(inst _ (BRANCH %exit))
)
(basic-block %exc
(inst _ (TRAP @void))
)
(basic-block %exit
(inst _ (@uvm.thread_exit)) ; COMMINST is no longer necessary because the syntax is already dynamic
)
)
```
**How would this benefit the Mu implementer?** The parser can be written by hand in very few lines of code. This is convenient for languages that has less capabilities (such as C which does not handle complex type hierarchies easily).
**How would this benefit client implementers?** The code generator can be more typed (using structured nested lists), rather than constructing arbitrary strings (using string formatting).
**Binary format?** There can be a simpler and direct mapping between the text format and the binary format. For example, atoms can be encoded as a hash code, and a list can be encoded as a type, a length and a list of values. Mu spec no longer needs to define a text format and a binary format separately.
Problems?
Does not look like assembly.
May be less readable than the current text format without aggressive pretty-printing.
Extra validation should be performed by the parser. (Really? The Mu micro VM is not required to correct any errors. Any error is allowed to have undefined behaviours.)
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/40Mu Client Interface as C Binding2015-08-21T15:55:43+10:00John ZhangMu Client Interface as C Binding*Created by: wks*
The current API is expressed in a language-neutral form, and it is the implementation that decides how to implement such an interface. Programmers still need to resort to implementation-specific interfaces to actually ...*Created by: wks*
The current API is expressed in a language-neutral form, and it is the implementation that decides how to implement such an interface. Programmers still need to resort to implementation-specific interfaces to actually use a particular Mu implementation.
Since C is so widely used as a system programming language, the Mu client interface (a.k.a the API) should be expressed as data types and function calls in the C programming language. If the client is not in C, it usually still has a C FFI.
# The API in C
Resources in Mu are exposed in opaque types which have reference semantics: they can be copied and still refers to the same resource.
* `mu_micro_vm_t`: a reference to a Mu micro VM instance.
* `mu_client_agent_t`: a reference to a client agent.
* `mu_handle_t`: a handle to a value in the Mu type system exposed to the client.
Messages are C functions. Like JNI, they are contained in a struct: `typedef struct mu_api_msgs {...} mu_api_msgs_t`. In this way, the client in C does not need to link against any libraries when compiling. The reason is, for a Mu micro VM implemented in a higher-level language (like the reference implementation in Scala), the binding of the callable C function is generated very late, later than even the loading time, and has no access to the native loader.
For example, assume there is a `mu_api_msgs_t* msgs` defined:
```c
mu_client_agent_t ca = ...
char buf[999999];
int sz;
// load file into buf
msgs->load_bundle(ca, buf, sz); // Load a bundle
// Putting C values into Mu
mu_handle_t h1 = msgs->put_schar(ca, 127);
mu_handle_t h2 = msgs->put_sshort(ca, 32767);
mu_handle_t h3 = msgs->put_sint(ca, 42);
mu_handle_t h4 = msgs->put_slong(ca, 42);
mu_handle_t h5 = msgs->put_slonglong(ca, 999999999999999);
// Converting Mu values to C
int v3 = msgs->to_sint(ca, h3);
unsigned long v4 = msgs->to_ulong(ca, h4); // just treat the int as unsigned
```
Mu-level flags are C preprocessor macros. They have type int.
```c
msgs->store(SEQ_CST, hLoc, hNewVal); // SEQ_CST is a macro
```
Callbacks, including the trap handler and undefined function handler, have defined signatures:
```
typedef mu_trap_return_status_t (*mu_trap_handler_t)(
mu_client_agent_t ca,
mu_handle_t stack,
mu_handle_t thread,
int watchpoint_id,
mu_handle_t &new_stack,
mu_handle_t &data_passed,
mu_handle_t &new_exception,
mu_api_msgs *msgs,
void *user_data);
typedef void (*mu_undefined_function_handler_t)(
mu_micro_vm_t microvm,
int funciton_id,
mu_api_msgs *msgs,
void *user_data);
```
These functions are registered via the `msgs->register_trap_handler` and `msgs->register_undefined_function_handler` API messages. In their parameters, the `user_data` is an arbitrary pointer provided by the client in an implementation-specific manner (see below).
# Implementation-defined behaviours
Some aspects of the C binding are implementation-specified. They include:
* How to create a Mu micro VM? Options are:
1. The C executable creates the Mu instance.
2. Mu loads the C dynamic library.
3. Mu starts separately and C connects to the existing instance in the same process.
4. C connects to a Mu instance in a different process, or a different machine.
* Options in creating Mu instances. Options are:
1. Heap size. Giving a heap size means the Client determines the heap size rather than Mu automatically decide its own storage.
2. Global data space size. Setting this value means the global data may have their own storage. Actual implementation could use the heap space, too.
3. Stack size. Similarly, this is too implementation-specific.
* What happens during initialisation?
1. Mu calls a C function to initialise the client, and the client provides a `void*` to Mu for the client's own context. (note: in this case, it is Mu loading C rather than C creating Mu.)
2. C creates a Mu instance, and sets its `void*` user data in a proprietary API message.
# Open questions
* Should we allow each Mu implementation have its own "namespace"? The opaque types (`mu_micro_vm_t` and so on) are opaque, but different implementations may have different representations. The current C binding design forbids one C program working with more than one Mu implementations (though it is okay to work with more than one *instances* of the same implementation).
* JNI does not solve this problem, either.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/48Mu IR rewriting library2015-10-29T16:47:55+11:00John ZhangMu IR rewriting library*Created by: wks*
Mu aims to be minimal, but such minimalism has made the construction of Mu clients hard. A client-level library can ease the client's job by accepting a slightly higher-level variant of the Mu IR and translating that h...*Created by: wks*
Mu aims to be minimal, but such minimalism has made the construction of Mu clients hard. A client-level library can ease the client's job by accepting a slightly higher-level variant of the Mu IR and translating that higher-level IR to actual Mu IR code (and/or HAIL scripts and/or subsequent API calls).
This issue tracks tasks that should be done at this layer.
* Pre-SSA to SSA converter. #44
- Writing Mu IR in the SSA form is hard, and the goto-with-values form is even harder. The library should automatically convert ordinary CFGs into the goto-with-values form using well-known algorithms.
* Platform-dependent constant values. #47
- Some ahead-of-time clients (notably C or other "traditional" languages) exposes platform details to the programmer as compile-time constants, but binding those values too early will make the object code non-portable. The rewriter should help the client determine these values so that the client compiler can be strictly ahead-of-time.
* Merge Mu IR and HAIL: #29 #46
- The library is not minimal. Integrating both languages will make the client's job easier.
* Annotations
- This will allow clients to attach arbitrary information to the Mu IR code, which can help the client introspect the program at run time.
- Note that if we need to use system debuggers (such as GDB), then these annotations need to go through the micro VM itself because it is the micro VM's responsibility to generate object codes (including the DWARF debug info).
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/49Stack frame iterator2015-11-13T14:34:31+11:00John ZhangStack frame iterator*Created by: wks*
# Problem
The current stack introspection/OSR API is inefficient.
It selects a frame by a number. For example:
```
ctx->cur_func(ctx, stack, 1);
ctx->cur_func_ver(ctx, stack, 2);
ctx->cur_inst(ctx, stack, 3...*Created by: wks*
# Problem
The current stack introspection/OSR API is inefficient.
It selects a frame by a number. For example:
```
ctx->cur_func(ctx, stack, 1);
ctx->cur_func_ver(ctx, stack, 2);
ctx->cur_inst(ctx, stack, 3);
ctx->dump_keepalives(ctx, stack, 4, &values);
```
A real-world implementation may need to unwind the stack from the top, one frame at a time, until reached the frame. If the client needs to traverse through stacks of many frames, the O(n^2) complexity may be a performance bottleneck. One application is to use the Client API (or the equivalent Mu IR (common) instructions) to generate the stack trace for exception handling.
This link shows a real-world Java stack trace: https://ptrthomas.wordpress.com/2006/06/06/java-call-stack-from-http-upto-jdbc-as-a-picture/ Or click here: [jtrac-callstack.pdf](https://github.com/microvm/microvm-meta/files/24885/jtrac-callstack.pdf)
EDIT: well, this is a call graph, not a stack trace. But imagine something goes wrong in JDBC...
# Desired API
The API should provide a "frame cursor" data type, which refers to a frame in a stack. It can be generated for a stack, and iterate through its frames from top to bottom.
The introspection API `cur_func`, `cur_func_ver`, `cur_inst` and `dump_keepalives` will work on this "cursor" instead of a stack and a number.
The OSR API `pop_frame`, can, instead of popping one frame at a time, pop all frames above a particular "cursor". [Preliminary experiments](https://github.com/microvm/liblushan/blob/master/src/test_remote_stack_chop.c) show that this is possible with C programs and [libunwind](http://www.nongnu.org/libunwind/).
## The "frame cursor" type
The "frame cursor" type shall be an opaque reference to ~~a Mu frame~~ a cursor. The cursor holds the context of a "current frame", and can move down to the parent frame.
It must be platform-independent.
It could potentially be large (given the number of registers in a CPU). Therefore it is desirable to be **mutable** – making a fresh copy for each frame would be costly (Haskell programmers may disagree).
There are some subtle interactions between it and the GC. GC may modify references on the stack, but the API must hide this detail from the client. So the API should not expose raw CPU
The cursor may be allocated on the Mu heap, but also may not.
The cursor is only valid when the stack is still unbound. As soon as the stack is bound again, the stack may change in any ways and the cursor is invalidated.
So I can think of some possible solutions:
1. ~~Create a new type `frameref`, like our existing `threadref` and `stackref`.~~ Create a new type `framecursor`. It has reference semantics: it refers to a mutable structure internally held by the Mu VM.
* pro: A dedicated opaque type, the cleanest model.
* con: A new primitive type, pointing to a large structure, just for introspection? Well... maybe not that bad.
* choices: Is it managed by the GC? GC is the easiest way, but we may not be able to print stack trace for OutOfMemoryException. (really? I am not sure) Alternatively it may be required to be closed explicitly.
2. Use `ref<void>` for the "cursor" type. Its content is allocated on the heap, opaque to the client, and may be platform-specific. When invalidated, the object remains live, but the content becomes invalid.
* pro: No new types introduced
* con: This implies the data content is on the Mu heap.
* con: The GC must have special knowledge of such a heap object, which is not a regular Mu object.
3. Use `ptr<void>`. Similar to `ref<void>`, but implies it is not GC-ed.
* pro, con: same as `ref<void>`
## Example Mu API
This example prints a stack trace on Mu.
```c
// This trap handler prints the stack trace.
void stack_printing_trap_handler(
MuCtx *ctx, // Equivalent to JNIEnv
MuThreadRefValue thread, // The current thread
MuStackRefValue stack, // The current stack
int wpid, // Watchpoint ID. 0 for ordinary traps.
MuTrapHandlerResult *result, // How the Mu thread should resume?
MuStackRefValue *new_stack, // Which stack shall the Mu thread bind to? Usually the old stack.
MuValue *values, int *nvalues, // What values shall be passed on the stack?
MuRefValue *exception, // What exception shall be thrown on that stack?
MuCPtr userdata) { // Client-specific data
ClientCompiler *clientCompiler = (ClientCompiler*) userdata; // The client-specific compiler.
// Get a cursor to the top of the stack.
MuFrameCursorValue *cursor = ctx->get_stack_cursor(ctx, stack);
// Iterate through.
int func_id;
while((func_id = ctx->cur_func(ctx, cursor)) != ID_OF_MY_STACK_BOTTOM_FUNC) {
if (func_id == 0) { // func_id == 0 means the frame is a native frame.
printf("This frame is native");
} else { // It is a Mu frame.
// Get the ID of the current Mu instruction.
int inst_id = ctx->cur_inst(ctx, cursor);
// The client looks up the source-level information.
SourcePosition sp = clientCompiler->getSourcePosition(inst_id);
printf("File: %s, Function: %s, Line: %d, Column: %d: %d\n",
sp.file, sp.func, sp.line, sp.column);
}
}
printf("End of stack trace\n");
// Close the cursor. (Alternatively let the GC close the cursor.)
ctx->close_cursor(ctx, cursor);
// We want to return to the old stack and continue normally,
*new_stack = stack;
// but do not pass any values.
*nvalues = 0;
// Continue normally (not throwing exception).
return MU_REBIND_PASS_VALUES. // passing 0 values
}
```
# Existing approaches
[libunwind](http://www.nongnu.org/libunwind/) is a portable way to walk stack frames in the C language. There are different implementations on different platforms (OSX has its own implementation), but the API is the same.
`unw_getcontext` creates a `unw_ucontext_t` structure for the current stack. `unw_init_local` creates a `unw_cursor_t` on the context. Then the user can call `unw_step` on the cursor to step through stack frames. `unw_get_reg` gets the value of a machine register from a cursor. The cursor keeps the state of registers (usually it is only able to recover callee-saved registers) at the resumption points (return addreses) of frames.
Example:
```c
#define UNW_LOCAL_ONLY
#include <libunwind.h>
void show_backtrace (void) {
unw_cursor_t cursor; unw_context_t uc;
unw_word_t ip, sp;
unw_getcontext(&uc);
unw_init_local(&cursor, &uc);
while (unw_step(&cursor) > 0) {
unw_get_reg(&cursor, UNW_REG_IP, &ip);
unw_get_reg(&cursor, UNW_REG_SP, &sp);
printf ("ip = %lx, sp = %lx\n", (long) ip, (long) sp);
}
}
```
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/51WebKit's B3 JIT compiler2016-02-19T14:27:53+11:00John ZhangWebKit's B3 JIT compiler*Created by: wks*
The B3 JIT compiler has received much attention recently.
I started a Wiki page: https://github.com/microvm/microvm-meta/wiki/B3-JIT-%28WebKit%29
Let's summarise B3 and its influences to Mu in the Wiki.
*Created by: wks*
The B3 JIT compiler has received much attention recently.
I started a Wiki page: https://github.com/microvm/microvm-meta/wiki/B3-JIT-%28WebKit%29
Let's summarise B3 and its influences to Mu in the Wiki.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/1Must specify protocol for client-uvm sharing of signal handlers2016-06-17T15:22:27+10:00John ZhangMust specify protocol for client-uvm sharing of signal handlers*Created by: mn200*
As per discussion in meeting on 1 August 2014, we think that the uVM should be the only entity making the `signal` system call. Other entities (*i.e.*, clients in practice) can register signal-handler-like objects w...*Created by: mn200*
As per discussion in meeting on 1 August 2014, we think that the uVM should be the only entity making the `signal` system call. Other entities (*i.e.*, clients in practice) can register signal-handler-like objects with the uVM, which promises to pass on signals generated by their code.
This needs to be documented in the spec-wiki, perhaps in a section about client-uVM API.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/2Multiple versions of the same function2016-06-17T15:22:30+10:00John ZhangMultiple versions of the same function*Created by: wks*
This issue addresses the representation of multiple versions of a function due to function redefinition.
Affects: the MicroVM reference implementation, the MicroVM-Client interface.
Does not affect: the MicroVM IR ...*Created by: wks*
This issue addresses the representation of multiple versions of a function due to function redefinition.
Affects: the MicroVM reference implementation, the MicroVM-Client interface.
Does not affect: the MicroVM IR language
## Background
In the current microvm-refimpl, there are several classes whose relations are as following:
1. Function: represents a callable function. One per function ID
2. CFG: a concrete function definition. Has many basic blocks, each of which has many instructions.
3. A Function has zero or one CFG: if zero, the function is declared but not defined; if one, the refers to the most recent version of the function definition.
4. Stack: represents the contexts of nested function activations.
5. Frame: the context of a function activation
6. A Stack has many Frame
7. A Frame has one Function: the function this frame is created for.
## Problem
After function re-definition, a new CFG is created and `Function.cfg` is set to the new CFG. The old CFG is discarded. This is problematic because:
a. Function redefinition only affects future invocations, but there are existing activations deep in the stack.
b. A frame corresponds to a concrete CFG, not an abstract callable Function. When a Function is redefined, the CFG of an existing Frame remains the same (is not redefined).
c. When a trap in an old version of a function is triggered, the client will introspect the frame which requires the metadata of the old version of a function, i.e. the CFG. It cannot be disposed.
## Example
Assume we have a naive Fibonacci number function:
```
int executeCount = 0;
int fib(int n) { // version 1
if (executeCount++ == 1024) { trap(keepalive=[n]); }
if (n<=1) return n; else return fib(n-1) + fib(n-2);
}
```
In the MicroVM, a dictionary is kept so that
```
functions = { "fib" : <version 1 of fib> }
```
When this is executed for too many times, the trap is triggered and the client decides to redefine `fib` as following:
```
int fib(int n) { // version 2
int a=0, b=1;
while(n--) { int tmp=a+b; a=b; b=tmp; }
return a;
}
```
And in MicroVM:
```
functions = { "fib" : <version 2 of fib> }
```
However, when the trap is triggered again (this is possible for this scenario), the control goes to the client and the client looks up `functions["fib"]` to get the metadata. The frame is still for version 1, but it gets `<version 2 of fib>`. This will cause error.
## solution
Change the object relations so that:
* In 3, a Function not only has the most recent CFG, but also maintains a list of historical CFGs.
* In 7, a Frame no longer has a Function, but has a CFG, instead.
* When introspecting a frame, during trap or by other means, the client gets the concrete version of a function rather than just a function ID.
* Specify that instructions in the newer version cannot reuse the IDs from instructions in the older version (as is already like this in the refimpl) so that all instructions (especially TRAP instructions) have unique IDs through time. Each TRAP instruction can uniquely identify the CFG this instruction is defined in.
All function definitions, i.e. CFGs, are kept alive until the last activation have returned.
## open questions
How to identify a particular version of a function? Does the MicroVM really need an interface for getting a specific version of a CFG? The client certainly has more information than the MicroVM about the program.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/3The client should see opaque references rather than transparent addresses.2016-06-17T15:22:32+10:00John ZhangThe client should see opaque references rather than transparent addresses.*Created by: wks*
(Discussed in 5 Aug 2014 meeting) The client may hold references to the µVM heap, but the client should see opaque references rather than raw addresses. More specifically,
1. The client may hold actual raw addresses...*Created by: wks*
(Discussed in 5 Aug 2014 meeting) The client may hold references to the µVM heap, but the client should see opaque references rather than raw addresses. More specifically,
1. The client may hold actual raw addresses, but it should consider it opaque. It should access the µVM memory using the µVM-provided API and should not depend on the fact that they are addresses.
2. The client holds indices into a table maintained by the µVM (or keys to a hashtable by µVM) and has to access the µVM memory through the API.
And the µVM should keep track on all references held externally.
In either cases, this behaviour should be documented and implemented accordingly.
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/4Should fully stick to C++11 memory model2016-06-17T15:22:34+10:00John ZhangShould fully stick to C++11 memory model*Created by: wks*
Currently the memory ordering primitives are copied from LLVM. As stated by LLVM langref, their memory model is not precisely defined.
> These semantics (ordering) are borrowed from Java and C++0x, but are somewhat ...*Created by: wks*
Currently the memory ordering primitives are copied from LLVM. As stated by LLVM langref, their memory model is not precisely defined.
> These semantics (ordering) are borrowed from Java and C++0x, but are somewhat more colloquial. If these descriptions aren’t precise enough, check those specs (see spec references in the atomics guide).
But the MicroVM should have a precise memory model. The C++11 memory model is the result of a lot of effort and we should stick to it.
Things should be done in the MicroVM:
* Remove the "UNORDERED" memory order. It is designed by LLVM for Java, but the MicroVM will treat data race as undefined behaviour and will leave the security constraints of Java to the client.
- If JVM can be implemented in C/C++, its memory model should also be implementable on a MicroVM with C++11-like memory model.
* Add "CONSUME" memory order. Define the "carries-a-data-dependency-to" relation in MicroVM. Some hardwares are aware of dependency.
* Define the program order (which is a total order per MicroVM thread because MicroVM does not have unspecified parameter evaluation order), the synchronisation order, the synchronises-with and the happens-before relations.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/5A heavy-weighted "Frame State Construction" mechanism2016-06-17T15:22:37+10:00John ZhangA heavy-weighted "Frame State Construction" mechanism*Created by: wks*
During on-stack replacement (OSR), it is often desired to save the state of a partially executed function, compile a new optimised version of the function and restore the state to the state before. How to map the old s...*Created by: wks*
During on-stack replacement (OSR), it is often desired to save the state of a partially executed function, compile a new optimised version of the function and restore the state to the state before. How to map the old state to the new state is the job of the Client, but the µVM can provide a new mechanism to make this easier.
Frame State Construction creates a new stack frame and populate it to the state of a partially executed function. The client specifies the values of all SSA Values (at least all live values) in the function and the next µVM instruction to execute. Then the stack can be resumed.
This is an epic powerful mechanism that allows the program to continue at any point of code, but may be difficult to implement.
p.s. I do not want to use the word "restore" because the µVM does not care about the "restore" semantic.
Example (in the C language):
```c
void inc_all(int ar[], long sz) {
int i;
for(i=0;i<sz;i++) {
cont:
ar[i] += 1;
}
}
```
But instead of starting from the beginning, I want to continue from the label "cont" with i = 100. The µVM should let me express something like:
```c
construct_frame(func=inc_all,
next_inst="cond",
local_vars={
ar: SOME_OLD_ARRAY,
sz: SOME_OLD_VALUE,
i: 100
})
```
Similarly in µVM IR, the code should be like:
```uir
.typedef @i32 = int<32>
.typedef @i64 = int<64>
.funcdef @inc_all <void (iref<@i32> @i64)> (%ar %sz) {
%entry:
BRANCH %head
%head:
%i = PHI @i64 { %entry: 0; %body: %i2; }
%cond = SLT @i64 %i %sz
BRANCH2 %cond %body %exit
%body:
%addr = SHIFTIREF <@i32> %ar %i
%old_val = LOAD <@i32> %addr
%new_val = ADD <@i32> %old_val 1
STORE <@i32> %addr %new_val
%i2 = ADD <@i64> %i 1
BRANCH %head
%exit:
RETVOID
}
```
I should be able to let it continue with:
```c
stack.create_new_frame(func = "@inc_all",
next_inst = "%addr", // %addr is actually the instruction's name,
// i.e. the instruction that calculates %addr.
local_vals = {
"%ar": SOME_VALUE,
"%sz": SOME_VALUE2,
"%i": 100,
"%cond": 1, // true
"%addr": WHATEVER, // This will be calculated immediately
"%old_val": WHATEVER, // This will be calculated immediately
"%new_val": WHATEVER, // This will be calculated immediately
"%i2": WHATEVER, // This will be calculated immediately
})
```
**Potential challenges**
1. This needs close collaboration with the code generator, especially the register allocator. This may require stack map (mapping in which machine register or memory location each local SSA Value is stored) at **every instruction** that can potentially be continued from.
* solution1: Add some dedicated "continue point" instruction where the code generator generates stack map. The "continue point" itself is a no-op.
* solution2: Upon request, re-compute the stack map for the desired instruction to continue. This must match the actual function code.
2. Cannot continue before a PHI node or a LANDINGPAD. PHI depends on incoming control flow and LANDINGPAD depends on the exception.
* solution: continue **after** those instructions, instead.
**possibilities**
1. Theoretically all possible states of stack can be constructed, not just "continuing from an instruction", but also a frame that "is calling some function but has not returned", or a frame that "is trapped to the client", or a dead stack.
2. Can we preserve the state of a full stack, quit the program, re-run and re-construct the whole stack again? (persistent program state)
# Alternative solutions for OSR state preserving
## Save states in global variables.
This (problematic) approach is taken by [Lameed et al.](http://dl.acm.org/citation.cfm?id=2451541). The saved states are loaded in the beginning of a newly-compiled function.
Problems:
1. used global variables. bad concurrency.
2. needs to generate code for loading those global variables. Lameed et al. compiles the function twice where those loads are removed in the the second compiling.
## Create a partially-evaluated function
This is a functional approach, similar to the concept of "continuation" in SCHEME. The new function takes no parameter (or arbitrary parameters) and behaves like the "bottom half" of the old function.
Advantage:
1. This "continuation" is just an ordinary µVM function and does not require special mechanisms other than OSR and function definition (not even **re** -definition)
2. At least as fast as Lameed et al.'s approach. Both compiles two versions of the new function: one for continuing and the other for newer fresh calls.
Problems:
1. Requires compiling a one-shot function just for one continuation. This epic "Frame State Construction" may look heavy, but is still lighter than compiling a new function.
2. For imperative programming languages, "continuation" may be difficult to create and may require complex control-flow analysis.
spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/6Document safe/unsafe memory access instructions2016-06-17T15:22:39+10:00John ZhangDocument safe/unsafe memory access instructions*Created by: mn200*
As per discussion on 22 August, we want to have two forms of every memory accessing instruction. One is "unsafe" and the machine is not required to present a corresponding abstract state to the client. One is "safe...*Created by: mn200*
As per discussion on 22 August, we want to have two forms of every memory accessing instruction. One is "unsafe" and the machine is not required to present a corresponding abstract state to the client. One is "safe", and if a memory exception occurs, the machine commits to giving the client an appropriate picture of the abstract state at that point.
All this discussion needs documenting.spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/7Choose between swap-stack with and without parameters.2016-06-17T15:22:41+10:00John ZhangChoose between swap-stack with and without parameters.*Created by: wks*
Coroutines may communicate with each other by yielding and, at the same time, passing data between stacks. There are two ways this can be done.
1. Allocate memory region in one stack using ALLOCA and let the other s...*Created by: wks*
Coroutines may communicate with each other by yielding and, at the same time, passing data between stacks. There are two ways this can be done.
1. Allocate memory region in one stack using ALLOCA and let the other stack write data in those region.
2. Let the swap-stack operation take parameters.
The first approach is currently appreciated by @wks and @hosking, but (Dolan et al.)[http://dl.acm.org/citation.cfm?id=2400695] proposed the second approach, but does not discuss the difference between the two.
We need to choose the appropriate one.spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/9Vector operations2016-06-17T15:22:44+10:00John ZhangVector operations*Created by: wks*
Some modern processors provide SIMD instructions. Using them properly can greatly increase the performance of some computations. The µVM should expose them to the user.
LLVM's approach:
* Vector types are first-cla...*Created by: wks*
Some modern processors provide SIMD instructions. Using them properly can greatly increase the performance of some computations. The µVM should expose them to the user.
LLVM's approach:
* Vector types are first-class types. Most instructions which accept scalar values also accept vector values.
* Binary operations are done element-wise.
* Comparison operations are done element-wise, resulting in a vector of one-bit integers.
* Conversions can be done between integer vectors, integer vector to FP vector, but not between different FP types (no fptrunc and fpext for vectors)
* The select instruction works with vector condition with vector values, and also scalar condition with vector values.
* extractelement and insertelement to build and extract element from vectors
* shufflevector: repermutate elements of two vectors
* All other instructions that work with first-class types, including phi, ret, function calls, the contents to load/store.
* LLVM does not address scatter/gather.
Things to be done in the µVM
* Alignment requirements.
* Provide extra intrinsic functions to support machine-provided operations not covered by the `BinOp` µVM instruction. Example: reciprocal, exp, log, abs, ...spec-2https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/17Portability vs exploiting architecture potentials.2016-06-17T15:23:01+10:00John ZhangPortability vs exploiting architecture potentials.*Created by: wks*
> For should the enemy strengthen his van, he will weaken his rear; should he strengthen his rear, he will weaken his van; should he strengthen his left, he will weaken his right; should he strengthen his right, he wil...*Created by: wks*
> For should the enemy strengthen his van, he will weaken his rear; should he strengthen his rear, he will weaken his van; should he strengthen his left, he will weaken his right; should he strengthen his right, he will weaken his left. If he sends reinforcements everywhere, he will everywhere be weak. -- Art of War, by Sun Zi
The µVM is designed to abstract over the hardware, but only acts as a thin layer of abstraction.
There are many differences between underlying hardware platforms. Scroll down to the Appendix or read https://github.com/microvm/microvm-meta/issues/16
# Solutions to differences
As mentioned in https://github.com/microvm/microvm-meta/issues/16, there are three solutions, from the weakest to the strongest.
1. Define the differences as **undefined behaviour**. The client must prevent touching those fields at all cost.
2. Define the differences as **implement-defined behaviour**. The µVM implementation define the behaviour and provide documents/compile-time/run-time checkable mechanisms.
3. **Define the behaviour** in the µVM. The µVM implementation bridges the differences.
The 1st approach is the weakest. It makes it absolutely impossible to make use of any platform-specific features and will introduce excessive checkings to avoid undefined behaviour. So the µVM should lie somewhere in between approach 2 and 3.
# The µVM design goal
The µVM is having several conflicting design goals.
1. µVM is low-level. It is a thin layer over the hardware.
2. µVM is minimal. Anything that can be done efficiently by the Client should be done by the Client.
3. µVM is **portable**.
4. µVM should support high-performance VMs
5. µVM potentially run on resource-constrained devices.
Depending on how to interpret **portable**, there are two different implications:
1. The µVM provide compile-time or run-time flags so that the Client can generate platform-dependent µVM IR code.
2. The µVM IR code is cross-platform. The same µVM IR code should run on all platforms.
The second interpretation is more portable than the first. Overall, the more portable the µVM is, the thicker the abstraction layer is.
Goal 4 can be interpreted in different ways:
1. Highest theoretical possible performance.
2. As high as C for computation-intensive problems.
3. As high as Java for computation-intensive problems.
4. Much higher than Python/PHP/R/<*insert your favourite scripting language here*>
5. As high as Python/PHP/R/...
6. As long as the program eventually terminates.
Depending on the concrete implementation, an average desktop/server µVM should reach between 2 and 3.
Goal 5 requires the µVM and the Client to be simple. In this case, the Client and the µVM cannot perform too much reasoning about the program. This results in sub-optimal performance of emitted code.
# Choices in the µVM
## Behaviour of UDIV/SDIV
1. UDIV/SDIV are implementation-defined, or
2. They always behave like Java (div by 0 is an exception, -0x80000000/-1 = -0x80000000)
## Supported vector sizes
Conflicts:
1. The current µVM IR is very expressive. It can express any vector types, given type T and size n: `vector <T n>`
2. Not all are supported by the architecture
So the µVM should
1. Make it **implementation-defined**, or
2. Only support **some selected vector sizes**, or
3. Support some selected vector sizes **and platform-specific vector sizes**, or
3. Support **all** `vector<T n>`, either using scalar operations or vector instructions.
Rationale:
1. The Client can **probe** the supported vector sizes and can generate code accordingly. So it is the Client's responsibility to choose vectors. This is also a kind of **specialisation** which is usually done by the high-level optimiser.
2. Choosing the appropriate vector size is a kind of **instruction selection** and should be done by the µVM in a platform-specific fashion.
For rationale 1, (DeVito et al)[http://terralang.org/pldi071-devito.pdf] used auto-tuning technique to determine the optimal vector size for vectorised matrix multiplication problem.
For rationale 2, (Jibaja et al)[https://01.org/node/1495] proposed adding SIMD into JavaScript, but only providing 128-bit registers to the programmer. However, the following code is difficult for the µVM to convert to 256-bit vectors:
```
__m128 elems[N];
for (int i=0; i<N; i++) {
elems[i] = vector_add_floatX4(elems[i], constant_vector(1,2,3,4));
}
```
The reason is:
1. It requires extensive control-flow analysis to merge two adjacent 128-bit adding to one 256-bit adding.
2. 256-bit vectors have different alignment requirement. So array __m128[] cannot be treated as the array __m256[].
The following code, however, is easier to convert:
```
__align_to(256) __m128 elems[N];
for (int i=0; i<N; i+=2) {
elems[i] = vector_add_floatX4(elems[i], constant_vector(1,2,3,4));
elems[i+1] = vector_add_floatX4(elems[i+1], constant_vector(1,2,3,4));
}
```
However, the client, when generating such code, **is already aware of** the presence of 256-bit vectors.
## Array indexing
Conflict:
1. Array indexing seems to be platform-independent. It is "begin+index" (index can be negative).
2. It is implemented using address calculation and memory accessing, where "address" is word-sized.
Solutions
1. Use **any integer type** as the index and are sign-extended (used by LLVM. See http://llvm.org/docs/LangRef.html#getelementptr-instruction), or
2. The index must be **word-sized** and the Client must appropriately extend/truncate the index to word-sized (current µVM design)
Rationale:
1. It only involves a signed extension of truncation and the µVM can cheaply do it. However,
2. Since the Client can probe the word size and the Client can generate TRUNC/SEXT instructions, it should be done by the client.
# Portable µVM IR as a subset of the µVM
If we can define **a subset of the µVM** with **defined behaviours** and **reasonable** performance (perhaps calling it "µVM Mobile Edition"), then some simplistic µVM Clients (I assume there will soon be many such Clients) can generate portable µVM IR code.
For implementations that seek ultimate performance, the µVM implementation can implement many **machine-dependent instructions** as a super set of such a "portable" IR. (call it "µVM Enterprise Edition"?)
Alternatively, we can define that "subset" as "The µVM IR ®" and treat the implementation-dependent extensions as "extended µVM".
# Appendex: Examples of differences
There are differences between architectures. For example:
The **word length** is different. Some operations (including array indexing) depends on the word length.
The **integer division** instruction behaves differently among architectures, especially in "division by zero" and "signed integer overflow".
Not all operations on all types perform equally well. 64-bit integer operations perform significantly worse (if possible at all) than 32-bit counterparts on 32-bit machines.
The supported **vector length** of vector instructions is different. This may vary from 64 bits (ARM) up to 512 bits (x86 with AVX) with 128 bits widely supported.
Some instructions are "optional" and the functionality must be software-implemented on those architectures. For example, **UDIV** and **SDIV** in ARMv7.
Although most architectures provide all operations (binary arithmetic and logical, comparison and conversion) mentioned in LLVM, concrete instructions does not work on all data sizes. For example, **conversion from floating point to integer** can only convert between certain data types (float to 32-bit int or 64-bit int, but not other int types.) Not all **vector operations** work for all vector types.
The address size (and indexes) of memory (and array) operations is different among architectures.
Supported data type that can be **atomically loaded/stored**. This affects how the Client is going to implement mutex locks.
Alignment requirement for **vector load/store** operations.
In the native interface/C interface/foreign function interface(FFI)
* The **pointer size** depends on the architecture. This also affects **function pointers** for external C functions.
* The **available system calls** depend on the operating system, the ABI and the processor. The Client must handle the differences between operating systems.
Other issues mentioned in https://github.com/microvm/microvm-meta/issues/16
# Current extension mechanism
An **intrinsic function** (or IFUNC for short) is something that has the similar form of a function, but is treated by the µVM as a regular instruction. The simplest form takes only the name (or ID) of the IFUNC:
```
%result = ICALL @uvm.thread_exit
```
The most complete form takes type arguments, value arguments and an exceptional destination:
```
%result = ICALL @uvm.do_something_weird <@T1 @T2 @T3> (%val1 %val2 %val3) EXC %normal %exceptional
```
Theoretically all binary operations can be defined as IFUNCs. For example:
```
%result = ICALL @uvm.math.sdiv <int<64>> (%lhs %rhs) EXC %normal %div_by_zero_handler
```
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/19Code generator prototyping strategies2016-06-17T15:23:07+10:00John ZhangCode generator prototyping strategies*Created by: eliotmoss*
As we start to ramp up in Amherst, we're pondering how best to get things rolling (i.e., what a good starting point is) and where we want to end up.
**Initial prototype thought**
For our initial prototype w...*Created by: eliotmoss*
As we start to ramp up in Amherst, we're pondering how best to get things rolling (i.e., what a good starting point is) and where we want to end up.
**Initial prototype thought**
For our initial prototype we are thinking about using QEMU (locally pronounced KEE-mew), the Quick Emulator. QEMU is a whole machine emulator (but can also be configured to run as user-mode only) that supports emulating one hardware ISA on another one (and same to same, of course). Beng whole-machine it deals with devices, interrupts, memory mapping, etc., but the part of greatest interest to us is that it takes guest ISA code blocks and dynamically translates them to host machine code, maintaining the translations in a code cache. Thus parts of its design will be great for dynamic and adaptive code generation. QEMU works by translating guest code into QEMU IR, which is then passed to a target code generator. That code generator does a useful level of local register allocation and local optimizations (hence the Quick in Quick Emulator).
For our case, μVM IR would be the guest ISA and our target would the host ISA that QEMU targets. A part of how QEMU does things is that the guest register file is represented as a memory data structure and (to first approximation) the host registers are used for local register allocation. To finer approximation I **believe** QEMU can bind some guest registers to host registers, useful where the host register set is larger. For μVM IR the "registers" are the SSA values and we can probably get some mileage from the SSA form in telling the QEMU register allocator helpful things about when things "die", etc.
Downsides of this prototype approach include:
* Dependence on QEMU
* Code quality may be limited, particularly in that it will tend to push all variables to memory as it crosses from block to block
* It is not clear what the situation is with QEMU and multithreading, though it **may** support concurrent execution
Upsides:
* Fast path to getting native code going
* Useful optimizations, especially for a dynamic environment
* Back-end architecture design that we might want to use a guide in our own work
* Already implements major platforms of interest
**Eventual system design thought**
Down the line we may want something like QEMU's code generation design, with more provision for (optional) higher levels of optimization and register allocation across larger scopes (that is, for whole functions, not just blocks/traces). But their internal IR, code generators, etc., might be a very useful base, as well as having a notion of the code cache design and so forth.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/20Client languages in play2016-06-17T15:23:09+10:00John ZhangClient languages in play*Created by: eliotmoss*
Maybe this is also being discussed elsewhere -- if so, we can close this and move the discussion there. Anyway, on the US side we are wondering about the specific suite of client languages being worked on. We h...*Created by: eliotmoss*
Maybe this is also being discussed elsewhere -- if so, we can close this and move the discussion there. Anyway, on the US side we are wondering about the specific suite of client languages being worked on. We have at least one student interested in doing a mapping from Java bytecode to μVM IR, and we also wonder about some of the dynamic languages (Javascript, maybe? I should perhaps come back and edit this after this week's meeting with the students). What about C or C--? For C we could look at lcc, a little C compiler originally designed for a compiler class, I think, but it accepts some standard kind of C, as I recall. gcc is more of a bear, but ought to be on the agenda at some point, no?https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/23Micro VM interface changes precipitated by having C as a client2016-06-17T15:23:18+10:00John ZhangMicro VM interface changes precipitated by having C as a client*Created by: eliotmoss*
While there will be a bunch of carefully worked out details that need to follow, I thought I would get the ball rolling with a quick summary / highlights of the Modula-3 approach. (See https://www.cs.purdue.edu/...*Created by: eliotmoss*
While there will be a bunch of carefully worked out details that need to follow, I thought I would get the ball rolling with a quick summary / highlights of the Modula-3 approach. (See https://www.cs.purdue.edu/homes/hosking/m3/reference/m3.html for a Modula-3 reference.)
First, M3 has ordinary *traced* references and also **untraced** ones. REF T is the type of a traced reference to a T, while UNTRACED REF T is the type of an untraced one. Traced vs. untraced is a concept distinct from safe vs. unsafe. Safe use of traced and untraced keep them segregated. Untraced objects come from an explicitly managed storage area. You can use NEW on an untraced reference type to allocate in such an area.
Certain regions of code (interfaces or modules -- probably a bundle is the corresponding thing for the Micro VM) can be marked UNSAFE. Unsafe code can do additional things. These are ones that I remember:
* Cast an expression to an arbitrary type (of the same size in bits) - this clearly allows interconverting between traced and untraced pointers, but also other things; this is called LOOPHOLE(e, T).
* Address arithmetic (Digression: In M3 REFANY is the type of a traced reference to any object, which in safe code requires a dynamically type-checked downcast to turn into a REF T and do more interesting things. The corresponding untraced type is ADDRESS, and it has arithmetic operators.)
* Up/down cast without checking
* Free an untraced reference's referent (the function is called DISPOSE)
Safe code must use traced/untraced only is specific ways, described in the language reference.
We may need to add to this sort of design to allow short unsafe code regions to manipulate ordinary (traced) objects safely in the presence of (say) concurrent GC. Also, an untraced version of iref would make sense for the Micro VM.