general-issue-tracker issueshttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues2018-06-28T23:11:33+10:00https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/73License2018-06-28T23:11:33+10:00John ZhangLicenseWe have decided to use the Apache 2.0 license for all our code.
The following two commits put forward a draft for the license file.
verbatim: mu/mu-perf-benchmarks@47045501
added ANU copyright: mu/mu-perf-benchmarks@76b1b7ca
If...We have decided to use the Apache 2.0 license for all our code.
The following two commits put forward a draft for the license file.
verbatim: mu/mu-perf-benchmarks@47045501
added ANU copyright: mu/mu-perf-benchmarks@76b1b7ca
If there is no problem, then I will merge the branch into master, and you can all put a copy of the LICENSE file into your project.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/72Alternative serialisable format (such as JSON/YAML/XML/...)2018-06-28T23:11:33+10:00Kunshan WangAlternative serialisable format (such as JSON/YAML/XML/...)I am glad to see the [mu-tool-compiler](https://gitlab.anu.edu.au/mu/mu-tool-compiler) project existing.
I have conjectured having an alternative serialisable and human-readable format to the current text-based IR. In fact, the text-ba...I am glad to see the [mu-tool-compiler](https://gitlab.anu.edu.au/mu/mu-tool-compiler) project existing.
I have conjectured having an alternative serialisable and human-readable format to the current text-based IR. In fact, the text-based Mu IR is a thing that I am unhappy with. It has various problems.
- It requires a dedicated parser, which has to be implemented by hand.
- When new features are added, the grammar changes, and the parser needs to be modified.
- The text-based IR is confined by aesthetic considerations, and has many inconsistencies. For example:
- The reason why `.funcdef ... <@sig>` has a signature is because it also works as a syntax sugar, using which a human writer only needs to write a `.funcdef` to create both a function and its first version.
- As a convention, types and signatures in Mu instructions are in angular brackets, such as `ADD <@i32> %x %y`. But instructions may have more than types and signatures. One example is `GETFIELDIREF`. It has a integer literal argument. But the current form `GETFIELDIREF <@type 3> %ref` is ugly. The number `3` looks out of place.
I suggest there should be a Mu IR format in a well-known structured data format, such as JSON, YAML, XML, and so on.
Related work:
- LLVM yaml2obj: http://llvm.org/docs/yaml2obj.html
Potential advantages:
- There are mature open-source parsers available.
- Easy to extend.
- Easy to specify (in mu-spec).
For example, if we want to add an externally-usable symbol to an exposed function, we only need to add a property, not redesigning the grammar:
```yaml
name: foo
func: func
callconv: DEFAULT
cookie: cookie
symbol: externally_visible_symbol # This is an added property
```
It is easy to specify because we can define the IR as an (abstract) object tree with properties, similar to [how the HTML5 DOM is defined](https://html.spec.whatwg.org/multipage/dom.html#elements-in-the-dom).
There are also potential disadvantages:
- More verbose
- Less human-readable than the current text form, but human readability should not be the primary concern.
XML example:
```xml
<bundle>
<type id="i8" ctor="int" length="8" /> <!-- note: XML ID is actually a name -->
<type id="i32" ctor="int" length="32" />
<type id="i64" ctor="int" length="64" />
<type id="pi8" ctor="uptr" type="i8" />
<type id="ppi8" ctor="uptr" type="pi8" />
<type id="refi32" kind="ref" type="i32" />
<funcsig id="mainsig" />
<paramty type="i32" />
<paramty type="ppi8" />
<retty type="i32" />
</funcsig>
<const id="I32_42" type="i32" value="42" />
<const id="I64_0" type="i64" value="0" />
<global id="errno" type="i32" />
<funcdecl id="main" sig="mainsig" />
<funcdef func="main" />
<bb lname="entry"> <!-- lname = local name -->
<param type="i32" lname="argc" />
<param type="ppi8" lname="argv" />
<inst opcode="ADD" flags="V" type="i32" opnd1="%argc" opnd2="@I32_42">
<result lname="res" />
<result lname="ovf" />
</inst>
<inst opcode="CALL" sig="some_sig" callee="some_callee">
<arg val="argc" />
<result lname="r1" />
<nor-dest name="bb2">
<pass-value val="r1" />
</nor-dest>
<exc-dest name="bb3" />
</inst>
<inst opcode="SWAPSTACK" swappee="%some_hypothetic_stack">
<return-with>
<result type="i32" lname="ss_res1" />
<result type="i32" lname="ss_res2" />
</return-with>
<pass-values>
<pass-value type="i32" val="%res" />
<pass-value type="i32" val="%r1" />
</pass-valuse>
</inst>
<!-- more instructions here -->
</bb>
<bb lname="bb2">
<param type="i32" lname="r1" />
<!-- more instructions here -->
</bb>
<bb lname="bb3">
<exc-param lname="exc" />
<!-- more instructions here -->
</bb>
</funcdef>
<expose id="exposed_main" symbol="c_callable_symbol_of_exposed_main"
func="main" callconv="DEFAULT" cookie="@I64_0" />
</bundle>
```
A YAML example:
```yaml
types:
- name: i8
ctor: int
length: 8
- {name: "i32", ctor: "int", length: 32}
- {name: "i64", ctor: "int", length: 64}
- {name: "double", ctor: "double"}
function_signatures:
- name: "mainsig"
paramtys: ["i32", "ppi8"]
rettys: ["i32"]
constants:
- {name: "I32_42", type: "i32", value: 42}
- {name: "I64_0", type: "i64", value: 0}
- {name: "D_0", type: "double", value: 0.0}
- name: "D_NAN"
type: "double"
value_from_int: 0x7ff0000000000001
globals:
- {name: "errno", type: "i32"}
functions:
- name: "main"
sig: "main_sig"
initial_version:
- bbname: "entry"
params:
- {type: "i32", lname: "argc"}
- {type: "ppi8", lname: "argv"}
insts:
- {opcode: "ADD", flags: "V", type: "i32", opnd1: "%argc", opnd2: "@I32_42",
results: ["res", "ovf"]}
- opcode: "CALL"
sig: "some_sig"
callee: "some_callee"
args: ["%argc"]
results: ["r1"]
nor_dest:
bb: "bb2"
pass_values: ["%r1"]
exc_dest:
bb: "bb3"
- opcode: "SWAPSTACK"
swappee: "%some_hypothetic_stack"
ret_with:
- {type: "i32", lname: "ss_res1"}
- {type: "i32", lname: "ss_res2"}
pass_value:
- {type: "i32", val: "%res"}
- {type: "i32", val: "%r1"}
# more instructions here
- bbname: "bb2"
params:
- {type: "i32", lname: "r1"}
insts:
# more instructions here
- bbname: "bb2"
excparam: "exc"
insts:
# more instructions here
exposed_functions:
- name: "exposed_main"
symbol: "c_callable_symbol_of_exposed_main"
func: "main"
callconv: "DEFAULT"
cookie: "@I64_0"
```
LISP:
```lisp
(type i8 int 8)
(type i32 int 32)
(type i64 int 64)
(type pi8 ptr i8)
(type ppi8 ptr pi8)
(funcsig mainsig (i32 ppi8) (i32))
(const I32_42 i32 42)
(const I64_0 i64 0)
(global errno i32)
(funcdecl main main_sig)
(funcdef main.v1 main
(bb entry ((i32 argc) (ppi8 argv))
(ADD i32 %argc @I32_42 res
((C carry)
(V ovf)
))
(CALL some_sig some_callee (%argc) (r1)
((bb2 (%r1)) (bb3)))
(SWAPSTACK %some_hypothetic_stack
(ret-with ((i32 ss_res1)
(i64 ss_res2)))
(pass-values ((i32 %res)
(i32 %r1))))
(bb bb2 ((i32 r1))
# More instructions here
)
(bb bb2 (exc)
# More instructions here
)
(expose exposed_main main DEFAULT @I64_0
((symbol "c_callable_symbol_of_exposed_main")))
```https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/70Debugging facilities for Mu clients2018-06-28T23:11:33+10:00Zixian CaiDebugging facilities for Mu clientshttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/69Comparing ref<T> against ref<U>2018-06-28T23:11:33+10:00Kunshan WangComparing ref<T> against ref<U># The problem
Currently the cmpOp instructions take only one type parameter, and both operands must have the same type.
```
%result1 = EQ <int<32>> %a1 %b1
%result2 = EQ <int<64>> %a2 %b2
%result3 = FEQ <float> %a3 %b3
%result...# The problem
Currently the cmpOp instructions take only one type parameter, and both operands must have the same type.
```
%result1 = EQ <int<32>> %a1 %b1
%result2 = EQ <int<64>> %a2 %b2
%result3 = FEQ <float> %a3 %b3
%result4 = EQ <ref<T>> %a4 %b4
%result5 = EQ <iref<T>> %a5 %b5
%result6 = EQ <funcref<sig>> %a6 %b6
%result7 = EQ <uptr<T>> %a7 %b7
%result8 = EQ <ufuncptr<sig>> %a8 %b8
```
In object-oriented programming, we sometimes want to compare a `ref<T>` against `ref<U>`, where T is a superclass of U. Mu does not know OOP, but Mu has the [prefix rule](https://gitlab.anu.edu.au/mu/mu-spec/blob/master/memory.rst#prefix-rule) so that a value of type `ref<T>` can actually refer to an object of type `U` as long as `T` is a prefix of `U`. This allows OOP to be implemented in the Mu type system.
For this reason, comparing `ref<T>` against `ref<U>` for equality is meaningful: The result is true iff both references refer to the same object, or both NULL. The semantics of `EQ` is actually [defined as so in the spec](https://gitlab.anu.edu.au/mu/mu-spec/blob/master/instruction-set.rst#comparison), but requires both operands to have the `ref<T>` type.
```
// Assume %a is ref<T> and %b is ref<U>
%result = EQ <ref<T>> %a %b // Disallowed in the spec because %b is a ref<U>. But the refimpl does not check the type parameter, so it works for now.
```
To work-around this problem, the client can use `REFCAST` to cast both operands to the same type (such as `ref<void>`) before comparing:
```
// Assume %a is ref<T> and %b is ref<U>
%aa = REFCAST <ref<T> ref<void>> %a
%bb = REFCAST <ref<U> ref<void>> %b
%result = EQ <ref<void>> %aa %bb
```
This would potentially make the Mu instruction stream very verbose.
# Simplest solution
The simplest work-around is to let REFCAST ignore the type parameter when comparing between two `ref`, `iref`, `funcref`, `uptr` or `ufuncptr` values.
The code will look like:
```
// Assume %a is ref<T> and %b is ref<U>
%result = EQ <ref<Blah>> %a %b // The micro VM ignores the Blah
```
This will elide the two `REFCAST` instructions. Similarly, when comparing `iref<T>` and `uptr<T>`, `T` is ignored; it also disregards the `sig` signature in `funcref<sig>` and `ufuncptr<sig>`. If this behaviour is *standardised*, the client can rely on this and emit less instructions.
## What the micro VM sees
When the micro VM sees this instruction: `EQ <ref<Blah>> %a %b`, the compiler knows that both `%a` and `%b` are `ref` of something, but does not know what `%a` and `%b` refers to (the "Blah" can just be a lie). As long as refs are always represented in the same way (such as represented as pointers to the beginning of the object, but may be moved by the GC), the compiler can still generate code without knowing the object type. **The compiler cares about the storage type**, not the high-level parameterised type.
## Potential side effects (unlikely)
This will require all `ref<_>` types to have the same representation (sizes, as pointer or as handle) regardless of the type parameter. It prevents the possibility that "`ref<T>` and `ref<U>` may have different sizes. But I don't think implementing different refs in different sizes would be useful.
# A more aggressive design
We can push it further by removing all type and signature parameters in `ref<T>`, `iref<T>`, `funcref<sig>`, `uptr<T>` and `ufuncptr<sig>`, so they become simply `ref`, `iref`, `funcref`, `uptr` and `ufuncptr`.
To compensate the lack of the knowledge about the referent type, instructions must be annotated with the referent types. But the micro VM only needs to know the referent type when doing pointer arithmetics (GETFIELDIREF...) and memory access (LOAD, STORE, ...). For example:
```
// Assume %a is an iref to T, and T is a struct
%b = GETFIELDIREF <T 3> %a
// Assume %c is an iref to int<64>
%v = LOAD <int<64>> %c
```
Actually this is *the same as the current Mu IR*. The type annotations on the instructions are intended to ease the job of the Mu-to-machine compiler inside the micro VM.
By discarding the type parameters, REFCAST will be unnecessary, and PTRCAST only casts between pointers and integers, but not between pointers.
But the Mu IR programs themselves will carry less information about the destination of refs/uptrs. It may make the behaviour of the program harder to reason about. But since the client can perform REFCAST at any time, it can always choose to cast all refs to `ref<void>`, and still write correct programs.
It is unlikely that we will adopt this aggressive design soon, but may be considered if we redesign the IR.
# Comparing ref against ptr
A related topic is whether it should be allowed to compare `ref` against `ptr`.
The obvious answer is "no". `ref` and `ptr` (as well as `funcref`) do not have the same storage type. `ref` can be represented as the address to the beginning of the object, and may be modified by the GC when the object is moved. `ref` can also be represented as a handle, or as a pair of <addr, type>. On the other hand, `ptr` must be treated as raw addresses. Even if `ref` is represented as address, consider an extreme case where we have a micro VM that performs GC between every pair of instructions, and moves every object at that time. It is a valid micro VM implementatoin, but the address of any `ref` is totally non-determinestic.
In some VMs (such as JikesRVM), there are VMMagics that allows getting the address from an object reference, or converting an address into an object reference. In this way, the GC can be implemented in the same language as the language it is serving. However, the `addr->objref` and `obj->addr` conversions alone are not enoug. Such VMs must also have mechanisms to specify **uninterruptable** regions in which GC must not happen. If the GC is concurrent, there must be other mechanisms to handle this gracefully. But all "magics" are closely related to the concrete (micro)VM implementation, and should be kept private.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/68Duration of Object Pinning2018-06-28T23:11:33+10:00Kunshan WangDuration of Object PinningI am just being pedantic about the semantics of object pinning. The current "object pinning" is vague w.r.t. the duration of pinning.
Now that we define global cells as always pinned [to support relocations for raw pointers](https://g...I am just being pedantic about the semantics of object pinning. The current "object pinning" is vague w.r.t. the duration of pinning.
Now that we define global cells as always pinned [to support relocations for raw pointers](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/60), there are two different scopes of pinning. An object is pinned iff
1. it is a global cell, or
2. it is in the "pin set" of a thread.
The purpose of the so-called "pin set" is to make pinning operations locally scoped: It is like a per-thread reference-count rather than global referenece-count. If copying GC never happens, threads only need to modify local thread states rather than global states.
Consequently, the thread state of one thread may not be visible (or consistent w.r.t. concurrency) to other threads, hence the phrase "is in the pin set" is vague: whether a memory location is pinned depends on the observer.
But there is one guarantee the micro VM must make:
**During the time when a memory location is pinned, its address must not change**, at least as observed by the same Mu thread which executes C functions while pinned. So naturally we can define that **the thread that pins the location must not observe the address changed between its own pinning and unpinning**. So if two threads independently pin and then unpin the same Mu object concurrently without any synchronisation, they may observe different addresses, because their durations may not overlap, and GC may happen in between. If the two threads do not communicate and their C functions do not save the pointers, it should just work even if the object is moved while **not** pinned.
But more interesting questions may arise if we consider inter-thread communication: If
1. one thread T1 pins an object O1, then
2. sends the address of O1 to another thread O2, then
3. O2 pins the same object O2, then sends a message to T1, then
4. T1 unpins O1, and
5. T2 independently unpins O1
Then **should O1 have constant address since T1 pins O1 until both T1 and T2 unpins O1**? If we interpret "sending a message" as "forming a happens-before relation", then the whole process looks pretty sequential.
We can use the "happens-before" relation to define the duration of pinning, so that the pinning/unpinning operations from different threads can chain up. Then some Mu objects may have a very long duration of pinning, during which it has constant address. This is not a problem, at least not more problematic than one single thread pinning an object for a really long time. We just need to precisely define the duration of pinning so that the client can depend on it.
I still don't know how to precisely express it. This is definitely trickier than the visibility rules for LOAD/STORE operations because this time it is about duration rather than just a value. The easiest model, of course, is to make pinning/unpinning sequentially consistent, but I wonder if it would require excessive fencing to prevent weird behaviours in something like:
1. T1 pins O1 and sends a message to T2, then
2. T2 pins O2 and sends a message to T3, then
3. the programmer thinks T3 should see O1 being pinned, but observed otherwise.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/67C thread becoming Mu thread (exposed functions, a.k.a. ".expfunc")2018-06-28T23:11:33+10:00Kunshan WangC thread becoming Mu thread (exposed functions, a.k.a. ".expfunc")This issue is about calling Mu functions from C functions. It is not a problem if Mu initiated the call to native program and then it calls back. But when a fresh native thread (such as created by `pthread_create`) directly calls a Mu fu...This issue is about calling Mu functions from C functions. It is not a problem if Mu initiated the call to native program and then it calls back. But when a fresh native thread (such as created by `pthread_create`) directly calls a Mu function, thread-local states (such as GC states) must have been initialised, or the Mu program will not work properly.
Related spec: https://gitlab.anu.edu.au/mu/mu-spec/blob/master/native-interface.rst#native-functions-calling-mu-functions
Previous issue: https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/39
# The problem
When a Mu thread is executing, there are thread-local states that needs to exist to support the execution of Mu IR programs.
For example, if the Mu IR program uses bump-pointer GC, the "current pointer" is a per-thread state, and it should point to the next available memory all the time. Mu instructions (such as `NEW` and `NEWHYBRID`) assumes such thread-local pointers are set up when such instructions are executed.
Such states are usually set up when a Mu thread is created. When a thread is created using the `NEWTHREAD` instruction or its equivalent API, the micro VM will initialise the states properly.
But the problem arises when the thread is created natively (for example, by `pthread_create`). Such **POSIX functions are not designed with Mu in mind** and will not initialise Mu-specific states. So a PThread cannot call Mu directly call a Mu function unless some preparation is done.
# Current design
Related spec: https://gitlab.anu.edu.au/mu/mu-spec/blob/master/native-interface.rst#native-functions-calling-mu-functions
The current Mu spec requires **implementation-defined** functions to be called before native threads not created by Mu (such as POSIX threads) can call any exposed Mu functions.
A Mu bundle can define `.expfunc` top-level definitions to directly expose pointers to C programs. For example:
```
.funcdef @fac ... {...}
.expfunc @fac_native = @fac #DEFAULT @I64_0 // expose @fac, default calling convention, use 0 as "cookie".
```
`@fac_native` is a raw function pointer which can be **called back** when Mu calls C and then C calls back to Mu. But when PThread wants to call `@fac_native`, it needs implementation-defined set-up.
## Possible implementations
* The concrete micro VM can forbid such calls, and enforce that only Mu threads can execute Mu functions.
* The concrete micro VM can extend the API with a function to attach or detach PThreads, or threads using other APIs.
* The concrete micro VM can create Mu-specific thread-local states lazily when entering from native to Mu. Since the only way to enter Mu is via "exposed functions", hence stubs can be created at those "expfuncs" to lazily check for such states, or use SIGSEGV to trap when such pointers are zero.
Each has its own strength and weakness. This is why this interface is still implementation-defined for now. Real-world experiences will tell which method is better.
## Multiple micro VMs in the same process?
It is rare that there will be one process running two micro VMs. But it is definitely possible. For example:
* A C host program provides both Python and Lua as extension languages (real-world applications exist), but both language implementations use the Mu micro VM.
* The client has some kind of sandbox mechanism and forces some part of the program to run in a separate micro VM.
# Related works
## JNI Invocation API
Related document: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html#attaching_to_the_vm
The JVM invocation API provides the `AttachCurrentThread` function to attach a PThread to a JVM, under the limitation that a native thread cannot be attached to two different JVMs. JNI also require that the PThread stack "should have enough stack space to perform a reasonable amount of work" and "The allocation of stack space per thread is operating system-specific. For example, using pthreads, the stack size can be specified in the pthread_attr_t argument to pthread_create.".
From Mu's point of view, the `MuCtx` structure holds Mu states for the client, so calling API functions in `MuCtx` does not need any attaching. However, calling "exposed Mu functions" will need special set-up like `AttachCurrentThread`.
## JikesRVM
JikesRVM's GC is designed in such a way that it will work even if the related thread-local data structure is all zero (as is initialised by the system). This gracefully avoided the problem related to GC. But it could not be the most general solution.
## .NET framework
Related documents: https://msdn.microsoft.com/en-us/library/74169f59(v=vs.110).aspx
VM-related thread-local states are created lazily when an unmanaged thread enters the managed runtime.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/66Are bundles the unit of compiling or the unit of loading?2018-06-28T23:11:33+10:00Kunshan WangAre bundles the unit of compiling or the unit of loading?# Two different views of "bundle"
From the compiler's point of view, compiling is a process where:
1. There are many modules (such as .class files), all of them needs to be compiled (to Mu IR, for example). Modules may have inter-d...# Two different views of "bundle"
From the compiler's point of view, compiling is a process where:
1. There are many modules (such as .class files), all of them needs to be compiled (to Mu IR, for example). Modules may have inter-dependencies, and there could even be mutual recursions (A imports B, B imports C, and C imports A).
2. Compilers should compile each module separately, with no knowledge of other modules. This implies each module compiles to a stand-alone bundle that has all the necessary things (types, functions, ...) used inside the bundle. Since types use "structural equivalence", it does not matter if two structurally isomorphic types are defined twice in two bundles. In particular, functions can be declared multiple times in different bundles.
3. When they are linked, bundles are merged. Different bundles should have no intersections, with one exception: Functions of the same name are resolved to be the same, so calling a declared but not defined function may ends up calling a function defined in another bundle. (Global cells should have similar properties, too.) We also allow calling functions that are declared but not defined in any other bundles. which triggers lazy loading.
In the current Mu design, the micro VM's view of bundles is:
1. There is a global bundle, which includes everything that is ever loaded at a point of time.
2. Bundle is the unit of loading. Bundles are loaded sequentially. (At least it is perceived to be so through the API. The Mu impl can load them in parallel while ensuring sequential consistency.)
3. Each bundle can refer to things (types, functions, ...) defined in the current bundle or the global bundle.
4. Every time a bundle is loaded, the contents (types, functions, ...) are merged into the global bundle. That is, *the is one single global bundle which gets gradually augmented as bundles are loaded*. Conflicts are not allowed. If a new function version (FuncVer) is defined on an existing function, this FuncVer becomes the "most recent version" of the function.
The main difference between the two views is whether we consider bundle loading to be a static, separated and parallel process, or a dynamic and sequentially inter-dependent one.
## Why is Mu designed like this?
The current Mu design is based on that (1) Mu is a run-time JIT compiler, and (2) Mu supports function re-definition. Because Mu is a run-time entity, it is one single thing that lives through the life of the application. It will observe all things the client ever deliver to it (bundles), and this is a temporal process. The micro VM always starts with no knowledge, and the client "teaches" the micro VM more and more knowledge by loading bundles. So the "global bundle" represents the "current knowledge" the micro VM has about the world (i.e. the types, functions, ... of the client's language). Since the growth of knowledge is a sequential process, it is natural to assume bundles are loaded in a sequence. In this way, if a later bundle refers to things the micro VM already knows (for example, types defined in previously loaded bundles), then it does not need to define/declare them again because Mu already knows it, so the bundle can just refer to them by name/ID. The sequential nature also makes it easy to support function re-definition. Since there is a sequence in bundles, a FuncVer in a newer bundle will replace the current "most recent" version in the global bundle.
The separate-compiling approach is the traditional and well-known way how the C compilers work. And it does not address function re-definition. Re-definition is still an "action" rather than a declaration, and the order of "which FuncVer invalidates which older FuncVer" does matter.
## What the client may want
But compiler (traditional C compiler or Mu client) writers may want a certain degree of flexibility of parallel compilation, and some aesthetic appeal that "**separate modules should be compiled to separate Mu bundles**". For example, as a JVM client, it will be more intuitive to generate one Mu IR bundle for each .class file, and each .class file can be compiled separately, and still allow lazy loading. For example:
```java
//// Foo.class
public class Foo {
public static void run() { Bar.run(); }
}
//// Bar.class
public class Bar {
public static void run() { Foo.run(); }
}
```
The separate-compilng model will deliver two Mu bundles:
```
//// Bundle1:
.typedef @Foo = ....
.funcdef @Foo.run VERSION %v1 ... {
...
CALL @Bar.run()
}
.funcdecl @Bar.run ... // Declare @Bar.run in Bundle1
//// Bundle2
.typedef @Bar = ....
.funcdef @Bar.run VERSION %v1 ... {
...
CALL @Foo.run()
}
.funcdecl @Foo.run ... // Declare @Foo.run in Bundle2
```
That is, `@Bar.run` is declared in Bundle1 and `@Foo.run` is declared in Bundle2. They declare functions in each other because neither has knowledge of the other.
However, in the current Mu model, the two bundles will look like:
```
//// Bundle1:
.typedef @Foo = ....
.funcdef @Foo.run VERSION %v1 ... {
...
CALL @Bar.run()
}
.funcdecl @Bar.run ... // Declare @Bar.run in Bundle1
//// Bundle2
.typedef @Bar = ....
.funcdef @Bar.run VERSION %v1 ... {
...
CALL @Foo.run()
}
```
The difference is subtle: Bundle2 does not declare `@Foo.run`, because it knows Bundle1 is loaded before it, and `@Foo.run` is already defined.
It is arguable that this will require two bundles to be built sequentially. But it can be worked around by "lifting" both declarations in to a third bundle:
```
//// Bundle0:
.funcdecl @Bar.run ... // Declare @Bar.run in Bundle1
.funcdecl @Foo.run ... // Declare @Foo.run in Bundle2
//// Bundle1:
.typedef @Foo = ....
.funcdef @Foo.run VERSION %v1 ... {
...
CALL @Bar.run()
}
//// Bundle2
.typedef @Bar = ....
.funcdef @Bar.run VERSION %v1 ... {
...
CALL @Foo.run()
}
```
Declaring functions is faster than defining. After Bundle 0 is loaded, Bundle 1 and Bundle 2 can be built and loaded in parallel.
It is also arguable that "lifting both declarations into a separate bundle" is a redundant step. But in practice, this step cannot be avoided. Still take Java as example. If one Java ClassLoader visits both Foo.class and Bar.class, then it already knows both classes, and it can simply build both into a single Mu bundle rather than splitting them into two. If two Java ClassLoaders attempt to load Foo and Bar in parallel, and they found the inter-dependency, but also found each other working on the two respective .class files simultaneously, then the ClassLoaders need certain synchronisation mechanism so that classes are not loaded twice. This is necessary even in existing non-Mu productional JVMs. So *if there are needs for compiling two Java classes in parallel and they have inter-dependencies, then the client has to factor out the common parts, which naturally leads to the "Bundle0"*.
An orthogonal issue is about the type system. Assume we have the two Java classes:
```java
class Foo { Bar bar; }
class Bar { Foo foo; }
```
Naturally `@Foo` should be `struct<@JavaHeader ref<@Bar>>`. However, without looking at bar.class, we cannot define the type `@Bar` which is supposed to match the structure of the Java class fields in Bar. So if we enforce lazy loading, then Foo.bar has to be represented as `ref<void>` rather than `ref<@Bar>`. This has been [discussed in a separate issue before](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/38). **The separate-compiling model does not solve this problem** because the crux is that the **knowledge** of Bar is only obtained by looking at Bar.class. Unlike declared-but-not-defined functions, having **types** that are not yet known (the C language calls it "incomplete type") will cause many problems. These types are inaccessible. If traps should be triggered when a type is used, it is hard to define what it means by "a type is used". If we define it as accessing an object that has that type, or simply performing BinOp on such types, then almost all instructions can trigger traps.
## Conclusion
In the end, we still believe the current Mu design is reasonable for its purpose as a JIT compiler.
The current "single global bundle" design is also easier for the boot image writer because there is only one bundle to consider.
But we may consider the needs of programming language implementers that "modular languages should be compiled to modular object code". The implication of adopting this model is still not clear. Alternatively, this model could also be implemented in a layer above Mu.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/65The "surface" IR as a layer above the formal IR2018-06-28T23:11:33+10:00Kunshan WangThe "surface" IR as a layer above the formal IR# Abstract
After the recent discussions about several alternative Mu IR forms, I realised the need for formal verification, the concrete micro VM implementation, and the application/client may be different. The IR, as a language, is t...# Abstract
After the recent discussions about several alternative Mu IR forms, I realised the need for formal verification, the concrete micro VM implementation, and the application/client may be different. The IR, as a language, is the means for the client to transfer programs to the micro VM, and this implies *compactness*, *efficiency* and a certain degree of *expressiveness* that does not hinder efficient implementation. Formal verification, on the other hand, will benefit from the IR designed for *functional* languages, *abstraction*, *consistency*, and *simplicity* with respect to an operational model.
This issue proposes a two-level model: a "surface" form primarily for client-µVM communication, and a "formal" form for formalisation. The "surface" form is strictly a syntax sugar of the "formal" form, and can be transformed statically and automatically to the latter when needed.
With the surface form "detached" but mappable to the formal form, the surface form can be designed more aggressively for code compactness, such as introducing side-exits in the middle of basic blocks. But I am not advocating making such aggressive design changes now.
# Different concerns
## Terminating Instructions and Code Bloating
Currently, in the "surface" form as described by the (informal) [Mu Spec](https://gitlab.anu.edu.au/mu/mu-spec/blob/master/instruction-set.rst#ssa-variables), some Mu IR instructions can be either a normal instruction in the middle of a basic block, or a terminal instruction that implies several destinations. The `CALL` instruction is a well-known example. A basic block may contain may `CALL` instructions in a sequence. When an exception is thrown, the default behaviour is "rethrow". While in the "formal" form as described in [uvm-formal-hol](https://gitlab.anu.edu.au/mu/mu-formal-hol/blob/master/uvmIRScript.sml#L370), `CALL` is always a terminating instruction, and both normal and exceptional destinations must be explicitly defined.
The "surface" form is designed with compactness and the machine behaviour in mind. Real-world programs will contain many function calls in a sequence. When no exceptions are thrown, each (both machine-level and language-level) call will simply fall-through to the next instruction, with the return values available to be accessed by subsequent instructions and function calls. The default "rethrow" behaviour is design so that when the current call site cannot catch exceptions, the stack-unwinder will simply skip the current function. It is *the fact that the call site does not handle the exception in most cases* that makes the code efficient.
A related topic is to add "side exits" in the middle of basic blocks, as is done by JikesRVM. This will remove the need to split basic blocks even when exceptions and "uncommon branch targets" are present and thus makes the IR more compact, but it will also make the control flow less explicit.
On the other hand, the "formal" form makes it easy to reason about all branches and all possible execution paths. If the IR is augmented with "NO_THROW" annotations, it can further make exceptions undefined behaviours, and relief the burdens of verification of some Mu functions.
Real-world programming languages, however, usually consistently choose one way or another, with "rethrow" more probable. Java, C#, Python and RPython, for example, always "rethrow", and does not provide the option of "nothrow". C++ allows the ["noexcept"](http://en.cppreference.com/w/cpp/language/noexcept) annotation on some C++ functions, and C++ exceptions can silently pass through any C functions on the stack, as long as the C functions are compiled with compatible compilers (such as g++/gcc), and the object files contain the appropriate unwinding information (.eh_frame in ELF).
But if future languages or existing languages with extensions (vmmagic) can make use of this feature, it has the potential to provide positive effects for the verifiability of carefully-written Mu functions.
## Declarative vs Operational
The current Mu IR is inspired by the LLVM IR. This IR describes an AST of functions and top-level declarations, such as constants (which, in LLVM's terminology, include both literals, global variables and functions). The SSA form is naturally a "dependency-description language": an instruction has many "uses" of other [Value](http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html), and each "Value" can be anything that we can get a value from, that is, either constant or local variable. The compiler sees the dependency of instructions, such as "this `add` instruction depends on an instruction result and a ConstantInt". With a type hierarchy, the compiler will need to pattern-match against the kinds of Values, as is what a static code-transformer always has to do.
On the other hand, [mu-formal-hol](https://gitlab.anu.edu.au/mu/mu-formal-hol) describes the execution of a thread as a sequence of state transition (a kind of interpreting in the functional style). A thread has may registers, each holds the value of a local variable. The registers are modified as the side effect of execution:
1. When [entering a basic block](https://gitlab.anu.edu.au/mu/mu-formal-hol/blob/master/uvmThreadSemanticsScript.sml#L383), the basic block arguments are assigned to the registers of the parameters.
2. When [executing an instruction](https://gitlab.anu.edu.au/mu/mu-formal-hol/blob/master/uvmThreadSemanticsScript.sml#L270), the instruction will affect the values of some registers, and the register values are updated.
With a "Value" being either a constant or a register, the value needs to be pattern-matched against the two cases, namely constant or register, every time an instruction argument is evaluated.
A solution to this complexity is to introduce instructions that loads global vars into local vars. For example, GETCONST, GETGLOBALCELLIREF, GETFUNCREF and GETEXPFUNC. (I intentionally avoid using the word "load" in order to emphasise they are not memory operations, but merely aliasing.) This will make the IR semantically clearer and probably make proof easier, at the cost of making basic blocks more verbose. But fundamentally the two forms are equivalent.
# Where the Two Forms Reconcile
I propose splitting the IR into two layers, with a "surface" catering to compactness, and the "formal" form designed to be more consistent. There will be *a mapping from every "surface" IR bundle to a "formal" IR bundle*, where the two forms reconcile. The point is, for every "surface" bundle, there will be an equivalent "formal" bundle. So it will not compromise verifiability by introducing "unfriendly" syntaxes.
The mapping will "desugar" the "surface" form. There will be a conversion rule for every "surface" syntax that does not exist in the "formal" form. For example:
1. The "fall-through" call
```
%cur_bb(%v1 %v2 ... %vn):
%lv1 = ...
%lv2 = ...
...
CALL <@sig> @callee (...) EXC(
(%rv1 %rv2) = CALL <@sig> @callee (...)
%lv3 = next instruction...
...
```
will be desugared into:
```
%cur_bb(%v1 %v2 ... %vn):
%lv1 = ...
%lv2 = ...
CALL <@sig> @callee (...) EXC(
%generated_continue_block(%v1 %v2 ... %vn %lv1 %lv2 ... $0 $1)
%generated_rethrow_block())
%generated_continue_block(%v1 %v2 ... %vn %lv1 %lv2 .. %rv1 %rv2):
%lv3 = next instruction...
...
%generated_rethrow_block() [%the_exception]:
THROW %the_exception
```
2. References to global variables:
```
%a = ADD <@i32> %a @ONE
```
will be desugared into:
```
%_tmp = GETCONST @ONE
%a = ADD <@i32> %a %_tmp
```
## Other Implications
With the introduction of an additional form, the "surface" form can be more aggressive in design.
The "surface" form may go beyond the current "single-exit" form by introducing **side-exits**, such as:
- DIV by zero, CALL/TRAP/WATCHPOINT/SWAPSTACK with exceptions, NEW/ALLOCA failure, LOAD/STORE with NULL pointers, may take side exits rather than forcing the basic block to be split.
- "guard" instructions (as usually demanded by tracing JIT compilers) may be implemented as side-exiting conditional branches. This also implies that the "side-exit" is the slow path while the "fall-through" case is the common fast path.
Currently I fear that breaking the "single-exit" property may result in the micro VM still having to splitting them internally. LLVM, with its basic blocks not taking parameters, and its optimisers having lots of transforms to do, probably would keep the single-exit SSA form. But since Mu already adopted the "goto-with-values" form, whether "side-exits" should be introduced to the IR should depend on the experiences in the high-performance Mu implementation.
Related works:
- [B3](https://webkit.org/docs/b3/intermediate-representation.html): Apple's B3 still requires Jump/Branch/Switch to be at the end of basic blocks. The reason could be that B3 still uses the text-book SSA form, so non-merging control flow branches do not need PHI nodes, hence it is cheap to add basic blocks.
- RPython: Its transformers (including GC transformers and exception transformers) will split basic blocks for function calls with exceptions. But since RPython is static, code compactness may not be a concern.
- JikesRVM: JikesRVM uses Factored CFG (FCFG), where a Potential Excepting Instruction (PEI) does not necessarily end a basic block. As described [here](http://www.jikesrvm.org/JavaDoc/org/jikesrvm/compilers/opt/ir/ControlFlowGraph.html), FCFG will significantly reduce the number of basic blocks, but will complicate flow-sensitive global analysis. But given that Mu pushes most optimisations out of the micro VM, it is arguable that the micro VM back end may favour a simpler form.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/64New Symbolic IR building API2018-06-28T23:11:33+10:00Kunshan WangNew Symbolic IR building API# Introduction
Recent experiences from Yi shows it is difficult to implement the current bundle building API in some languages such as Rust with respect to *mutability* and *cyclic references* between IR nodes. We are proposing an alt...# Introduction
Recent experiences from Yi shows it is difficult to implement the current bundle building API in some languages such as Rust with respect to *mutability* and *cyclic references* between IR nodes. We are proposing an alternative IR building API. In the new API, **IR nodes will refer to other nodes by symbols (IDs and optionally names)**, rather than referring to the nodes directly. The advantages will be:
1. The client can build the IR nodes in any order.
2. The implementation language of *the micro VM itself* can keep symbolic references between IR nodes, which can be helpful if the language (such as Rust or Haskell) is sensitive to mutability and cycles.
The potential disadvantage is that it will require the micro VM to resolve symbolic reference between IR nodes during bundle building or after bundle loading. But this practice has been commonly used in the history of compiler construction (according to Tony). We expect the cost of micro VM-side symbol resolution to be acceptable, though the actual performance implication is not yet known.
TL;DR: Scroll down to "The Proposed New API"
# Background
The [high-performance Mu implementation](https://gitlab.anu.edu.au/mu/mu-impl-fast) is written in Rust.
**Rust does not like cycles.** Every Rust l-value is owned by another unique l-value. This forbids having cyclic reference between nodes. The moment the programmer attempt to form a cyclic reference, it will complain about ownership problems.
But the Mu IR often contain cycles. Such as:
```
.typedef @linkedlist = struct < @i64 @linkedlist_ref >
.typedef @i64 = int<64>
.typedef @linkedlist_ref = ref < @linkedlist >
```
Note the recursion on the `ref` type. The text form can express this recursion because each type node has a name: `@linkedlist`, `@i64` and `@linkedlist_ref`.
The current IR building API constructs nodes directly, so API functions cannot refer to nodes that will be created in the future. To support cycles, the API mutates IR nodes after a node is created:
```c
MuCtx *ctx = ...;
MuBundleNode b = ...;
// Create a type: ref<?>
MuTypeNode linkedlist_ref = ctx->new_type_ref(ctx, b)
MuTypeNode i64 = ctx->new_type_int(ctx, b, 64);
MuTypeNode members[] = {i64, linkedlist_ref};
MuTypeNode linkedlist = ctx->new_type_struct(ctx, b, members, 2);
// Set the ref<?> to ref<@linkedlist>
ctx->set_type_ref(ctx, b, linkedlist_ref, linkedlist);
```
Note that the IR nodes must be created in a particular order, with reference targets populated in the end. Such order always exists for all Mu IR bundles, because the Mu IR can only be recursive at certain spots. Constructing the IR nodes in the following order will always work: ref/uptr types -> other types and signatures -> populate ref/uptr types -> constants/globalCells/funcs/expfuncs -> funcvers -> basic blocks -> bb params -> insts/results -> branch destinations/exception clauses/keepalives/subclauses of NEWTHREAD and SWAPSTACK.
But **Rust does not like mutation, either**. Rust is designed for memory safety, especially trying to avoid data races in multi-threaded programs. So it is a well-known rule in Rust that *if anything is shared, it cannot be modified*. Most sharing mechanisms (&mut (mutably borrowed reference), Rc (refcount box), Arc (atomic refcount box)) share in a read-only mode, unless synchornisation mechanisms are also applied (such as Mutex).
This does not match the current Mu bundle building and loading design: currently, the bundles is mutable when being built, but becomes immutable once loaded. To implement such mutation, Rust needs to use `Cell<T>` -- "internal mutable fields" in immutable objects. And when an object is constructed but the field is not yet supplied (such as `ref`), Rust needs to use `Option<T>` to leave space for the `None` case. We would have `struct TypeRef { target: Cell<Option<Type>> }`, while when loaded, the `target` field is neither optional nor mutable.
## Other languages
No known programming languages can express the idea of "mutable while being constructed, immutable when used".
In Java, JavaBeans is a famous "mutable construction" pattern: a "bean" class provides a parameter-less constructor and many setters, so properties can be set via setters. It is useful for introspection-based object initialisation:
```java
class Foo {
public Foo() { }
private Bar bar;
public void setBar(Bar bar) { this.bar = bar; }
public Bar getBar() { return bar; }
}
class Bar {
public Bar() { }
private Foo foo;
public void setFoo(Foo foo) { this.foo = foo; }
public Foo getFoo() { return foo; }
}
```
This pattern allows multiple objects to be created, but their dependencies injected later. The Spring Framework is one such container:
```xml
<beans>
<bean id="theFoo" class="Foo">
<property name="bar" ref="theBar" />
</bean>
<bean id="theBar" class="Bar">
<property name="foo" ref="theFoo" />
</bean>
</beans>
```
Many Spring Framework objects claim to be "thread-safe *after* configured". It implies that they may not be ready to use *during* configuration. Such objects depend on the protocol the programmers know in order to work properly.
But this "JavaBeans" pattern is also criticised for leaving the object fields non-final, though none of the properties are supposed to be changed after setting. They claim the compiler will not be able to do certain optimisations given that the fields are still mutable and the public setters are still accessible.
Despite the criticism, constructors with immutable fields are known not being able to form object graphs with cycle. Scala is another programming language that promotes immutable objects, but also suffers from the inability to form cycles of immutable references directly. [One workaround](http://stackoverflow.com/questions/8374010/scala-circular-references-in-immutable-data-types) is to use lazy evaluation to resolve some reference edges later, but the underlying implementation still depends on JVM-level mutable fields.
```scala
class Element [T] (val value: T, p : => Element[T], n : => Element [T]) {
lazy val prev = p
lazy val next = n
}
```
Unfortunately, such workarounds do not exist in Rust.
## rustc: The Rust Compiler
rustc is the compiler of the Rust language. Internally, it has an LLVM-like CFG representation, and there are inter-references between instructions and basic blocks.
rustc separates the *references* to things and the *contents* of things. Take basic blocks for example. A CFG contains many BasicBlock:
```rust
// For information, only. May not match the actual source code.
struct CFG {
bbs: Map<BasicBlock, BasicBlockData>
}
struct BasicBlock(u32)
struct BasicBlockData {
instructions: ...
blahblah: ...
blahblahblah: ...
}
struct BranchInstruction {
destination: BasicBlock
}
```
In the snippet above, `BasicBlockData` actually holds the information about a basic block, while `BasicBlock` is just an alias of `u32`. Instructions, such as `BranchInstruction`, refer to the destination by `BasicBlock`, which is a symbolic reference, and does not own the basic block. The actual owner of `BasicBlockData` is the `bbs` field in `CFG`. Using this approach, the AST is still a tree from Rust's point of view. When the program needs information about a basic block, it can borrow the reference to `BasicBlockData` from `CFG.bbs` using the `BasicBlock` as the key.
It is true that this approach will require a redirection whenever accessing the information about a basic block. But if the `Map` is implemented as an array and `BasicBlock` holds the index, then the lookup can be just an extra ADD and a LOAD. This is probably the best we can get if we use Rust without also using its unsafe features.
```rust
let bb: BasicBlock = ...;
{
let bbi: &BasicBlockData = cfg.bbs[bb]; // Borrow BasicBlockData
// use bbi
} // BasicBlockData returned here.
```
Similarly, the high-performance micro VM can use this pattern to handle cyclic references between entities. All Mu types can be owned by the bundle, and one type can refer to another via indices into the array of "MuTypeData".
The implication of this is that the API should also refer to other IR nodes via symbols rather than actual constructed nodes.
# The Proposed New API
The new API will introduce another C-level struct: `struct MuIRBuilder`. Similar to the `MuCtx` which is used by only one client thread, the client must only use `MuIRBuilder` in one client thread, otherwise the micro VM will need to synchronise every method of it. This new struct is actually orthogonal to this topic. From the observation of the current bundle building API, the functions related to bundle building has no intersection with other functions in the `MuCtx` struct whose purpose is to mutate the running Mu state.
`struct MuIRBuilder` is also a function pointer table, like the current `MuCtx` design. It is an open topic if we adopt the traditional C approach -- having all methods as top-level C functions, but it is a separate issue.
```c
typedef struct MuIRBuilder MuIRBuilder;
struct MuIRBuilder {
void *header; // implementation-specific private field
MuID (*gen_sym)(MuIRBuilder *b, MuCString name);
void (*load)(MuIRBuilder *b);
void (*abort)(MuIRBuilder *b);
void (*new_type_int)(MuIRBuilder *b, MuID id, int length);
void (*new_type_ref)(MuIRBuilder *b, MuID id, MuID target);
void (*new_type_struct)(MuIRBuilder *b, MuID id, MuID fields[], MuArraySize nfields);
...
};
```
## gen_sym: use ID for everything
The `gen_sym` method creates a "Mu symbol", or just "sym" for short. A "sym" has a numerical ID and an optional string name. A "sym" identifies a node in the bundle. The ID is generated by the micro VM when `gen_sym` is called. `name` can be `NULL`. The generated ID is used to identify this "sym".
All other functions take `MuID` as parameter for the ID of the node it is creating, and use the `MuID` to refer to other nodes. The ID can be used as long as it is returned by `gen_sym`, and *may even be used before the actual thing is created*. For example, if we are creating the linked list type, the C client code will look like:
```c
MuIRBuilder *b = ...;
MuID linkedlist = b->gen_sym(b, "@linkedlist");
MuID i64 = b->gen_sym(b, NULL /* I don't care about its name. */);
MuID linkedlist_ref = b->gen_sym(b, "@linkedlist_ref");
MuID members[] = {i64, linkedlist_ref}; // Use i64 and linkedlist_ref before these types are defined.
b->new_type_struct(b, linkedlist, members, 2);
b->new_type_int(b, i64, 64);
b->new_type_ref(b, linkedlist_ref, linkedlist);
```
Note that `i64` is used by `new_type_struct` before `new_type_int` is called.
Also note that the `new_xxxx` functions return `void` rather than handles. Since nodes refer to each other by "sym", handles are no longer necessary.
## complex sub-structures
More things will become IR nodes, such as
- branching destinations (basic block + arguments)
- exception clauses
- keep-alive clauses
- sub-clauses in the NEWTHREAD and SWAPSTACK instructions (these two instructions are too complex to be created by one function)
For example:
```uir
(@x @y) = CALL <@sig> @callee (@v1 @v2 @v3)
EXC(
@bb1(@v4 @x @v5 @y @v6)
@bb2(@blah @blah @blah))
KEEPALIVES(@v1 @v3 @v5)
```
```c
MuID call_inst = b->gen_sym(b, "@the_OSR_introspectable_call_site");
MuID x = b->gen_sym(b, "@x");
MuID y = b->gen_sym(b, "@y");
MuID exc_clause = b->gen_sym(b, NULL /* I prefer not to give too many names */);
MuID nor_dest = b->gen_sym(b, NULL /* I prefer not to give too many names */);
MuID exc_dest = b->gen_sym(b, NULL /* I prefer not to give too many names */);
MuID keepalive_clause = b->gen_sym(b, NULL /* I prefer not to give too many names */);
MuID call_args[] = {v1, v2, v3};
MuID call_results[] = {x, y}; // Sorry Eliot
b->new_call_inst(b, call_inst,
call_results, 2, // SSA variables for return values
sig, callee,
call_args, 3, // arguments to the function
exc_clause,
keepalive_clause);
b->new_exc_clause(b, exc_clause, nor_dest, exc_dest);
MuID nor_args[] = {v4, x, v5, y, v6};
b->new_dest(b, nor_dest, bb1, nor_args, 5);
MuID exc_args[] = {blah, blah, blah};
b->new_dest(b, exc_dest, bb2, exc_args, 3);
MuID keepalive_vars[] = {v1, v3, v5};
b->new_keepalive_clause(b, keepalive_clause, keepalive_vars, 3);
```
It's quite some code, but it should be okay if the client doesn't construct bundles via this API by hand.
## Instruction results are passed in, too
Note that the "syms" of return values are passed in. They are SSA variables, too, and the instructions need to refer to their results. Most instructions will be created like "three-address instructions".
But most instructions has a known number of return values, such as:
```uir
%c = EQ <@i64> %a %b
```
```c
MuID i64, a, b = ...;
MuID cmp_inst = b->gen_sym(b, "@the_name_of_the_instruction_itself_for_tracing_and_debugging");
MuID c = b->gen_sym(b, "@func.ver.entry.c");
b->new_cmp(b, cmp_inst,
c, // the only return
MU_CMP_EQ, i64, a, b);
```
Some instructions (namely binOp (with zero/neg/ovf/carry flags), CALL, TRAP, WATCHPOINT, SWAPSTACK and
COMMINST) may have different number of results depending on the arguments. But since all instructions are "three-address", all results have to be passed in anyway.
## Put stuffs together
To prevent mutability, all instructions (themselves) are created in one step, while complex sub-components (such as exc-clause, keepalive-clause, ...) are created separately.
A basic block is created in one step, too. When creating basic blocks, instructions are passed in as parameters.
```uir
%bb1(<@T1> %p1 <@T2> %p2) [%exc_param]:
%x = [%inst1] ADD ...
%y = [%inst2] SUB ...
[%inst3] TRAP ...
```
```c
MuID bb1, p1, p2, exc_param inst1, inst2, inst3 = ... // gen_syms
b->new_binop(b, inst1, ...);
b->new_binop(b, inst2, ...);
b->new_trap(b, inst3, ...);
MuID bb1_param_tys[] = {T1, T2};
MuID bb1_params[] = {p1, p2};
MuID bb1_insts[] = {inst1, inst2, inst3};
b->new_bb(b, bb1,
bb1_param_tys, bb1_params, 2, // Two parameters
exc_param, // This block may be used to catch exceptions.
bb1_insts, 3 // Three instructions
);
```
The top-level is implicit. When the `MuIRBuilder->load` method is called, all top-level definitions (types, signatures, constants, global cells, functions, exposed functions) ever created will be part of the bundle. It should be reasonable to keep mutability at the very top level.
# Performance impact
It still needs to be observed from experiments.
Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/63muapi.h: destinations2018-06-28T23:11:33+10:00Eliot Mossmuapi.h: destinationsWe revised the muapi.h interface recently around results of most instructions, removing add_result, and adding get_result and num_results. It seems that the situation around destinations for instructions such as BRANCH2 (just an example...We revised the muapi.h interface recently around results of most instructions, removing add_result, and adding get_result and num_results. It seems that the situation around destinations for instructions such as BRANCH2 (just an example) is similar. However, in this case I think it appropriate that the destination be built by the user of the API **before** making the call to create the BRANCH2, and the two handles on the destinations be passed in to the API function that creates the BRANCH2. I think this works for most branching things.
CALLs may be more problematic in that the destination may want to mention CALL results to pass on. However, a CALL node could support add_normal_dest (or set_normal_dest) and similarly for the exceptional destination.
But when we know the number of destinations, and the instruction itself is not producing more results that the destination can refer to, it seems cleaner to build the dests first and pass them in.John ZhangJohn Zhanghttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/62Possible statistics trivially implementable in the refimpl2018-06-28T23:11:34+10:00Kunshan WangPossible statistics trivially implementable in the refimplThese will provide more information about the characteristics of real-world Mu programs, and help implementers.
Dynamic recording:
* counting instructions executed (how many ADD, LOAD, BRANCH, ... executed)
* number of instructio...These will provide more information about the characteristics of real-world Mu programs, and help implementers.
Dynamic recording:
* counting instructions executed (how many ADD, LOAD, BRANCH, ... executed)
* number of instructions between branching (median)
* function call count
Static analysis:
* register pressure (static analysis)
- simultaneous live variables
Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/61Unary operators2018-06-28T23:11:34+10:00Eliot MossUnary operatorsAs we work on the pypy JIT to Mu, we have found, to our surprise, that Mu lacks certain unary operators that most language use:
Integer negation: x -> -x
Bitwise complement/inversion: x -> ~x
Floating point negation: x -> -x
...As we work on the pypy JIT to Mu, we have found, to our surprise, that Mu lacks certain unary operators that most language use:
Integer negation: x -> -x
Bitwise complement/inversion: x -> ~x
Floating point negation: x -> -x
Is this intentional or an oversight? I will ask the students to code around
it using a binary operation as a constant, but it feels yucky ...Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/60External linkage of uptr fields in the boot image2018-06-28T23:11:34+10:00Kunshan WangExternal linkage of uptr fields in the boot imageThis issue addresses the need that in addition to "external constants" (`.const @blah <@T> = EXTERN "blah"`), some **uptr fields in the Mu memory** needs to be initialised to the address of external symbols, too. This pattern exists in r...This issue addresses the need that in addition to "external constants" (`.const @blah <@T> = EXTERN "blah"`), some **uptr fields in the Mu memory** needs to be initialised to the address of external symbols, too. This pattern exists in regular C programs as well as PyPy which compiles to C.
However, there is a pure client-side solution which can relocate uptr fields without extending either the Mu IR or the API.
Another solution that adds only one API call can let the Mu boot image builder use the system linker/loader.
# Problem
In C, global variables can be initialised to constant values. The values can be literals, and can also be pointers to other global variables. In the latter case, the pointers are expressed in the form of `&symbol` where `symbol` is the name of the other global variable.
Consider the following C program:
```c
// Foo.c
struct Foo {
int v;
struct Bar *bar;
};
struct Bar {
double w;
};
struct Baz {
File *fp
};
struct Bar bar1 = { 3.14 };
struct Foo foo1 = {
42,
&bar1 // This field is initialised to the pointer to bar1 at link time.
};
struct Baz baz1 = {
&stdin // This field is initialised to the pointer to stdin at link time.
};
```
Both the `foo1.bar` and the `baz1.fp` fields are initialised to pointers. The former points to an object in the same compilation unit, while the latter points to a variable in the standard library.
However, the address of neither destinations can be determined at compile time nor link time.
1. Obviously, the address of `stdin` is determined only after `libc` is loaded.
2. The address of `bar1`, though only referenced within a .c file, is also indeterminate. The reason is that the current module (executable or .so), may be loaded at different memory addresses for each run. So it is not until the program is loaded could the run-time linker figure out their absolute addresses, and patch the `foo1.bar` field.
## Use in PyPy
PyPy uses some external libraries. One example is `libffi`.
`libffi` defines some global variables which the `libffi` users are supposed to use.
```c
// Excerpt from ffi.h
// libffi is an FFI implementation
// This struct describes a C data type, including both primitive types and structs.
// For structs, the *elements member points to an array of field types.
typedef struct _ffi_type
{
size_t size;
unsigned short alignment;
unsigned short type;
struct _ffi_type **elements;
} ffi_type;
// These are the descriptors of primitive C types.
FFI_EXTERN ffi_type ffi_type_void;
FFI_EXTERN ffi_type ffi_type_uint8;
FFI_EXTERN ffi_type ffi_type_sint8;
FFI_EXTERN ffi_type ffi_type_uint16;
FFI_EXTERN ffi_type ffi_type_sint16;
FFI_EXTERN ffi_type ffi_type_uint32;
FFI_EXTERN ffi_type ffi_type_sint32;
FFI_EXTERN ffi_type ffi_type_uint64;
FFI_EXTERN ffi_type ffi_type_sint64;
FFI_EXTERN ffi_type ffi_type_float;
FFI_EXTERN ffi_type ffi_type_double;
FFI_EXTERN ffi_type ffi_type_pointer;
```
`libffi` describes C data types using the `ffi_type` struct. Primitive types are pre-defined global variables. If the user wants to describe a custom C struct, it creates an instance of `ffi_type` and fills in the fields.
```c
// Suppose we want to describe this struct:
struct Foo { int a; char b; void* c; };
// We define an ffi_type instance:
/// First make an array of field types
ffi_type field_types[4] = { &ffi_type_int, &ffi_type_int8, &ffi_type_pointer, NULL };
/// Then describe struct Foo itself.
ffi_type foo_type = {
0, // will be initialised by libffi
0, // will be initialised by libffi
FFI_TYPE_STRUCT, // it means "Foo is a struct"
&field_types // "Foo has these fields"
};
```
Keep in mind that these data structures are **raw C data structures**.
PyPy, as a high-level language, will store the pointer to such structs into PyPy-level **heap objects** and use the pointer later. In RPython, heap objects in their "boot image" (the `pypy` executable) are global C variables. It will look like:
```c
struct pypy_path_to_module_SomePyPyObjectType object = {
GC_HEADER,
HASHCODE,
BLAHBLAH,
&foo_type // Untraced pointer to C global variable
};
```
At compile time (from RPython source code to C source code), the RPython toolchain describes objects in the boot image symbolically: structs are described field by field, and may contain pointers to other struct values. The toolchain also makes uses of the fact that RPython program eventually compiles to C. All of such struct values become global variables in C, *no matter whether they are GC-ed heap objects or not* (this also mean they are immortal). **This approach avoided the dynamic linking problem** because C source code can still refer to other global variables symbolically, whether they are traced or not, and off-loads the task of address resolution to the linker and the loader.
## Problem to handle this in Mu
Mu strictly distinguishes between traced references (`ref<T>`) and untraced pointers (`uptr<T>`). Mu treats `uptr<T>` as raw integer values and does not care about its destination. This means, in Mu, **untraced pointers are literally untraced**.
But the reason why the boot image builder work is that *the boot image builder can use the GC to trace all references in all heap objects and global cells (which are still scanned) and find the transitive closures*. The GC can find all reference fields and record references between objects. This object-reference graph can generate *relocation entries* which allows the loader to fix reference between heap objects.
So the boot image builder has no power to "trace" "untraced pointers" and find "which memory location contains a raw pointer to which untraced memory region". i.e. The boot image builder cannot express the following C structure:
```c
struct Foo {
int v;
struct Bar *bar;
};
struct Bar {
double w;
};
struct Bar bar1 = { 3.14 };
struct Foo foo1 = {
42,
&bar1 // Cannot express this, because it is UNTRACED pointer.
};
```
The reason is, the boot image builder takes the **value** held inside objects as the input, not their symbolic initialisers. The boot image builder sees the **current** address of `bar1`, but the boot image is relocatable, and the address will not be valid after loading.
# Solution
## Solution 1: Redesign the PyPy-level library, or the translation process.
The reason why PyPy needs such C structs is because it needs to call C functions that need them. Currently these C structs are expressed as "constant struct values", which are initialised at compile time. If the PyPy-side library were written with the fact that "raw pointer are not preserved across boot image building" in mind, such structs would have been created at run time rather than compile time, and there will be no need to preserve "pointers from one struct to another".
Alternatively, all untraced structs can be translated to C programs, compiled by conventional C compilers (GCC), and linked against the Mu program (pypy in this case) dynamically. Given that the program's purpose is to interact with native C programs, having extra C programs is not completely wrong, though not very elegant.
## Solution 2: Let the client reinvent the linker
This approach require the client to record a list of (iref, symbol) pairs. Each pair means: "Please fill the pointer field at this iref to the address of this symbol before running the `main` function". This is exactly what the system linker is doing. With the existing `.const @blah <@T> = EXTERN "blah"` external constant, the client only needs to generate an intialiser function which has a list of STORE instructions to update each iref. The list of irefs can be saved in a heap object which is held by a global cell and built into the boot image. As soon as this list is used, it can be GC-ed (just nullify the only global cell that holds reference to it).
## Solution 3: Let Mu support such relocation
There is just one API function need to be added:
```c
void (*add_ptr_reloc)(MuCtx *ctx, MuIRefValue field, const char *symbol);
```
`field` is an IRef to a memory location (heap object or global cell) of `uptr<T>` or `ufuncptr<sig>` type. This function, when called, will add a relocation entry to the running micro VM. It has no effect on the running VM. But when the client later orders the micro VM to build a boot image, the boot image will contain relocation entries that will re-initialise the given field to the address of the given symbol.
Unlike Solution 2, this solution can make use of the system linker/loader, but adds more burden to the micro VM. But given that the micro VM's boot image builder already has to handle relocation entries, this requirement looks reasonable.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/58Signal handling for clients or user programs2018-06-28T23:11:34+10:00Kunshan WangSignal handling for clients or user programsTL;DR: The concept **signal** is unix-specific and varies *a lot* among operating systems. However, many programs, especially unix command-line programs, depend on signals for basic user interaction, such as CTRL-C. This issue talks abou...TL;DR: The concept **signal** is unix-specific and varies *a lot* among operating systems. However, many programs, especially unix command-line programs, depend on signals for basic user interaction, such as CTRL-C. This issue talks about what a Mu implementation can do for the client and the application programmers. The *simplest* thing a Mu implementation can do is *do nothing*. If the implementation wants to be responsible, there is much room it can design its implementation-specific interface.
# Problem statement
Many UNIX command line programs use signals to interact with the user or other programs **in normal use cases**, such as
- Asynchronous signals:
- SIGINT: received when the user presses CTRL-C
- SIGWINCH: received when the size of the terminal window is changed
- Synchronous signals:
- SIGPIPE: when attempt to write to a broken pipe. For example, when `cat foo.txt | sort | uniq`, and `cat` tries to write to stdout.
## Mu status quo
Mu is not designed for Unix only, and does not contain "signal" in its IR or API.
For most errors, Mu IR programs behave in a signal-agnostic way. For example:
- division by zero: In the [UDIV/UREM/SDIV/SREM](https://gitlab.anu.edu.au/mu/mu-spec/blob/master/instruction-set.rst#binary-operations) instructions, an "exceptional destination" must be specified. It is a basic block in the same function as the `*DIV`/`*REM` instruction, and it behaves like a jump. Implementations may use signals to implement such jump, but may also generate a `CMP reg, 0`, `JE abnormal_dest`, `DIV` sequence.
- NULL pointer: Just like division by zero, the [LOAD/STORE](https://gitlab.anu.edu.au/mu/mu-spec/blob/master/instruction-set.rst#load-instruction) instructions also take an "exceptional destination", which is jumped to when NULL pointer error occurs.
But Mu does not define what happens when CTRL-C is pressed
## Signals, VMs and operating systems
[This article from IBM](http://public.dhe.ibm.com/software/dw/java/i-signalhandling-pdf.pdf) described what happens when the user presses CTRL-C on different operating systems. Obviously the behaviours vary greatly.
Quote:
- On z/OS and AIX: A single thread, chosen by the operating system, receives the signal.
- Linux: All threads receive the signal, and the signal handler is invoked on each thread. Linux threads are just separate processes that share the same address space, so it is also possible for another application to raise a signal on a specific thread.
- Windows: A new thread is created for executing the signal handler. This thread dies once the signal handler is complete.
JVM does not provide any official mechanism to handle signals, probably because JVM is not UNIX-specific, either. In fact, [Jython cannot handle CTRL-C like CPython does](http://bugs.jython.org/issue1270): Jython immediately terminates the Python program rather than raising a catchable Python exception.
The closest public Java API is `Runtime.addShutdownHook`. The hook will be called when the VM is shutting down, including when CTRL-C is pressed. But this mechanism cannot prevent the shutdown sequence from happening.
There is a proprietary interface: [sun.misc.Signal](http://www.docjar.com/docs/api/sun/misc/Signal.html). The documentation says the signal is handled in a new Java Thread running at MAX_PRIORITY. But `sun.misc.*` is private to the JVM implementation, and thus cannot be depended on.
# What can Mu implementations do?
For synchronous signals, such as `SIGPIPE`, the the client should provide wrappers so that `write` does not raise `SIGPIPE`, but instead returns a special error code, or throw a Mu exception. In this way, the client bypasses the potential signal. It looks like [simply masking this signal, or setting a file-descriptor to not raise SIGPIPE] will do the job.
For asynchronous
1. Do nothing. SIGINT, ... will simply kill the process. Or,
2. Provide platform-specific interfaces to the client.
The first option is the easiest if we just want a running micro VM.
The second option is where the Mu implementation writers (such as [mu-impl-fast](https://gitlab.anu.edu.au/mu/mu-impl-fast)) can demonstrate their creativity. It will be like doing a mental exercise of "How will you design the Java API so that it is easy to write command line tools (such as `cat` and `grep`) in Java?"
One method I can think about is to provide a global event queue. The client should provide a thread that polls from this queue in the background and take appropriate options (such as sending messages to other Mu threads, or interrupting them via setting shared variables, using watchpoints or futexes).
If we find one interface particularly favourable, we may consider *standardising* the extensions for *all Mu micro VMs on UNIX*.
# Higher-level view
"Signal" is a 1980s-1990s UNIX idea. Before 1995, POSIX does not have "threads", so there were one thread per process. Signals are sent to processes, and then they are handled by **the only thread** in the process. At that time, signals were probably a very straightforward message-passing mechanism. The **stack layout is exposed** to the C programmer via the parameter to the signal handler, probably because at that time, those who handle signals are probably system experts.
But things have changed when **multi-threading** model comes into being, and **VMs make the stacks opaque**. This makes us think **what really are the endpoints of signal communication**? It looks like the old signal model is no longer perfectly suited for the multi-threaded world, but the *command line interface is designed at the old ages* and has not fully adapted to the new world, yet. I am looking forward to how future programming languages and VMs can change the way people write command-line programs.
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/57Ahead-of-time Compiling & Boot-image2018-06-28T23:11:34+10:00Kunshan WangAhead-of-time Compiling & Boot-imageThis issus is about supporting ahead-of-time compiling and building boot-images.
The Mu (including both the IR and the API) is designed for JIT compiling. Very little is specified about the ahead-of-time compiling scenario. However, r...This issus is about supporting ahead-of-time compiling and building boot-images.
The Mu (including both the IR and the API) is designed for JIT compiling. Very little is specified about the ahead-of-time compiling scenario. However, real-world language VMs (such as the `pypy.exe`, `python.exe` or `java.exe` executable images) are executable images and should be in the system-specific native image format (such as ELF). The image should contain the micro VM and the client. Preferably it should also contain **AoT-compiled core libraries** (such as built-in objec types, `java.lang.Object`) and, in some cases (such as PyPy), the **AoT-compiled interpreter and metacircular client**.
This issue will discuss the following topics:
- Dynamic linking and loading (linking at start-up time by the system linker)
- Symbol resolution (determine the addresses of symbols (such as `write`))
- This will revive an old idea: "load-time constants" (https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/47)
- Proposed [load-time constants](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/57#note_343): `.const @Xxxx <@T> = EXTERN "write"
- Library dependencies (which `.so` should be loaded?)
- Each ELF or Mach-O file can specify its library dependencies. But this part is extremely platform-specific.
- Could [add a new top-level](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/57#note_344), but my hypothetical scenarios ([1](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/57#note_345), [2](https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/57#note_346)) suggest external linkage should be specified in a separate linking step, like: `ld impl_supplied_entry_point.o bootimage.o -l external-lib -o executable`.
- Possible extensions to the API to address boot-image building
- What should be in the boot image?
- This is very client-specific. It's determined by how the client is implemented, metacircular or not.
- How to determine what is in a boot image?
- Probably using a whitelist. The client can always record all necessary things.
I will consider the following scenarios:
1. VM with non-metacircular client (No active project. My obsolete [js-mu](https://gitlab.anu.edu.au/mu/obsolete-js-mu) was an example).
2. AoT compiling Mu IR program into the boot image. (RPySOM interpreter as an RPython program)
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/56ASM-style IR builder2018-06-28T23:11:34+10:00John ZhangASM-style IR builder*Created by: wks*
This issue discusses a higher-level abstraction over the IR builder API. It will allow the client to construct Mu IR CFG in a stateful style. The stateful builder will hold a pointer to the "current basic block" at any...*Created by: wks*
This issue discusses a higher-level abstraction over the IR builder API. It will allow the client to construct Mu IR CFG in a stateful style. The stateful builder will hold a pointer to the "current basic block" at any time. New instructions are implicitly appended to the end of the current basic block. Such interface can also emulate fall-through-style ASM instructions, such as JL, JE, JNE, etc.
It is a layer above the API. The muapi.h should still be kept minimal.
There is a problem in implementation. Such builder is easy to build in SSA form, but since we have switched to the "goto-with-values" form, more book-keeping needs to be done in the client. Probably we still need a soup of objects in the client and do liveness analysis and convert SSA to goto-with-value.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/55Mu, LibMu and LibMuXxx: Layered API for the client2018-06-28T23:11:34+10:00John ZhangMu, LibMu and LibMuXxx: Layered API for the client*Created by: wks*
As we discussed today, the structure of Mu and its client-facing interfaces should be like this picture:
![mu-libmu](https://cloud.githubusercontent.com/assets/370317/15414816/6b55849c-1e81-11e6-839a-c6cac254845f.pn...*Created by: wks*
As we discussed today, the structure of Mu and its client-facing interfaces should be like this picture:
![mu-libmu](https://cloud.githubusercontent.com/assets/370317/15414816/6b55849c-1e81-11e6-839a-c6cac254845f.png)
(Black text represents the component, and red text represents the programming language they are implemented in.)
In the inner circle is the micro VM. It can be implemented in any language, but it provides a C API, and *both the micro VM and the C API (i.e. the inner black circle boundary) need to be verified*. Outside the outer circle is the client. The ring in between is a library which we call "LibMu". In theory, the client, the LibMu and the Mu micro VM can be implemented in different languages.
When LibMu (or some language-specific LibMu wrappers, such as LibMu-Z for some hypothetical language Z, as shown in the picture as the client-facing semi-circle) talks with the client, **it should present a nice client-friendly interface for the client to construct Mu IR bundles**. Such interface should provide appropriate data structures, data types and constructors or even high-level transformers for the convenience of the client. *This layer does not need to be minimal*.
When LibMu talks with the Mu micro VM, **the interface must be minimal and verifiable**. We agreed (#50) that it is difficult to verify a parser, so it rules out "sending text or binary blobs into the micro VM (across the black circle)". The C API of the micro VM provides a function call-style API (also discussed in #50, but need to be revised) so that LibMu constructs a bundle into Mu by making a sequence of function calls, each function constructs a Mu IR node (such as instruction, basic block, type or constant).
Some programming languages (such as Python, Haskell, ...) may have relatively high overhead when calling C foreign functions, comparing to direct C-to-C calls. If the client is written in such languages (language Y in the picture), it will be slow to construct the Mu-side AST by frequently calling through the C interface. We consider this as a problem of the language implementation of Y. In such case, the client should have some part of it written in C (the inner micro VM-facing semi-circle in LibMu) so that language Y can encode the MuIR bundle and send it to the C component (this interface does not need to be verified) and the C component constructs the MuIR in Mu via the C API (this interface is verified).
https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/54Flags in arithmetic/logical operations2018-06-28T23:11:34+10:00John ZhangFlags in arithmetic/logical operations*Created by: wks*
This issue will give the client access to the flags set by arithmetic or logical operations, such as overflow, carry, zero, negative, ... This issue should only affect the BinOp instructions (ADD, SUB, MUL, ...).
Th...*Created by: wks*
This issue will give the client access to the flags set by arithmetic or logical operations, such as overflow, carry, zero, negative, ... This issue should only affect the BinOp instructions (ADD, SUB, MUL, ...).
The design should consider:
- [ ] scalar integral types
- [ ] scalar floating point types
- [ ] vector typeshttps://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/53Access to native thread-local memory2018-06-28T23:11:34+10:00John ZhangAccess to native thread-local memory*Created by: wks*
This issue is about accessing thread-local memory/variables defined in native programs (C). One important application is the `errno` variable in C/C++.
This is only slightly related to #52 which introduces thread-lo...*Created by: wks*
This issue is about accessing thread-local memory/variables defined in native programs (C). One important application is the `errno` variable in C/C++.
This is only slightly related to #52 which introduces thread-local storage to Mu itself. There is no intention to force Mu's thread-local storage use the same mechanism as native programs.
Thread-local storage in native programs is highly machine/OS/ABI-dependent. The register used to point to thread-local buffers varies, and maybe not all platform have such register.
One possible workaround could be depending on helper functions written in C or assembly.
But if we want Mu to integrate deeper with native programs (i.e. do things more efficiently), we can define more instructions (probably "common instructions") to give Mu more capabilities, such as getting/setting the value of the FS register. But any such instructions would likely be platform-dependent and probably optional for unsuitable platforms.https://gitlab.anu.edu.au/mu/general-issue-tracker/-/issues/52Thread-local storage2018-06-28T23:11:34+10:00John ZhangThread-local storage*Created by: wks*
Add **thread-local** memory to Mu, in addition to the existing *heap*, *stack* and *global* memory.
[Proposal 1](https://github.com/microvm/microvm-meta/issues/52#issuecomment-213364592): the C-like approach, has kn...*Created by: wks*
Add **thread-local** memory to Mu, in addition to the existing *heap*, *stack* and *global* memory.
[Proposal 1](https://github.com/microvm/microvm-meta/issues/52#issuecomment-213364592): the C-like approach, has known problems
[Proposal 2](https://github.com/microvm/microvm-meta/issues/52#issuecomment-213375674) (preferred): a more aggressive design