mu-impl-fast issueshttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues2018-09-10T06:31:05+10:00https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/51Memory order in store/load API calls2018-09-10T06:31:05+10:00Yi LinMemory order in store/load API callsFor API calls `load()`/`store()`, they take memory order as an argument. However I am uncertain whether we need to do anything special for different memory orders. Current implementation just does a plain load/store in `vm.handle_load()`...For API calls `load()`/`store()`, they take memory order as an argument. However I am uncertain whether we need to do anything special for different memory orders. Current implementation just does a plain load/store in `vm.handle_load()` and `vm.handle_store()`.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/86Issue with rodal on macOS when dynamically linking with boot image2017-09-22T11:10:22+10:00Yi LinIssue with rodal on macOS when dynamically linking with boot imageI am not sure if I fully understood the problem. Isaac @igariano01 may explain more on this.
Generally the issue is that rodal needs to redefine `malloc()` and `free()` that Rust uses to manage heap memory. Those are weak symbols, so r...I am not sure if I fully understood the problem. Isaac @igariano01 may explain more on this.
Generally the issue is that rodal needs to redefine `malloc()` and `free()` that Rust uses to manage heap memory. Those are weak symbols, so rodal redefines them. Thus if an object is dumped by rodal, it cannot be freed by the default `free()`, instead it is freed by rodal.
However on macOS, Rust uses a different allocator when generating dynamic libraries instead of using `malloc()`/`free()`, and those cannot be redefined. Thus when we link to mu as a dynamic library, rodal cannot redefine the `free()` function. However, there seems no issue when linking to mu a as static library.
One solution that seems very hacky is to disable dynamic link with mu in boot image generation. Branch https://gitlab.anu.edu.au/mu/mu-impl-fast/tree/static-link-for-macos has the fix.
The other solution is to use Rust's nightly feature to provide a custom allocator (rodal will implement a custom allocator along with deallocation). Branch https://gitlab.anu.edu.au/mu/mu-impl-fast/tree/global_allocator has a fix. This is a more elegant approach. But my only concern is that this requires us to use nightly Rust. I am concerned the language implementation itself is more buggy in a nightly version, and we will meet problems when debugging (e.g. whether it is our bug, or their bug). And if we switch to nightly Rust, it is very likely that we will start using more nightly features, and we will not be able to go back to stable Rust in the future.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/19Register allocation validator2017-06-23T18:15:34+10:00Yi LinRegister allocation validatorThis issue describes the algorithm for the register allocation validation in Zebu.
Eliot proposed the approach. I implemented it in Zebu, and put the pseudo code here. I am not exactly confident if I have captured all the ideas from ...This issue describes the algorithm for the register allocation validation in Zebu.
Eliot proposed the approach. I implemented it in Zebu, and put the pseudo code here. I am not exactly confident if I have captured all the ideas from Eliot, and implemented it correctly. This issue serves as a record of how the register allocation validation works. Any discussion is welcome, and it may help future work.
<br /><br />
Input: register assignment `ASSIGNED(reg) -> mreg`, and code. <br/>
Output: `PASS` if `ASSIGNED` is correct; or `ERROR` if it is not.
<br /><br />
Assumptions:
* we have correct liveness information.
* spill code (`SPILL_LOAD`/`SPILL_STORE`) is already inserted, and spilled virtual registers are replaced with short-lived virtual registers, and we know the mapping.
<br /><br />
We are trying to iterate through the code focusing on registers by maintaining an `alive set`, each entry of which is a 3-element tuple `ENTRY(reg, machine_reg, mem)`, meaning `reg` is alive (contains a valid value) at the moment in `machine_reg` and spilled location `mem`.
* `(-, machine_reg, -)` means a machine register is alive, but no virtual register and no spill location is associated.
* `(*, machine_reg, *)` means matching entries that have `machine_reg`
* `(reg?, machine_reg, *)` means matching entries that have `machine_reg`, and naming the first element as `reg` (it may exist or not)
<br /><br />
We have a few primitives about alive set
* `x <- ALIVE[block]` loads alive set for `block` and uses alias `x` for it.
* `x.ADD_ALIVE(r, m, mem)`adds `(r, m, mem)` to alive set `x`.
* `ALIVE[block] <- x` saves alive set `x` with `block`.
* `x.ASSERT_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If no entry is found, ends with `ERROR`.
* `x.KILL_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If any entry is found, delete the entries.
* `x.EXIST_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If any entry is found, it is true.
* `x.TRIM(list)`: for any `r`, `m` in `list`, preserve them in `x`, delete other entries in `x`
* `x.INTERSECT(y)`: if `(r, m, *)` or `(-, m, *)` appear in both `x` and `y`, preserve it in `x`; otherwise delete it from `x`.
* `y <- x.COPY`: copy `x` into `y`
<br /><br />
Pseudo code:
```
// set the initial alive set for a function start
init.ADD_ALIVE(-, stack pointer, -)
init.ADD_ALIVE(-, frame pointer, -)
init.ADD_ALIVE(-, program counter, -)
init.ADD_ALIVE(-, callee saved registers, -)
init.ADD_ALIVE(-, used argument registers, -)
push entry block to work queue
ALIVE[entry block] <- init
while work queue is not empty {
pop work queue -> block
alive <- ALIVE[block]
for each inst in block {
// (1) check spill
// we only care about the variable before spilling
if inst is SPILL_LOAD(mem) -> t for var v {
alive.ASSERT_ALIVE(v, *, mem)
} else if inst is SPILL_STORE(t) -> mem for var v {
alive.ADD_ALIVE(v, -, mem)
}
// (2) check use
// when we use a register, it needs to contain a valid value
for reg in inst.virtual_register_uses() {
mreg = ASSIGNED(reg)
alive.ASSERT_ALIVE(reg, mreg, *)
}
for mreg in inst.real_register_uses() {
alive.ASSERT_ALIVE(*, mreg, *)
}
// (3) kill died regs
// when a register dies, we remove its entry from alive set
for reg in inst.virtual_register_dies() {
alive.KILL_ALIVE(reg, *, *)
}
for mreg in inst.real_register_dies() {
alive.KILL_ALIVE(*, mreg, *)
}
// (4) check and add defines
for reg in inst.virtual_register_defines() {
if reg NOT in inst.liveout() {
// when a register is defined, but doesnt live out of this instruction
// we kill its previous values from alive set (by deleting the entries)
alive.KILL_ALIVE(reg, *, *)
} else {
mreg = ASSIGNED(reg)
if NOT alive.EXIST_ALIVE(*, mreg, *) {
// mreg doesnt hold any value at the moment, we simply add an entry
alive,ADD_ALIVE(reg, mreg, -)
} else {
// we need to ensure assigning mreg will not destroy useful values
for (x?, mreg, *) in ALIVE {
if x NOT exist {
x <- reg
} else {
// we have x in mreg, and we want reg in mreg as well
if x == reg {
// overwrite value (safe)
} else {
if inst is MOVE {
// possible coalescing (safe)
} else {
// we are destroying the value of x
// and x is alive at the moment (otherwise it is killed already in step 3)
ERROR
}
}
}
alive.ADD_ALIVE(reg, mreg, -)
}
}
}
}
for mreg in inst.real_register_defines() {
if mreg NOT in inst.liveout() {
// when a register is defined, but doesnt live out of this instruction
// we kill its previous values from alive set (by deleting the entries)
alive.KILL_ALIVE(*, mreg, *)
} else {
if NOT alive.EXIST_ALIVE(*, mreg, *) {
// mreg doesnt hold any value at the moment, we simply add an entry
alive.ADD_ALIVE(-, mreg, -)
} else {
for (reg?, mreg, -) in ALIVE {
if reg NOT exist {
// we have value in mreg, but it doesnt hold value of any variables
// overwrite the value is safe
} else {
// we are holding reg in mreg, defining mreg will destroy the value of reg
ERROR
}
}
}
}
}
// finishing the block, we only preserve what are alive at the end of the block
alive.TRIM(block.liveout())
if block is NOT visited {
ALIVE[block] <- alive
push_successors = true
} else {
alive_before <- ALIVE[block]
alive.INTERSECT(alive_before)
if alive <> alive_before {
push_successors = true
}
ALIVE[block] <- alive
}
mark block as visited
if push_successors {
if block has 1 successor {
push successor to work queue
ALIVE[successor] <- alive
} else block has 2 successors {
push successor1 to work queue
alive1 <- alive.COPY
ALIVE[successor1] <- alive1.TRIM(successor1.livein())
push successor2 to work queue
alive2 <- alive.COPY
ALIVE[successor2] <- alive2.TRIM(successor2,livein())
}
}
}
}
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/13In-place code generation2017-03-01T17:40:17+11:00Yi LinIn-place code generationThe idea is to generate machine code directly without going through some representation of the machine code. The motivation is to save allocation/spaces for the machine code representation. Current x86_64 assembly backend tests the idea ...The idea is to generate machine code directly without going through some representation of the machine code. The motivation is to save allocation/spaces for the machine code representation. Current x86_64 assembly backend tests the idea of in-place code generation (though it is completely not necessary for an assembly backend to do this).
Issues:
* Too many eliminated moves (that are turned into `nop`), which results in bad performance. One solution is to do compaction on the code in the end (slide/copy the code to remove 'holes').
* In-place code generation introduces extra memory overhead, as we need to save locations of registers.
*A note for x86_64 asm backend:
Metadata for instruction can be more optimised. For example, currently I am using `LinkedHashMap<MuID, Vec<ASMLocation>>` to store information on used and defined registers. Both `LinkedHashMap` and `Vec` is expensive. We can use fixed length array, or simply `use1`, `use2`, `use3`...*
* It makes machine code level optimisations and transformations hard to implement.
Questions:
* Is it possible that we can emit machine code (binary) before we know the final code? For example, for `jmp`, it may turn into different machine code based on how big the offset is. However any assembler may need to deal with this question.
We need to discuss this more about this before starting implementing JIT.