mu-impl-fast issueshttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues2017-06-07T00:22:48+10:00https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/32Calling function defined in another bundle does not work2017-06-07T00:22:48+10:00Isaac Garianoisaac@ecs.vuw.ac.nzCalling function defined in another bundle does not workI am trying to define a function in one bundle, and call it another, and i'm getting an error:
Heres an example that will reproduce the problem:
file argc_exit.uir:
```
.funcsig @exit_sig = (int<32>) -> ()
.funcdef @argc.exit <@...I am trying to define a function in one bundle, and call it another, and i'm getting an error:
Heres an example that will reproduce the problem:
file argc_exit.uir:
```
.funcsig @exit_sig = (int<32>) -> ()
.funcdef @argc.exit <@exit_sig>
{
%entry(<int<32>>%arg):
CCALL #DEFAULT <ufuncptr<@exit_sig> @exit_sig> <ufuncptr<@exit_sig>>EXTERN "exit"(%arg)
RET
}
```
file argc_inline.uir:
```
.typedef %char = int<8>
.funcdef @my_main <(int<32> uptr<uptr<%char>>)->(int<32>)> VERSION @my_main_v1
{
%entry(<int<32>>%argc <uptr<uptr<%char>>>%argv):
CALL <(int<32>)->()> @argc.exit (%argc)
RET <int<32>>1
}
```
Then using my mu-tool-compiler:
`./muc -r -f my_main argc_exit.uir argc_inline.uir emit/argc`
(use -c if you wan't to see the API calls it uses).
`thread '<unnamed>' panicked at 'Operand 1013' is neither a local var or a global var', src/vm/api/api_impl/muirbuilder.rs:1290 stack backtrace:`
(the symbol with Id 1013 is @argc.exit).
I tracked the error down and it appears to be comming from the function `get_treenode` in (src\vm\api\api_impl\muirbuilder.rs).
My guess is the API implementation only looks for things defined in the current bundle and not other bundles.
However from my understanding of the Mu-spec you should be able to refer to entities declared in previously loaded bundles.
A workaround is to combine both files into the same bundle such as with `./muc -r -f my_main <(cat argc_exit.uir && cat argc_inline.uir) emit/argc`.Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/31[x86_64] floating point/int128 conversion2017-06-13T13:38:11+10:00Yi Lin[x86_64] floating point/int128 conversionunimplemented for nowunimplemented for nowYi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/30Single exit block for Mu function2017-07-20T15:39:12+10:00Yi LinSingle exit block for Mu functionWe should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.We should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/29Implement SWITCH with switch table2017-06-06T15:10:22+10:00Yi LinImplement SWITCH with switch tableCurrently the compiler generates cascading conditional branches for SWITCH instruction. We should consider using switch table if there are many case arms.Currently the compiler generates cascading conditional branches for SWITCH instruction. We should consider using switch table if there are many case arms.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/28Name mangling in Zebu, unnamed MuEntity, global/local names2017-07-19T15:05:23+10:00Yi LinName mangling in Zebu, unnamed MuEntity, global/local namesCurrently Zebu makes assumptions about names, and mangles names in a way that is inconsistent and confusing. And it may be inconsistent to the spec.
These are some notes (I need to rethink on these):
* Zebu assumes local names, and man...Currently Zebu makes assumptions about names, and mangles names in a way that is inconsistent and confusing. And it may be inconsistent to the spec.
These are some notes (I need to rethink on these):
* Zebu assumes local names, and mangles it (if needed) in its own way. The spec requires all names used via API are global names (no mangling is needed)
* Zebu checks and transforms each name so the name does not include special character, and can be safely used in assembly
* Zebu assumes some entities such as `Block`, `MuFunction` have a name. These assumption may not be consistent with the spec (the spec requires top-level entities have names). This needs further check.
* Names that start with number is valid as name for a Mu entity, however the name may not be valid to be used directly in the assembly.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/27Conditional branch adjustment after trace generation2017-07-20T15:41:22+10:00Yi LinConditional branch adjustment after trace generationCurrently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. I...Currently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. Ideally before instruction selection, a conditional branch should always be followed by its false label. The adjustment should follow the rules:
* any conditional branch followed by its false label stays unchanged
* for conditional branch followed by its true label, we switch the true and false label, and negate the condition
* for conditional branch followed by neither label, we invent a new false label, and rewrite the conditional branch so that the new cond branch will be followed by the new false label.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/25Flags for int<128>2017-07-12T10:49:35+10:00Isaac Garianoisaac@ecs.vuw.ac.nzFlags for int<128>I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
...I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
The notation <exp> indicates exp is a 128-bit value
[exp] indicates exp is a 64-bits value
<[exp]> indicates exp is a 192-bits value
<<exp>> indicates exp is a 256-bits value
exp.h indicates the higher 64-bits of the expression and exp.l is the lower 64-bits of the expression
(each should occupy there own register)
Xi = 2^(64*i) (i.e. it is i*64-bits worth of zeros with a one at the front)
Ti is a temporary register (64-bits)
Note:
Some optimisations may be able to be performed if an argument to the instruction is an immediate
Zero and Negtaive Flags:
D, Z, N = BINOP S1, S2
ORR Z <- D.h, D.l // Z = D.h | D.l
CMP Z, #0 // Z <=> 0
CSET Z, EQ // Z = (Z == 0) ? 1 : 0
LSR N, D.h, 63 // N = (D.h >> 63) (so that N[0] = D.h[63])
Overflow and Carry for Add/Sub:
D, C, V = ADD/SUB S1, S2
// Compute the add/subtraction normal (except ensure the Ad with carry/subtract with carry sets the carry flag)
CSET C, CS // Set to 1 if the carry flag is set
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
For ADD:
// T[63] = 1 IFF S1 and S2 have different signs
EOR T1 <- S1h, S2.h // T1 = S1.h ^ S2.h
For Sub:
// T[63] = 1 IFF S1 and -S2 have different signs
EON T1 <- S1h, S2.h // T1 = S1.h ^ (~S2.h)
// V[63] = 1 iff D and S1 have different signs
// and S1 and S2 (or -S2) have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
Overflow for Sbutraction: (Note: this is essentially the same method I used for arithmetic less than 32 bits)
D, V = SUB S1, S2
// Compute the subtraction normally
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
// V[63] = 1 iff D and S1 have different signs
// and S1 and -S2 have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
------------
Overflow and carry for Multiply:
D, C, V = MUL S1, S2
---------------------- (this is just my working) ----------
<S1.h*X1+S1.l> * <S2.h*X1+S2.l> =
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]> + <S1.l*S2.l>
Discared everything that occupys the lower 128-bits:
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]>
-----------------------------
<S1.h*S2.h>*X2 +
<S1.l*S2.h>*X1 +
<S1.h*S2.l>*X1
--------------------------------
<[S1.h*S2.h].h*X1+[S1.h*S2.h].l*X1>*X2 +
<[S1.l*S2.h].h*X1 + [S1.l*S2.h.l>*X1
<[S1.h*S2.l].h*X1 + [S1.h*S2.l].l>*X1
--------------------------------------------------
[S1.h*S2.h].h*X3 + [S1.h*S2.h].l*X3 +
[S1.l*S2.h].h*X2 + [S1.l*S2.h.l*X1 +
[S1.h*S2.l].h*X2 + [S1.h*S2.l].l*X1
----------------------------------------------------
Discare all factors of X1 (as they will only contribute to the lower 128 bits of the result)
[[S1.h*S2.h].h+ [S1.h*S2.h].l]*X3 +
[[S1.l*S2.h].h + [S1.h*S2.l].h]*X2
So to get the overflow flag let:
D.h = [[S1.h*S2.h].h+ [S1.h*S2.h].l]
D.l = [[S1.l*S2.h].h + [S1.h*S2.l].h]
Then set it to '1' iff (D.h != 0) || (D.l != 0)
------------------------------------------
SO EMIT THE FOLLOWING CODE:
UMULH D.l <- S1.l, S2.h // D.l = [S1.l*S2.h].h
UMULH D.h <- S1.h*S2.l // D.h = [S1.h*S2.l].h
ADD D.l <- D.h, D.l // D.l += D.h
UMULH D.h <- S1.h, S2.h // D.h = [S1.h*S2.h].h
MADD D.h <- S1.h, S2.h, D.h // D.h += [S1.h*S2.h].l
CMP D.l, #0 // D.l <=> 0
CSET C <- NE // C = (D.l != 0) ? 1 : 0
CMP D.h, #0 // D.h <=> 0
CSINC C <- C, XZR, EQ // C = (D.h == 0) ? C : (0+1)
MOV V <- C // V = C (they should be the same)
// Now get the lower 128-bits of the product (and store it in D.h, D.l)
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/24Mu IR Type checking2017-11-21T13:54:18+11:00Isaac Garianoisaac@ecs.vuw.ac.nzMu IR Type checkingThe Mu IR compiler currently will compile some invalid mu code, specifically I noticed the following invalid code successfully compiled (which were used in some of the tests) :
* a SHL/LSHR/ASHR instruction where the second argument is...The Mu IR compiler currently will compile some invalid mu code, specifically I noticed the following invalid code successfully compiled (which were used in some of the tests) :
* a SHL/LSHR/ASHR instruction where the second argument is not the same as the first (in the case of the test the first argument was int<64> and the second argument was an int<8>) (this code was generated in tes_shl and test_lshr).
* passing an int<64> as an argument to a C function expecting an int<32> (this was generated by test_pass_1arg_by_stack, and test_pass_2arg_by_stack)
In addition the compiler doesn't seem to check when you use an SSA variable whether it has been assigned to yet.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/23[x86_64] shift operations may return wrong result if 2nd operand is larger th...2017-07-12T10:49:36+10:00Yi Lin[x86_64] shift operations may return wrong result if 2nd operand is larger than int8Shifting instructions in Mu require two operands of the same size, e.g. `Shl <int64> a b`, in which `a` and `b` are both `int64`.
However `shl`, `shr`, `sar` in x86_64 either takes a second operand in the `CL` register (8 bits), or a...Shifting instructions in Mu require two operands of the same size, e.g. `Shl <int64> a b`, in which `a` and `b` are both `int64`.
However `shl`, `shr`, `sar` in x86_64 either takes a second operand in the `CL` register (8 bits), or as a 8bits immediate. Current the instruction selector simply moves lower 8bits of `b` into `CL`, which may result in incorrect result. https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/22[x86_64] Status flags undefined for mul/div/idiv2017-06-29T12:53:30+10:00Yi Lin[x86_64] Status flags undefined for mul/div/idivThe following table summerizes how Mu integer binops are mapped to x86_64 insts, and how x86_64 insts affect status flags.
| Mu IR | X86_64 Inst | #N (signed) | #Z (zero) | #C (carry) | #V (overflow) |
|:-----: |:-----------: ...The following table summerizes how Mu integer binops are mapped to x86_64 insts, and how x86_64 insts affect status flags.
| Mu IR | X86_64 Inst | #N (signed) | #Z (zero) | #C (carry) | #V (overflow) |
|:-----: |:-----------: |:-----------: |:---------: |:----------: |:-------------: |
| ADD | add | ✓ | ✓ | ✓ | ✓ |
| SUB | sub | ✓ | ✓ | ✓ | ✓ |
| AND | and | ✓ | ✓ | - | - |
| OR | or | ✓ | ✓ | - | - |
| XOR | xor | ✓ | ✓ | - | - |
| MUL | mul | ✗ | ✗ | ✓ | ✓ |
| UDIV | div | ✗ | ✗ | - | - |
| SDIV | idiv | ✗ | ✗ | - | - |
| UREM | div | ✗ | ✗ | - | - |
| SREM | idiv | ✗ | ✗ | - | - |
| SHL | shl | ✓ | ✓ | - | - |
| LSHR | shr | ✓ | ✓ | - | - |
| ASHR | sar | ✓ | ✓ | - | - |
`mul`, `div` and `idiv` generate undefined signed flag (#N), and zero flag(#Z). We will need to generate extra code to check, and set those flags.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/21GC related C functions return inaccurate result2017-11-21T13:46:42+11:00Yi LinGC related C functions return inaccurate result`gc/src/heap/gc/clib_x64.c` contains C functions for GC, such as `get_registers()`, which contains inline assembly to save all values in general purpose registers into an array. However C compilers may generate code that changes the regi...`gc/src/heap/gc/clib_x64.c` contains C functions for GC, such as `get_registers()`, which contains inline assembly to save all values in general purpose registers into an array. However C compilers may generate code that changes the registers before saving.
We may want to rewrite the function in assembly instead of C. And I believe it is reasonable that we want to eliminate all C functions in the code base and replace them with assembly (all C functions are pretty simple). Ideally we want only Rust code and assembly in the code base.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/20[x86_64] int<1> arithmetics return wrong result2017-05-01T15:36:35+10:00Yi Lin[x86_64] int<1> arithmetics return wrong resultCurrently Zebu treats int<1> the same as int<8>. This is fine if the client only uses int<1> as boolean. If the client uses int<1> arithmetic operations, Zebu returns wrong result.
We should either explicitly forbid int<1> arithmetic...Currently Zebu treats int<1> the same as int<8>. This is fine if the client only uses int<1> as boolean. If the client uses int<1> arithmetic operations, Zebu returns wrong result.
We should either explicitly forbid int<1> arithmetics or implement it.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/19Register allocation validator2017-06-23T18:15:34+10:00Yi LinRegister allocation validatorThis issue describes the algorithm for the register allocation validation in Zebu.
Eliot proposed the approach. I implemented it in Zebu, and put the pseudo code here. I am not exactly confident if I have captured all the ideas from ...This issue describes the algorithm for the register allocation validation in Zebu.
Eliot proposed the approach. I implemented it in Zebu, and put the pseudo code here. I am not exactly confident if I have captured all the ideas from Eliot, and implemented it correctly. This issue serves as a record of how the register allocation validation works. Any discussion is welcome, and it may help future work.
<br /><br />
Input: register assignment `ASSIGNED(reg) -> mreg`, and code. <br/>
Output: `PASS` if `ASSIGNED` is correct; or `ERROR` if it is not.
<br /><br />
Assumptions:
* we have correct liveness information.
* spill code (`SPILL_LOAD`/`SPILL_STORE`) is already inserted, and spilled virtual registers are replaced with short-lived virtual registers, and we know the mapping.
<br /><br />
We are trying to iterate through the code focusing on registers by maintaining an `alive set`, each entry of which is a 3-element tuple `ENTRY(reg, machine_reg, mem)`, meaning `reg` is alive (contains a valid value) at the moment in `machine_reg` and spilled location `mem`.
* `(-, machine_reg, -)` means a machine register is alive, but no virtual register and no spill location is associated.
* `(*, machine_reg, *)` means matching entries that have `machine_reg`
* `(reg?, machine_reg, *)` means matching entries that have `machine_reg`, and naming the first element as `reg` (it may exist or not)
<br /><br />
We have a few primitives about alive set
* `x <- ALIVE[block]` loads alive set for `block` and uses alias `x` for it.
* `x.ADD_ALIVE(r, m, mem)`adds `(r, m, mem)` to alive set `x`.
* `ALIVE[block] <- x` saves alive set `x` with `block`.
* `x.ASSERT_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If no entry is found, ends with `ERROR`.
* `x.KILL_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If any entry is found, delete the entries.
* `x.EXIST_ALIVE(r, m, mem)` tries to match pattern `(r, m, mem)`. If any entry is found, it is true.
* `x.TRIM(list)`: for any `r`, `m` in `list`, preserve them in `x`, delete other entries in `x`
* `x.INTERSECT(y)`: if `(r, m, *)` or `(-, m, *)` appear in both `x` and `y`, preserve it in `x`; otherwise delete it from `x`.
* `y <- x.COPY`: copy `x` into `y`
<br /><br />
Pseudo code:
```
// set the initial alive set for a function start
init.ADD_ALIVE(-, stack pointer, -)
init.ADD_ALIVE(-, frame pointer, -)
init.ADD_ALIVE(-, program counter, -)
init.ADD_ALIVE(-, callee saved registers, -)
init.ADD_ALIVE(-, used argument registers, -)
push entry block to work queue
ALIVE[entry block] <- init
while work queue is not empty {
pop work queue -> block
alive <- ALIVE[block]
for each inst in block {
// (1) check spill
// we only care about the variable before spilling
if inst is SPILL_LOAD(mem) -> t for var v {
alive.ASSERT_ALIVE(v, *, mem)
} else if inst is SPILL_STORE(t) -> mem for var v {
alive.ADD_ALIVE(v, -, mem)
}
// (2) check use
// when we use a register, it needs to contain a valid value
for reg in inst.virtual_register_uses() {
mreg = ASSIGNED(reg)
alive.ASSERT_ALIVE(reg, mreg, *)
}
for mreg in inst.real_register_uses() {
alive.ASSERT_ALIVE(*, mreg, *)
}
// (3) kill died regs
// when a register dies, we remove its entry from alive set
for reg in inst.virtual_register_dies() {
alive.KILL_ALIVE(reg, *, *)
}
for mreg in inst.real_register_dies() {
alive.KILL_ALIVE(*, mreg, *)
}
// (4) check and add defines
for reg in inst.virtual_register_defines() {
if reg NOT in inst.liveout() {
// when a register is defined, but doesnt live out of this instruction
// we kill its previous values from alive set (by deleting the entries)
alive.KILL_ALIVE(reg, *, *)
} else {
mreg = ASSIGNED(reg)
if NOT alive.EXIST_ALIVE(*, mreg, *) {
// mreg doesnt hold any value at the moment, we simply add an entry
alive,ADD_ALIVE(reg, mreg, -)
} else {
// we need to ensure assigning mreg will not destroy useful values
for (x?, mreg, *) in ALIVE {
if x NOT exist {
x <- reg
} else {
// we have x in mreg, and we want reg in mreg as well
if x == reg {
// overwrite value (safe)
} else {
if inst is MOVE {
// possible coalescing (safe)
} else {
// we are destroying the value of x
// and x is alive at the moment (otherwise it is killed already in step 3)
ERROR
}
}
}
alive.ADD_ALIVE(reg, mreg, -)
}
}
}
}
for mreg in inst.real_register_defines() {
if mreg NOT in inst.liveout() {
// when a register is defined, but doesnt live out of this instruction
// we kill its previous values from alive set (by deleting the entries)
alive.KILL_ALIVE(*, mreg, *)
} else {
if NOT alive.EXIST_ALIVE(*, mreg, *) {
// mreg doesnt hold any value at the moment, we simply add an entry
alive.ADD_ALIVE(-, mreg, -)
} else {
for (reg?, mreg, -) in ALIVE {
if reg NOT exist {
// we have value in mreg, but it doesnt hold value of any variables
// overwrite the value is safe
} else {
// we are holding reg in mreg, defining mreg will destroy the value of reg
ERROR
}
}
}
}
}
// finishing the block, we only preserve what are alive at the end of the block
alive.TRIM(block.liveout())
if block is NOT visited {
ALIVE[block] <- alive
push_successors = true
} else {
alive_before <- ALIVE[block]
alive.INTERSECT(alive_before)
if alive <> alive_before {
push_successors = true
}
ALIVE[block] <- alive
}
mark block as visited
if push_successors {
if block has 1 successor {
push successor to work queue
ALIVE[successor] <- alive
} else block has 2 successors {
push successor1 to work queue
alive1 <- alive.COPY
ALIVE[successor1] <- alive1.TRIM(successor1.livein())
push successor2 to work queue
alive2 <- alive.COPY
ALIVE[successor2] <- alive2.TRIM(successor2,livein())
}
}
}
}
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/18Separating compilation information from IR data structures2017-06-10T10:28:39+10:00Yi LinSeparating compilation information from IR data structuresIt is poorly designed that the data structures for IR also contains information that is gradually generated during compilation, such as:
* `use_count` and `expr` in `SSAVarEntry`
* `block_trace` in `MuFunctionVersion`
* `exception_block...It is poorly designed that the data structures for IR also contains information that is gradually generated during compilation, such as:
* `use_count` and `expr` in `SSAVarEntry`
* `block_trace` in `MuFunctionVersion`
* `exception_blocks` in `FunctionContent`
* `control_flow` in `Block`
They are initially not available, and are generated during compilation. They can be safely destroyed after the compilation.
These compilation information should be stored separately from the IR.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/17Implementing ALLOCA/ALLOCA_HYBRID2017-07-19T15:07:32+10:00Yi LinImplementing ALLOCA/ALLOCA_HYBRIDCurrently the compiler assumes that frame size is constant at compile time.
For *x86_64*, stack pointer needs to be 16-bytes aligned before a function call. The compiler ensures this by:
* `rbp` is always 16-bytes aligned.
* frame ...Currently the compiler assumes that frame size is constant at compile time.
For *x86_64*, stack pointer needs to be 16-bytes aligned before a function call. The compiler ensures this by:
* `rbp` is always 16-bytes aligned.
* frame size is a multiple of 16-bytes (align up to 16-bytes if it is not, see `frame.rs`).
* if any call argument is passed on stack, if necessary, push a padding value to stack so that `rsp` is still 16-bytes aligned after pushing call arguments.
* restoring from an exception will set `rsp` from `rbp` and the constant frame size.
We can implement `ALLOCA` by computing allocating size during compile time, and frame size is still a compile-time constant. However, the implementation of `ALLOCA_HYBRID` will break this assumption. A straightforward solution is to make the alloca'd size always a multiple of 16-bytes (for alignment requirement), and record a *current frame size* somewhere (for restoring from exception) - this would keep most of the above unchanged. This issue tracks related discussion.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/16IR Building: Confusion in `FuncRef` ID2017-03-23T03:50:11+11:00Yi LinIR Building: Confusion in `FuncRef` IDCurrently when building a bundle through IR building API, for a function of Mu ID *x*, a `MuFunction` with *x* is created. Also a constant `FuncRef`pointing to the function is created, and the constant has its own ID as *x*. In a word, *...Currently when building a bundle through IR building API, for a function of Mu ID *x*, a `MuFunction` with *x* is created. Also a constant `FuncRef`pointing to the function is created, and the constant has its own ID as *x*. In a word, *x* is used for both for the function, and the constant. I am not sure if this is a typo or a deliberate decision. I assume that everything in Mu should have a unique ID, and using one ID for both a function and a funcref constant seems confusing. @u5211824https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/15Removing unnecessary use of lock/`Option` for IR data structures2017-05-19T14:32:29+10:00Yi LinRemoving unnecessary use of lock/`Option` for IR data structuresThis issue tracks unnecessary uses of locks in Zebu. The old Mu spec requires mutation on the IR, for example, creating a IR node, then adding a name to it. Thus either locks or `Option<T>` are introduced to allow mutation. The new spec ...This issue tracks unnecessary uses of locks in Zebu. The old Mu spec requires mutation on the IR, for example, creating a IR node, then adding a name to it. Thus either locks or `Option<T>` are introduced to allow mutation. The new spec makes the IR almost (if not completely) immutable. It is possible to remove most of the locks.
Some rewrite compilation passes try to mutate on IR nodes as well. However, we can always copy and update instead of mutating.
Lock:
* [x] `name` field in `MuEntityHeader`
* [x] `ops` field in `Instruction`
`Option<T>`:
* [ ] Most of the `Option` uses in `ir.rs`https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/14Tune instruction selection for mu code from RPython2017-07-12T10:49:36+10:00Yi LinTune instruction selection for mu code from RPythonThis issue tracks common instruction patterns that the RPython compiler generates.
We will need to tune instruction selector to generate efficient code for these patterns.
* [x] Conditional branch
`cmpres = CMP_OP a b`
`v1 = ZE...This issue tracks common instruction patterns that the RPython compiler generates.
We will need to tune instruction selector to generate efficient code for these patterns.
* [x] Conditional branch
`cmpres = CMP_OP a b`
`v1 = ZEXT <int1 int8> cmpres`
`v2 = CMP_EQ v1 1`
`BRANCH2 v2 ... ...`Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/13In-place code generation2017-03-01T17:40:17+11:00Yi LinIn-place code generationThe idea is to generate machine code directly without going through some representation of the machine code. The motivation is to save allocation/spaces for the machine code representation. Current x86_64 assembly backend tests the idea ...The idea is to generate machine code directly without going through some representation of the machine code. The motivation is to save allocation/spaces for the machine code representation. Current x86_64 assembly backend tests the idea of in-place code generation (though it is completely not necessary for an assembly backend to do this).
Issues:
* Too many eliminated moves (that are turned into `nop`), which results in bad performance. One solution is to do compaction on the code in the end (slide/copy the code to remove 'holes').
* In-place code generation introduces extra memory overhead, as we need to save locations of registers.
*A note for x86_64 asm backend:
Metadata for instruction can be more optimised. For example, currently I am using `LinkedHashMap<MuID, Vec<ASMLocation>>` to store information on used and defined registers. Both `LinkedHashMap` and `Vec` is expensive. We can use fixed length array, or simply `use1`, `use2`, `use3`...*
* It makes machine code level optimisations and transformations hard to implement.
Questions:
* Is it possible that we can emit machine code (binary) before we know the final code? For example, for `jmp`, it may turn into different machine code based on how big the offset is. However any assembler may need to deal with this question.
We need to discuss this more about this before starting implementing JIT.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/12Use sidemap for GC object metadata2017-11-21T13:48:32+11:00Yi LinUse sidemap for GC object metadataZebu intends to side maps to seperate objects from their metadata.
However, currently as a compromise, I am using a 64-bit object header along with `GCType` for the metadata. We will go back to using side maps.
This issue describ...Zebu intends to side maps to seperate objects from their metadata.
However, currently as a compromise, I am using a 64-bit object header along with `GCType` for the metadata. We will go back to using side maps.
This issue describes the design of sidemap scheme.
* we will assume a minimal object size `MIN_SIZE`, and a minimal alignment `MIN_ALIGN`. The larger the minimal size/align is, the less memory is required for metadata (see below). However, it wastes memory in the heap. MIN_SIZE of 16/24/32<del>bits</del> bytes, MIN_ALIGN of 128bits are reasonable.
* object metadata includes:
* 1bit/MIN_ALIGN: object start (and end - so we can decide size) (size is required for copying/dumping object)
* 1bit/64bits: reference locations
* 8bits/MIN_SIZE: gc state (mark bit, reference count, etc)
* small objects have less space to encode metadata, but large objects have plenty. We can use different schemes to encode small/large objects.
* side maps should be stored in the metadata part of a page/memory chunk.
More concrete design will be updated here once we discuss more. Yi LinYi Lin