mu-impl-fast issueshttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues2017-07-12T10:49:35+10:00https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/41persisting VM natively2017-07-12T10:49:35+10:00Yi Linpersisting VM nativelyWe now use Rust's `rustc_serialise` to persist VM as a JSON string in the boot image. This clearly imposes large overhead in both boot image size and loading time. We should persist the VM in a native and relocatable way.We now use Rust's `rustc_serialise` to persist VM as a JSON string in the boot image. This clearly imposes large overhead in both boot image size and loading time. We should persist the VM in a native and relocatable way.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/25Flags for int<128>2017-07-12T10:49:35+10:00Isaac Garianoisaac@ecs.vuw.ac.nzFlags for int<128>I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
...I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
The notation <exp> indicates exp is a 128-bit value
[exp] indicates exp is a 64-bits value
<[exp]> indicates exp is a 192-bits value
<<exp>> indicates exp is a 256-bits value
exp.h indicates the higher 64-bits of the expression and exp.l is the lower 64-bits of the expression
(each should occupy there own register)
Xi = 2^(64*i) (i.e. it is i*64-bits worth of zeros with a one at the front)
Ti is a temporary register (64-bits)
Note:
Some optimisations may be able to be performed if an argument to the instruction is an immediate
Zero and Negtaive Flags:
D, Z, N = BINOP S1, S2
ORR Z <- D.h, D.l // Z = D.h | D.l
CMP Z, #0 // Z <=> 0
CSET Z, EQ // Z = (Z == 0) ? 1 : 0
LSR N, D.h, 63 // N = (D.h >> 63) (so that N[0] = D.h[63])
Overflow and Carry for Add/Sub:
D, C, V = ADD/SUB S1, S2
// Compute the add/subtraction normal (except ensure the Ad with carry/subtract with carry sets the carry flag)
CSET C, CS // Set to 1 if the carry flag is set
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
For ADD:
// T[63] = 1 IFF S1 and S2 have different signs
EOR T1 <- S1h, S2.h // T1 = S1.h ^ S2.h
For Sub:
// T[63] = 1 IFF S1 and -S2 have different signs
EON T1 <- S1h, S2.h // T1 = S1.h ^ (~S2.h)
// V[63] = 1 iff D and S1 have different signs
// and S1 and S2 (or -S2) have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
Overflow for Sbutraction: (Note: this is essentially the same method I used for arithmetic less than 32 bits)
D, V = SUB S1, S2
// Compute the subtraction normally
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
// V[63] = 1 iff D and S1 have different signs
// and S1 and -S2 have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
------------
Overflow and carry for Multiply:
D, C, V = MUL S1, S2
---------------------- (this is just my working) ----------
<S1.h*X1+S1.l> * <S2.h*X1+S2.l> =
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]> + <S1.l*S2.l>
Discared everything that occupys the lower 128-bits:
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]>
-----------------------------
<S1.h*S2.h>*X2 +
<S1.l*S2.h>*X1 +
<S1.h*S2.l>*X1
--------------------------------
<[S1.h*S2.h].h*X1+[S1.h*S2.h].l*X1>*X2 +
<[S1.l*S2.h].h*X1 + [S1.l*S2.h.l>*X1
<[S1.h*S2.l].h*X1 + [S1.h*S2.l].l>*X1
--------------------------------------------------
[S1.h*S2.h].h*X3 + [S1.h*S2.h].l*X3 +
[S1.l*S2.h].h*X2 + [S1.l*S2.h.l*X1 +
[S1.h*S2.l].h*X2 + [S1.h*S2.l].l*X1
----------------------------------------------------
Discare all factors of X1 (as they will only contribute to the lower 128 bits of the result)
[[S1.h*S2.h].h+ [S1.h*S2.h].l]*X3 +
[[S1.l*S2.h].h + [S1.h*S2.l].h]*X2
So to get the overflow flag let:
D.h = [[S1.h*S2.h].h+ [S1.h*S2.h].l]
D.l = [[S1.l*S2.h].h + [S1.h*S2.l].h]
Then set it to '1' iff (D.h != 0) || (D.l != 0)
------------------------------------------
SO EMIT THE FOLLOWING CODE:
UMULH D.l <- S1.l, S2.h // D.l = [S1.l*S2.h].h
UMULH D.h <- S1.h*S2.l // D.h = [S1.h*S2.l].h
ADD D.l <- D.h, D.l // D.l += D.h
UMULH D.h <- S1.h, S2.h // D.h = [S1.h*S2.h].h
MADD D.h <- S1.h, S2.h, D.h // D.h += [S1.h*S2.h].l
CMP D.l, #0 // D.l <=> 0
CSET C <- NE // C = (D.l != 0) ? 1 : 0
CMP D.h, #0 // D.h <=> 0
CSINC C <- C, XZR, EQ // C = (D.h == 0) ? C : (0+1)
MOV V <- C // V = C (they should be the same)
// Now get the lower 128-bits of the product (and store it in D.h, D.l)
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/36Replace TreeNode.clone_value() with TreeNode.as_value()2017-07-13T11:16:47+10:00Yi LinReplace TreeNode.clone_value() with TreeNode.as_value()we should avoid always cloning the `Value`.we should avoid always cloning the `Value`.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/30Single exit block for Mu function2017-07-20T15:39:12+10:00Yi LinSingle exit block for Mu functionWe should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.We should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/27Conditional branch adjustment after trace generation2017-07-20T15:41:22+10:00Yi LinConditional branch adjustment after trace generationCurrently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. I...Currently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. Ideally before instruction selection, a conditional branch should always be followed by its false label. The adjustment should follow the rules:
* any conditional branch followed by its false label stays unchanged
* for conditional branch followed by its true label, we switch the true and false label, and negate the condition
* for conditional branch followed by neither label, we invent a new false label, and rewrite the conditional branch so that the new cond branch will be followed by the new false label.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/59Test suite with pytest writes to /tmp directory, which may cause issue about ...2017-08-24T23:40:47+10:00Yi LinTest suite with pytest writes to /tmp directory, which may cause issue about writing privilege.The tests in pytest will create files (executables/boot images, text files) in /tmp folder. This causes an issue when two users run the tests on a same machine: the first user creates executables in /tmp directory, and the second user do...The tests in pytest will create files (executables/boot images, text files) in /tmp folder. This causes an issue when two users run the tests on a same machine: the first user creates executables in /tmp directory, and the second user does not have privilege to overwrite the file, which causes the test fail.
A test should either create its own temp directory (and name it as code emitting directory for Zebu by using `--aot-emit-dir=<dir>`), or use the default *emit* directory under current directory and put all generated files there. Thus once the test is done, we can simply delete the specified folder, and next test run will be a fresh one. A test should not generate files in anywhere other than the specified directory, as this not only causes issue about writing privilege between users, but also gives a high possibility that we may accidentally run executables from previous test runs.John ZhangJohn Zhanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/8IR validation pass2017-08-25T09:36:31+10:00Yi LinIR validation passCurrently if the input IR is incorrect, one of the following may happen:
1. some assertion in the compiler may fail
1. Rust safety finds it and panics (such as index out of bounds)
1. the compiler generates correct or incorrect code e...Currently if the input IR is incorrect, one of the following may happen:
1. some assertion in the compiler may fail
1. Rust safety finds it and panics (such as index out of bounds)
1. the compiler generates correct or incorrect code even if input IR is incorrect
We will want a IR validation pass to check the input IR. It includes:
* check if types and numbers of operands and results of each instructions match
* check if function signatures matches parameters and return values
* check if branch arguments matches parameters, and if branch destination is valid
* check if the last instruction for each block is terminating instruction (`BRNACH`, `CALL`, `RET`, etc)
...
(this list will grow when I think up more)Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/78Collect performance data when doing CI2017-08-27T22:42:17+10:00Zixian CaiCollect performance data when doing CI1. We would like to get rid of docker container when running CI
2. We would like to add an extra stage before `rustfmt` to execute `mubench local <path_to_yml> --dump <path>` and archive the data.1. We would like to get rid of docker container when running CI
2. We would like to add an extra stage before `rustfmt` to execute `mubench local <path_to_yml> --dump <path>` and archive the data.Zixian CaiZixian Caihttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/87Make IR debug output from Zebu compatible with muc2017-10-09T17:38:55+11:00Yi LinMake IR debug output from Zebu compatible with mucCurrently with `trace` level logging, Zebu outputs IR in its own format. We should make this compatible with the text form that `muc` uses (https://gitlab.anu.edu.au/mu/mu-tool-compiler/blob/master/UIR.g4)Currently with `trace` level logging, Zebu outputs IR in its own format. We should make this compatible with the text form that `muc` uses (https://gitlab.anu.edu.au/mu/mu-tool-compiler/blob/master/UIR.g4)https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/56Allow GC heap growth2017-11-21T13:35:56+11:00Yi LinAllow GC heap growthIn current GC implementation, we allocate memory of the given heap size, and allocate (and initialize) metadata for the whole heap all at once at startup. This causes heap initialization extremely slow (causing 70% of the startup time - ...In current GC implementation, we allocate memory of the given heap size, and allocate (and initialize) metadata for the whole heap all at once at startup. This causes heap initialization extremely slow (causing 70% of the startup time - measured by @igariano01)
This should be fixed when we rewrite GC to allow heap growth so that we only need to mmap and initialize a small heap. The rewrite is on the schedule along with Issue #12.
|operation|time (μs)|
|----------|---|
|after rodal_init_deallocate|90.726|
|before mu_main|6.838|
|before gc_init|73.149|
|after gc_init|15,065.736|
|after init_runtime|35.896|
|after restore gc types|416.741|
|after build table|6,249.104|
|after loaded args|75.905|
|before swap_to_mu_stack|235.152|
|**Total**|22,249.247|https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/24Mu IR Type checking2017-11-21T13:54:18+11:00Isaac Garianoisaac@ecs.vuw.ac.nzMu IR Type checkingThe Mu IR compiler currently will compile some invalid mu code, specifically I noticed the following invalid code successfully compiled (which were used in some of the tests) :
* a SHL/LSHR/ASHR instruction where the second argument is...The Mu IR compiler currently will compile some invalid mu code, specifically I noticed the following invalid code successfully compiled (which were used in some of the tests) :
* a SHL/LSHR/ASHR instruction where the second argument is not the same as the first (in the case of the test the first argument was int<64> and the second argument was an int<8>) (this code was generated in tes_shl and test_lshr).
* passing an int<64> as an argument to a C function expecting an int<32> (this was generated by test_pass_1arg_by_stack, and test_pass_2arg_by_stack)
In addition the compiler doesn't seem to check when you use an SSA variable whether it has been assigned to yet.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/91Support for sel4-rumprun2019-03-25T13:08:19+11:00Yi LinSupport for sel4-rumprunhttps://gitlab.anu.edu.au/mu/mu-impl-fast/merge_requests/20 and https://gitlab.anu.edu.au/mu/mu-impl-fast/merge_requests/21 added some code to support Zebu running on sel4-rumprun. However, it does not run on sel4-rumprun yet, due to `ro...https://gitlab.anu.edu.au/mu/mu-impl-fast/merge_requests/20 and https://gitlab.anu.edu.au/mu/mu-impl-fast/merge_requests/21 added some code to support Zebu running on sel4-rumprun. However, it does not run on sel4-rumprun yet, due to `rodal` does not support sel4-rumprun.
The changes mainly address these issues:
* removed usage of dynamic libraries (`dlopen`, `dlsym`, etc) as sel4-rumprun does not support dynamic linking.
* rewrote some testcases to avoid using dynamic libraries.
* added feature guard `sel4-rumprun` for OS dependent code. Feature guard is used instead of OS guard as Rust does not correctly recognise sel4-rumprun.
* added feature guard `sel4-rumprun-target-side` for two-stage cross compilation.
Problems with the changes:
* it doesn't actually run on rumprun (`rodal` uses `dlsym`).
* the changes currently have massive code duplication instead of reusing OS/Target dependent code for Linux/x86_64. And duplicated code is not maintained when the original code changes.
* there is some hard-coded path for running Zebu on sel4-rumprun.
* using feature guard `[#cfg(feature = "sel4-rumprun")]` instead of a proper OS guard `[#cfg(target_os = "sel4-rumprun")]` makes OS dependent code quite unreadable. For example, for linux code, we have to do
```
[#cfg(not(feature = "sel4-rumprun))]
[#cfg(target_os = "linux")]
... // linux dependent code
```
* there is no document on how to setup environment and run Zebu on sel4-rumprun.
These should be addressed if we want to properly support Zebu on sel4-rumprun.Javad Ebrahimian Amirijavad.amiri@anu.edu.auJavad Ebrahimian Amirijavad.amiri@anu.edu.au