mu-impl-fast issueshttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues2017-08-09T17:54:01+10:00https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/75Bug: emiting instruction address fails at unknown struct layout2017-08-09T17:54:01+10:00John ZhangBug: emiting instruction address fails at unknown struct layout## Problem description
Zebu fails when compiling PyPy with the following error:
```
thread '<unnamed>' panicked at 'a struct type does not have a layout yet: BackendType { size: 0, alignment: 1, struct_layout: None, elem_size: None, gc_...## Problem description
Zebu fails when compiling PyPy with the following error:
```
thread '<unnamed>' panicked at 'a struct type does not have a layout yet: BackendType { size: 0, alignment: 1, struct_layout: None, elem_size: None, gc_type: GCType { id: 2490, alignment: 1, fix_size: 0, fix_refs: None, var_refs: None, var_size: None } }', src/compiler/backend/arch/x86_64/inst_sel.rs:5804
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at /checkout/src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at /checkout/src/libstd/sys_common/backtrace.rs:60
at /checkout/src/libstd/panicking.rs:355
3: std::panicking::default_hook
at /checkout/src/libstd/panicking.rs:371
4: std::panicking::rust_panic_with_hook
at /checkout/src/libstd/panicking.rs:549
5: std::panicking::begin_panic
at /checkout/src/libstd/panicking.rs:511
6: std::panicking::begin_panic_fmt
at /checkout/src/libstd/panicking.rs:495
7: mu::compiler::backend::x86_64::inst_sel::InstructionSelection::emit_inst_addr_to_value_inner
8: mu::compiler::backend::x86_64::inst_sel::InstructionSelection::emit_inst_addr_to_value
9: mu::compiler::backend::x86_64::inst_sel::InstructionSelection::instruction_select
10: <mu::compiler::backend::x86_64::inst_sel::InstructionSelection as mu::compiler::passes::CompilerPass>::visit_function
11: mu::compiler::passes::CompilerPass::execute
12: mu::compiler::Compiler::compile
13: mu::vm::vm::VM::make_boot_image_internal
14: mu::vm::api::api_bridge::_forwarder__MuCtx__make_boot_image
15: fnc_1063
16: main
17: __libc_start_main
18: _start
fatal runtime error: failed to initiate panic, error 5
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/74Ref map generation is wrong2017-08-09T09:49:27+10:00Isaac Garianoisaac@ecs.vuw.ac.nzRef map generation is wrongThe GC appears to be generating incorrect refmaps for types, this is causing test_pypy to fail.
Specifically when dumping an instance of 'stt294' whose structure is:
```
2509@stt294 = Struct:
1213@stt55 = Struct:
121...The GC appears to be generating incorrect refmaps for types, this is causing test_pypy to fail.
Specifically when dumping an instance of 'stt294' whose structure is:
```
2509@stt294 = Struct:
1213@stt55 = Struct:
1214@stt56 = Struct:
1060@stt10 = Struct:
1037@stt3 = Struct:
1012@i64 = Int(64) +0
1032@ptrstt2 = UPtr(1033@stt2 = Struct("@stt2") ) +8
1059@refstt10 = Ref(1060@stt10 = Struct("@stt10") ) *+16
1059@refstt10 = Ref(1060@stt10 = Struct("@stt10") ) *+24
1003@i8 = Int(8) *+32
1036@refstt3 = Ref(1037@stt3 = Struct("@stt3")) +40
1126@refhyb11 = Ref(1127@hyb11 = Hybrid("@hyb11") ) *+48
1059@refstt10 = Ref(1060@stt10 = Struct("@stt10") ) *+56
1003@i8 = Int(8) +64
1010@refhyb1 = Ref(1011@hyb1 = Hybrid("@hyb1")) *+72
1012@i64 = Int(64) +80
1010@refhyb1 = Ref(1011@hyb1 = Hybrid("@hyb1")) *+88
1012@i64 = Int(64) +96
1059@refstt10 = Ref(1060@stt10 = Struct("@stt10")) *+104
1003@i8 = Int(8) *+112
1003@i8 = Int(8) +113
1036@refstt3 = Ref(1037@stt3 = Struct("@stt3")) *+120
1036@refstt3 = Ref(1037@stt3 = Struct("@stt3")) +128
```
(The last column indicates the offset, and a * indicates that Zebu is treating the value at that offset as a reference when dumping). The reported ref map for this type is '1110101011011100', this is clearly completley wrong.
This bug can be reproduced with the following C code (executing it causes a segfault as it tries to dump an object at location '0x101', due to misreading a field as a reference):
```
#include <math.h>
#include <stddef.h>
#include <stdint.h>
#include "muapi.h"
#include "mu-fastimpl.h"
#define G(id, ...) global_ ## id
#define L(id, ...) local_ ## id
MuVM* mvm;
MuCtx* ctx;
MuID G(0, int64_const);
MuID G(1);
MuID G(2, uptr_stt2_const);
MuID G(3, stt2);
MuID G(4);
MuID G(5, int8_const);
MuID G(6);
MuID G(7);
MuID G(8, stt2_cell);
MuID G(9, hyb11);
MuID G(10, hyb11_cell);
MuID G(11, hyb1);
MuID G(12, hyb1_cell);
MuID G(13, stt3);
MuID G(14, stt3_cell);
MuID G(15, stt10);
MuID G(16, stt10_cell);
MuID G(17, stt56);
MuID G(18);
MuID G(19);
MuID G(20, stt55);
MuID G(21);
MuID G(22, stt294);
MuID G(23);
MuID G(24, stt294_cell);
void build_bundle() {
MuIRBuilder* irbuilder = ctx->new_ir_builder(ctx);
G(0, int64_const) = irbuilder->gen_sym(irbuilder, "@int64_const");
G(1) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_int(irbuilder, G(1), 64);
irbuilder->new_const_int(irbuilder, G(0, int64_const), G(1), 1);
G(2, uptr_stt2_const) = irbuilder->gen_sym(irbuilder, "@uptr_stt2_const");
G(3, stt2) = irbuilder->gen_sym(irbuilder, "@stt2");
G(4) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_uptr(irbuilder, G(4), G(3, stt2));
irbuilder->new_const_null(irbuilder, G(2, uptr_stt2_const), G(4));
G(5, int8_const) = irbuilder->gen_sym(irbuilder, "@int8_const");
G(6) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_int(irbuilder, G(6), 8);
irbuilder->new_const_int(irbuilder, G(5, int8_const), G(6), 1);
G(7) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_void(irbuilder, G(7));
irbuilder->new_type_struct(irbuilder, G(3, stt2), (MuTypeNode[]){G(7)}, 1);
G(8, stt2_cell) = irbuilder->gen_sym(irbuilder, "@stt2_cell");
irbuilder->new_global_cell(irbuilder, G(8, stt2_cell), G(3, stt2));
G(9, hyb11) = irbuilder->gen_sym(irbuilder, "@hyb11");
irbuilder->new_type_struct(irbuilder, G(9, hyb11), (MuTypeNode[]){G(7)}, 1);
G(10, hyb11_cell) = irbuilder->gen_sym(irbuilder, "@hyb11_cell");
irbuilder->new_global_cell(irbuilder, G(10, hyb11_cell), G(9, hyb11));
G(11, hyb1) = irbuilder->gen_sym(irbuilder, "@hyb1");
irbuilder->new_type_struct(irbuilder, G(11, hyb1), (MuTypeNode[]){G(7)}, 1);
G(12, hyb1_cell) = irbuilder->gen_sym(irbuilder, "@hyb1_cell");
irbuilder->new_global_cell(irbuilder, G(12, hyb1_cell), G(11, hyb1));
G(13, stt3) = irbuilder->gen_sym(irbuilder, "@stt3");
irbuilder->new_type_struct(irbuilder, G(13, stt3), (MuTypeNode[]){G(1), G(4)}, 2);
G(14, stt3_cell) = irbuilder->gen_sym(irbuilder, "@stt3_cell");
irbuilder->new_global_cell(irbuilder, G(14, stt3_cell), G(13, stt3));
G(15, stt10) = irbuilder->gen_sym(irbuilder, "@stt10");
irbuilder->new_type_struct(irbuilder, G(15, stt10), (MuTypeNode[]){G(13, stt3)}, 1);
G(16, stt10_cell) = irbuilder->gen_sym(irbuilder, "@stt10_cell");
irbuilder->new_global_cell(irbuilder, G(16, stt10_cell), G(15, stt10));
G(17, stt56) = irbuilder->gen_sym(irbuilder, "@stt56");
G(18) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_ref(irbuilder, G(18), G(15, stt10));
G(19) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_ref(irbuilder, G(19), G(13, stt3));
irbuilder->new_type_struct(irbuilder, G(17, stt56), (MuTypeNode[]){G(15, stt10), G(18), G(18), G(6), G(19)}, 5);
G(20, stt55) = irbuilder->gen_sym(irbuilder, "@stt55");
G(21) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_ref(irbuilder, G(21), G(9, hyb11));
irbuilder->new_type_struct(irbuilder, G(20, stt55), (MuTypeNode[]){G(17, stt56), G(21), G(18), G(6)}, 4);
G(22, stt294) = irbuilder->gen_sym(irbuilder, "@stt294");
G(23) = irbuilder->gen_sym(irbuilder, NULL);
irbuilder->new_type_ref(irbuilder, G(23), G(11, hyb1));
irbuilder->new_type_struct(irbuilder, G(22, stt294), (MuTypeNode[]){G(20, stt55), G(23), G(1), G(23), G(1), G(18), G(6), G(6), G(19), G(19)}, 10);
G(24, stt294_cell) = irbuilder->gen_sym(irbuilder, "@stt294_cell");
irbuilder->new_global_cell(irbuilder, G(24, stt294_cell), G(22, stt294));
irbuilder->load(irbuilder);
}
MuValue get_global(MuID id) { return ctx->handle_from_global(ctx, id); }
MuValue get_const(MuID id) { return ctx->handle_from_const(ctx, id); }
void store_field(MuIRefValue object, int field, MuValue value) {
MuIRefValue field_ref = ctx->get_field_iref(ctx, object, field);
ctx->store(ctx, MU_ORD_NOT_ATOMIC, field_ref, value);
}
MuIRefValue get_field_iref(MuIRefValue object, int field) {
return ctx->get_field_iref(ctx, object, field);
}
int main() {
mvm = mu_fastimpl_new_with_opts("init_mu --aot-emit-dir=emit");
ctx = mvm->new_context(mvm);
build_bundle();
MuIRefValue stt294_cell = get_global(G(24, stt294_cell));
MuIRefValue hyb1_cell = get_global(G(12, hyb1_cell));
MuIRefValue stt10_cell = get_global(G(16, stt10_cell));
MuIRefValue stt3_cell = get_global(G(14, stt3_cell));
MuIRefValue hyb11_cell = get_global(G(10, hyb11_cell));
MuIRefValue int64_const = get_const(G(0, int64_const));
MuIRefValue int8_const = get_const(G(5, int8_const));
MuIRefValue uptr_stt2_const = get_const(G(2, uptr_stt2_const));
store_field(stt294_cell, 1, hyb1_cell);
store_field(stt294_cell, 2, int64_const);
store_field(stt294_cell, 3, hyb1_cell);
store_field(stt294_cell, 4, int64_const);
store_field(stt294_cell, 5, stt10_cell);
store_field(stt294_cell, 6, int8_const);
store_field(stt294_cell, 7, int8_const);
store_field(stt294_cell, 8, stt3_cell);
store_field(stt294_cell, 9, stt3_cell);
MuIRefValue stt55_part = get_field_iref(stt294_cell, 0);
//.const stt55_const<stt55> = {stt56_const hyb11_cell stt10_cell int8_const}
store_field(stt55_part, 1, hyb11_cell);
store_field(stt55_part, 2, stt10_cell);
store_field(stt55_part, 3, int8_const);
MuIRefValue stt56_part = get_field_iref(stt55_part, 0);
store_field(stt56_part, 1, stt10_cell);
store_field(stt56_part, 2, stt10_cell);
store_field(stt56_part, 3, int8_const);
store_field(stt56_part, 3, stt3_cell);
//.const stt55_const<stt55> = {stt56_const hyb11_cell stt10_cell int8_const}
MuIRefValue stt10_part = get_field_iref(stt56_part, 0);
MuIRefValue stt3_part = get_field_iref(stt10_part, 0);
store_field(stt3_part, 0, int64_const);
store_field(stt3_part, 1, uptr_stt2_const);
ctx->make_boot_image(ctx,
(MuID[]){G(0, int64_const), G(2, uptr_stt2_const), G(3, stt2), G(5, int8_const), G(8, stt2_cell), G(9, hyb11), G(10, hyb11_cell), G(11, hyb1), G(12, hyb1_cell), G(13, stt3), G(14, stt3_cell), G(15, stt10), G(16, stt10_cell), G(17, stt56), G(20, stt55), G(22, stt294), G(24, stt294_cell)}, 17,
NULL, NULL, NULL, NULL, NULL, 0, NULL, NULL, 0,
"test_ref_map");
}
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/71Error on undefined reference to asm when linking with Zebu's shared library2017-08-04T17:26:57+10:00Yi LinError on undefined reference to asm when linking with Zebu's shared libraryIt has been reported multiple times that linking with Zebu's shared library (`libmu.so`) on linux will result a failure with error message: 'undefined reference to asm'. I am not sure what causes this problem. But a workaround is setting...It has been reported multiple times that linking with Zebu's shared library (`libmu.so`) on linux will result a failure with error message: 'undefined reference to asm'. I am not sure what causes this problem. But a workaround is setting `CARGO_HOME` to a local directory such as `.` or `.cargo` instead of using the default cargo directory when building Zebu and when linking with `libmu.so`. Linking against Zebu's static library has no issue like this.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/70[x86_64] Unimplemented conversions2017-08-01T23:27:03+10:00Isaac Garianoisaac@ecs.vuw.ac.nz[x86_64] Unimplemented conversionsWhen trying to run test_pypy.py on develop, for some reason I am know longer running out of memory, instead I get the following error:
```
thread '<unnamed>' panicked at 'not yet implemented', src/compiler/backend/arch/x86_64/inst_se...When trying to run test_pypy.py on develop, for some reason I am know longer running out of memory, instead I get the following error:
```
thread '<unnamed>' panicked at 'not yet implemented', src/compiler/backend/arch/x86_64/inst_sel.rs:1352
```
It seems that BITCAST, FPTRUNC and FPEXT are the only unimplemented conversion ops on x86-64.
This should be fixed (the aarch64 backend implements them by emitting single assembly instructions, hopefully there are equivalent instructions for x86-64).Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/68SELECT on comparison of doubles panicked at 'not yet implemented'2017-07-21T15:44:42+10:00Javad Ebrahimian Amirijavad.amiri@anu.edu.auSELECT on comparison of doubles panicked at 'not yet implemented'The following IR code:
```rust
ssa! (($vm, $tester_name) <int1> cmp_res);
inst! (($vm, $tester_name) blk_entry_cmp:
cmp_res = CMPOP (CmpOp::EQ) result f64_2_local
);
ssa! (($vm, $teste...The following IR code:
```rust
ssa! (($vm, $tester_name) <int1> cmp_res);
inst! (($vm, $tester_name) blk_entry_cmp:
cmp_res = CMPOP (CmpOp::EQ) result f64_2_local
);
ssa! (($vm, $tester_name) <int64t> blk_entry_ret);
inst! (($vm, $tester_name) blk_entry_inst_select:
blk_entry_ret = SELECT cmp_res int64_pass_local int64_fail_local
);
inst! (($vm, $tester_name) blk_entry_inst_ret:
SET_RETVAL blk_entry_ret
);
```
panics with this output:
```
TRACE - instsel on node#1032 (SETRETVAL (int<64>(%blk_entry_ret #1030) = SELECT if (int<1>(%cmp_res #1028) = EQ double(%result #1026) double(2)) then int<64>(0) else int<64>(1)))
TRACE - instsel on SETRETVAL
TRACE - instsel on node#1031 (int<64>(%blk_entry_ret #1030) = SELECT if (int<1>(%cmp_res #1028) = EQ double(%result #1026) double(2)) then int<64>(0) else int<64>(1))
TRACE - instsel on SELECT
FAILED
failures:
---- test_compiler::test_floatingpoint::test_double_add stdout ----
thread 'test_compiler::test_floatingpoint::test_double_add' panicked at 'not yet implemented', src/compiler/backend/arch/x86_64/inst_sel.rs:3790:32
```
Both `result` and `f64_2_local` are doubles.
I'm using a commit from 20/06/2017 on 5:19PM.
In the newest commit, this `unimplemented!()` panic in located at line 4433 of `x86_64/inst_sel.rs`.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/27Conditional branch adjustment after trace generation2017-07-20T15:41:22+10:00Yi LinConditional branch adjustment after trace generationCurrently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. I...Currently we are doing this adjustment during instruction selection (target dependent code). It is possible to do this in a cleaner and target indepdent way.
We need another pass to adjust conditional branch after trace generation. Ideally before instruction selection, a conditional branch should always be followed by its false label. The adjustment should follow the rules:
* any conditional branch followed by its false label stays unchanged
* for conditional branch followed by its true label, we switch the true and false label, and negate the condition
* for conditional branch followed by neither label, we invent a new false label, and rewrite the conditional branch so that the new cond branch will be followed by the new false label.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/30Single exit block for Mu function2017-07-20T15:39:12+10:00Yi LinSingle exit block for Mu functionWe should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.We should have a pass at IR level to rewrite the code to allow only one exit block so that the epilogue only appears once for each function.
Currently the compiler generates epilogue for each RET instruction.https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/28Name mangling in Zebu, unnamed MuEntity, global/local names2017-07-19T15:05:23+10:00Yi LinName mangling in Zebu, unnamed MuEntity, global/local namesCurrently Zebu makes assumptions about names, and mangles names in a way that is inconsistent and confusing. And it may be inconsistent to the spec.
These are some notes (I need to rethink on these):
* Zebu assumes local names, and man...Currently Zebu makes assumptions about names, and mangles names in a way that is inconsistent and confusing. And it may be inconsistent to the spec.
These are some notes (I need to rethink on these):
* Zebu assumes local names, and mangles it (if needed) in its own way. The spec requires all names used via API are global names (no mangling is needed)
* Zebu checks and transforms each name so the name does not include special character, and can be safely used in assembly
* Zebu assumes some entities such as `Block`, `MuFunction` have a name. These assumption may not be consistent with the spec (the spec requires top-level entities have names). This needs further check.
* Names that start with number is valid as name for a Mu entity, however the name may not be valid to be used directly in the assembly.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/63[aarch64] Register allocotar is allocated multiple destination arguments to t...2017-07-14T14:20:35+10:00Isaac Garianoisaac@ecs.vuw.ac.nz[aarch64] Register allocotar is allocated multiple destination arguments to the same registerWhen trying to compile the following MUIR function (using `muc -r`
```
.funcdef foo <(int<128> int<128> int<128> int<128> int<128> int<128>)->(int<128>)>
{
entry(<int<128>>a0 <int<128>>a1 <int<128>>a2 <int<128>>a3 <int<128>>a4 <int<1...When trying to compile the following MUIR function (using `muc -r`
```
.funcdef foo <(int<128> int<128> int<128> int<128> int<128> int<128>)->(int<128>)>
{
entry(<int<128>>a0 <int<128>>a1 <int<128>>a2 <int<128>>a3 <int<128>>a4 <int<128>>a5):
RET a5
}
```
I get an error
```
INFO - emity/foo.S:44:10: error: unpredictable LDP instruction, Rt2==Rt
LDP X0 ,X0 ,[X29,#16]
```
After peephole optimization, the code contained the following:
```
TRACE - #40 LDP X0 ,X0 ,[X29,#16] define: [1103, 1104] uses: [58] pred: [39] succ: [41]
TRACE - #41 LDP X0 ,X1 ,[X29,#32] define: [1107, 1108] uses: [58] pred: [40] succ: [43]
```
The register alocator has allocated #1103 and #1104 both to X0, but hasn't broken the last load (X0 and X1 are the return registers, so the last LDP instruction loads the last stack argument and writes the return value).Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/36Replace TreeNode.clone_value() with TreeNode.as_value()2017-07-13T11:16:47+10:00Yi LinReplace TreeNode.clone_value() with TreeNode.as_value()we should avoid always cloning the `Value`.we should avoid always cloning the `Value`.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/3API Implementation2017-07-12T10:49:36+10:00John ZhangAPI ImplementationBelow is the list of API calls to implement, ordered by priority derived from requirements from test cases.
Please tick them off as you go, and watch for updates.
## `MuIRBuilder`
- [x] `load`
- [x] `gen_sym`
- [x] `new_bb`
- [x] `new_...Below is the list of API calls to implement, ordered by priority derived from requirements from test cases.
Please tick them off as you go, and watch for updates.
## `MuIRBuilder`
- [x] `load`
- [x] `gen_sym`
- [x] `new_bb`
- [x] `new_binop`
- [x] `new_branch`
- [x] `new_branch2`
- [x] `new_call`
- [x] `new_ccall`
- [x] `new_cmp`
- [x] `new_comminst`
- [x] `new_const_double`
- [x] `new_const_extern`
- [x] `new_const_int`
- [x] `new_const_int_ex`
- [x] `new_const_null`
- [x] `new_conv`
- [x] `new_dest_clause`
- [x] `new_exc_clause`
- [x] `new_func`
- [x] `new_func_ver`
- [x] `new_funcsig`
- [x] `new_getfieldiref`
- [x] `new_getiref`
- [x] `new_getvarpartiref`
- [x] `new_global_cell`
- [x] `new_load`
- [x] `new_new`
- [x] `new_newhybrid`
- [x] `new_ret`
- [x] `new_select`
- [x] `new_shiftiref`
- [x] `new_store`
- [x] `new_switch`
- [x] `new_throw`
- [x] `new_type_double`
- [x] `new_type_float`
- [x] `new_type_funcref`
- [x] `new_type_hybrid`
- [x] `new_type_int`
- [x] `new_type_iref`
- [x] `new_type_ref`
- [x] `new_type_struct`
- [x] `new_type_ufuncptr`
- [x] `new_type_uptr`
- [x] `new_type_void`
## MuCtx
- [x] `store`
- [x] `get_field_iref`
- [x] `get_iref`
- [x] `get_var_part_iref`
- [x] `handle_from_const`
- [x] `handle_from_func`
- [x] `handle_from_global`
- [x] `handle_from_sint64`
- [x] `handle_from_uint64`
- [x] `handle_from_uint8`
- [x] `id_of`
- [x] `new_fixed`
- [x] `new_hybrid`
- [x] `new_ir_builder`
- [x] `refcast`
- [x] `shift_iref`Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/4IR Instruction Implementation2017-07-12T10:49:36+10:00John ZhangIR Instruction ImplementationA list of IR Instruction needed progressively.
Tick them off as you go, and watch for updates.
## IR Instructions
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UDIV`
- [x] `SREM`
- [x] `UREM`
- [x] `SHL`
- [x] `ASHR`
- [x] `L...A list of IR Instruction needed progressively.
Tick them off as you go, and watch for updates.
## IR Instructions
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UDIV`
- [x] `SREM`
- [x] `UREM`
- [x] `SHL`
- [x] `ASHR`
- [x] `LSHR`
- [x] `FADD`
- [x] `FSUB`
- [x] `FDIV`
- [x] `FMUL`
- [x] `AND`
- [x] `OR`
- [x] `XOR`
- [x] `EQ`
- [x] `NE`
- [x] `SGE`
- [x] `SGT`
- [x] `SLE`
- [x] `SLT`
- [x] `ULE`
- [x] `ULT`
- [x] `FOEQ`
- [x] `FOGT`
- [x] `FOLE`
- [x] `FOLT`
- [x] `FONE`
- [x] `TRUNC`
- [x] `ZEXT`
- [x] `SEXT`
- [x] `REFCAST`
- [x] `PTRCAST`
- [x] `FPTOSI`
- [x] `SITOFP`
- [x] `UITOFP`
- [x] `SELECT`
- [x] `BRANCH`
- [x] `BRANCH2`
- [x] `SWITCH`
- [x] `CALL`
- [x] `RET`
- [x] `NEW`
- [x] `NEWHYBRID`
- [x] `GETIREF`
- [x] `GETFIELDIREF`
- [x] `GETELEMIREF`
- [x] `SHIFTIREF`
- [x] `GETVARPARTIREF`
- [x] `LOAD`
- [x] `STORE`
- [x] `CCALL`
- [x] `NEWTHREAD`
- [x] `COMMINST @uvm.thread_exit`
- [x] `THROW`
- [x] `COMMINST @uvm.native.pin`
- [x] `COMMINST @uvm.native.unpin`
- [ ] `COMMINST @uvm.get_threadlocal`
- [x] `COMMINST @uvm.set_threadlocal`
Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/5JIT Tests2017-07-12T10:49:36+10:00John ZhangJIT TestsA list of tests to do for the JIT.
## Milestone Tests
- [x] constant function
- [x] fibonacci
- [x] two functions (compiling multiple functions)
- [x] RPython SHA1 test
- [ ] RPython GC benchmark
- [x] RPython Richards benchmark
- [x] ...A list of tests to do for the JIT.
## Milestone Tests
- [x] constant function
- [x] fibonacci
- [x] two functions (compiling multiple functions)
- [x] RPython SHA1 test
- [ ] RPython GC benchmark
- [x] RPython Richards benchmark
- [x] RPython NBody benchmark
- [x] RPython SOM interpreter
- [ ] PyPy interpreter with minimum modules
- [ ] PyPy interpreter with compilable modules
## Binary operation tests
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UREM`
- [x] `SHL`
- [x] `LSHR`
- [x] `AND`
- [x] `XOR`
## Compare operation tests
- [x] `EQ`
- [x] `NE`
- [x] `SGE`
- [x] `SGT`
- [x] `SLE`
- [x] `SLT`
## Conversion operation tests
- [x] `TRUNC`
- [x] `ZEXT`
- [x] `SEXT`
- [x] `REFCAST`
- [x] `PTRCAST`
## Control flow operation tests
- [x] `BRANCH`
- [x] `BRANCH2`
- [x] `CALL`
- [x] `RET`
- [x] `SWITCH`
- [x] `CCALL`
- [x] `THROW`
## Memory operation tests
- [x] `NEW`
- [x] `NEWHYBRID`
- [x] `GETIREF`
- [x] `GETFIELDIREF`
- [x] `SHIFTIREF`
- [x] `GETVARPARTIREF`
- [x] `LOAD`
- [x] `STORE`
## `COMMINST` tests
- [x] `COMMINST @uvm.thread_exit`
- [x] `COMMINST @uvm.native.pin`
- [x] `COMMINST @uvm.native.unpin`
- [x] `COMMINST @uvm.get_threadlocal`
- [x] `COMMINST @uvm.set_threadlocal`John ZhangJohn Zhanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/7Insert intermediate blocks for removing phi-node values2017-07-12T10:49:36+10:00Yi LinInsert intermediate blocks for removing phi-node valuesThe current implementation insert moves before branching. Instead, we should insert intermediate blocks between source and destination blocks, and put moves there. The current implementation insert moves before branching. Instead, we should insert intermediate blocks between source and destination blocks, and put moves there. RPython benchmarksYi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/9Bug: hybrid layout2017-07-12T10:49:36+10:00John ZhangBug: hybrid layoutHybrid may contain empty fixed part, in which case `backend::layout_struct` fails because `struct_align = 0`.Hybrid may contain empty fixed part, in which case `backend::layout_struct` fails because `struct_align = 0`.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/11VMOptions2017-07-12T10:49:36+10:00Yi LinVMOptionsWe need to allow Zebu to take options during initialisation.
* The initialisation function starts from `mu_fastimpl_new()` in `vm/api/api_impl/muvm.rs`.
* Arguments as a formatted string would suffice.
* Consider using Rust crates ...We need to allow Zebu to take options during initialisation.
* The initialisation function starts from `mu_fastimpl_new()` in `vm/api/api_impl/muvm.rs`.
* Arguments as a formatted string would suffice.
* Consider using Rust crates to manage command line options, such as [clap](https://crates.io/crates/clap).RPython benchmarksYi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/14Tune instruction selection for mu code from RPython2017-07-12T10:49:36+10:00Yi LinTune instruction selection for mu code from RPythonThis issue tracks common instruction patterns that the RPython compiler generates.
We will need to tune instruction selector to generate efficient code for these patterns.
* [x] Conditional branch
`cmpres = CMP_OP a b`
`v1 = ZE...This issue tracks common instruction patterns that the RPython compiler generates.
We will need to tune instruction selector to generate efficient code for these patterns.
* [x] Conditional branch
`cmpres = CMP_OP a b`
`v1 = ZEXT <int1 int8> cmpres`
`v2 = CMP_EQ v1 1`
`BRANCH2 v2 ... ...`Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/23[x86_64] shift operations may return wrong result if 2nd operand is larger th...2017-07-12T10:49:36+10:00Yi Lin[x86_64] shift operations may return wrong result if 2nd operand is larger than int8Shifting instructions in Mu require two operands of the same size, e.g. `Shl <int64> a b`, in which `a` and `b` are both `int64`.
However `shl`, `shr`, `sar` in x86_64 either takes a second operand in the `CL` register (8 bits), or a...Shifting instructions in Mu require two operands of the same size, e.g. `Shl <int64> a b`, in which `a` and `b` are both `int64`.
However `shl`, `shr`, `sar` in x86_64 either takes a second operand in the `CL` register (8 bits), or as a 8bits immediate. Current the instruction selector simply moves lower 8bits of `b` into `CL`, which may result in incorrect result. https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/25Flags for int<128>2017-07-12T10:49:35+10:00Isaac Garianoisaac@ecs.vuw.ac.nzFlags for int<128>I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
...I have worked out what instructions should be emitted to compute flags for binary operations on Aarch64, on x86-64 a similar method of implementation should hopefully work. As such I have included my notes here.
```
Notes on notation:
The notation <exp> indicates exp is a 128-bit value
[exp] indicates exp is a 64-bits value
<[exp]> indicates exp is a 192-bits value
<<exp>> indicates exp is a 256-bits value
exp.h indicates the higher 64-bits of the expression and exp.l is the lower 64-bits of the expression
(each should occupy there own register)
Xi = 2^(64*i) (i.e. it is i*64-bits worth of zeros with a one at the front)
Ti is a temporary register (64-bits)
Note:
Some optimisations may be able to be performed if an argument to the instruction is an immediate
Zero and Negtaive Flags:
D, Z, N = BINOP S1, S2
ORR Z <- D.h, D.l // Z = D.h | D.l
CMP Z, #0 // Z <=> 0
CSET Z, EQ // Z = (Z == 0) ? 1 : 0
LSR N, D.h, 63 // N = (D.h >> 63) (so that N[0] = D.h[63])
Overflow and Carry for Add/Sub:
D, C, V = ADD/SUB S1, S2
// Compute the add/subtraction normal (except ensure the Ad with carry/subtract with carry sets the carry flag)
CSET C, CS // Set to 1 if the carry flag is set
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
For ADD:
// T[63] = 1 IFF S1 and S2 have different signs
EOR T1 <- S1h, S2.h // T1 = S1.h ^ S2.h
For Sub:
// T[63] = 1 IFF S1 and -S2 have different signs
EON T1 <- S1h, S2.h // T1 = S1.h ^ (~S2.h)
// V[63] = 1 iff D and S1 have different signs
// and S1 and S2 (or -S2) have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
Overflow for Sbutraction: (Note: this is essentially the same method I used for arithmetic less than 32 bits)
D, V = SUB S1, S2
// Compute the subtraction normally
// V[63] = 1 IFF D and S1 have different signs
EOR V <- D.h, S1.h // V = D.h ^ S1.h
// V[63] = 1 iff D and S1 have different signs
// and S1 and -S2 have the same sign
BIC V <- V, T // V = V & ~T
// Check tmp_status[n-1]
TST V, 1 << 63 // V[63] <=> 1
CSET V, NE // V = (V[63] != 1) ? 1 : 0
------------
Overflow and carry for Multiply:
D, C, V = MUL S1, S2
---------------------- (this is just my working) ----------
<S1.h*X1+S1.l> * <S2.h*X1+S2.l> =
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]> + <S1.l*S2.l>
Discared everything that occupys the lower 128-bits:
<<S1.h*S2.h*X2>> + <[S1.l*S2.h*X1]> + <[S1.h*X1*S2.l]>
-----------------------------
<S1.h*S2.h>*X2 +
<S1.l*S2.h>*X1 +
<S1.h*S2.l>*X1
--------------------------------
<[S1.h*S2.h].h*X1+[S1.h*S2.h].l*X1>*X2 +
<[S1.l*S2.h].h*X1 + [S1.l*S2.h.l>*X1
<[S1.h*S2.l].h*X1 + [S1.h*S2.l].l>*X1
--------------------------------------------------
[S1.h*S2.h].h*X3 + [S1.h*S2.h].l*X3 +
[S1.l*S2.h].h*X2 + [S1.l*S2.h.l*X1 +
[S1.h*S2.l].h*X2 + [S1.h*S2.l].l*X1
----------------------------------------------------
Discare all factors of X1 (as they will only contribute to the lower 128 bits of the result)
[[S1.h*S2.h].h+ [S1.h*S2.h].l]*X3 +
[[S1.l*S2.h].h + [S1.h*S2.l].h]*X2
So to get the overflow flag let:
D.h = [[S1.h*S2.h].h+ [S1.h*S2.h].l]
D.l = [[S1.l*S2.h].h + [S1.h*S2.l].h]
Then set it to '1' iff (D.h != 0) || (D.l != 0)
------------------------------------------
SO EMIT THE FOLLOWING CODE:
UMULH D.l <- S1.l, S2.h // D.l = [S1.l*S2.h].h
UMULH D.h <- S1.h*S2.l // D.h = [S1.h*S2.l].h
ADD D.l <- D.h, D.l // D.l += D.h
UMULH D.h <- S1.h, S2.h // D.h = [S1.h*S2.h].h
MADD D.h <- S1.h, S2.h, D.h // D.h += [S1.h*S2.h].l
CMP D.l, #0 // D.l <=> 0
CSET C <- NE // C = (D.l != 0) ? 1 : 0
CMP D.h, #0 // D.h <=> 0
CSINC C <- C, XZR, EQ // C = (D.h == 0) ? C : (0+1)
MOV V <- C // V = C (they should be the same)
// Now get the lower 128-bits of the product (and store it in D.h, D.l)
```https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/34Emitting a 128-bit int constant to a P<Value>2017-07-12T10:49:35+10:00Isaac Garianoisaac@ecs.vuw.ac.nzEmitting a 128-bit int constant to a P<Value>Currently the functions `emit_ireg_value` and `emit_reg_value` in `aarch64/mod.rs` and `emit_reg` in `x86_64\inst_sel.rs` are unimplemented when the source value/tree_node is a `Constant::IntEx`.
This will cause a problem if you try and ...Currently the functions `emit_ireg_value` and `emit_reg_value` in `aarch64/mod.rs` and `emit_reg` in `x86_64\inst_sel.rs` are unimplemented when the source value/tree_node is a `Constant::IntEx`.
This will cause a problem if you try and and do a 128-bit UDIV/UREM/SREM/SDIV (it will call `unimplemented!()`)
E.g. compiling the following code will fail with 'panic not yet impelemented'
```
.funcdef foo<()->()>
{
entry():
%v = UREM <int<128>> <int<128>>1 <int<128>>1
RET
}
```
(but only when using `muc` with `-c` , it works when using `muc` with `-r` (this is probably a bug in `muc`, I will look into it).
I believe the simplest solution would be to have some way of combining two `P<Value>`s into one.
(Basically have a function that is the inverse of `split_int128`).
That way we could use the same code for `emit_ireg_ex` and then instead of returning a pair of `P<Value>`'s we return a `P<Value>` constructed by combining these two.
I release the reason for this bug is that I have modified run time entry points for UDIV/UREM/SREM/SDIV to take a int<128> arguments, and so we need to pass a single `P<Value>` to the call to `emit_runtime_entry`, wheras before when using int<64> arguments we were passing 2 `P<Value>`'s for each 128-bit integer.Yi LinYi Lin