mu-impl-fast issueshttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues2017-07-12T10:49:36+10:00https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/11VMOptions2017-07-12T10:49:36+10:00Yi LinVMOptionsWe need to allow Zebu to take options during initialisation.
* The initialisation function starts from `mu_fastimpl_new()` in `vm/api/api_impl/muvm.rs`.
* Arguments as a formatted string would suffice.
* Consider using Rust crates ...We need to allow Zebu to take options during initialisation.
* The initialisation function starts from `mu_fastimpl_new()` in `vm/api/api_impl/muvm.rs`.
* Arguments as a formatted string would suffice.
* Consider using Rust crates to manage command line options, such as [clap](https://crates.io/crates/clap).RPython benchmarksYi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/10Use higher byte of x86 16bits registers2017-06-14T00:06:06+10:00Yi LinUse higher byte of x86 16bits registersCurrently the register allocator only uses lower bytes of the aliased registers, AL/BL/CL/DL. The high byte (AH/BH/CH/DH) is wasted. Currently the register allocator only uses lower bytes of the aliased registers, AL/BL/CL/DL. The high byte (AH/BH/CH/DH) is wasted. https://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/9Bug: hybrid layout2017-07-12T10:49:36+10:00John ZhangBug: hybrid layoutHybrid may contain empty fixed part, in which case `backend::layout_struct` fails because `struct_align = 0`.Hybrid may contain empty fixed part, in which case `backend::layout_struct` fails because `struct_align = 0`.Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/8IR validation pass2017-08-25T09:36:31+10:00Yi LinIR validation passCurrently if the input IR is incorrect, one of the following may happen:
1. some assertion in the compiler may fail
1. Rust safety finds it and panics (such as index out of bounds)
1. the compiler generates correct or incorrect code e...Currently if the input IR is incorrect, one of the following may happen:
1. some assertion in the compiler may fail
1. Rust safety finds it and panics (such as index out of bounds)
1. the compiler generates correct or incorrect code even if input IR is incorrect
We will want a IR validation pass to check the input IR. It includes:
* check if types and numbers of operands and results of each instructions match
* check if function signatures matches parameters and return values
* check if branch arguments matches parameters, and if branch destination is valid
* check if the last instruction for each block is terminating instruction (`BRNACH`, `CALL`, `RET`, etc)
...
(this list will grow when I think up more)Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/7Insert intermediate blocks for removing phi-node values2017-07-12T10:49:36+10:00Yi LinInsert intermediate blocks for removing phi-node valuesThe current implementation insert moves before branching. Instead, we should insert intermediate blocks between source and destination blocks, and put moves there. The current implementation insert moves before branching. Instead, we should insert intermediate blocks between source and destination blocks, and put moves there. RPython benchmarksYi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/6IR rewrite pass before instruction selection2017-11-21T13:50:53+11:00Yi LinIR rewrite pass before instruction selectionSome instructions such as `NEW` will be expanded into a sequence of code (may involve new blocks), and some instructions such as `THREADEXIT` will be expanded into a CCall into runtime service functions. Currently this is done at instruc...Some instructions such as `NEW` will be expanded into a sequence of code (may involve new blocks), and some instructions such as `THREADEXIT` will be expanded into a CCall into runtime service functions. Currently this is done at instruction selection pass, by directly expanding such instructions into machine code. Alternatively, a better choice is to rewrite/expand such instructions into Mu IR before instruction selection. Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/5JIT Tests2017-07-12T10:49:36+10:00John ZhangJIT TestsA list of tests to do for the JIT.
## Milestone Tests
- [x] constant function
- [x] fibonacci
- [x] two functions (compiling multiple functions)
- [x] RPython SHA1 test
- [ ] RPython GC benchmark
- [x] RPython Richards benchmark
- [x] ...A list of tests to do for the JIT.
## Milestone Tests
- [x] constant function
- [x] fibonacci
- [x] two functions (compiling multiple functions)
- [x] RPython SHA1 test
- [ ] RPython GC benchmark
- [x] RPython Richards benchmark
- [x] RPython NBody benchmark
- [x] RPython SOM interpreter
- [ ] PyPy interpreter with minimum modules
- [ ] PyPy interpreter with compilable modules
## Binary operation tests
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UREM`
- [x] `SHL`
- [x] `LSHR`
- [x] `AND`
- [x] `XOR`
## Compare operation tests
- [x] `EQ`
- [x] `NE`
- [x] `SGE`
- [x] `SGT`
- [x] `SLE`
- [x] `SLT`
## Conversion operation tests
- [x] `TRUNC`
- [x] `ZEXT`
- [x] `SEXT`
- [x] `REFCAST`
- [x] `PTRCAST`
## Control flow operation tests
- [x] `BRANCH`
- [x] `BRANCH2`
- [x] `CALL`
- [x] `RET`
- [x] `SWITCH`
- [x] `CCALL`
- [x] `THROW`
## Memory operation tests
- [x] `NEW`
- [x] `NEWHYBRID`
- [x] `GETIREF`
- [x] `GETFIELDIREF`
- [x] `SHIFTIREF`
- [x] `GETVARPARTIREF`
- [x] `LOAD`
- [x] `STORE`
## `COMMINST` tests
- [x] `COMMINST @uvm.thread_exit`
- [x] `COMMINST @uvm.native.pin`
- [x] `COMMINST @uvm.native.unpin`
- [x] `COMMINST @uvm.get_threadlocal`
- [x] `COMMINST @uvm.set_threadlocal`John ZhangJohn Zhanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/4IR Instruction Implementation2017-07-12T10:49:36+10:00John ZhangIR Instruction ImplementationA list of IR Instruction needed progressively.
Tick them off as you go, and watch for updates.
## IR Instructions
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UDIV`
- [x] `SREM`
- [x] `UREM`
- [x] `SHL`
- [x] `ASHR`
- [x] `L...A list of IR Instruction needed progressively.
Tick them off as you go, and watch for updates.
## IR Instructions
- [x] `ADD`
- [x] `SUB`
- [x] `MUL`
- [x] `SDIV`
- [x] `UDIV`
- [x] `SREM`
- [x] `UREM`
- [x] `SHL`
- [x] `ASHR`
- [x] `LSHR`
- [x] `FADD`
- [x] `FSUB`
- [x] `FDIV`
- [x] `FMUL`
- [x] `AND`
- [x] `OR`
- [x] `XOR`
- [x] `EQ`
- [x] `NE`
- [x] `SGE`
- [x] `SGT`
- [x] `SLE`
- [x] `SLT`
- [x] `ULE`
- [x] `ULT`
- [x] `FOEQ`
- [x] `FOGT`
- [x] `FOLE`
- [x] `FOLT`
- [x] `FONE`
- [x] `TRUNC`
- [x] `ZEXT`
- [x] `SEXT`
- [x] `REFCAST`
- [x] `PTRCAST`
- [x] `FPTOSI`
- [x] `SITOFP`
- [x] `UITOFP`
- [x] `SELECT`
- [x] `BRANCH`
- [x] `BRANCH2`
- [x] `SWITCH`
- [x] `CALL`
- [x] `RET`
- [x] `NEW`
- [x] `NEWHYBRID`
- [x] `GETIREF`
- [x] `GETFIELDIREF`
- [x] `GETELEMIREF`
- [x] `SHIFTIREF`
- [x] `GETVARPARTIREF`
- [x] `LOAD`
- [x] `STORE`
- [x] `CCALL`
- [x] `NEWTHREAD`
- [x] `COMMINST @uvm.thread_exit`
- [x] `THROW`
- [x] `COMMINST @uvm.native.pin`
- [x] `COMMINST @uvm.native.unpin`
- [ ] `COMMINST @uvm.get_threadlocal`
- [x] `COMMINST @uvm.set_threadlocal`
Yi LinYi Linhttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/3API Implementation2017-07-12T10:49:36+10:00John ZhangAPI ImplementationBelow is the list of API calls to implement, ordered by priority derived from requirements from test cases.
Please tick them off as you go, and watch for updates.
## `MuIRBuilder`
- [x] `load`
- [x] `gen_sym`
- [x] `new_bb`
- [x] `new_...Below is the list of API calls to implement, ordered by priority derived from requirements from test cases.
Please tick them off as you go, and watch for updates.
## `MuIRBuilder`
- [x] `load`
- [x] `gen_sym`
- [x] `new_bb`
- [x] `new_binop`
- [x] `new_branch`
- [x] `new_branch2`
- [x] `new_call`
- [x] `new_ccall`
- [x] `new_cmp`
- [x] `new_comminst`
- [x] `new_const_double`
- [x] `new_const_extern`
- [x] `new_const_int`
- [x] `new_const_int_ex`
- [x] `new_const_null`
- [x] `new_conv`
- [x] `new_dest_clause`
- [x] `new_exc_clause`
- [x] `new_func`
- [x] `new_func_ver`
- [x] `new_funcsig`
- [x] `new_getfieldiref`
- [x] `new_getiref`
- [x] `new_getvarpartiref`
- [x] `new_global_cell`
- [x] `new_load`
- [x] `new_new`
- [x] `new_newhybrid`
- [x] `new_ret`
- [x] `new_select`
- [x] `new_shiftiref`
- [x] `new_store`
- [x] `new_switch`
- [x] `new_throw`
- [x] `new_type_double`
- [x] `new_type_float`
- [x] `new_type_funcref`
- [x] `new_type_hybrid`
- [x] `new_type_int`
- [x] `new_type_iref`
- [x] `new_type_ref`
- [x] `new_type_struct`
- [x] `new_type_ufuncptr`
- [x] `new_type_uptr`
- [x] `new_type_void`
## MuCtx
- [x] `store`
- [x] `get_field_iref`
- [x] `get_iref`
- [x] `get_var_part_iref`
- [x] `handle_from_const`
- [x] `handle_from_func`
- [x] `handle_from_global`
- [x] `handle_from_sint64`
- [x] `handle_from_uint64`
- [x] `handle_from_uint8`
- [x] `id_of`
- [x] `new_fixed`
- [x] `new_hybrid`
- [x] `new_ir_builder`
- [x] `refcast`
- [x] `shift_iref`Kunshan WangKunshan Wanghttps://gitlab.anu.edu.au/mu/mu-impl-fast/-/issues/2Granularity of RwLocks2017-06-14T00:11:57+10:00Kunshan WangGranularity of RwLocksDesigned for concurrency, Rust enforces the use of proper synchronisation when working with shareable memory locations. Atomic integers, Mutexes and RwLocks are a few examples. These types can be used even if the memory locations are "im...Designed for concurrency, Rust enforces the use of proper synchronisation when working with shareable memory locations. Atomic integers, Mutexes and RwLocks are a few examples. These types can be used even if the memory locations are "immutably shared": the user can gain read-write access by obtaining the read-write lock, or using atomic operations.
In the centre of the mu-impl-fast is the [VM](https://gitlab.anu.edu.au/mu/mu-impl-fast/blob/master/src/vm/vm.rs#L25) object. It contains the global IR information, including all types, constants and functions ever loaded into the micro VM. Accesses to these data structures must be properly synchronised.
Currently the way of synchronisation is fine-grained locking. Every HashMap is protected by a RwLock. This will enable data-race-free access to the shared data structures, but it does have its disadvantages.
- **Potential deadlocks.** Sometimes one operation needs to have read-write access to multiple objects. It is very easy to run into the simplest [ABBA deadlock](http://cmdlinelinux.blogspot.com.au/2014/01/linux-kernel-deadlocks-and-how-to-avoid.html) if the locks are obtained in different orders by different threads.
- workaround: If "bundle loading" is the only operation that needs RW access, it can simply obtain all RW locks before performing any operations. During the operations, it is free to obtain any extra RO locks if necessary.
- **The locking is too fine-grained.** Putting locks on too many objects will increase the space overhead, and will require the user to perform more locking operations which will increase the time overhead.
- workaround: For any operation that needs RO access, holding the RO locks during the entire operation rather than frequently acquiring/releasing the locks can reduce the time overhead, but not the space overhead.
- **Using locks instead of lock-free data structures.** Bundle loading is extremely rare comparing to RO accesses. In the current design, Mu-level exception handling needs to obtain RO access to the VM stack-unwinding metadata. Although exception handling is slow-path from the user program's perspective, it is still more common than client-to-MuVM API calls. User-level exception handling should be lock-free.
Ideally, the shared data structures should be implemented with **transactional lock-free data structures that strongly biases towards fast read-only accesses**. The ideal data structure should be the [RCU](https://en.wikipedia.org/wiki/Read-copy-update)-like [multi-version concurrency control](https://en.wikipedia.org/wiki/Multiversion_concurrency_control) algorithm. Actually all Mu IR nodes are immutable by design (NOTE: Function "redefinition" is actually, by design, "adding a version to a function", so existing versions do not change. There is also no API that asks "what versions does a function have".) , so it doesn't matter if the client sees a slightly older version of the data structure: it will not see some new-coming nodes concurrently being inserted, but whatever it sees, it is the correct version that it should see. (For example, it is OK if a newly-started function runs its older version rather than the "newest version", because it is allowed, unless the client uses proper fences and synchronisations by itself.)
But since our current goal is to have a working VM, we can postpone these optimisations and just use the status quo for now.