mu-spec issues

TAILCALL is not always possible

2017-07-23T11:50:43+10:00

Currently the mu-spec requires the TAILCALL instruction to not create a new 'frame' (and is supported whenever the callee and caller have the same return type). Unfortunantly, on aarch64 at least, I can't guarantee this when the callee's signature requires more space for arguments on the stack than the 'caller' (since we can't simply place the arguments for the callee in the callers argument space since it's insufficient space). Currently, the best idea I could come up with as to implementing this is roughly as follows: ``` caller: SUB SP, SP, #E // Reserve extra space // The usual prologue: // Create a new frame record STP FP, LR, [SP], #16 // Push (FP, LR) MOV FP, SP // Save callee saved registers and allocate stack space (as normal)... ... // The code emitted for a call to callee // Compute arguments as usual, and place them in registers // or the argument stack space starting at FP-16 (i.e. the result of the SUB instruction emitted above) // Restore all callee saved registers // Restore the frame record MOV SP, FP LDP FP, LR, [SP], #16 // Pop (FP, LR) // We have to save the LR somewhere (we can't use the stack as the callee may modify it // but we just restored all callee saved registers so we can't use them either // TODO: append LR (somehow?) to a thread-local linked list on the heap... BL callee callsite: // Record the exceptionall destination as 'exception' ADD SP, SP, #E // TODO: Restore LR from the heap RET LR ... epilogue: // Before caller returns // Do the normal epilogue (restore callee saved register, pop the frame record)... ... ADD SP, SP, #E exception[exc]: // I could probably do this inside muentry_throw_exception itself... ADD SP, SP, #E // TODO: Restore LR from the heap MOV X0, exc // Rethrow 'exc' B muentry_throw_exception ``` Where 'E' is maximum of (B - A), where B is the stack argument size of each signature that is in a `TAILCALL`, and A is the stack argument size of caller. Unfortunately, this doesn't actually do a taillcall (as it uses a call instruction, `BL`, instead of a branch `B`), and requires the horrible storing of the link register on the heap somewhere (I haven't actually worked out how to do this..) (by doing this weird link register marshalling however I may be breaking the no new 'frame' rule). I could simply do a normal call, which may be more efficient (as it won't require storing the LR on the heap), however that would definitely break the no new frame rule. I checked with clang/llvm (as @U1817699 suggested), and it has a `musttaill` flag for it's call instruction, which guarantees a tailcall, but can only be used if the callee and caller have the same signature/prototype, it also has a `tail` flag which is merely a hint, and does not actually do a tailcall or affect the generated code in the case mentioned above. Is my implementation conform ant with the spec? If not, should we change the spec or does anyone have a better idea of how to implement it?

Swapstack exc_caluse

2017-08-08T01:22:09+10:00

Currently the swap stack instruction requires an exception clause, this makes no sense if the current_stack_clause is 'KILL_OLD'. I suggest making a minor change: * If the current stack clause is 'KILL_OLD', require that the exception clause is absent (and have control flow analysis treat this instruction in the same was a thread_exit, I.e. it has no successor) * Otherwise, require an exception clause Also am I correct in assuming it's undefined behaviour to do a swap_stack/new_thread where the new stack clause is THROW_EXC but the last thing executed on the swappee was not a swap stack instruction with an exception clause?

Branch to entry block

2017-09-18T09:06:53+10:00

> The entry block must not be branched to from any basic blocks. Why is this restriction here?

Add a Union Type and simple TaggedRef type

2018-01-26T04:47:41+11:00

I suggest adding a `union` type, it would be like a C `union { T1 t1; ... }`, you would use it like any other type (you can have SSA variables of it, you can new it and alloca it) but also extend the instruction set: * `GETFIELDIREF [PTR] opnd` will get an iref/uptr to the index'th variant of the union opnd (of type T), this would also communicate to the GC that `opnd` should be treated as having a value of type Ti (where Ti is the index'th type in the union T). * `EXTRACTVALUE` and `INSERTVALUE` should be extend to work on unions, in the same way as above The main purpose of having union types is to allocate a structure of known size on the heap, but whose actual type may change throughout it's lifetime. This would allow for example a type like 'union, ref` that may be an integer or a reference, and to inform the GC of this whenever you use it (but when the actual type is not known at allocation or compile time). I also suggest adding a simple tagged reference type (I think? x86-64 allows up to 17 of the highest bit as a tags and aarch64 allows up to 8), that would store a reference and a tag, like tagref64 but more efficient as it would not store floating points or integers (so you can do a simple 'and' instruction to get the tag and/or pointer). These changes would be particularly useful to @pzakopaylo01 who is trying to implement a Haskell closure, but the layout is not known at the time of creating it (but because Mu is Garbage collected he can't simply cast random bits to/from references and other types). His current solution is utilising tagref64 but that has a few problems: * It is inefficient, for example it takes 10 instructions on aarch64 to make a tagged pointer tagref, it would take 1 with a simple tagged reference type * The range of values you can store in it is limited (specifically it is limited to storing a 52-bit integer which limits the precision of Haskell datatypes) * It is cumbersome to use, and not general enough to support they way it is used He may be able to better describe his problems than me, but either way I think these two changes would be simple and straightforward to implement in Zebu (though the garbage collector will need a bit of work), and would help extend Mu to be more easily target-able by a wider range of languages (such as ones with dynamic types).

MOVED TO mu-impl-fast

2018-01-26T04:47:41+11:00

MOVED TO mu-impl-fast

Returning exit status from a Mu program

2018-01-26T04:47:41+11:00

Currently there does not appear to be a standard way in Mu to return an exit status from a program, or pass arguments to it, e.g. you could do this is c: int main(int argc, char* argv[]) { return argc; } To implement this in Mu, you have to return an exit status by CCALLing the c 'exit' function, (which is bad as CCALL is implementation specific). ``` .typedef @int32 = int<32> .typedef @char = int<8> .typedef @string = uptr<@char> .typedef @string_array = uptr<@string> .funcsig @main_sig = (@int32 @string_array)->() .funcdecl @my_main <@main_sig> .funcsig @exit_sig = (@int32) -> () .typedef @ufp_exit = ufuncptr<@exit_sig> .const @const_exit <@ufp_exit> = EXTERN "exit" .funcdef @my_main VERSION @my_main_v1 <@main_sig> { %entry(<@int32>%argc <@string_array>%argv): CCALL #DEFAULT <@ufp_exit @exit_sig> @const_exit (%argc) RET } ``` I suggest altering the spec to allow the primordial function to have the signature `(int<32>, uptr>)->(int<32>)` and have it behave like the c main function. If there are multiple threads, each with there own 'root' function, I suggest having the first one to return an int `i` be equivalent to calling `exit(i)`, if you don't wan't the program to die, then you'd use `THREADEXIT` instead. So that above could be rewritten as: ``` .funcdef @my_main VERSION @my_main_v1 <(@int32 @string_array)->(@int32)> { %entry(<@int32>%argc <@string_array>%argv): RET %argc } ```

Signature list missing from summary in `common-insts.rst`

2018-01-26T04:47:41+11:00

The top of `common-insts.rst` currently looks like (code below). However, this list does not mention the signature list, which is both a required parameter in the IR-building API (`muapi.h`) and is discussed in later sections of the same document (`common-insts.rst`) . ```markdown **Common Instructions** (sometimes abbreviated as "comminst") are instructions that have a common format and are used with the ``COMMINST`` super instruction. They have: 1. An ID and a name. (This means, they are *identified*. See ``__.) 2. A flag list. 3. A type parameter list. 4. A value parameter list. 5. An optional exception clause. 6. A possibly empty (which means optional) keep-alive clause. ```

Meta ID/Name conversion name string `\0`-terminated

2018-01-26T04:47:41+11:00

Is `%name` in `[0x250]@uvm.meta.id_of (%name: @uvm.meta.bytes.r) -> int<32>` a `\0`-terminated string? It seems that in the reference implementation there is such assumption. Maybe it's good to make it clear in the spec?

Indirect branch

2018-01-26T04:47:41+11:00

I notice that the branch instruction is direct (dest must be a block label). Why don't we support indirect branches? DO we have a way of emulating with TAILCALL? I know that the only code addresses that we can talk about in our type system are function references. So, we don't even have a way to compute code addresses otherwise. In note that LLVM has indirect branch.

Casts in muapi.h, please

2018-01-26T04:47:41+11:00

*Created by: eliotmoss* Things like MU_BINOP_ADD should have the cast to MuBinOptr, e.g.: #define MU_BINOP_ADD ((MuBinOptr)0x01) (This is the consensus of the UMass team + Tony)

Should MuCString be const?

2018-01-26T04:47:41+11:00

In muapi.h the type `MuCString` is defined as `typedef char *MuCString`. It is used by some functions such as `gen_sym`, which is often called with a string literal argument. Should we make `MuCString` a typedef for `const char*` as I don't believe `gen_sym` (or any other functions that take a `MuCString`) has a reason to modify the parsed string. I am current using muapi.h with C++, and so I haved to use a `const_cast` (which is unsafe, the other alternative is to create a temporary buffer for the string, copy it into the buffer and then free it after the call (this would be perfectly safe but is probably unnecessary)).

Void type allocation

2018-01-26T04:47:41+11:00

Just wondering, does Mu permit `NEW <@void>`?

Calling an exposed Mu function in Mu

2018-01-26T04:47:41+11:00

This problem has been raised before and discussed, but has not been properly documented. How existing (especially reference implementation) handles it also needs to be documented. This thread shall serve these purposes ### Problem Description RPython allows casting a function pointer to and from an address. A situation can occur when the code loads a function address, cast it back to a function pointer and calls it. This poses some problem for Mu because the function is of type `funcref`, and currently `Address` is translated as `uptr`. `ufuncptr` in Mu is specifically for native interface. A Mu function can be 'exposed' to get a `ufuncptr`, but it is expected that it will be called from C rather than from within Mu itself. What should Mu do in such case? Should the instruction be `CALL` or `CCALL`? I do acknowledge that the deeper issue is the mismatch of type system assumptions that's built into RPython compiler. Handling it though can be quite tricky. In previous discussions, Yi did point out that in the Zebu implementation this is not a problem. But I believe that this is a problem for the current reference implementation.

Separation of immovability and immortality in `@uvm.native.pin` instruction

2018-01-26T04:47:41+11:00

## Problem Description Currently `@uvm.native.pin` instruction bestows both immovability (not able to move) and immortality (kept alive) on the pinned object. However some times we might only want an object to be immovable but can still be collected by the GC. This need was raised in mu/mu-client-pypy#10. When an object `p` in PyPy heap (Mu heap) is bound to an object `o` in C (`CCALL malloc`), `p` is pinned and `o` keeps the address of `p` as an integer, and the reference count on `o` is set to a special value to mark this bounding. PyPy's GC, however, can still collect `p`, and it has different implications on `o` depending on the reference count. This behaviour is currently not supported in Mu. The current semantic of `@uvm.native.pin` can result in memory leak as `p` is never collected. Currently the spec doesn't seem to have a very clear specification on this issue. In mu/general-issue-tracker#28 it is mentioned that the pointer **can be used** until the object is unpinned. This does imply both immovability and immortality until a corresponding `@uvm.native.unpin` instruction. This should be clarified further in the spec. This does raise the problem of pointer validity. If an object is pinned with the option of being mortal, when it is collected, the pointer become invalid. Do we just then push the responsibility to the client and say that the client must know what they are doing and must not access an invalid pointer in this way? ## Proposed change Add an optional flag to `@uvm.native.pin` instruction to allow mortal pinning: ``` [0x240]@uvm.native.pin [#MORTAL] (%opnd: T) -> uptr ``` - The presence of the flag `#MORTAL` allows the pinned object to be collected by the GC when it becomes unreachable. When the object is collected, all derived `uptr`s become invalid.

Determine generality of Mu's weak reference + finaliser support

2018-01-26T04:47:42+11:00

Mu already has weak refs and there is an existing finalisation proposal. The question this issue seek an answer to is what language semantics of weak references can these constructs implement? In particular, Java has four levels of strength of pointers, each with specific semantics. Can Mu model that? Likewise, various flavors of Lisp and Scheme offer different weak reference semantics. Which of those can Mu model? A guess going in to this is that Mu can model some, but not all, the flavors of weak references there are out in the wild. So another question is: How much do we care? If we want to cover more cases, what additional semantics would do that, in a language-neutral way? Would we instead support some kind of side interface to the Mu collector that allows clients to directly bolt stuff on? What would that look like? (Maybe that's way beyond reasonable.)

COMMINSTs for getting the values of top-level definitions by ID

2018-01-26T04:47:42+11:00

This issue is motivated by https://gitlab.anu.edu.au/mu/mu-client-pypy/issues/8 Currently, the set of common instructions is insufficient. After a bundle is built and loaded at run time, there is no way to get the value of *any* top-level definition by their ID using the COMMINST-based client API. The C-based client API has the `handle_from_xxx` functions that create handles for top-level definitions, but their counterparts are absent in COMMINSTs. The straight forward solution is to add the following COMMINSTs so that top-level definitions can be dynamically looked up by ID. ``` [0x268]@uvm.meta.constant_by_id (%id: int<32>) -> T [0x269]@uvm.meta.global_by_id (%id: int<32>) -> T [0x26a]@uvm.meta.func_by_id (%id: int<32>) -> T [0x26b]@uvm.meta.expfunc_by_id (%id: int<32>) -> T ``` ## Alternative solutions ### Adding one single COMMINST The following COMMINST is sufficient to get any top-level definition by ID. ``` [0x268]@uvm.meta.top_level_by_id (%id: int<32>) -> T ``` But it has less static information for the Mu backend compiler to translate that instruction. ### Having a per-bundle initialiser function It will behave like Java's `` method. But it is no different from allowing the client to look up a particular Mu function by ID and then execute it.

Add semantics about GC finaliser to spec

2018-01-26T04:47:42+11:00

The proposal is from https://gitlab.anu.edu.au/mu/general-issue-tracker/issues/27 by Kunshan Wang. The proposal suggests two common instructions, `@uvm.gc.prevent_death_once` and `@uvm.gc.next_object_to_finalise`. * `@uvm.gc.prevent_death_once` marks an object as finalisable. When GC finds the object dead, it put the object to a finalisable object queue. * `@uvm.gc.next_object_to_finalise` will pop an object reference from the queue. These instructions allow client to create a finaliser thread (running Mu code). e5848b32 captures the idea and put them into the spec.

Low-bit ref tagging

2018-01-26T04:47:42+11:00

CakeML, like many ML implementations I suspect, uses sophisticated low-bit tagging in its pointers. While taggedref64 in Mu is a start, and is a kind of clever "hack", it does not really cover the CakeML case and also prevents one from using any / all parts of the 64-bit address space. So after some reflection I would like to start discussion of a possible general technique for supporting low-bit-tagged references. (One can imagine -high bit tagging as well using BiBoP - big bag of pages, where each "page" has a given type and thus the high bits tell the types apart, but I don't think that necessarily meets the same need.) So I start with the idea of lowtaggedref where n gives the number of tag bits desired and t is the type of object the ref refers to. Note that choosing a particular value for n requires that objects referred to by the lowtaggedref must be align on a 2^n boundary. This may be larger than the minimum granule size that the allocator would naturally use, and may lead to fragmentation, but the user is buying into that when making the choice. Actual allocators probably use minimum granule sizes of 4, 8, or 16 bytes anyway, which allows for 2, 3, or 4 tag bits without extra fragmentation. This information is not enough, however. The reason is that some tag values may indicate that the remainder of the ref is a pointer while other values indicate that it is a non-pointer value. Again, on reflection, and taking into account the emerging spec I am working on with MIchael and Tony for an Immix collector that might be used with CakeML *and* with Mu, I propose to cover that case with a different type. Therefore, lowtaggedref gives n low bits of tag value that can be fetched, etc., separately from a ref value in the remaining bits, and it is implemented by forcing 2^n alignment. It support a null pointer as well (which also has the same n bits of tag (so an == 0 comparison is not quite enough to detect a null pointer - you have to mask off the tag). Now, to support dynamic mixing of pointer and non-pointer data, a bit more like taggedref64 and what happens in a number of dynamic languages (and CakeML, which uses a "dynamic" representation in the heap because of parameterized types), I add dyntaggedref. Here t is the type of thing that this refers to *if* it is a ref, n is the number of low bits used for tagging, and pattern is an n-bit number that tells the system which tag values indicate that this value is a ref. In particular, if bit i is a i, then tag value i indicates that the w-n ref bits are a pointer (w is the word size). So, a 2-bit tag requires a 4-bit pattern, a 3-bit tag an 8-bit pattern, etc. Since these values are statically part of the type, GC can still pick things apart. It is also possible, using classical Karnaugh map reduction techniques, etc., to derive, automatically, fast masking checks, though the general case would just shift the pattern right and check the low bit (or equivalent).

push_frame should take frame cursor as parameter

2018-02-19T20:22:27+11:00

It is a mistake when refactoring the API to use frame cursors. https://gitlab.anu.edu.au/mu/mu-spec/commit/9853cbddf3f34eef88e8f884e162a8789ccb41cc All functions related to stack manipulation should take a frame cursor as a parameter. Therefore, ```c void (*push_frame )(MuCtx *ctx, MuStackRefValue stack, MuFuncRefValue func); ``` should be ```c void (*push_frame )(MuCtx *ctx, MuFCRefValue cursor, MuFuncRefValue func); ```

Change common instructions to intrinsics

2019-07-24T02:44:33+10:00

Common instructions should be changed to intrinsics in the spec, because ”common” instructions are extremely rare.

Change global memory to static memory

2019-07-24T02:43:20+10:00

This should be done because the heap is also global

Redefine the API in XML (or another high-level language)

2019-07-24T02:44:34+10:00

As @U60333591 Javad suggested fixing the terminologies (https://gitlab.anu.edu.au/mu/mu-spec/issues/20, https://gitlab.anu.edu.au/mu/mu-spec/issues/21), I suggest fixing the scripts that manages the API headers and the intrinsics documentation, too. There is a directory `https://gitlab.anu.edu.au/mu/mu-spec/tree/master/scripts` that contains some scripts. In the past, every time I modified intrinsics (still called CommInsts at that moment), I called the `synchronise_everything.sh` script to make things consistent. - `synchronise_everything.sh`: This executes the following scripts. - `muapi-irbuilder-to-comminsts.py`: This parses the `muapi.h` header file, finds the IR builder API part, and generates the corresponding intrinsics (CommInsts) in the `common-insts.rst` document. The purpose is that *all API functions should be available as intrinsics (CommInsts)*, with differences if appropriate. - `comminsts-to-muapi.py`: It parses the `common-insts.rst` file, grep the comminst definitions, and generate the corresponding definitions in the `muapi.h` header. You see, it is quite dirty. I have to parse the embedded code in the `.rst` documents as well as the `.h` header, and inject contents into each other. In fact, this is my least favourite part of the spec. I have long been thinking that it should be done differently. - **The API should be defined in a more machine-readable format**, such as XML, and - **the API should be defined in a higher-level type system**, not C, but can be easily mapped to C, Java, or other high-level languages and generate bindings accordingly. For example, here is an API function that creates a new thread on an unbound stack, and resumes the stack normally: ```c MuThreadRefValue (*new_thread_nor)(MuCtx *ctx, MuStackRefValue stack, MuRefValue threadlocal, MuValue *vals, MuArraySize nvals); /// MUAPIPARSER threadlocal:optional;vals:array:nvals ``` The `MUAPIPARSER` magic is followed by several other magics that tells the parser that the `threadlocal` parameter is optional, and `vals` is an array, and its length is `nvals`. What it really means, in Java, is: ```java MuThreadRefValue new_thread_nor( MuStackRefValue stack, Optional threadlocal, List vals); ``` Yes. I have been careful to limit what type the API can use, hoping to make it less susceptible to the complexity of the API (such as struct layout). As a result, - The possible types of parameters and the return value only includes scalar types (including pointers) and lists (as C arrrays), but not structs, and - If a parameter has pointer type, it may or may not be optional, but it is always documented, and - If a parameter has list type, it is always required to supply its length in another parameter. But we cannot express the idea of "optional" and "array with length" in plain C, and those are what the `MUAPIPARSER` magic is for. Using Java is okay since its type system can express those ideas, but it is still difficult to parse. XML should be a better choice. ```xml ``` It should be easy to convert such XML snippet into a C function declaration (perhaps not the other way around). I am not sure if the documentation should be included as part of the XML, because I still think reStructuredText is more appropriate for documentation. I don't know how intrinsics should be described. Maybe inline code snippets are still appropriate, because they should be in Mu IR, and should be precisely specified. I think most Mu-related projects are not as active as a few years ago, so it should be safe to fix the implementations, too, without worrying about getting in the way of other contributors.