Proposed Lua-like µVM-Client Interface

Created by: wks

In Lua, the C program exchanges values with a "Lua state" using a stack.

All Lua values are kept on the stack and can converted to and from C values on demand.
All Lua references must be kept on the stack. All operations involving tables require the table operand to be on the stack.

Why?

The type systems of Lua and C are different. This stack segregates all Lua types from C types.
Lua uses garbage collection. (The official Lua uses mark-sweep.) This stack simplifies GC by preventing the C program from keeping a reference to the Lua world.

Reference: The stack in the Lua C API http://www.lua.org/pil/24.2.html

Overview

The Client interacts with the µVM via a µVM Client Agent. The Agent keeps a stack of µVM values, a thread-local allocator and so on. It is the counterpart of a µVM thread, albeit working for the Client. Each Agent is only accessible from one Client thread (not thread safe), but a µVM Client may have arbitrary number of Agents.

There are several principles:

The stack holds any µVM value that can be held in a µVM SSA variable. A cell in the stack is like a µVM SSA variable. Unlike the µVM memory, it does not have memory location and cannot be referred to.
The Client provide implementation-specific way to add Client values into the Agent stack and extracting µVM values to Client values.
All µVM API messages that take µVM values as parameters shall use existing µVM values on the stack. All µVM API messages that return µVM values shall push new values on the stack.

Example (Java as Client):

MicroVM mvm = …;

ClientAgent ca = mvm.newClientAgent();

// push some values
ca.pushInt("@i32", 0x12345678);
ca.pushLong("@i64", 0x123456789abcdef0L);
ca.pushFloat(3.14F);
ca.pushDouble(6.28);

// The Java integer type and the µVM type does not need to match.
// The following pushes convert Java integer type to µVM types of different lengths
ca.pushInt("@i64", -0x22334455);
ca.pushLong("@i32", 0x123456789abcdef0L); // truncated to 0x9abcdef0
ca.pushInt("@i1", 1); // integer of 1 bit (boolean type)
ca.pushBool("@i1", true); // same as above.

// Retrieving values from the stack
int v1 = ca.toInt(-4); Get the 4th top element from the stack and convert to Java int. So it is -0x22334455.
long v2 = ca.toLong(-4); Same as above, but convert to Java long (zero extended). So it is 0xddccbbabL.

// Popping
ca.pop(7); Pop 7 elements from the stack.

// Memory access
ca.pushGlobal("@global_var"); // Get the internal reference of a global cell and push on the stack
ca.load(MEMORD_ACQUIRE); // Load. Assume the top element is an internal reference.

ca.pushGlobal("@global_var");
ca.pushInt("@i64", 42);
ca.store(MEMORD_RELEASE); // Store. The top element is the new value and the second is the internal reference.

// Calling a µVM function
// This is fairly complicated because this involves creating both a µVM stack and a µVM thread.

ca.pushFunc("@some_func"); // Push a function reference
ca.pushInt("@i32", 42); // Push argument1
ca.pushDouble(3.14); // Push argument2

try {
    ca.newStack(2); // Create a new stack, using a function with 2 arguments on the stack.
    // So the top 2 elements are arguments and the third element is the function itself.
    // Those elements are popped and a new stack value is pushed.
} catch (MicroVMStackOverflowException e) {
    ...
}

ca.newThread(); // Create a new thread. Assume the top element on the stack is a µVM stack value.

API functions

The principles are

Pushing operations add new values to the top.
The popping operation removes values from the top.
Queries (the operations that convert values back to Java values) are non-destructive. They keep the values.
Operations on the top of the stack are destructive: they pop the operands, like the JVM, then push new values on the top.

Getting the µVM Client Agent

message: new_client_agent
parameters: none
returns: a handle of new client agent

Create a new Client Agent.

Example Java signature: public ClientAgent MicroVM#newClientAgent()

message: close_client_agent
parameters:
1. ca: the handle to the client agent
returns: none

Close the Client Agent.

Example Java signature: public void ClientAgent#close()

Pushing new values to the stack

message: push_value
parameter:
1. uvm_type: the µVM type of the value
2. val: the value in the Client's representation
returns: none
stack top:
- before: ...
- after: ..., val_in_uvm_type

Convert the Client value val to the µVM type uvm_type and push it to the stack. This message only work for non-reference values, including integers and floating point numbers.

The µVM may implement this as multiple functions/methods that best suits the Client programming language.

Example Java signatures:

public void ClientAgent#pushInt(int uvmTypeID, int val)
public void ClientAgent#pushLong(int uvmTypeID, long val)
public void ClientAgent#pushBigInteger(int uvmTypeID, BigInteger val)
public void ClientAgent#pushFloat(int uvmTypeID, float val): will truncate/extend to the µVM type
public void ClientAgent#pushDouble(int uvmTypeID, double val): will truncate/extend to the µVM type
public void ClientAgent#pushFloatNoType(float val): always convert to the µVM float type
public void ClientAgent#pushDoubleNoType(double val): always convert to the µVM float type.
message: push_global
parameters:
1. global: the ID/name of a µVM global cell
returns: none
stack top:
- before: ...
- after: ..., global_val

Push an internal reference of a global cell to the stack.

Example Java signature: public void ClientAgent#pushGlobal(int uvmGlobalID)

message: push_func
parameters:
1. func: the ID/name of a µVM function
returns: none
stack top:
- before: ...
- after: ..., func_val

Push a function reference (µVM's func type) of a µVM function to the stack

Example Java signature: public void ClientAgent#pushFunc(int uvmFuncID)

Converting to Client types

message: to_client_value
parameters:
1. pos: the position in the stack
returns: the µVM value in the client type
stack top: not changed

Convert a value in the stack to the client type. This applies for non-reference types including integers and floating point numbers.

The µVM may implement this as multiple functions/methods that best suits the Client language.

Example Java signatures:

public int ClientAgent#toInt(int pos)
public long ClientAgent#toLong(int pos)
public BigInteger ClientAgent#toBigInteger(int pos)
public float ClientAgent#toFloat(int pos)
public double ClientAgent#toDouble(int pos)

Popping

message: pop
parameters:
1. num: the number of values to pop
returns: none
stack top:
- before: ..., elem_1, elem_2, ..., elem_num
- after: ...

Pop num elements from the stack.

Example Java signature: public void ClientAgent#pop(int num)

Memory Allocation

message: new
parameters:
1. type: the ID/name of the µVM type of the object
returns: none
stack top:
- before: ...
- after: ..., ref

Allocate an object of type type on the µVM heap and push the object reference on the stack.

Example Java signature: public void ClientAgent#newObj(int typeID)

message: new_hybrid
parameters:
1. type: the ID/name of the µVM type of the object
returns: none
stack top:
- before: ..., len
- after: ..., ref

Allocate an object of type type, which must be a hybrid type, on the µVM heap and push the object reference on the stack. The length of the variable part is len, which is any µVM integer types zero_extended to the machine word length.

Example Java signature: public void ClientAgent#newHybridObj(int typeID)

Memory Access

message: load
parameters:
1. memord: the memory ordering
returns: none
stack top:
- before: ..., iref
- after: ..., val

Load from an internal reference iref on the stack and push the loaded value to the stack, using the memord memory ordering.

Example Java signature: public void ClientAgent#load(MemoryOrder memOrd)

message: store
parameters:
1. memord: the memory ordering
returns: none
stack top:
- before: ..., iref, new_val
- after: ...

Store new_val on the stack into an internal reference iref on the stack, using the memord memory ordering.

Example Java signature: public void ClientAgent#load(MemoryOrder memOrd)

Stack and Thread operations

message: new_stack
parameters:
1. nparams: the number of parameters to the stack-bottom function
returns: none
stack top:
- before: ..., func, arg_1, arg_2, ..., arg_nparams
- after: ..., stack

Create a new stack using func as the stack-bottom function and arg_x as its arguments. Push the newly created stack value to the stack.

Example Java signature: public void ClientAgent#newStack(int nParams)

message: new_thread
parameters: none
returns: none
stack top:
- before: ..., stack
- after: ..., thread

Create a new thread which is initially bound to a stack stack. Push the thread value on the Client Agent stack. The new thread thread starts execution immediately.

Example Java signature: public void ClientAgent#new_thread()

Other API functions

TODO: define them later

is_int, is_float, is_ref, is_iref, ... is_stack, is_thread, is_tagref64
tagref64_is_int, ... tagref64_get_ref, ..., tagref64_set_fp... : manipulate the tagref64 type.
copy_value, remove_value: manipulate the Client Agent stack.
extract_field, insert_field: manipulate struct types
extract_element, insert_element: manipulate vector types
get_iref, get_field_iref, get_elem_iref, shift_iref, get_fixed_part_iref, get_var_part_iref: manipulating reference types.
get_current_stack, kill_stack, bind_thread_to_stack: advanced thread/stack operations
get_active_func_version_id, get_current_instruction_id, dump_keepalive_variables, pop_frame, push_frame: for OSR

Known Issues

This API assumes a stack which can contain ANY µVM types that were applicable for SSA variables. This makes it a dynamically types. This is good for Lua because Lua is dynamic and has a small set of types (only nil, boolean, number, string, table, function, userdata, ...), all of which have similar sizes.

The main problem with the µVM is when there are µVM struct type values (especially large structs which, themselves, are bad to be represented as value rather than reference to heap object). Some corner cases include the "complex number" type which can be represented as a struct of two doubles. In any cases, extra type information must be kept for the stack to know the type of all of its elements.

I have to trust the µVM implementation to handle the dynamic typing efficiently.