Commit e14f9334 authored by Kunshan Wang's avatar Kunshan Wang

MicroVM-client interface.

parent 7b67fc85
......@@ -14,5 +14,7 @@ Contents:
- `Instruction Set <instruction-set>`__
- `Intrinsic Functions <intrinsic-funcs>`__
- `Memory Model <memory-model>`__
- `µVM-Client Interface <uvm-client-interface>`__
- `Implementation <implementation>`__
.. vim: tw=80
There are two aspects of the MicroVM, namely the specification and the
Some details of the MicroVM are not specified by the specification and are left
to the implementation. They include:
- The underlying platform, i.e., the hardware, the word length (16-bit, 32-bit,
64-bit, ...), the endianness (big-endian, little-endian, ...), the operating
system, etc.
- The programming language and the runtime environment the µVM itself is
implemented in and run on. It is not always practical to run the µVM upon
another VM, but the specification does not force implementing µVM natively.
- The way µVM interfaces with the client. They can be the same program, in the
same process or in different processes. They can be written in the same
language or different languages.
- The way µVM IR code is represented and transmitted. The µVM specification
defines the µVM IR, including the type system, the instruction set, etc.,
their semantics and two representations (text and binary) for interchanging.
However, for practical reasons, the client may represent the µVM IR in
equivalent formats.
- The way µVM IR code is executed. The µVM is designed for the convenience of
just-in-time compiling, but it does not forbid implementing as an interpreter.
- The way µVM threads are implemented. The behaviour of µVM threads are
specified and the µVM is designed with simultaneous multi-threading in mind,
but the implementation may or may not implement threading using native
- The concrete garbage-collection algorithm used. The semantic of references is
defined and the implementation can choose any garbage collector as deems
Implementations should advertise their specific details to their users.
.. vim: tw=80
µVM-Client Interface
This page defines the abstract interface between the µVM and the Client,
including messages, call-backs, signals and so on. Because how a µVM is
implemented is not defined by this specification, the concrete interface (i.e.
the concrete API in high-level languages) of particular µVM implementations may
How to start a µVM and/or a client is implementation-specific.
In the beginning, there are no µVM stacks and no µVM threads in the µVM.
µVM IR Code Loading
µVM IR code is provided by the client. It is delivered in the unit of a bundle.
A bundle consists of many type definitions, function signature definitions,
constant definitions, global data definitions, function declarations and
function definitions, which are collectively called top-level definitions. See
`uVM IR <uvm-ir>`__ for more details.
How a bundle is delivered from the client to the µVM is implementation-specific.
Multiple bundles can be sequentially delivered to the µVM. If the µVM implements
parallel bundle delivery, the result must be equivalent to as if they were
delivered in a specific sequence.
In a bundle, if a types, function signatures, constants, global data or function
declaration has the same ID or name as an existing top-level definition defined
in a previous bundle, it is an error.
If a function definition has the ID as a previous function definition or
function declaration, it must also have the same function signature and the new
function definition **redefines** the previous function definition or
declaration. If the signatures are different, it is an error.
When a function definition redefines another function definition or declaration,
all existing call sites to the previously defined or undefined function now
calls the newly defined function.
TODO: when a µVM thread runs simultaneously with the client loading a bundle, is
the old version of a function still visible to any µVM thread? If so, is the old
version of the function still visible to the client?
Stack and Thread Creation
Trap and Undefined Function Handling
Client-held GC Roots
Direct Memory Access for the Client
Signal Handling
The µVM needs to handle (to be defined) some hardware traps including
divide-by-zero errors and floating point exceptions. These should be implemented
by signal handling in UNIX-like operating systems. Meanwhile the client may also
need to handle such erroneous cases, for example, when implementing an
interpreter. According to how the operating system works, only one signal
handler can be registered by a process at the same time.
In an environment where the µVM is present, the client should not register the
signal handler. The µVM should register the signal handler. When signals arrive,
e.g. SIGFPE for divide-by-zero error, the µVM should check if the error occurs
in any µVM IR code. If so, it should be handled within the µVM (to be defined)
by taking the exceptional branching (to be defined). If it does not occur in any
µVM IR code, it should let the client handle it by calling back or sending
messages to the client depending on the implementation. Errors like
divide-by-zero within the µVM runtime (e.g. the garbage collector) are fatal and
will not be handled. The previous signal handler registered by external
libraries will be preserved by the µVM in case the error does not occur within
the client, either, and needs to be daisy-chained to external libraries.
.. vim: tw=80
......@@ -92,19 +92,21 @@ Example::
RET <int<32>> %rv
All identifier are unique within their scopes. i.e. a global identifier must be
unique among all top-level definitions, of the same kind of not. A local
identifier must be unique in a function definition.
Type Definition
µVM provides a simple but expressive type system.
A type is constructed by a finite but potentially recursive combination of type
constructors, including ``int``, ``float``, ``double``,
``ref``, ``iref``, ``weakref``, ``struct``,
``array``, ``hybrid``, ``void``, ``func``,
``thread``, ``stack`` and ``tagref64``. They are
documented in `<type-system>`__.
constructors, including ``int``, ``float``, ``double``, ``ref``, ``iref``,
``weakref``, ``struct``, ``array``, ``hybrid``, ``void``, ``func``, ``thread``,
``stack`` and ``tagref64``. They are documented in `<type-system>`__.
In the text form, wherever a type is expected, it can be written inline using
In the text form, wherever a type is expected, it can be written in line using
the above constructors or give a name to a type and reference that type by name.
A type definition gives a name to a type. It looks like::
......@@ -280,35 +282,17 @@ A declared function has no body and can be defined later.
Note that the definitions does not have an order. It is allowed to define two
functions that call each other without having to declare the second
A function can be re-defined provided that the signature is not changed. The new
function will replace the old one and all existing call sites to the old
function will automatically call the new version.
The identifier of a function defined by ``.funcdef`` or ``.funcdecl`` represents
a constant SSA Value of type ``func``. It can be used by the
``CALL``, ``INVOKE``, ``TAILCALL`` and
``NEWSTACK`` instructions.
It is an error to have both a function declaration and a function definition of
the same identifier in a single bundle, even if they have the same signature.
Function Identifier
A function can be re-defined in another bundle delivered to the µVM provided
that the signature is not changed. The new function will replace the old one and
all existing call sites to the old function will automatically call the new
version. See `uVM-Client Interface <uvm-client-interface>` for details.
Each function, declared or defined, has a unique function identifier, which is
**not** the identifier in the text form or the binary form of the µVM IR. It is
the value of the ``func`` type, which is opaque in the sense that the
underlying binary runtime representation is an implementation detail of the µVM.
It may be implemented as the address of the compiled code, but does not have to
When a function is declared, such a unique ID is reserved for the function. When
defining a function, the function ID is bound to the definition. When
re-defining a function, the newly defined function body replaces the older
version, but the function ID does not change. All existing values of the
``func`` type remain valid, but refers to the newer version of the
function, instead. All existing activation of the older version of the function
remain to be valid. Decided by the implementation, the garbage collector may
reclaim the space of compiled function code once there is no active frames of
the older version on all stacks.
The identifier of a function defined by ``.funcdef`` or ``.funcdecl`` represents
a constant SSA Value of type ``func``. It can be used by the ``CALL``,
``INVOKE``, ``TAILCALL`` and ``NEWSTACK`` instructions.
Function Body
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment