WARNING! Access to this system is limited to authorised users only.
Unauthorised users may be subject to prosecution.
Unauthorised access to this system is a criminal offence under Australian law (Federal Crimes Act 1914 Part VIA)
It is a criminal offence to:
(1) Obtain access to data without authority. -Penalty 2 years imprisonment.
(2) Damage, delete, alter or insert data without authority. -Penalty 10 years imprisonment.
User activity is monitored and recorded. Anyone using this system expressly consents to such monitoring and recording.

To protect your data, the CISO officer has suggested users to enable 2FA as soon as possible.
Currently 2.2% of users enabled 2FA.

Commit 87369ef1 authored by Adam R. Nelson's avatar Adam R. Nelson
Browse files

Merge branch 'master' of https://github.com/microvm/microvm-refimpl2 into client-uir-writer

parents c161413e e723c4be
Mu Reference Implementation version 2
# Mu Reference Implementation version 2
Version 2.1.0
This project is the current reference implementation of Mu, the micro virtual
machine designed by [The Micro Virtual Machine Project](http://microvm.org).
......@@ -13,11 +14,12 @@ This project is based on the previous works of
[microvm-refimpl](https://github.com/microvm-project/microvm-refimpl) is the
previous reference implementation.
How to compile
## How to compile
**For the impatient**:
* Install JDK 8. If you use Mac, download from
* If you use Mac, install [Homebrew](http://brew.sh/).
* Install [Scala](http://scala-lang.org/) 2.11. If you use Mac and Homebrew,
`brew install scala`.
......@@ -42,30 +44,43 @@ sbt update genSrc eclipse
**Detailed guide**:
You need [Scala](http://scala-lang.org/) 2.11 and
The reference implementation is developed and tested with Java VM 8. You need a
JRE to build the Scala/Java part, and a JDK to build the C binding.
You also need [Scala](http://scala-lang.org/) 2.11 and
[sbt](http://www.scala-sbt.org/) 0.13. It is recommended to install them using
the package manager of your operating system or distribution (including apt-get,
yum, pacman, etc. for GNU/Linux distributions and Homebrew for Mac OS X).
the package manager of your operating system or distribution (such as apt-get,
yum, pacman, etc. for GNU/Linux distributions and Homebrew for Mac OS X) if such
packages are available.
For Ubuntu users: Ubuntu 15.10 does not provide sbt in its repository. Please
[download sbt](http://www.scala-sbt.org/download.html) from the official sbt web
site, or follow the [official sbt installing guide for
Linux](http://www.scala-sbt.org/0.13/tutorial/Installing-sbt-on-Linux.html). If
you experience any "certificate" problems, [this
page](https://github.com/sbt/sbt/issues/2295) provides a solution.
Then after cloning this repository, you can simply invoke `sbt compile` to
compile this project. Or you can do it step by step:
* To download all dependencies from the Maven central repository, invoke `sbt
To download all dependencies from the Maven central repository, invoke `sbt
* To generate the Mu IR parser from the Antlr grammar, invoke `sbt genSrc`. The
generated sources will be in the `target/scala-2.11/src_managed` directory.
To generate the Mu IR parser from the Antlr grammar, invoke `sbt genSrc`. The
generated sources will be in the `target/scala-2.11/src_managed` directory.
* To compile, invoke `sbt compile`. This will also generate the Mu IR parser
using Antlr.
To generate an Eclipse project, install the [sbt-eclipse
plugin](https://github.com/typesafehub/sbteclipse) and invoke `sbt eclipse`.
Make sure you generate the parser before creating the Eclipse project, so that
the generated sources will be on the Eclipse build path.
To compile, invoke `sbt compile`. This will also generate the Mu IR parser using
Make sure you generate the parser (`sbt genSrc`) before creating the Eclipse
project, so that the generated sources will be on the Eclipse build path.
IntelliJ IDEA has plugins for Scala and SBT. Make sure you don't commit `.idea`
or generated project files into the repository.
How to run
## How to run
There is a sample factorial program (generously provided by @johnjiabinzhang) in
the `src/test` directory. To run the program with all dependencies on the
......@@ -86,16 +101,120 @@ sbt 'set fork:=true' 'test:runMain junks.FactorialFromRPython'
`fork := true` tells sbt to run the program in a different process than the one
running sbt itself.
Author and Copyright
## Implementation details
This reference implementation aims to be easy to work with, but does not have
high performance. It may be used by client writers to evaluate the Mu micro VM,
and may also be used by Mu micro VM implementers as a reference to compare with.
The micro VM is implemented as an interpreter written in Scala. The main class
is `uvm.refimpl.MicroVM`, which implements the `MuVM` struct specified by the
but is more Scala-like. The client interacts with the micro VM via
`uvm.refimpl.MuCtx` instances created by the `MicroVM` instance, which
corresponds to the `MuCtx` struct in the spec. `uvm.refimpl.MuValue` and its
subclasses implement the `MuValue` handles, but has a real Scala type hierarchy
and does extra type checking when converting, which is not required by the spec.
The client can also be written in C or other languages that can interface with
C. (TODO: this part is under construction)
### Threading model
It uses green threads to execute multiple Mu threads and uses a round-robin
scheduler: the interpreter iterates over all active threads, executes one
instruction for each active thread, then repeat this process. However, the whole
Scala-based program itself is **not thread safe**. Do not run multiple JVM or
native threads. This means, you can still experiment with concurrent Mu
programs, but there are some corner cases that do not work in this refimpl. For
- Waiting for other Mu threads in the trap handler. The trap handler is executed
by the same thread executing the Mu IR. During trap handler, no Mu program is
executed. So if you want to use
to wait for certain Mu thread to come to a certain rendezvous point (a common
optimisation trick), you should either wait within Mu IR (not in trap
handlers) or try the high-performance Mu implementation which is still being
- Synchronising with concurrent native programs via pointers, atomic memory
access and futex. This is the realistic way for Mu to synchronise with
native programs or foreign languages, but this refimpl implements atomic
memory access as not-atomic (since it uses green thread) and implements futex
in Scala (since it has its own scheduler).
The MicroVM instance will not start executing unless its `execute()` method is
called. This method is specific to this implementation, and is not defined in
the specification. This also means the *client cannot run concurrently with the
MicroVM*, i.e. once started, the client can only intervene in the execution in
**trap handlers**. So a common use pattern is:
val microVM = new MicroVM()
val uir = myCompiler.compile(sourceCode)
val ctx = microVM.newContext()
microVM.setTrapHandler(theTrapHandler) // Set the trap handler so the client
// can talk with the VM when trapped.
val stack = ctx.newStack(theMainFunction)
val thread = ctx.newThread(stack, Seq(params, to, the, main, function))
microVM.execute() // The current JVM thread will run on behalf of the MicroVM.
// This blocks until all Mu threads stop.
// However, MicroVM will call theTrapHandler.
Only the text-based IR and HAIL are implemented. The binary-based IR and HAIL
script do not have high priority at this point, because our current focus is to
implement a correct Mu VM and the text-based IR is easier for debugging. IR
parsing is also not yet known as the bottleneck.
### Garbage collection
This reference implementation has an exact tracing garbage collector with a
mark-region small object space and a mark-sweep large object space.
### IR implementation-specific details
- Many undefined behaviours in the specification will raise
`UvmRuntimeException`, such as division by zero, going below the last frame of
a stack, accessing a NULL reference, etc. But this behaviour is not
- `int<n>` for n = 1 to 64 are implemented. `vec<T n>` is implemented for all T
that are int, float or double, and all n >= 1. However, only 8, 16, 32, 64-bit
integers, float, double, `vec<int<32> 4>`, `vec<float 4>` and `vec<double 2>`
can be loaded or stored from the memory.
- The tagged reference type `tagref64` is fully implemented.
- Out-of-memory errors will terminate the VM rather than letting the Mu IR
handle such failures via the exception clauses.
### Native interface
This reference implementation assumes it is running on x86-64 on either Linux or
OSX. It implements the [AMD64 Unix Native
of the specification. It can call native functions from Mu IR and let native
programs call back to Mu IR.
It does not support throwing Mu exceptions into native programs, or handing
C++-based exceptions.
## Author and Copyright
This project is created by Kunshan Wang, Yi Lin, Steve Blackburn, Antony
Hosking, Michael Norrish.
This project is released under the CC-BY-SA license. See `LICENSE`.
## Contact
Kunshan Wang <kunshan.wang@anu.edu.au>
......@@ -16,7 +16,8 @@ licenses := Seq("CC BY-SA 4.0" -> url("https://creativecommons.org/licenses/by-s
scalaVersion := "2.11.7"
libraryDependencies ++= Seq(
"org.antlr" % "antlr4" % "4.5.1",
"org.scala-lang" % "scala-reflect" % "2.11.7",
"org.antlr" % "antlr4" % "4.5.1-1",
"com.typesafe.scala-logging" %% "scala-logging" % "3.1.0",
"ch.qos.logback" % "logback-classic" % "1.1.3",
"com.github.jnr" % "jnr-ffi" % "2.0.7",
......@@ -36,3 +37,14 @@ antlr4GenListener in Antlr4 := false
antlr4GenVisitor in Antlr4 := false
lazy val makeClasspathFile = taskKey[Unit]("write the run-time classpath to target/jars.txt as colon-separated list")
makeClasspathFile := {
val cp = (fullClasspath in Runtime).value.files
println("fullClasspath: \n" + cp.mkString("\n"))
val cpStr = cp.mkString(":")
IO.write(new java.io.File("cbinding/classpath.txt"), cpStr)
CFLAGS += -std=gnu11
ifndef JAVA_HOME
$(error JAVA_HOME is required. Invoke with 'make JAVA_HOME=/path/to/java/home')
CFLAGS += -I $(JAVA_HOME)/include
ifndef OS
uname := $(shell uname)
ifeq ($(uname),Darwin)
ifeq ($(uname),Linux)
$(error Unrecognized operating system $(uname). I currently only worked on OSX and Linux.)
ifeq ($(OS),OSX)
CFLAGS += -I $(JAVA_HOME)/include/darwin
LDFLAGS += -L $(JAVA_HOME)/jre/lib/server -l jvm -rpath $(JAVA_HOME)/jre/lib/server -install_name '@rpath/libmurefimpl2start.so'
ifeq ($(OS),LINUX)
CFLAGS += -I $(JAVA_HOME)/include/linux
LDFLAGS += -Wl,--no-as-needed -L $(JAVA_HOME)/jre/lib/amd64/server -l jvm -Wl,-rpath,$(JAVA_HOME)/jre/lib/amd64/server
.PHONY: all
all: libs tests
.PHONY: libs
libs: libmurefimpl2start.so
libmurefimpl2start.so: refimpl2-start.c classpath.h
$(CC) -fPIC -shared $(CFLAGS) -o $@ $< $(LDFLAGS)
classpath.txt: ../build.sbt
cd .. ; sbt makeClasspathFile
classpath.h: classpath.txt
xxd -i classpath.txt > classpath.h
.PHONY: tests
tests: test_client
test_client: test_client.c libmurefimpl2start.so
$(CC) `./refimpl2-config --istart --cflags --libs` -o $@ $<
.PHONY: clean veryclean
rm *.so test_client
rm *.so test_client classpath.txt classpath.h
# The C binding of the Mu reference implementation
This directory contains the C binding so you can write the Mu client in C. If
You write the client in a JVM language, you don't need this binding since the
reference implementation is already implemented on JVM.
## Building
Make sure you build the micro VM. Go to the parent directory and type `sbt
compile`. Then come here and type `make JAVA_HOME=/path/to/the/java/home`.
This will produce the `libmurefimpl2start.so` dynamic library that contains code
that starts the JVM and creates the MuVM instance for your client written in C.
This library will **hard code the classpath and the JVM library path** into the
shared object, so it only works for this particular `microvm-refimpl2`
repository you cloned. This is because as this reference implementation is a
research project, it is unlikely to install it into any well-known places such
as `/usr`. Hard-coding the JVM's `libjvm.so` path using "rpath" eliminates the
need to put the JVM library path to `LD_LIBRARY_PATH`, since JVM is seldom
installed into `/usr/lib`.
## Usage
The `refimpl2-config` script generates the necessary compiler and linker flags
for you.
But you should think first: Does my client start the Mu VM, or some other
program starts the VM for me?
### Your program starts Mu
You write the client in C, and it starts the JVM and creates a micro VM
The `libmurefimpl2start.so` library contains functions (defined in
`refimpl2-start.h`) that starts the JVM and create the Mu instance for you.
Your client invokes the `mu_refimpl2_new()` function which returns a `MuVM*`.
After using, call `mu_refimpl2_close` to close it. The `mu_refimpl2_new_ex`
function provides more options.
Use the `refimpl2-config` script with the `--istart` flag to indicate your
program will create the Mu reference implementation instance. Such clients need
to link against `libmurefimpl2start.so`, and it will **hard-code its location
using rpath** for the same reason why it hard-codes the classpath and JVM
locations, otherwise your executable file will require `libmurefimpl2start.so`
to be on the `LD_LIBRARY_PATH` to execute.
For example:
cc `/path/to/refimpl2-config --istart --cflags --libs` -o my_client my_client.c
### Other program starts Mu
You write the client in C, but some other program starts the Mu micro VM and
gives your client a pointer to the `MuVM` struct. In this case, you don't know
how the micro VM is created. You only depend on the implementation-neutral API.
In this case, all your program need is the `muapi.h` header. Use
`refimpl2-config` without the `--istart` flag will only generate the inclusion
As another example, you wrote a Scala program which you call a "Mu loader". The
program creates a `MicroVM` instance, then dynamically loads your client from a
".so" file, calls one of its functions and passes the `MuVM*` pointer to it. You
can write your client like this:
void my_entry(MuVM* mvm) {
And compile it to a dynamic library by:
cc `/path/to/refimpl2-config --cflags` -fPIC -shared -o libmyclient.so my_client.c
## Implementation details
The `MuVM` struct has an extra non-standard function `execute()`. See
[../README.md](../README.md) for more details.
vim: tw=80
......@@ -43,13 +43,19 @@ typedef void (*MuCFP)();
// Result of a trap handler
typedef int MuTrapHandlerResult;
// Used by new_thread
typedef int MuHowToResume;
// Values or MuTrapHandlerResult
#define MU_THREAD_EXIT 0x00
// Values or MuTrapHandlerResult and muHowToResume
#define MU_REBIND_THROW_EXC 0x02
// Used by MuTrapHandler
typedef void (*MuValuesFreer)(MuValue *values, MuCPtr freerdata);
// Declare the types here because they are used in the following signatures.
typedef struct MuVM MuVM;
typedef struct MuCtx MuCtx;
......@@ -57,13 +63,14 @@ typedef struct MuCtx MuCtx;
// Signature of the trap handler
typedef void (*MuTrapHandler)(MuCtx *ctx, MuThreadRefValue thread,
MuStackRefValue stack, int wpid, MuTrapHandlerResult *result,
MuStackRefValue *new_stack, MuValue *values, int *nvalues,
MuRefValue *exception,
MuStackRefValue *new_stack, MuValue **values, int *nvalues,
MuValuesFreer *freer, MuCPtr *freerdata, MuRefValue *exception,
MuCPtr userdata);
// Memory orders
typedef int MuMemOrd;
// Values of MuMemOrd
#define MU_NOT_ATOMIC 0x00
#define MU_RELAXED 0x01
#define MU_CONSUME 0x02
......@@ -75,6 +82,7 @@ typedef int MuMemOrd;
// Operations for the atomicrmw API function
typedef int MuAtomicRMWOp;
// Values of MuAtomicRMWOp
#define MU_XCHG 0x00
#define MU_ADD 0x01
#define MU_SUB 0x02
......@@ -90,7 +98,7 @@ typedef int MuAtomicRMWOp;
// Calling conventions.
typedef int MuCallConv;
#define MU_DEFUALT 0x00
#define MU_DEFAULT 0x00
// Concrete Mu implementations may define more calling conventions.
// NOTE: MuVM and MuCtx are structures with many function pointers. This
......@@ -112,6 +120,9 @@ struct MuVM {
// Set handlers
void (*set_trap_handler )(MuVM *mvm, MuTrapHandler trap_handler, MuCPtr userdata);
// Proprietary API: let the micro VM execute
void (*execute)(MuVM *mvm);
// A local context. It can only be used by one thread at a time. It holds many
......@@ -134,13 +145,13 @@ struct MuCtx {
void (*load_hail )(MuCtx *ctx, char *buf, int sz);
// Convert from C values to Mu values
MuIntValue (*handle_from_int8 )(MuCtx *ctx, int8_t num, int len);
MuIntValue (*handle_from_sint8 )(MuCtx *ctx, int8_t num, int len);
MuIntValue (*handle_from_uint8 )(MuCtx *ctx, uint8_t num, int len);
MuIntValue (*handle_from_int16 )(MuCtx *ctx, int16_t num, int len);
MuIntValue (*handle_from_sint16)(MuCtx *ctx, int16_t num, int len);
MuIntValue (*handle_from_uint16)(MuCtx *ctx, uint16_t num, int len);
MuIntValue (*handle_from_int32 )(MuCtx *ctx, int32_t num, int len);
MuIntValue (*handle_from_sint32)(MuCtx *ctx, int32_t num, int len);
MuIntValue (*handle_from_uint32)(MuCtx *ctx, uint32_t num, int len);
MuIntValue (*handle_from_int64 )(MuCtx *ctx, int64_t num, int len);
MuIntValue (*handle_from_sint64)(MuCtx *ctx, int64_t num, int len);
MuIntValue (*handle_from_uint64)(MuCtx *ctx, uint64_t num, int len);
MuFloatValue (*handle_from_float )(MuCtx *ctx, float num);
MuDoubleValue (*handle_from_double)(MuCtx *ctx, double num);
......@@ -213,7 +224,7 @@ struct MuCtx {
// Thread and stack creation and stack destruction
MuStackRefValue (*new_stack )(MuCtx *ctx, MuFuncRefValue func);
MuThreadRefValue (*new_thread)(MuCtx *ctx, MuStackRefValue stack,
MuHowToResume *htr, MuValue *vals, int nvals, MuRefValue *exc);
MuHowToResume htr, MuValue *vals, int nvals, MuRefValue exc);
void (*kill_stack)(MuCtx *ctx, MuStackRefValue stack);
// Frame cursor operations
#!/usr/bin/env python
from __future__ import print_function # compatible with python2
import sys
import os
import os.path
import platform
plat_sys = platform.system()
whereami = os.path.dirname(os.path.realpath(__file__))
args = sys.argv[1:]
if len(args) == 0 or "--help" in args or "-h" in args:
cc `refimpl2-config --istart --cflags --libs` -o the_output your-c-program-that-starts-mu.c
cc `refimpl2-config --cflags` -fPIC -shared -o theclient.so your-c-program-loaded-by-the-jvm.c
--istart Your C program will start the JVM and create the Mu instance.
--cflags If present, this script will print compiler flags.
--libs If present, this script will print linker flags.
if '--istart' in args:
if '--cflags' in args:
print("-I {} ".format(whereami), end="")
if '--libs' in args:
if plat_sys == "Linux":
print("-Wl,--no-as-needed ", end="")
print("-L {} -l murefimpl2start -Wl,-rpath,{} ".format(
whereami, whereami), end="")
if '--cflags' in args:
print("-I {} ".format(whereami), end="")
if '--libs' in args:
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <jni.h>
#include "refimpl2-start.h"
#include "classpath.h"
static JavaVM *jvm;
static JNIEnv *env;
static jclass cinitiater_cls;
static jmethodID new_mid;
static jmethodID new_ex_mid;
static jmethodID close_mid;
static char *cinitiater_class = "uvm/refimpl/nat/CInitiater";
static char *new_method = "mu_refimpl2_new";
static char *new_ex_method = "mu_refimpl2_new_ex";
static char *close_method = "mu_refimpl2_close";
static int refimpl2_start_debug;
static void init_jvm() {
char *debug_env = getenv("REFIMPL2_START_DEBUG");
if (debug_env != NULL) {
refimpl2_start_debug = 1;
JavaVMInitArgs vm_args;
JavaVMOption options[1];
char *cpoptionstr = (char*)calloc(classpath_txt_len + 100, 1);
strcat(cpoptionstr, "-Djava.class.path=");
strncat(cpoptionstr, (const char*)classpath_txt, classpath_txt_len);
options[0].optionString = cpoptionstr;
if (refimpl2_start_debug) {
printf("Classpath option: '%s'\n", cpoptionstr);
vm_args.version = JNI_VERSION_1_8;
vm_args.nOptions = 1;
vm_args.options = options;
vm_args.ignoreUnrecognized = JNI_FALSE;
int rv = JNI_CreateJavaVM(&jvm, (void**)&env, &vm_args);
if (rv != JNI_OK) {
printf("ERROR: Failed to create JVM: %d\n", rv);
jclass cinitiater_cls = (*env)->FindClass(env, cinitiater_class);
if (cinitiater_cls == NULL) {
printf("ERROR: class %s cannot be found.\n", cinitiater_class);
new_mid = (*env)->GetStaticMethodID(env, cinitiater_cls, new_method, "()J");
if (new_mid == NULL) {
printf("ERROR: method %s cannot be found.\n", new_method);
new_ex_mid = (*env)->GetStaticMethodID(env, cinitiater_cls, new_ex_method, "(JJJ)J");
if (new_ex_mid == NULL) {
printf("ERROR: method %s cannot be found.\n", new_ex_method);
close_mid = (*env)->GetStaticMethodID(env, cinitiater_cls, close_method, "(J)V");
if (close_mid == NULL) {
printf("ERROR: method %s cannot be found.\n", close_method);
MuVM *mu_refimpl2_new() {
if (jvm == NULL) {
uintptr_t rv = (*env)->CallStaticLongMethod(env, cinitiater_cls, new_mid);
return (MuVM*)rv;
MuVM *mu_refimpl2_new_ex(int64_t heap_size, int64_t global_size, int64_t stack_size) {
if (jvm == NULL) {
uintptr_t rv = (*env)->CallStaticLongMethod(env, cinitiater_cls, new_ex_mid,
heap_size, global_size, stack_size);
return (MuVM*)rv;
void mu_refimpl2_close(MuVM *mvm) {
if (jvm == NULL) {