M

mu-impl-ref2

Mu reference implementation (the 2nd version).

Name Last Update
cbinding Loading commit data...
migrate_scripts Loading commit data...
project Loading commit data...
pythonbinding Loading commit data...
src Loading commit data...
tests Loading commit data...
tools Loading commit data...
.gitignore Loading commit data...
.travis.yml Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
build.sbt Loading commit data...
test.sh Loading commit data...

Mu Reference Implementation version 2

Build Status

Version 2.2.0

This project is the current reference implementation of Mu, the micro virtual machine designed by The Micro Virtual Machine Project.

Version 2.2.0 implements the current Mu Specification.

How to compile

For the impatient:

  • Install JDK 8. If you use Mac, download from Oracle.
  • If you use Mac, install Homebrew.
  • Install Scala 2.12. If you use Mac and Homebrew, brew install scala.
  • Install sbt 0.13. If you use Mac and Homebrew, brew install sbt.
  • Install Scala IDE 4.6 or later (Eclipse with pre-installed plugins for Scala).
  • Clone this repository:
git clone git@gitlab.anu.edu.au:mu/mu-impl-ref2.git

If you do not have SSH access to the ANU GitLab repositories, use the HTTPS URL:

git clone https://gitlab.anu.edu.au/mu/mu-impl-ref2.git
  • In the directory mu-impl-ref2, do the following:
sbt update genSrc eclipse
  • Open Scala IDE and import the generated project as "existing project into workspace".

Detailed guide:

The reference implementation is developed and tested with Java VM 8. You need a JRE to build the Scala/Java part, and a JDK to build the C binding.

You also need Scala 2.12 and sbt 0.13. It is recommended to install them using the package manager of your operating system or distribution (such as apt-get, yum, pacman, etc. for GNU/Linux distributions and Homebrew for Mac OS X) if such packages are available.

For Ubuntu users: Ubuntu 15.10 does not provide sbt in its repository. Please download sbt from the official sbt web site, or follow the official sbt installing guide for Linux. If you experience any "certificate" problems, this page provides a solution.

Then after cloning this repository, you can simply invoke sbt compile to compile this project. Or you can do it step by step:

  • To download all dependencies from the Maven central repository, invoke sbt update.

  • To generate the Mu IR parser from the Antlr grammar, invoke sbt genSrc. The generated sources will be in the target/scala-2.12/src_managed directory.

  • To compile, invoke sbt compile. This will also generate the Mu IR parser using Antlr.

To generate an Eclipse project, install the sbt-eclipse plugin and invoke sbt eclipse. Make sure you generate the parser (sbt genSrc) before creating the Eclipse project, so that the generated sources will be on the Eclipse build path.

IntelliJ IDEA has plugins for Scala and SBT. Make sure you don't commit .idea or generated project files into the repository.

C binding and Python binding

The C binding is in the cbinding directory. Just run make inside cbinding.

The Python binding is in the pythonbinding directory. It depends on the C binding, so make sure you make the C binding first. The Python binding does not need to be built.

See cbinding/README.md and pythonbinding/README.md for more details.

How to test

For the impatient: run the test.sh script.

Detailed steps:

  1. Compile native programs necessary for testing the native interface:
pushd tests/c-snippets
make
popd
  1. Set the TRAVIS environment variable to true:
export TRAVIS=true

This will tell the test cases in src/test/scala not to print excessive logs which would be helpful for identifying problems for individual test cases.

  1. Run sbt test.

How to run

For the impatient: Execute the following command and see Mu running a factorial example.

sbt 'set fork:=true' 'test:runMain junks.FactorialFromRPython'

C API

The reference implementation implements the Mu Client API which allows C programs to control the micro VM and construct and deliver bundles for the micro VM to execute.

See cbinding/README.md for more details.

Scala API

The micro VM itself is implemented in Scala.

  • uvm.refimpl.MicroVM is the counterpart of the MuVM struct in the Mu Client API. It can be instantiated with VMConf options explained below.
  • uvm.refimpl.MuCtx is the counterpart of the MuCtx struct in C.
  • uvm.refimpl.MuValue and its subclasses implement the MuValue handles.

As an implementation detail, the micro VM will not start execution until MicroVM.execute() is called. See implementation details below.

The Scala interface is closer to the Scala's style. For example, the MuCtx.dumpKeepalives() method returns a Seq[MuValue] rather than writing the results into a given array. It also does more static type checking than the C interface.

There is a sample factorial program (generously provided by @johnjiabinzhang) in the src/test directory. To run the program with all dependencies on the classpath, you need to run it with sbt. Invoke sbt to enter the interactive shell. Then type:

set fork := true
test:runMain junks.FactorialFromRPython

or directly from the command line:

sbt 'set fork:=true' 'test:runMain junks.FactorialFromRPython'

fork := true tells sbt to run the program in a different process than the one running sbt itself.

Boot Image

The reference implementation can create boot images, a package that contains a Mu IR bundle and a serialised Mu memory, including the global memory and the heap.

Boot images can be created using the standard make_boot_image method on the MuVM object. In this reference implementation, the boot image is a zip file. By convention, boot images have the file-name extension .muref.

Before a boot image can be executed, an entry point needs to be specified. Use the tools/mar.py script to set the entry point by ID or name. The entry point is a Mu function that takes an int<32> and a uptr<uptr<int<8>>> as parameters, the same as the main function in C.

The tools/mar.py script can also specify extra libraries to be loaded when the micro VM loads the boot image. EXTERN constants will be resolved from these libraries in the order of those libraries.

The tools/runmu.sh script runs the micro VM with the given boot image. Additional arguments are passed to the entry point.

Micro VM Configuration

There are some parameters that controls the behaviour of the reference implementation.

When using the C API, the refimpl-specific cbinding/refimpl2-start.h header provides the mu_refimpl_new_ex function which accepts a C-style string. The options are encoded as key=value pairs, one option per line, with no spaces between the equal sign.

When using the tools/runmu.sh script, the options are specified as command-line options in the form --key=value before the boot image file name.

Options:

Sizes may have suffixes K, M, G or T. 1K = 1024 bytes. sosSize, losSize and globalSize must be a multiple of 32768 bytes (32K).

  • sosSize: The size of the small object space in bytes. default: 2M
  • losSize: The size of the large object space in bytes. default: 2M
  • globalSize: The size of the large object space in bytes. default: 1M
  • stackSize: The size of each stack in bytes. default: 60K
  • dumpBundle: Print out the bundle as text every time a bundle is loaded. default: false
  • staticCheck: Run static checker after each bundle is loaded. default: true
  • sourceInfo: Provide line/column info in Mu IR when errors occur. May be useful for debugging small Mu IR bundles, but will significantly slow down parsing!!! Enable only if the bundle is small. default: false
  • automagicReloc: Allow "automagic" relocation. If true, uptr and ufuncptr fields will also be traced during boot image building. If a uptr field points to a global cell field, it will still point to the same field after boot image loading; if a ufuncptr points to a native function, it will point to the same function after boot image loading. default: false
  • extraLibs: Extra libraries to load when starting the micro VM. This is a colon-separated list of libraries. Each library has the same syntax of the path argument of the dlopen system function. By default, it does not load any extra libraries.
  • bootImg: The path to the boot image. Only useful in the C API. By default, it does not load any boot image.
  • uPtrHack: When true, it will allow memory locations of general reference types to be accessed by uptr<T>. By default, such fields can only be accessed by iref<T>, but this hack is necessary for the current mu-client-pypy project to work. default: false

Log levels can be: ALL, TRACE, DEBUG, INFO, WARN, ERROR, OFF. Case-insensitive. Setting to WARN should get rid of most logging information, except the serious ones. The default log level is DEBUG.

  • vmLog: The log level of the micro VM (the "uvm" package)
  • gcLog: The log level of the garbage collector (the "uvm.refimpl.mem" package). If vmLog is set but gcLog is not, it will use the log level of vmLog.

Implementation details

This reference implementation aims to be easy to work with, but does not have high performance. It may be used by client writers to evaluate the Mu micro VM, and may also be used by Mu micro VM implementers as a reference to compare with.

The micro VM is implemented as an interpreter written in Scala. The main class is uvm.refimpl.MicroVM, which implements the MuVM struct specified by the client API, but is more Scala-like. The client interacts with the micro VM via uvm.refimpl.MuCtx instances created by the MicroVM instance, which corresponds to the MuCtx struct in the spec. uvm.refimpl.MuValue and its subclasses implement the MuValue handles, but has a real Scala type hierarchy and does extra type checking when converting, which is not required by the spec.

The client can also be written in C, Python or other languages that can interface with C.

Threading model

It uses green threads to execute multiple Mu threads and uses a round-robin scheduler: the interpreter iterates over all active threads, executes one instruction for each active thread, then repeat this process. However, the whole Scala-based program itself is not thread safe. Do not run multiple JVM or native threads. This means, you can still experiment with concurrent Mu programs, but there are some corner cases that do not work in this refimpl. For example:

  • Waiting for other Mu threads in the trap handler. The trap handler is executed by the same thread executing the Mu IR. During trap handler, no Mu program is executed. So if you want to use watchpoints to wait for certain Mu thread to come to a certain rendezvous point (a common optimisation trick), you should either wait within Mu IR (not in trap handlers) or try the high-performance Mu implementation which is still being written.

  • Synchronising with concurrent native programs via pointers, atomic memory access and futex. This is the realistic way for Mu to synchronise with native programs or foreign languages, but this refimpl implements atomic memory access as not-atomic (since it uses green thread) and implements futex in Scala (since it has its own scheduler).

The MicroVM instance will not start executing unless its execute() method is called. This method is specific to this implementation, and is not defined in the specification. This also means the client cannot run concurrently with the MicroVM, i.e. once started, the client can only intervene in the execution in trap handlers. So a common use pattern is:

val microVM = new MicroVM()

val uir = myCompiler.compile(sourceCode)
val ctx = microVM.newContext()
ctx.loadBundle(uir)

microVM.setTrapHandler(theTrapHandler)  // Set the trap handler so the client
                                        // can talk with the VM when trapped.

val stack = ctx.newStack(theMainFunction)
val thread = ctx.newThread(stack, Seq(params, to, the, main, function))

microVM.execute() // The current JVM thread will run on behalf of the MicroVM.
                  // This blocks until all Mu threads stop.
                  // However, MicroVM will call theTrapHandler.

The refimpl implements the text-based IR and HAIL as well as the IR-builder API to construct Mu IR ASTs programmatically.

Garbage collection

This reference implementation has an exact tracing garbage collector with a mark-region small object space and a mark-sweep large object space.

IR implementation-specific details

  • Many undefined behaviours in the specification will raise UvmRuntimeException, such as division by zero, going below the last frame of a stack, accessing a NULL reference, etc. But this behaviour is not guaranteed.

  • int<n> for n = 1 to 128 are implemented. vec<T n> is implemented for all T that are int, float or double, and all n >= 1. However, only 8, 16, 32, 64, 128-bit integers, float, double, vec<int<32> 4>, vec<float 4> and vec<double 2> can be loaded or stored from the memory.

  • The tagged reference type tagref64 is fully implemented.

  • Out-of-memory errors will terminate the VM rather than letting the Mu IR handle such failures via the exception clauses.

Native interface

This reference implementation assumes it is running on x86-64 on either Linux or OSX. It implements the AMD64 Unix Native Interface of the specification. It can call native functions from Mu IR and let native programs call back to Mu IR.

It does not support throwing Mu exceptions into native programs, or handing C++-based exceptions.

Author and Copyright

This project is created by Kunshan Wang, Yi Lin, Steve Blackburn, Antony Hosking, Michael Norrish.

This project is released under the CC-BY-SA license. See LICENSE.

Contact

Kunshan Wang kunshan.wang@anu.edu.au