Commit f9f5b52c authored by Kunshan Wang's avatar Kunshan Wang

Merge boot image building into other parts.

parent c4b60612
......@@ -8,7 +8,7 @@ machine, including its architecture, instruction set and type system.
Main specification:
- `Overview <overview.rst>`__
- `Intermediate Representation <ir.rst>`__
- `Intermediate Representation (IR) <ir.rst>`__
- `Intermediate Representation Binary Form (deprecated) <ir-binary.rst>`__
- `Type System <type-system.rst>`__
- `Instruction Set <instruction-set.rst>`__
......@@ -20,13 +20,13 @@ Main specification:
- `Memory Model <memory-model.rst>`__
- `(Unsafe) Native Interface <native-interface.rst>`__
- `Heap Allocation and Initialisation Language (HAIL) <hail.rst>`__
- `Boot Image Building <bootimage.rst>`__
- `Portability and Implementation Advices <portability.rst>`__
Platform-specific parts: These extends the main specification. The main
specification considers these parts as implementation-specific.
- `AMD64 Unix Native Interface <native-interface-x64-unix.rst>`__
- `Extensions for Boot-image Building <bootimage.rst>`__
Frequently asked questions:
......
......@@ -45,6 +45,10 @@ The Mu Micro VM and Client Contexts
Mu IDs and names are represented as::
// C-style '\0'-terminated string
typedef char *MuCString;
// Identifiers and names of Mu
typedef uint32_t MuID;
typedef char *MuName;
......@@ -59,8 +63,12 @@ A Mu instance is represented as a pointer to the struct ``MuVM``::
MuID (*id_of )(MuVM *mvm, MuName name);
MuName (*name_of )(MuVM *mvm, MuID id);
void (*set_trap_handler)(MuVM *mvm, MuTrapHandler trap_handler, MuCPtr userdata);
void (*make_boot_image)(MuVM *mvm, MuID* whitelist, MuArraySize whitelist_sz, MuCString output_file);
};
.. _client-context:
The client interacts with Mu for almost all tasks through **client contexts**,
or simply **context** when unambiguous.
......@@ -250,6 +258,41 @@ the *Trap Handling* section below for more information.
``userdata`` will be passed to the handlers as the last element when they are
called.
.. _make-boot-image:
::
void (*make_boot_image)(MuVM *mvm, MuID* whitelist, MuArraySize whitelist_sz, MuCString output_file);
The ``make_boot_image`` function creates a boot image which contains all
top-level definitions specified by ``whitelist``, which is an array of IDs. The
length of the array is ``whitelist_sz``. All heap objects reachable from any
global cells in the white-list are also in the boot image. The contents of the
global cells and reachable heap objects are preserved. It is an error if any
threads, stacks, frame cursors or IR Nodes are reachable from the global cells
in the white-list. The process of creating boot image is not atomic. Both
concurrent modifications of the memory reachable from the white-listed global
cells, and concurrent bundle loading, have undefined behaviours.
The IDs and the names of the entities are preserved in the boot image.
The boot image is written to the file specified by ``output_file``, a
``'\0'``-terminated C string. The format of the boot image is
implementation-defined.
Micro VM implementations may only allow the ``make_boot_image`` function to be
used in certain modes, enabled in implementation-specified manners.
NOTE: When building boot images, the micro VM implementation may need to
keep more information about the IR than usual. In usual occasions, the micro
VM may freely discard information (such as the type information not helpful
for GC) for space efficiency; but they need to be preserved when scanning
the heap for values other than references.
For example, an implementation may only enable this function if the VM is
started with the ``--enable-boot-image-building`` flag. In this case, it
will record more type information in the object layout.
MuCtx Functions
===============
......@@ -297,16 +340,14 @@ The ``load_bundle`` function loads a Mu IR bundle, and the ``load_hail``
function loads a HAIL script. The content is held in the memory pointed to by
``buf``, and ``sz`` is the length of the content in bytes.
These two functions exist for legacy reasons, and are optional. Implementations
may choose not to implement these two functions, document this behaviour, and
advise the client programmers to use the `IR builder API <irbuilder.rst>`__
instead.
Concurrency: The content of the bundle or the effect of the HAIL script is fully
visible to other evaluations in the client that *happen after* this call.
..
TODO: These two functions should be made optional or lifted to the
higher-level API which should be beyond this spec. The text-form bundle
needs a parser, and the HAIL script is also not the most efficient way to
load data into Mu at run time.
..
For Lua users: This is similar to ``lua_load``, but a Mu bundle itself is
......@@ -691,6 +732,8 @@ The semantics of these instructions are defined by the Mu memory model.
Stack and thread operations
===========================
.. _new-stack:
::
// Thread and stack creation and stack destruction
......
==================================
Extensions for Boot-image Building
==================================
===================
Boot-image Building
===================
Purpose
=======
Mu provides an interface to build "boot images". A boot image is a file that
contains a Mu IR bundle, a serialised Mu memory (only global cells and heap
objects), and implementation-specific data (such as the micro VM itself, the Mu
function as the entry point, external linkages, and statically-linked native
libraries).
This extension allows creating a "boot-image"—a partially initialised micro VM
instance which includes a pre-loaded bundles, pre-allocated heap objects,
pre-initialised memory contents, and external linkages.
Motivation
==========
The purpose for this mechanism is to allow fast VM initialisation.
Typical language runtimes (such as the ``java`` executable for JVM and the
``pypy`` executable for PyPy) need to have many things initialised when they
start:
``pypy`` executable for PyPy), when implemented on Mu, need to have many things
initialised when they start:
- **pre-loaded bundles**: These include the built-in data types and the
essential parts of the standard library of the language. For dynamic
......@@ -38,111 +42,31 @@ start:
files into different regions of the address space of the process, and fixes
relocation entries.
Extension
=========
Several extensions to the Mu IR and the API are made to support boot image
building.
Mu IR Extension
---------------
In addition to existing *constant constructors*:
- An **external constant constructor** creates a pointer constant. It is written
as:
+ the keyword ``EXTERN``, followed by
+ a string literal, which is a sequence of ASCII characters surrounded by
double quotation marks (code is 34). The code of each character shall be
within 33–126 but not 34, There is no escape sequences. This string
represents a symbolic name.
The values of such constants are implementation-defined. Usually the
implementation will resolve the symbolic names to the address of C functions.
..
Example::
.typedef @int = int<32>
.typedef @void = void
.typedef @voidp = uptr<@void>
.typedef @size_t = int<64>
.typedef @ssize_t = int<64>
.funcsig @write.sig = (@int @voidp @size_t) -> (@ssize_t)
.typedef @write.fp = ufuncptr<@write.sig>
.const @write = EXTERN "write"
A good language runtime must start-up fast. Thus it is not practical to use the
JIT compiler to compile all of the standard library functions at start-up time,
but the boot image builder should AoT compile the initial Mu IR bundle into
machine code, and simply memory-map the machine code into the memory. The same
is true for run-time heap objects: the heap should be serialised and bulk-copied
into the heap. It should also make use of system utilities (such as the dynamic
linker/loader) to load external libraries and resolve external symbols, thus
metadata should be provided.
.funcdef @main ... <...> {
%...(...):
...
%rv = CCALL #DEFAULT <@write.fp @write.sig> @write (%fd %buf %sz)
...
}
Mu IR and API
=============
Mu Client API Extension
-----------------------
The ``MuCtx`` struct now has an extra method::
struct MuCtx {
// ... other methods...
MuConstNode (*new_const_extern )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, char *symbol);
};
The ``new_const_extern`` function creates an external constant. ``ty`` is the
type of the constant which must be a pointer type; ``symbol`` is the symbolic
name.
The ``MuVM`` struct now has an extra method::
struct MuVM {
// ... other methods...
void (*make_boot_image)(MuVM *mvm,
MuID* whitelist, MuArraySize whitelist_sz,
char* output_file); /// MUAPIPARSER whitelist:array:whitelist_sz
};
The ``make_boot_image`` function creates a boot image which contains all
top-level definitions specified by ``whitelist``, which is an array of IDs. The
length of the array is ``whitelist_sz``. All heap objects reachable from any
global cells in the white-list are also in the boot image. The contents of the
global cells and reachable heap objects are preserved. It is an error if any
threads, stacks, frame cursors or IR Nodes are reachable from the global cells
in the white-list. The process of creating boot image is not atomic. Both
concurrent modifications of the memory reachable from the white-listed global
cells, and concurrent bundle loading, have undefined behaviours.
The IDs and the names of the entities are preserved in the boot image.
The boot image is written to the file specified by ``output_file``, a
``'\0'``-terminated C string. The format of the boot image is
implementation-defined.
Several extensions has been added to the Mu IR and the API to support boot image
building.
Micro VM implementations may only allow the ``make_boot_image`` function to be
used in certain modes, enabled in implementation-specified manners.
The IR can define `external constants <ir.rst#external-constructor>`__. They are
resolved in an implementation-specific manner when the bundle is loaded.
NOTE: When building boot images, the micro VM implementation may need to
keep more information about the IR than usual. In usual occasions, the micro
VM may freely discard information (such as the type information not helpful
for GC) for space efficiency; but they need to be preserved when scanning
the heap for values other than references.
For example, an implementation may only enable this function if the VM is
started with the ``--enable-boot-image-building`` flag. In this case, it
will record more type information in the object layout.
Two API functions are added:
Common Instructions Extension
-----------------------------
- `new_const_extern <irbuilder.rst#new-const-extern>`__ creates the external
constants programmatically.
TODO: Counterpart of ``make_boot_image``
- `make_boot_image <api.rst#make-boot-image>`__ creates the boot image. It takes
a white-list of top-level definitions and writes its transitive closure (over
both IR nodes, heap objects and global cells) into a file.
.. vim: tw=80
......@@ -88,6 +88,8 @@ descriptions when unambiguous.
Thread and Stack operations
===========================
.. _uvm-new-stack:
::
[0x201]@uvm.new_stack <[sig]> (%func: funcref<sig>) -> stackref
......
......@@ -2470,8 +2470,8 @@ NOTE: New stacks can be created by the ``@uvm.new_stack`` `common instruction
<common-insts.rst>`__ or the ``new_stack`` `API <api.rst>`__
function.
Common Structures
-----------------
Common Structures in NEWTHREAD and SWAPSTACK
--------------------------------------------
+ *newStackClause* ::= ``PASS_VALUES`` ``<`` *Ts* ``>`` ``(`` *vals* ``)``
+ *newStackClause* ::= ``THROW_EXC`` *exc*
......
This diff is collapsed.
......@@ -143,6 +143,8 @@ And the **name** can be set by the ``set_name`` function::
All nodes can just work without names, but names may provide extra debug
information if the implementation supports.
.. _loading-bundle:
When finished creating a bundle, use the ``load_bundle_from_node`` function to
**load** it::
......@@ -182,6 +184,8 @@ is created in **two steps**::
The first step creates the type and the second type sets the referent type. In
this way, the client can create recursive types, such as linked list.
.. _define-a-function:
To *define* a **function**, two nodes need to be created: A function and a
function version::
......@@ -290,6 +294,8 @@ Basic Functions
``new_bundle`` creates a new Mu IR bundle.
.. _load-bundle-from-node:
::
void (*load_bundle_from_node )(MuCtx *ctx, MuBundleNode b);
......@@ -386,6 +392,8 @@ types, and their lengths are ``nparamtys`` and ``nrettys``, respectively.
Creating Constant Nodes
-----------------------
.. _new-const-extern:
::
MuConstNode (*new_const_int )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, uint64_t value);
......@@ -394,6 +402,7 @@ Creating Constant Nodes
MuConstNode (*new_const_double )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, double value);
MuConstNode (*new_const_null )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty);
MuConstNode (*new_const_seq )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, MuConstNode *elems, MuArraySize nelems);
MuConstNode (*new_const_extern )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, MuCString symbol);
These functions create constant nodes and add them to the bundle ``b``.
......@@ -423,6 +432,10 @@ These functions create constant nodes and add them to the bundle ``b``.
constants which are the fields or elements. ``nelems`` is the length of the
array, and must match the actual number of fields or elements of the type.
- ``new_const_extern`` can create constants of ``uptr`` or ``ufuncptr`` types.
``ty`` is the type. ``symbol`` is the symbolic name, which can include ASCII
characters 33-126 but not 34.
All constants are created in one step, because constants cannot be recursive.
Creating Other Top-level Nodes
......
......@@ -418,6 +418,8 @@ the following is true:
TODO: The "carries a dependency to" relation is not well-defined for the
client since it may be written in a different language.
.. _happens-before:
The Happens Before Relation
---------------------------
......@@ -553,6 +555,8 @@ The load operations performed by the ``@uvm.futex.wait``,
``@uvm.futex.wait_timeout`` and ``@uvm.futex.cmp_requeue`` on the memory
location given by its argument are atomic.
.. _special-funcdef:
Special Rules for Functions and Function Redefinition
=====================================================
......
......@@ -103,9 +103,12 @@ typedef MuChildNode MuFuncVerNode; // Function version
typedef MuChildNode MuBBNode; // Basic block
typedef MuChildNode MuInstNode; // Instruction (itself, not result)
// C-style '\0'-terminated string
typedef char *MuCString;
// Identifiers and names of Mu
typedef uint32_t MuID;
typedef char *MuName;
typedef MuCString MuName;
// Convenient types for the void* type and the void(*)() type in C
typedef void *MuCPtr;
......@@ -291,7 +294,7 @@ struct MuVM {
// Build boot image
void (*make_boot_image)(MuVM *mvm,
MuID* whitelist, MuArraySize whitelist_sz,
char* output_file); /// MUAPIPARSER whitelist:array:whitelist_sz
MuCString output_file); /// MUAPIPARSER whitelist:array:whitelist_sz
};
// A local context. It can only be used by one thread at a time. It holds many
......@@ -518,7 +521,7 @@ struct MuCtx {
MuConstNode (*new_const_null )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty);
// new_const_seq works for structs, arrays and vectors. Constants are non-recursive, so there is no set_const_seq.
MuConstNode (*new_const_seq )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, MuConstNode *elems, MuArraySize nelems); /// MUAPIPARSER elems:array:nelems
MuConstNode (*new_const_extern )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, char *name);
MuConstNode (*new_const_extern )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty, MuCString symbol);
// Create global cell
MuGlobalNode (*new_global_cell )(MuCtx *ctx, MuBundleNode b, MuTypeNode ty);
......
......@@ -200,6 +200,8 @@ A Mu function can be **exposed** as a native function pointer in three ways:
3. Dynamically, the ``expose`` and ``unexpose`` API function do the same thing
as the above instructions.
.. _cookie:
A "cookie", which is a 64-bit integer value, can be attached to each exposed
value. When a Mu function is called via one of its exposed value, the attached
cookie can be retrieved by the ``@uvm.native.get_cookie`` common instruction in
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment