memory.rst 21.2 KB
Newer Older
1 2 3
=================
Mu and the Memory
=================
Kunshan Wang's avatar
Kunshan Wang committed
4

Kunshan Wang's avatar
Kunshan Wang committed
5 6 7
Overview
========

Kunshan Wang's avatar
Kunshan Wang committed
8
Mu supports automatic memory management via garbage collection. There is a
Kunshan Wang's avatar
Kunshan Wang committed
9 10 11
**heap** which contains garbage-collected **objects** as well as many **stacks**
and a **global memory** which contain data that are not garbage-collected.

Kunshan Wang's avatar
Kunshan Wang committed
12 13 14 15
The heap memory is managed by the garbage collector.

Unlike C or C++, local SSA variables are not bound to memory locations. Stack
memory must be allocated by the ``ALLOCA`` or ``ALLOCAHYBRID`` instructions or
Kunshan Wang's avatar
Kunshan Wang committed
16
using the Mu client interface.
Kunshan Wang's avatar
Kunshan Wang committed
17 18 19

This specification does not mandate any object layout, but it is recommended to
layout common data types as in the platform application binary interface (ABI)
20
so that the native interface is easier to implement.
Kunshan Wang's avatar
Kunshan Wang committed
21 22 23 24

Basic Concepts
==============

25 26 27
Mu Memory
---------

28 29
The **Mu memory** consists of three parts: the **heap memory**, the **stack
memory** and the **global memory**.
Kunshan Wang's avatar
Kunshan Wang committed
30 31 32 33 34

Memory is allocated in their respective **allocation units**. Every allocation
unit has a **lifetime** which begins when the allocation unit is created and
ends when it is destroyed.

35 36
A **memory location** is a region of data storage in the memory which can hold
data values. A memory location has a type and its value can only be of that
37 38 39 40 41 42 43 44 45
type. An **internal reference** (the `iref type <type-system.rst#iref>`__)
refers to a memory location in *any* part of the Mu memory.

    NOTE about **addresses**: The Mu memory is defined without mentioning
    "address". There is no "size", "alignment" or "offset" of a memory location.
    The relation between a memory location and an address is only established
    when pinning (discussed later). Even when pinned, the address may or may not
    be the canonical address where the location is allocated in Mu. For example,
    Mu objects can be replicated.
46 47 48 49 50 51 52 53 54
    
    When allocating Mu memory locations (in the heap, stack or global memory),
    Mu guarantee the location can hold Mu values of a particular type and, as
    long as the type allows atomic access, the location can be accessed
    atomically. The implementation must ensure all memory locations are
    allocated in such a way. For example, it should not allocate an integer
    across page boundary, but it may choose to use locks for atomicity, which,
    in practice, is usually a bad idea.

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
..

    NOTE about **iref**, i.e. *internal references*: ``iref<T>`` is a traced
    reference type. The destination of ``iref<T>`` must be within the Mu memory.
    We can consider both heap objects, stacks and the global memory to be
    *"conceptually boxed"* in the sense that Mu knows what type those things
    contain (at least finding out the reference fields for GC). We can
    straightforwardly think about heap objects as boxes that contain the values.
    Although stack cells are, in practice, usually allocated by pushing or
    popping, we can consider each stack as a huge box that contains many stack
    cells.  Similarly, "the entire global memory" can be considered a huge box
    that contains all global cells. ``iref`` can only be created or derived from
    the allocation site of the respective Mu allocation units, such as the
    ``NEW``, ``ALLOCA`` instructions and the definitions of ``.global``. So
    ``iref`` can never come out of thin air.

    On the contrary, untraced pointers (the `uptr type <type-system.rst#uptr>`__
    and the `ufuncptr type <type-system.rst#ufuncptr>`__) are simply addresses.
    They may point outside the Mu memory (for instance, to memory allocated by
    ``malloc``). The destinations are *"conceptually unboxed"*, in the sense
    that Mu assume no knowledge about the destination, and the GC will not trace
    them.

78 79 80
..

    For C programmers: The word "object" in the C language is the counterpart of
81 82 83
    "memory location" in Mu. Mu does not have bit fields, and a Mu memory
    location is always an "object" in C's sense. In Mu's terminology, the word
    "object" is a synonym of "heap object" or "garbage-collected object".
Kunshan Wang's avatar
Kunshan Wang committed
84

85
    In C, the word "memory location" must have scalar types, but Mu uses the
86
    term "memory location" for composite types, too.
87 88 89 90 91
    
For a memory location L that represents type T, if c is a member (if applicable)
or a component of T, it also has a memory location which is a **member** or a
**component** of the memory location L, respectively.  Memory location L1
**contains** a memory location L2 if L2 is a component of L1.
92 93 94 95

The **lifetime** of a memory location is the same as the allocation unit that
contains it.

96 97 98 99 100
As implementation details, when an allocation unit is destroyed and another
allocation unit occupied the same or overlapping space as the former, they are
different allocation units.  Different allocation units contain no common memory
locations. When a heap object is moved by the garbage collector, it is still the
same object. Any memory locations within the same object remain the same.
Kunshan Wang's avatar
Kunshan Wang committed
101

Kunshan Wang's avatar
Kunshan Wang committed
102 103
    NOTE: This means the memory of Mu is an abstraction over the memory space of
    the process. 
Kunshan Wang's avatar
Kunshan Wang committed
104

105 106 107 108 109 110 111 112 113 114 115 116 117 118
Native Memory
-------------

The **native memory** is not Mu memory. The native memory is an address space of
a sequence of bytes, each can be addressed by an integral address. The size of
the address is implementation-defined.

A region of bytes in the native memory can be interpreted as Mu values in an
implementation-dependent way. The bytes that represents a Mu value is the
**bytes representation** of that Mu value.

    For C programmers: it is similar to the "object representation", but in Mu,
    unless a memory location is pinned, it may not be represented in such a way.

119 120
.. _pinning:

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
A Mu memory location can be **pinned**. In this state, it is mapped to a
(contiguous) region of bytes in the native memory which contains the bytes
representation of the value the memory location holds. The beginning of the
memory location is mapped to the lowest address of the region. Different
components of a memory location which do not contain each other do not map to
overlapping regions in the address space.

    For C programmers:
    
    * Mu assumes 8-bit bytes. 

    * Mu does not have the bit-field type, but a client can implement bit-fields
      using integer types and bit operations. 

    * Mu does not have union types. However, like C, directly casting an address
      to a pointer has implementation-defined behaviours. If a Mu program
      interfaces with native programs, it has to also depend on the platform.

    * Unlike C, Mu operations work on SSA variables rather than memory locations
      (the counterpart of objects in C).

    * Mu forces the 2's complement representation, though the byte order and
      alignment requirement are implementation-defined.

145 146 147 148
Global cells are always pinned. Other Mu memory locations can be pinned by
adding to the pinning set of any Mu thread (see `Native Interface
<native-interface.rst>`__ for details about the pinning and unpinning operations
which modifies the pinning set).
149

Kunshan Wang's avatar
Kunshan Wang committed
150 151 152 153 154
Memory Allocation and Deallocation
==================================

An allocation unit in the heap memory is called a **heap object**, or **object**
when unambiguous. It is created when executing the ``NEW`` or ``NEWHYBRID``
155 156
instructions or the ``new_fixed`` or ``new_hybrid`` API function. It is
destroyed when the object is collected by the garbage collector.
Kunshan Wang's avatar
Kunshan Wang committed
157

158 159 160
An allocation unit in the stack memory is called an **alloca cell** or **stack
cell**. It is created when executing the ``ALLOCA`` or ``ALLOCAHYBRID``
instruction. It is destroyed when the stack frame containing it is destroyed.
Kunshan Wang's avatar
Kunshan Wang committed
161 162

An allocation unit in the global memory is called a **global cell**. One global
163
cell is created for every ``.global`` declaration in a bundle submitted to Mu.
Kunshan Wang's avatar
Kunshan Wang committed
164
Global cells are never destroyed.
Kunshan Wang's avatar
Kunshan Wang committed
165 166 167 168 169 170 171

Initial Values
--------------

The initial value of any memory location is defined as the following, according
the type of data value the memory location represents:

Kunshan Wang's avatar
Kunshan Wang committed
172 173
* The initial value of ``int`` and pointer types is 0 (numerical value or
  address).
Kunshan Wang's avatar
Kunshan Wang committed
174
* The initial value of floating point types is positive zero.
Kunshan Wang's avatar
Kunshan Wang committed
175 176
* The initial value of ``ref``, ``iref``, ``weakref``, ``funcref``, ``stackref``
  and ``threadref`` is ``NULL``.
Kunshan Wang's avatar
Kunshan Wang committed
177 178
* The initial value of ``tagref64`` is a floating point number which is
  positive zero.
Kunshan Wang's avatar
Kunshan Wang committed
179
* The initial values of all fields or elements in ``struct``, ``array``,
180 181
  ``vector`` and the fixed and variable part of ``hybrid`` are the initial
  values according to their respective types.
Kunshan Wang's avatar
Kunshan Wang committed
182 183 184 185

Garbage Collection
------------------

186 187 188
A **root** is an object reference or internal reference in:

* any global cell, or
Kunshan Wang's avatar
Kunshan Wang committed
189
* any bound Mu stacks, or
Kunshan Wang's avatar
Kunshan Wang committed
190
* the thread-local object reference in any threads, or
191
* any values held by any client contexts in the API.
192 193 194 195

A live stack contains references in its alloca cells and live local SSA
variables. A dead stack contains no references. A thread can strongly reach its
bound stack unless it is temporarily unbound because of trapping.
Kunshan Wang's avatar
Kunshan Wang committed
196 197

An object is **strongly reachable** if it can be reached by traversing only
198 199 200 201
strong, stack and thread references from any root. An object is **weakly
reachable** if it is not strongly reachable, but can be reached by traversing
strong stack, thread and weak references from any root. Otherwise an object is
**unreachable**.
Kunshan Wang's avatar
Kunshan Wang committed
202 203 204 205 206 207 208 209 210 211 212

The garbage collector can collect unreachable objects. It may also modify weak
references which refers to a weakly reachable object to ``NULL``.

    NOTE: Doing the latter may make weakly reachable objects become unreachable.

The garbage collector may move objects.

Memory Accessing
================

213 214 215
Memory accessing operations include **load** and **store** operations. To
**access** means to load or store. **Atomic read-modify-write** operations may
have both a load and a store operation, but may have special atomic properties.
Kunshan Wang's avatar
Kunshan Wang committed
216

217 218
    NOTE: Instructions are named in capital letters: LOAD and STORE. The
    abstract operations are in lower case: load, store and access.
Kunshan Wang's avatar
Kunshan Wang committed
219

Kunshan Wang's avatar
Kunshan Wang committed
220
Memory access operations can be performed by some Mu instructions (see
221 222
`Instruction Set <instruction-set.rst>`__, API functions (see `Client Interface
<api.rst>`__), native programs which accesses the pinned Mu memory,
223
or in other implementation-specific ways.
Kunshan Wang's avatar
Kunshan Wang committed
224 225 226 227

Two memory accesses **conflict** if one stores to a memory location and the
other loads from or stores to the same memory location.

228 229 230 231 232
Parameters and Semantics of Memory Operations
---------------------------------------------

Generally speaking, load operations copy values from the memory and store
operations copy values into the memory. The exact results are determined by the
233
memory model. See `Memory Model <memory-model.rst>`__.
234 235 236 237 238 239 240

A **load** operation has parameters ``(ord, T, loc)``. *ord* is the memory order
of the operation.  *T* is the type of *loc*, a memory location. The result is a
value of the strong variant of type *T*.

A **store** operation has parameters ``(ord, T, loc, newVal)``. *ord is the
memory order of the operation. *T* is the type of *loc*, a memory location.
Kunshan Wang's avatar
Kunshan Wang committed
241 242
*newVal* is a value whose type is the strong variant of *T*. This operation does
not produce values as result.
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312

A **compare exchange** operation is an atomic read-modify-write operation. Its
parameters are ``(isWeak, ordSucc, ordFail, T, loc, expected, desired)``.
*isWeak* is a Boolean parameter which indicates whether the compare exchange
operation is weak or string. *ordSucc* and *ordFail* are the memory orders of
the operation when the comparing is successful or failed. *T* is the type of the
memory location *loc*. *expected* and *desired* are values whose type is the
strong variant of *T*. The result is a pair ``(v, s)``, where *v* has the type
of the strong variant of *T*, and *s* is a Boolean value.

A compare exchange operation performs a load operation on *loc* and compares its
result with *expected*. If the comparison is successful, it performs a store
operation to location *loc* with *desired* as *newVal*.

If the operation is strong, The comparison succeeds **if and only if** the
result of load equals *expected*. If it is weak, the comparison succeeds **only
if** the result of load equals the *expected* value and it may spuriously fail,
that is, it may fail even if the loaded value equals the *expected* value.

The result *v* is the result of the initial load operation and *s* is whether
the comparison is successful or not.

An **atomic-x** operation is an atomic read-modify-write operation, where *x*
can be one of (XCHG, ADD, SUB, AND, NAND, OR, XOR, MAX, MIN, UMAX, UMIN). Its
parameters are ``(ord, T, loc, opnd)``. *ord* is the memory order of the
operation. *T* is the type of the memory location *loc*. *opnd* is a value whose
type is the strong variant of *T*. The result also has the type of the strong
variant of *T*.

An atomic-x operation performs a load operation on location *loc*. Then
according to *x*, it performs one of the binary operation below, with the result
of the load operation as the left-hand-side operand and the value *opnd* as the
right-hand-side operand. The result is:

XCHG
    The value of *opnd*.
ADD
    The sum of the two operands.
SUB
    The difference of the two operands.
AND
    The bitwise AND of the two operands.
NAND
    The bitwise NOT of the bitwise AND of the two operands.
OR
    The bitwise inclusive OR of the two operands.
XOR
    The bitwise exclusive OR of the two operands.
MAX
    The maximum value of the two operands, considering both operand as signed.
MIN
    The minimum value of the two operands, considering both operand as signed.
UMAX
    The maximum value of the two operands, considering both operand as unsigned.
UMIN
    The minimum value of the two operands, considering both operand as unsigned.

..

    NOTE: In the C syntax, the semantic of NAND is ``~(op1 & op2)``.

Then it performs a store operation to location *loc* with the result of the
binary operation as *newVal*.

The result of the atomic-x operation is the result of the initial load
operation.

All operators other than ``XCHG`` are only applicable for integer types.
``XCHG`` is allowed for any type. However, a Mu implementation may only
implement some combinations of operators and operand types according to the
313
requirements specified in `Portability <portability.rst>`__
314 315 316 317 318 319 320

Memory Operations on Pointers
-----------------------------

Load, store, compare exchange and atomic-x operations can work with native
memory in addition to Mu memory locations. In this case, the *loc* parameter of
the above operations become a region of bytes in the native memory (usually
Kunshan Wang's avatar
Kunshan Wang committed
321
represented as ``uptr<T>``) rather than memory locations (usually represented as
322 323 324 325 326
``iref<T>``).

Only *native-safe* types can be accessed via pointers.

When accessing the memory via pointers, if the bytes are mapped to a Mu memory
327
location via pinning (see `Native Interface <native-interface.rst>`__), then if the
328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387
referent type of the pointer is the same as the Mu memory location, it has the
same effect as accessing the corresponding Mu memory location.

When non-atomically loading from or storing to a region *R* of bytes which is

1. not mapped to (i.e. not perfectly overlapping with) a particular Mu memory
   location, and
2. each byte in the region is part of any mapped byte region of any pinned Mu
   memory location,
   
then such an operation loads or stores on a byte-by-byte basis. Specifically:

* Such a load operation *L*:

  1. for each address *A* of byte in the region *R*, performs a load operation
       on the (only) Mu memory location of scalar types (not composite types)
       whose mapped byte region *R2* contains address *A*, then extract the byte
       value *b* at address *A*, then

  2. combine all results *b* from the previous step into a sequence of byte
       values, then interprets it as the bytes representation of a Mu value.
       This Mu value is the result of the load operation *L*.

* Such a store operation *S*:

  1. interprets its *newVal* argument as its bytes representation *B*, then

  2. for each address *A* of byte in the region *R*, performs a load operation
       on the (only) Mu memory location of scalar types (not composite types)
       whose mapped byte region *R2* contains address *A*, then update the
       result by replacing the byte at address *A* with the byte in *B*, then
       perform a store operation on the same Mu memory location with the updated
       value as *newVal*.

..

    NOTE: This allows Mu to allocate a byte array and access (by itself or by
    native programs) it via pointers as if it is a struct or a union, and then
    interpret the written values as bytes. The requirement of each byte being
    mapped gives implementation-defined behaviours to accesses beyond the border
    of any Mu objects (such as array out-of-bound errors), or accessing padding
    bytes in Mu structs.

Accessing native memory regions not mapped to Mu memory locations has
implementation-defined behaviours.

    NOTE: Accessing the native memory may have all kinds of results: getting a
    previously-stored value, storing to one address and affect another address
    when two addresses are mapped to the same physical memory region/file,
    segmentation fault, bus error (especially on OSX), turning on/off the light
    by doing memory-mapped IO, launching nuclear missiles, summoning nasal
    demons, etc. Mu cannot make much guarantee.

Native programs can access pinned Mu memory locations in implementation-defined
ways.

    NOTE: This means it requires the efforts from the implementations of both Mu
    and the native programs to obtain any defined semantics in mixed Mu-native
    programs. For C, it will involve the C language, the platform ABI and the Mu
    ABI of that platform.
388

389 390
Memory Layout
=============
Kunshan Wang's avatar
Kunshan Wang committed
391

392 393 394
Whether or how Mu data of any type are represented in the native memory is
implementation-defined. When an object is pinned, the layout is viewed from the
native memory in a platform-dependent way.
395 396

For Mu implementers, it is recommended to use the layout defined by the
397 398
application binary interface of the platform in order to ease the data exchange
via the native interface implementation.
Kunshan Wang's avatar
Kunshan Wang committed
399

400
Mu has some rules about Mu memory locations which must always preserved.
Kunshan Wang's avatar
Kunshan Wang committed
401

402 403
Rules of Memory Locations
=========================
Kunshan Wang's avatar
Kunshan Wang committed
404 405 406 407 408 409

Every memory location has an associated type bound when the memory location is
created and cannot be changed. The memory location can only hold values of that
type.

    NOTE: The association between memory location and type is conceptual. This
Kunshan Wang's avatar
Kunshan Wang committed
410
    does not mean the Mu implementation has to keep a metadata of the type of
Kunshan Wang's avatar
Kunshan Wang committed
411 412 413 414
    all memory locations at runtime. The implementation only needs to keep
    enough metadata to implement its garbage collector.

A memory location has a **beginning** and an **end**. The value it holds is
415 416 417 418 419 420 421
represented in that region.  A non-NULL internal reference of type *T* refers to
the memory location of type *T* at a specific beginning.

    NOTE: There can only be one such memory location.
    
Specifically, there is a memory location of type ``void`` at the beginning of
any other memory location.
Kunshan Wang's avatar
Kunshan Wang committed
422

423 424
    NOTE: This makes it legal to cast any ``iref<T>`` to ``iref<void>`` and
    back.
Kunshan Wang's avatar
Kunshan Wang committed
425 426 427 428 429 430 431 432

Prefix Rule
-----------

    NOTE: The prefix rule is design to support having common language-specific
    object headers in objects. It also supports inheritance in object-oriented
    programming where a superclass is a prefix of a subclass.

433
A component C is a **prefix** of a type T if any of the following is true.
Kunshan Wang's avatar
Kunshan Wang committed
434

435 436
+ *C* is *T* itself.
+ *T* is a ``struct`` and *C* is its first field.
437 438
+ *T* is a ``hybrid`` and *C* is its first field of the fixed part, or the fixed
  part of *T* has no fields and *C* is the first element of the variable part.
439 440 441
+ *T* is an ``array<T n>`` or ``vector<T n>`` and n >= 1, and *C* is its first
  element.
+ *C* is a prefix of another prefix of *T*.
Kunshan Wang's avatar
Kunshan Wang committed
442

443 444
A prefix of memory location *M* is the memory location that represents a prefix
of the type of *M*.
Kunshan Wang's avatar
Kunshan Wang committed
445

446
All prefixes of a memory location have the same beginning.
Kunshan Wang's avatar
Kunshan Wang committed
447

448
The ``REFCAST`` instruction or the ``refcast`` API function preserves the
449 450
beginning of the operand. If it casts ``iref<T1>`` to ``iref<T2>``, the result
is an internal reference to the memory location of type ``T2`` at the same
451
beginning. (see `Instruction Set <instruction-set.rst>`__)
Kunshan Wang's avatar
Kunshan Wang committed
452 453 454 455 456 457 458 459 460 461 462 463

Array Rule
----------

A **memory array** is defined as a contiguous memory location of components of
the same type. The ``array`` type, the ``vector`` type as well as the variable
part of a ``hybrid`` are all represented in the memory as memory arrays.

Nested ``array``, ``vector`` and variable part of ``hybrid`` can be considered
as a single memory array with the innermost element type of the nested type as
the element type.

464
    Example: The variable part of ``hybrid<T array<array<vector<float 4> 10
Kunshan Wang's avatar
Kunshan Wang committed
465
    20>>`` can be treated as:
Kunshan Wang's avatar
Kunshan Wang committed
466

Kunshan Wang's avatar
Kunshan Wang committed
467 468 469 470
    + a memory array of ``float``, or,
    + a memory array of ``vector<float 4>``, or
    + a memory array of ``array<vector<float 4> 10>``, or
    + a memory array of ``array<array<vector<float 4> 10> 20>``, or
Kunshan Wang's avatar
Kunshan Wang committed
471

Kunshan Wang's avatar
Kunshan Wang committed
472 473
Internal references to an element of a memory array can be shifted to other
elements in the same memory array using the ``SHIFTIREF`` instruction.
Kunshan Wang's avatar
Kunshan Wang committed
474

Kunshan Wang's avatar
Kunshan Wang committed
475
    NOTE: ``SHIFTREF`` may cross the boundary of Mu types, but still remain in
Kunshan Wang's avatar
Kunshan Wang committed
476 477 478 479 480
    the memory array. For example, an internal reference to the first ``float``
    in the ``array<array<float 10> 10>`` array, which is a 10x10 matrix of
    float, can be shifted to other rows using the ``SHIFTIREF`` instruction and
    cross the 10-element boundary. Shifting by 12 elements from element (0,0)
    will reach the element at (1,2).
Kunshan Wang's avatar
Kunshan Wang committed
481 482

.. vim: tw=80