GitLab will be upgraded to the 12.10.14-ce.0 on 28 Sept 2020 at 2.00pm (AEDT) to 2.30pm (AEDT). During the update, GitLab and Mattermost services will not be available. If you have any concerns with this, please talk to us at N110 (b) CSIT building.

common-insts.rest 17.3 KB
Newer Older
1 2 3 4 5 6 7 8 9
===================
Common Instructions
===================

This document specifies Common Instructions.

**Common Instructions** are instructions that have a common format and are
used with the ``COMMINST`` super instruction. They have:

Kunshan Wang's avatar
Kunshan Wang committed
10
1. An ID and a name. (This means, they are *identified*. See `<uvm-ir.rest>`__.)
11 12 13 14 15
2. A flag list.
3. A type parameter list.
4. A value parameter list.
5. An optional exception clause.
6. A possibly empty (which means optional) keep-alive clause.
16

Kunshan Wang's avatar
Kunshan Wang committed
17
*Common instructions* are a mechanism to extend the Mu IR without adding new
18 19 20 21
instructions or changing the grammar.

    NOTE: *Common instructions* were named "intrinsic function" in previous
    versions of this document. The name was borrowed from the LLVM. However, the
Kunshan Wang's avatar
Kunshan Wang committed
22
    common instructions in Mu are quite different from the usual concept of
23 24 25 26 27 28 29 30 31 32 33
    intrinsic functions.

    Intrinsic functions usually mean a kind a function that is understood
    directly by the compiler. The C function ``memcpy`` is considered an
    intrinsic function by some compilers. In JikesRVM, methods of the ``Magic``
    class are a kind of intrinsic functions. They appear like ordinary functions
    in the language and bypass all front-end tools including the C parser and
    javac, but they are understood by the backend. Their purpose is to perform
    tasks that cannot be expressed by the high-level programming language,
    including direct raw memory access in Java.

Kunshan Wang's avatar
Kunshan Wang committed
34
    Common instructions only differ from ordinary Mu instructions in that they
35
    have a common format and are called by the ``COMMINST`` super instruction.
Kunshan Wang's avatar
Kunshan Wang committed
36
    The purpose is to add more instructions to the Mu IR without having to
37 38
    modify the parser.

Kunshan Wang's avatar
Kunshan Wang committed
39
    Common instructions are not Mu functions and cannot be called by the
40
    ``CALL`` instruction, nor can it be directly used from the high-level
Kunshan Wang's avatar
Kunshan Wang committed
41
    language that the client implements. The Mu client must understand common
42 43 44 45 46 47 48 49
    instructions because it is the only source of IR code of Mu. That is to say,
    *there is no way any higher-level program can express anything which Mu
    knows but the client does not*. For special high-level language functions
    that cannot be directly implemented in the high-level programming language,
    like the methods in the ``java.lang.Thread`` class, the client must
    implement those special high-level language functions in "ordinary" Mu IR
    code, which may or may not involve common instructions. For example,
    creating a thread is a "magic" in Java, but it is not more special than
50 51 52 53
    executing an instruction (``NEWTHREAD``) in Mu. Some Java libraries require
    Mu to make a ``CCALL`` to some C functions which are provided by the JVM,
    and they slip under the level of Mu. But Mu and the client always know the
    fact that "it call C function" and it is not magic.
54

55
This document uses the following notation::
56

57
    [id]@name [F1 F2 ...] < T1 T2 ... > ( p1:t1, p2:t2, ... ) excClause KEEPALIVE -> RTs
58
    
59 60 61 62 63
``id`` is the ID and ``@name`` is the name. ``F1 F2 ...`` is a list of flags.
``T1 T2 ...`` is a list of types.  It is the type parameter list. ``p1:t1,
p2:t2, ...`` is a list of symbolic name and type pair. It is the value parameter
list with the type of each parameter.  If ``excClause`` or ``KEEPALIVE`` are
present, they mean that the common instruction accepts exception clause or
64
keepalive clause, respectively.  ``RTs`` are the return types. If the type
65 66
parameter list or the value parameter list are omitted, it means the common
instruction takes no type parameters or value parameters, respectively. If the
Kunshan Wang's avatar
Kunshan Wang committed
67
return type is omitted, it returns no results (equivalent to ``-> ()``).
68

69 70 71 72
The names of many common instructions are grouped by prefixes, such as
``@uvm.tr64.``. In this document, their common prefixes may be omitted in their
descriptions when unambiguous.

73 74 75
Thread and Stack operations
===========================

76 77
::

78
    [0x201]@uvm.new_stack <[sig]> (%func: funcref<sig>) -> stackref
79

80 81 82
Create a new stack with ``%func`` as the stack-bottom function. ``%func`` must
have signature ``sig``. Returns the stack reference to the new stack.

83 84
The stack-bottom frame is in the state **READY<Ts>**, where *Ts* are the
parameter types of ``%func``.
85 86 87 88

This instruction continues exceptionally if Mu failed to create the stack. The
exception parameter receives NULL.

89 90 91
::

    [0x202]@uvm.kill_stack (%s: stackref)
92 93 94 95

Destroy the given stack ``%s``. The stack ``%s`` must be in the **READY** state
and will enter the **DEAD** state.

96 97 98
::

    [0x203]@uvm.thread_exit
99 100 101 102

Stop the current thread and kill the current stack. The current stack will enter
the **DEAD** state. The current thread stops running.

103 104 105
::

    [0x204]@uvm.current_stack -> stackref
106 107 108 109 110 111

Return the current stack.

64-bit Tagged Reference
=======================

112
::
113

114 115 116
    [0x211]@uvm.tr64.is_fp  (%tr: tagref64) -> int<1>
    [0x212]@uvm.tr64.is_int (%tr: tagref64) -> int<1>
    [0x213]@uvm.tr64.is_ref (%tr: tagref64) -> int<1>
117

118 119 120
- ``is_fp`` checks if ``%tr`` holds an FP number.
- ``is_int`` checks if ``%tr`` holds an integer.
- ``is_ref`` checks if ``%tr`` holds a reference.
121

122
Return 1 or 0 for true or false.
123

124
::
125

126 127 128
    [0x214]@uvm.tr64.from_fp  (%val: double) -> tagref64
    [0x215]@uvm.tr64.from_int (%val: int<52>) -> tagref64
    [0x216]@uvm.tr64.from_ref (%ref: ref<void>, %tag: int<6>) -> tagref6
129

130 131 132 133
- ``from_fp``  creates a ``tagref64`` value from an FP number ``%val``.
- ``from_int`` creates a ``tagref64`` value from an integer ``%val``.
- ``from_ref`` creates a ``tagref64`` value from a reference ``%ref`` and the
  integer tag ``%tag``.
134

135
Return the created ``tagref64`` value.
136 137


138
::
139

140 141 142 143
    [0x217]@uvm.tr64.to_fp  (%tr: tagref64) -> double
    [0x218]@uvm.tr64.to_int (%tr: tagref64) -> int<52>
    [0x219]@uvm.tr64.to_ref (%tr: tagref64) -> ref<void>
    [0x21a]@uvm.tr64.to_ref (%tr: tagref64) -> int<6>
144

145 146 147 148 149
- ``to_fp``  returns the FP number held by ``%tr``.
- ``to_int`` returns the integer held by ``%tr``.
- ``to_ref`` returns the reference held by ``%tr``.
- ``to_tag`` returns the integer tag held by ``%tr`` that accompanies the
  reference.
150

151 152
They have undefined behaviours if ``%tr`` does not hold the value of the
expected type.
153 154 155 156 157 158 159 160 161 162 163 164

Math Instructions
=================

    TODO: Should provide enough math functions to support:

    1. Ordinary arithmetic and logical operations that throw exceptions when
       overflow. Example: C# in checked mode, ``java.lang.Math.addOvf`` added in
       Java 1.8.
    2. Floating point math functions. Example: trigonometric functions, testing
       NaN, fused multiply-add, ...

165 166 167 168
    It requires some work to decide a complete list of such functions. To work
    around the limitations for now, please call native functions in libc or
    libm using ``CCALL``.

169 170 171
Futex Instructions
==================

Kunshan Wang's avatar
Kunshan Wang committed
172
See `<threads-stacks.rest>`__ for high-level descriptions about Futex.
Kunshan Wang's avatar
Kunshan Wang committed
173

174 175 176
Wait
----

177 178 179 180
::

    [0x220]@uvm.futex.wait <T> (%loc: iref<T>, %val: T) -> int<32>
    [0x221]@uvm.futex.wait_timeout <T> (%loc: iref<T>, %val: T, %timeout: int<64>) -> int<32>
181 182 183

``T`` must be an integer type.

184 185 186 187
``wait`` and ``wait_timeout`` verify if the memory location ``%loc`` still
contains the value ``%val`` and then put the current thread to the waiting queue
of memory location ``%loc``. If ``%loc`` does not contain ``%val``, return
immediately. These instructions are atomic.
188

189
- ``wait`` waits indefinitely.
190

191 192 193
- ``wait_timeout`` has an extra ``%timeout`` parameter which is a 64-bit
  unsigned integer that represents a time in nanoseconds. It specifies the
  duration of the wait.
Kunshan Wang's avatar
Kunshan Wang committed
194 195

Both instructions are allowed to spuriously wake up.
196

197
They return a signed integer which indicates the result of this call:
198 199

* 0: the current thread is woken.
Kunshan Wang's avatar
Kunshan Wang committed
200 201
* -1: the memory location ``%loc`` does not contain the value ``%val``.
* -2: spurious wakeup.
202
* -3: timeout during waiting (``wait_timeout`` only).
203 204 205 206

Wake
----

207 208 209
::

    [0x222]@uvm.futex.wake <T> (%loc: iref<T>, %nthread: int<32>) -> int<32>
210 211 212

``T`` must be an integer type.

213 214
``wake`` wakes *N* threads in the waiting queue of the memory location ``%loc``.
This instruction is atomic.
Kunshan Wang's avatar
Kunshan Wang committed
215 216 217 218 219 220 221 222 223 224

*N* is the minimum value of ``%nthread`` and the actual number of threads in the
waiting queue of ``%loc``. ``%nthread`` is signed. Negative ``%nthread`` has
undefined behaviour.

It returns the number of threads woken up.

Requeue
-------

225 226 227
::

    [0x223]@uvm.futex.cmp_requeue <T> (%loc_src: iref<T>, %loc_dst: iref<T>, %expected: T, %nthread: int<32>) -> int<32>
Kunshan Wang's avatar
Kunshan Wang committed
228 229 230

``T`` must be an integer type.

231 232 233 234 235
``cmp_requeue`` verifies if the memory location ``%loc_src`` still contains the
value ``%expected`` and then wakes up *N* threads from the waiting queue of
``%loc_src`` and move all other threads in the waiting queue of ``%loc_src`` to
the waiting queue of ``%loc_dst``. If ``%loc_src`` does not contain the value
``%expected``, return immediately. This instruction is atomic.
Kunshan Wang's avatar
Kunshan Wang committed
236 237 238 239 240 241 242

*N* is the minimum value of ``%nthread`` and the actual number of threads in the
waiting queue of ``%loc``. ``%nthread`` is signed. Negative ``%nthread`` has
undefined behaviour.

It returns a signed integer. When the ``%loc_src`` contains the value of
``%expected``, return the number of threads woken up; otherwise return -1.
243

244 245 246
Miscellaneous Instructions
==========================

247 248 249
::

    [0x230]@uvm.kill_dependency <T> (%val: T) -> T
250 251 252 253 254 255 256

Return the same value as ``%val``, but ``%val`` does not carry a dependency to
the return value.

    NOTE: This is supposed to free the compiler from keeping dependencies in
    some performance-critical cases.

257 258 259
Native Interface
================

260 261 262 263 264 265
Object pinning
--------------

::

    [0x240]@uvm.native.pin   <T> (%opnd: T) -> uptr<U>
Kunshan Wang's avatar
Kunshan Wang committed
266
    [0x241]@uvm.native.unpin <T> (%opnd: T)
267 268 269

*T* must be ``ref<U>`` or ``iref<U>`` for some U.

270 271 272 273 274
- ``pin`` adds one instance of the reference ``%opnd`` to the pinning multiset
  of the current thread.  Returns the mapped pointer to the bytes for the memory
  location.  If *T* is ``ref<U>``, it is equivalent to pinning the memory
  location of the whole object (as returned by the ``GETIREF`` instruction). If
  *opnd* is ``NULL``, the result is a null pointer whose address is 0.
275

276 277 278
- ``unpin`` removes one instance of the reference ``%opnd`` from the pinning
  multiset of the current thread. It has undefined behaviour if no such an
  instance exists.
279

280 281
Mu function exposing
--------------------
282

283
::
284

285
    [0x242]@uvm.native.expose [callconv] <[sig]> (%func: funcref<sig>, %cookie: int<64>) -> U
286

287 288
*callconv* is a platform-specific calling convention flag. *U* is determined by
the calling convention and *sig*.
289

290 291
``expose`` exposes a Mu function *func* as a value according to the calling
convention *callConv* with cookie *cookie*.
292

293 294 295 296 297 298
    Example::
        
        .funcdef @foo VERSION ... <@foo_sig> (...) { ... }

        %ev = COMMINST @uvm.native.expose [#DEFAULT] <[@foo_sig]>

299 300
::

Kunshan Wang's avatar
Kunshan Wang committed
301
    [0x243]@uvm.native.unexpose [callconv] (%value: U)
302 303 304 305

*callconv* is a platform-specific calling convention flag. *U* is determined by
the calling convention.

306 307 308
``unexpose`` removes the exposed value.

::
309

310
    [0x244]@uvm.native.get_cookie () -> int<64>
311 312 313 314

If a Mu function is called via its exposed value, this instruction returns the
attached cookie. Otherwise it returns an arbitrary value.

315 316 317 318
Metacircular Client Interface
=============================

These are additional instructions that enables Mu IR programs to behave like a
319
client.
320 321 322 323 324 325 326 327 328 329

Some types and signatures are pre-defined. They are always available. Note that
the following are not strict text IR syntax because some types are defined in
line::

    .typedef @uvm.meta.bytes   = hybrid <int<64> int<8>>    // ID: 0x260
    .typedef @uvm.meta.bytes.r = ref<@uvm.meta.bytes.r>     // ID: 0x261
    .typedef @uvm.meta.refs    = hybrid <int<64> ref<void>> // ID: 0x262
    .typedef @uvm.meta.refs.r  = ref<@uvm.meta.refs.r>      // ID: 0x263

Kunshan Wang's avatar
Kunshan Wang committed
330
    .funcsig @uvm.meta.trap_handler.sig       = (stackref int<32> ref<void>) -> ()   // ID: 0x264
331 332 333 334 335 336 337

In ``bytes`` and ``refs``, the fixed part is the length of the variable part.
``bytes`` represents a byte array. ASCII strings are also represented this way.

ID/name conversion
------------------

338
::
339

340 341
    [0x250]@uvm.meta.id_of (%name: @uvm.meta.bytes.r) -> int<32>
    [0x251]@uvm.meta.name_of (%id: int<32>) -> @uvm.meta.bytes.r
342

343 344 345 346 347 348
- ``id_of`` converts a textual Mu name ``%name`` to the numerical ID. The name
  must be a global name.

- ``name_of`` converts the ID ``%id`` to its corresponding name. If the name
  does not exist (if defined in binary only, or it is an instruction without
  name), it returns ``NULL``. The returned object must not be modified.
349 350 351 352 353 354 355

They have undefined behaviours if the name or the ID in the argument do not
exist, or ``%name`` is ``NULL``.

Bundle/HAIL loading
-------------------

356 357 358 359
::

    [0x252]@uvm.meta.load_bundle (%buf: @uvm.meta.bytes.r)
    [0x253]@uvm.meta.load_hail   (%buf: @uvm.meta.bytes.r)
360 361 362 363 364 365 366 367

``load_bundle`` and ``load_hail`` loads Mu IR bundles and HAIL scripts,
respectively. ``%buf`` is the content. The first 4 characters in ``%buf``
determines whether it is binary or text.

Stack introspection
-------------------

368
::
369

370 371 372 373
    [0x254]@uvm.meta.new_cursor         (%stack: stackref) -> framecursorref
    [0x255]@uvm.meta.next_frame         (%cursor: framecursorref)
    [0x256]@uvm.meta.copy_cursor        (%cursor: framecursorref) -> framecursorref
    [0x257]@uvm.meta.close_cursor       (%cursor: framecursorref)
374

375
In all cases, ``cursor`` and ``stack`` cannot be ``NULL``.
376

377 378
- ``new_cursor`` allocates a frame cursor, referring to the top frame of
  ``%stack``. Returns the frame cursor reference.
379

380 381
- ``next_frame`` moves the frame cursor so that it refers to the frame below its
  current frame.
382

383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405
- ``copy_cursor`` allocates a frame cursor which refers to the same frame as
  ``%cursor``. Returns the frame cursor reference.

- ``close_cursor`` deallocates the cursor.

::

    [0x258]@uvm.meta.cur_func           (%cursor: framecursorref) -> int<32>
    [0x259]@uvm.meta.cur_func_Ver       (%cursor: framecursorref) -> int<32>
    [0x25a]@uvm.meta.cur_inst           (%cursor: framecursorref) -> int<32>
    [0x25b]@uvm.meta.dump_keepalives    (%cursor: framecursorref) -> @uvm.meta.refs.r

These functions operate on the frame referred by ``%cursor``. In all cases,
``%cursor`` cannot be ``NULL``.

- ``cur_func`` returns the ID of the frame. Returns 0 if the frame is native.

- ``cur_func_ver`` returns the ID of the current function version of the frame.
  Returns 0 if the frame is native, or the function of the frame is undefined.

- ``cur_inst`` returns the ID of the current instruction of the frame. Returns 0
  if the frame is just created, its function is undefined, or the frame is
  native.
406

407
- ``dump_keepalives`` dumps the values of the keep-alive variables of the
408 409 410 411 412
  current instruction in the frame. If the function is undefined, the arguments
  are the keep-alive variables. Cannot be used on native frames. The return
  value is a list of object references, each of which refers to an object which
  has type *T* and contains value *v*, where *T* and *v* are the type and the
  value of the corresponding keep-alive variable, respectively.
413

414 415 416
On-stack replacement
--------------------

417
::
418

419 420
    [0x25c]@uvm.meta.pop_frames_to (%cursor: framecursorref)
    [0x25d]@uvm.meta.push_frame <[sig]> (%stack: stackref, %func: funcref<sig>)
421

422
``%cursor``, ``%stack`` and ``%func`` must not be ``NULL``.
423

424
- ``pop_frames_to`` pops all frames above ``%cursor``.
425

426 427 428
- ``push_frame`` creates a new frame on top of the stack ``%stack`` for the
  current version of the Mu function ``%func``. ``%func`` must have the
  signature ``sig``.
429 430 431 432

Watchpoint operations
---------------------

433
::
434

435 436
    [0x25e]@uvm.meta.enable_watchpoint  (%wpid: int<32>)
    [0x25f]@uvm.meta.disable_watchpoint (%wpid: int<32>)
437 438 439

- ``enable_watchpoint``    enables  all watchpoints of watchpoint ID ``%wpid``.
- ``disenable_watchpoint`` disables all watchpoints of watchpoint ID ``%wpid``.
440

441 442 443
Trap handling
-------------

444 445
::

446
    [0x260]@uvm.meta.set_trap_handler (%handler: funcref<@uvm.meta.trap_handler.sig>, %userdata: ref<void>)
447

448 449
This instruction registers a trap handler. ``%handler`` is the function to be
called and ``%userdata`` will be their last argument when called.
450

451 452
This instruction overrides the trap handler registered via the C-based client
API.
453 454 455 456 457 458 459

A trap handler takes three parameters:

1. The stack where the trap takes place.
2. The watchpoint ID, or 0 if triggered by the ``TRAP`` instruction.
3. The user data, which is provided when registering.

Kunshan Wang's avatar
Kunshan Wang committed
460 461
A trap handler is run by the same Mu thread that caused the trap and is executed
on a new stack.
462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483

A trap handler *usually* terminates by either executing the ``@uvm.thread_exit``
instruction (probably also kill the old stack before exiting), or ``SWAPSTACK``
back to another stack while killing the stack the trap handler was running on.

Notes about dynamism
--------------------

These additional instructions are not dynamic. Unlike the C-based API, these
instructions do not use handles. Arguments, such as the additional arguments of
``push_frame`` are also statically typed. If the client needs dynamically typed
handles, it can always make its own. For example, ``push_frame`` can be wrapped
by a Mu function which takes a dynamic argument list, checks the argument types,
and executes a static ``@uvm.meta.push_frame`` instruction on the unboxed
values.

Some dynamic lookups, such as looking up constants by ID, are not available,
either. It can be worked around by maintaining a ``HashMap<id,value>`` (in the
form of Mu IR programs) which is updated with each bundle loading. In other
words, if the client does not maintain such a map, Mu will have to maintain it
for the client.

484
.. vim: tw=80