Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • G general-issue-tracker
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 47
    • Issues 47
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • mumu
  • general-issue-tracker
  • Issues
  • #67
Closed
Open
Issue created Aug 15, 2016 by Kunshan Wang@u5211824Owner

C thread becoming Mu thread (exposed functions, a.k.a. ".expfunc")

This issue is about calling Mu functions from C functions. It is not a problem if Mu initiated the call to native program and then it calls back. But when a fresh native thread (such as created by pthread_create) directly calls a Mu function, thread-local states (such as GC states) must have been initialised, or the Mu program will not work properly.

Related spec: https://gitlab.anu.edu.au/mu/mu-spec/blob/master/native-interface.rst#native-functions-calling-mu-functions

Previous issue: #39

The problem

When a Mu thread is executing, there are thread-local states that needs to exist to support the execution of Mu IR programs.

For example, if the Mu IR program uses bump-pointer GC, the "current pointer" is a per-thread state, and it should point to the next available memory all the time. Mu instructions (such as NEW and NEWHYBRID) assumes such thread-local pointers are set up when such instructions are executed.

Such states are usually set up when a Mu thread is created. When a thread is created using the NEWTHREAD instruction or its equivalent API, the micro VM will initialise the states properly.

But the problem arises when the thread is created natively (for example, by pthread_create). Such POSIX functions are not designed with Mu in mind and will not initialise Mu-specific states. So a PThread cannot call Mu directly call a Mu function unless some preparation is done.

Current design

Related spec: https://gitlab.anu.edu.au/mu/mu-spec/blob/master/native-interface.rst#native-functions-calling-mu-functions

The current Mu spec requires implementation-defined functions to be called before native threads not created by Mu (such as POSIX threads) can call any exposed Mu functions.

A Mu bundle can define .expfunc top-level definitions to directly expose pointers to C programs. For example:

.funcdef @fac ... {...}

.expfunc @fac_native = @fac #DEFAULT @I64_0    // expose @fac, default calling convention, use 0 as "cookie".

@fac_native is a raw function pointer which can be called back when Mu calls C and then C calls back to Mu. But when PThread wants to call @fac_native, it needs implementation-defined set-up.

Possible implementations

  • The concrete micro VM can forbid such calls, and enforce that only Mu threads can execute Mu functions.
  • The concrete micro VM can extend the API with a function to attach or detach PThreads, or threads using other APIs.
  • The concrete micro VM can create Mu-specific thread-local states lazily when entering from native to Mu. Since the only way to enter Mu is via "exposed functions", hence stubs can be created at those "expfuncs" to lazily check for such states, or use SIGSEGV to trap when such pointers are zero.

Each has its own strength and weakness. This is why this interface is still implementation-defined for now. Real-world experiences will tell which method is better.

Multiple micro VMs in the same process?

It is rare that there will be one process running two micro VMs. But it is definitely possible. For example:

  • A C host program provides both Python and Lua as extension languages (real-world applications exist), but both language implementations use the Mu micro VM.
  • The client has some kind of sandbox mechanism and forces some part of the program to run in a separate micro VM.

Related works

JNI Invocation API

Related document: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html#attaching_to_the_vm

The JVM invocation API provides the AttachCurrentThread function to attach a PThread to a JVM, under the limitation that a native thread cannot be attached to two different JVMs. JNI also require that the PThread stack "should have enough stack space to perform a reasonable amount of work" and "The allocation of stack space per thread is operating system-specific. For example, using pthreads, the stack size can be specified in the pthread_attr_t argument to pthread_create.".

From Mu's point of view, the MuCtx structure holds Mu states for the client, so calling API functions in MuCtx does not need any attaching. However, calling "exposed Mu functions" will need special set-up like AttachCurrentThread.

JikesRVM

JikesRVM's GC is designed in such a way that it will work even if the related thread-local data structure is all zero (as is initialised by the system). This gracefully avoided the problem related to GC. But it could not be the most general solution.

.NET framework

Related documents: https://msdn.microsoft.com/en-us/library/74169f59(v=vs.110).aspx

VM-related thread-local states are created lazily when an unmanaged thread enters the managed runtime.

Assignee
Assign to
Time tracking