Process isolation

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
jockm
Posts: 13
Joined: Tue Aug 21, 2018 9:46 pm

Re: Process isolation

Post by jockm » Sun Aug 26, 2018 9:24 pm

jickster wrote:
Sat Aug 25, 2018 11:24 pm
Why isn’t it just a small refactor?

Everywhere the macros MP_STATE_* are used is what you’d have to modify.
Not so much. Take a look at how memory is allocated and you will see it assumes a single memory pool which isn't desirable for multiple instances in an embedded application.

Imagine you have a system instance and an applet instance, you wouldn't want the applet instance to be able to consume all memory and starve the system instance. You would also want the supervising task to be able to kill an instance and not leave memory heavily fragmented.

Even if you have a single instance in an embedded application you may want a memory pool of a fixed size to give to uPy. So this means all of the malloc/free/etc calls have to be potentially overridden and wrapped in MP_STATE_*.

I haven't done a full scope of work, but I know that wasn't the only issue I saw. Not a simple refactor as I say.
@pythoncoder mentioned another issue: scripts that allow direct access to physical resources - peripherals, memoryview of RAM - are going to have to be rewritten to use synchronization mechanisms to multiplex. Sounds very messy.

Even if code is refactored to pass in CONTEXT object, I guarantee you peripheral modules will NOT be rewritten (anytime soon) to support multiplexing. An entire new abstraction layer would have to be written and peripheral code updated to support multiplexing access to peripherals.
I am going to argue this doesn't need to be done — or handled in a minumal way — in the same way most RTOSs don't worry about this.

At most you have a xx_in_use mutex and fail if SPI, I2C, etc are in use at the time of the call. However even this may not be needed for many/most of the use cases of multiple interpreters.

In my own case the applet interpreters won't have access to any hardware IO. They are applets that interact with the hardware/outside world through a provided API. I am using uPy as an extension language, one that is lean enough to be used on a microcontroller.
Can you run useful scripts without hw resource synchronization? Doubtful. Even networking requires synchronization: the ip stack cannot run in both of the apps because only one thing can control the physical interface at once.

Without synchronization, one app must be the master - and running the networking - and the others the slaves. The master downloads the `.py` files and kicks off new isolated instances then ???somehow??? detect when the slaves are done and copy the results.
This still doesn’t synchronize other hw resources like SPI: if two or more scripts use same bus simultaneously you’re still screwed.
[/quote]

I think you are thinking too much about how uPy is being used right now, and not how it could be used if you could run multiple interpreters.

Also I don't anticipate one instance spawning another instance, but that the master C/C++ application would be doing that.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Process isolation

Post by jickster » Mon Aug 27, 2018 4:41 am

If you want memory isolation, you need separate heaps without question, case closed on that topic.


jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: Process isolation

Post by jickster » Mon Aug 27, 2018 4:58 am

jockm wrote:
jickster wrote:
Sat Aug 25, 2018 11:24 pm
Why isn’t it just a small refactor?

Everywhere the macros MP_STATE_* are used is what you’d have to modify.
Not so much. Take a look at how memory is allocated and you will see it assumes a single memory pool which isn't desirable for multiple instances in an embedded application.

Imagine you have a system instance and an applet instance, you wouldn't want the applet instance to be able to consume all memory and starve the system instance. You would also want the supervising task to be able to kill an instance and not leave memory heavily fragmented.

Even if you have a single instance in an embedded application you may want a memory pool of a fixed size to give to uPy. So this means all of the malloc/free/etc calls have to be potentially overridden and wrapped in MP_STATE_*.

I haven't done a full scope of work, but I know that wasn't the only issue I saw. Not a simple refactor as I say.

.
Obviously the memory allocation has to be modified to take into account the CONTEXT object.

Besides that and everywhere MP_STATE* macros are used, you also have to add code to vm.c to switch the execution between the applets.

Things would get complicated if you have critical sections within a .py to guard against other applets using shared resources. I don’t know how you’d handle that.



Sent from my iPad using Tapatalk Pro

jockm
Posts: 13
Joined: Tue Aug 21, 2018 9:46 pm

Re: Process isolation

Post by jockm » Mon Aug 27, 2018 5:10 am

jickster wrote:
Mon Aug 27, 2018 4:41 am
If you want memory isolation, you need separate heaps without question, case closed on that topic.
Well heap is a specific term. If you don't have heavyweight processes you only have one heap. What is common in embedded systems is to have memory pools that are allocated within the heap and then provide functions similar to malloc/free/realloc.

uPy's GC code is dangerously close to being this

jockm
Posts: 13
Joined: Tue Aug 21, 2018 9:46 pm

Re: Process isolation

Post by jockm » Mon Aug 27, 2018 5:14 am

jickster wrote:
Mon Aug 27, 2018 4:58 am

Besides that and everywhere MP_STATE* macros are used, you also have to add code to vm.c to switch the execution between the applets.
I see exactly no need for this. Please explain why you think it would be required.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: Process isolation

Post by jickster » Mon Aug 27, 2018 5:23 am

jockm wrote:
jickster wrote:
Mon Aug 27, 2018 4:41 am
If you want memory isolation, you need separate heaps without question, case closed on that topic.
Well heap is a specific term. If you don't have heavyweight processes you only have one heap. What is common in embedded systems is to have memory pools that are allocated within the heap and then provide functions similar to malloc/free/realloc.

uPy's GC code is dangerously close to being this
Each VM has a pool of memory associated which is commonly called the heap.
Objects allocated within a .py are allocated from this heap. If you write C-based py modules and use malloc/free/realloc functions - and it’s impossible to NOT use those functions at some point - the same heap is used.

If you do not want one applet to hog all memory, you have to created a pool aka heap for each VM.

What do you mean dangerously?


Sent from my iPhone using Tapatalk Pro

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: Process isolation

Post by jickster » Mon Aug 27, 2018 5:26 am

jockm wrote:
jickster wrote:
Mon Aug 27, 2018 4:58 am

Besides that and everywhere MP_STATE* macros are used, you also have to add code to vm.c to switch the execution between the applets.
I see exactly no need for this. Please explain why you think it would be required.

If you have multiple VMs, you want them to execute simultaneously from the perspective of each .py.

If you don’t have an OS to do that, it must be done in the uPy code somehow: vm.c must switch between applets otherwise one would hog 100% CPU until (or if) it ever finishes.


Sent from my iPhone using Tapatalk Pro

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Process isolation

Post by pythoncoder » Mon Aug 27, 2018 5:41 am

I wonder about the practicality of implementing this. I should stress I'm just a user of MicroPython and can't answer for the maintainers. So my views on their possible response is pure guesswork.

The principal purpose of MicroPython is to run as a single instance on bare metal. The number of users wanting to run multiple instances is small (but > 1). A solution is likely to be partial because of the issue of conflict over physical resources. Special coding is needed to achieve process isolation in terms of heap access.

It seems unlikely that the maintainers will find time to implement this. Nor are they likely to accept changes to MicroPython internals to provide this rather specialist functionality if it has a significant impact on code size or performance on baremetal targets. My guess is that the way forward is for users who need this to collaborate to agree a set of requirements. Then submit an actual, tested, PR with figures for any impact on baremetal performace.

If this is rejected you may need to consider a fork.
Peter Hinch
Index to my micropython libraries.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Process isolation

Post by jickster » Mon Aug 27, 2018 5:47 am

Changing key functions that reference the VM state so they take a CONTEXT* is not a big deal: it’ll increase code size small amount and runtime by the amount it takes to lookup the CONTEXT*

The real problem comes with multitasking.
If there’s no OS and you want the applets to run in parallel, you’re gonna have to switch CONTEXT* every so often round-robin style.

Are these changes going to have a big impact?
To me it appears no.


The bigger reason why this’ll get rejected is the still unjustified requirement for spawning to occur at the C-level. This means more infrastructure will need to be added to manage the applets with one C-API.

“BUT HOW CAN YOU SPAWN .py within a .py AND have separate heap”

Scoped-allocation. @pythoncoder you’ll remember the discussion we had with Damien on github. This here is another good use-case.

Scoped-allocation will give you isolation but done with .py language. The solution to be able to return stuff across heaps is obvious: if you want to return an object it must have a __deep_copy__ method implemented similar to how if you want to finalize an object it must have __del__ method implemented. Deep copy for data structures can be rolled out incrementally.

Sent from my iPhone using Tapatalk Pro

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Process isolation

Post by pythoncoder » Mon Aug 27, 2018 6:23 am

Damien's excellent plan for scoped allocations was to allocate on the C stack. Would that be compatible with context switching?
Peter Hinch
Index to my micropython libraries.

Post Reply