compiling C modules into loadable shared objects

v923z · Post by **v923z** » Mon Sep 02, 2019 5:40 pm

Hi all,

I would like to raise a quite general question, most probably for the maintainers of the micropython code base.

On linux, python C modules can be compiled into shared objects, resulting in a single .so file that can then be imported. What would it take to compile micropython C modules into shared objects, and equip the interpreter with the option of loading such modules from anywhere in the file system?

My rationale is the following.

1. with such a facility, there could be a community-maintained, community-curated list of all kinds of modules, which could be employed without the hassle of the extra compilation and upload step. I believe, this would be a boost in the general accessibility of the whole
project. (I, for one, would definitely try out more modules, if I didn't have to compile, and this would also make micropython field-serviceable. I understand that I don't have to compile python modules, either, but most of them consist of multiple python files, scattered according to the linking of the maintainer. .so files, on the other hand, are single objects.)

2. having the modules detached from the firmware would have the additional benefit that one would no longer be limited by the size of the flash: the shared objects could be stored on and loaded from the SD card.

On linux (or any other OS for that matter), during the compilation of a C module, a lot of effort goes into making sure that all the required libraries are aligned, and dependencies are resolved, and this is, why you usually have to compile the code for your particular computer.
micropython would not suffer from this issue, because there is only one version, so I believe, the maintenance costs are moderate. (One would still have to compile for the various ports, but that is OK, and unavoidable, no matter what.)

Is this a good idea at all? If it isn't, what are the tripping points? If it is, how should one set out to implement it? Are the hurdles insurmountable?

Cheers,

Zoltán

stijn · Post by **stijn** » Mon Sep 02, 2019 5:55 pm

There's https://github.com/micropython/micropython/pull/1627 which think is similar to what you're asking about and there's my take on it https://github.com/stinos/micropython/tree/windows-pyd which is roughly like what CPython does.

which could be employed without the hassle of the extra compilation and upload step

I'm not 100% sure that a .so built on one machine with one specific version of gcc and glibc and whatever dependencies it has, automatically works on another machine?

.so files, on the other hand, are single objects

It's not uncommon to have a .so file doing lower level stuff and then some additional .py files for the higher level machinery, in fact that can be pretty handy to not have to do things in C or C++ when Python is better suited for it. But that's just to illustrate that the 'single .so file' principle has it's limits as well.

OutoftheBOTS_ · Post by **OutoftheBOTS_** » Mon Sep 02, 2019 8:44 pm

I do believe they are working on exactly this atm. They are working on being able to compile any C code into machine instructions then MP being able to execute the machine instructions.

A problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.

See here where Damien is demo the progress so far https://www.youtube.com/watch?v=GDXsK1b ... XfHxj5gmLE

jimmo · Post by **jimmo** » Mon Sep 02, 2019 10:36 pm

Yes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.

It's working for STM32 and Unix. ESP32 is not too far off but there are some additional complexities with the Xtensa arch. (There's also currently no native emitter for ESP32)

It's quite a complicated feature. As you've described it's basically the same functionality as a .so on Linux or a .dll on Windows, so all the same things with a dynamic loader -- code relocation etc. Added to that the MicroPython-specific complexity like QSTRs and so forth.

Fundamentally though the way this works is the same as the existing .mpy cross compiler (which takes Python code and spits out a "compiled" .mpy file). Because the Python code can already contain @native decorated functions, the .mpy format already had support for native code. The main thing this feature adds is a way of taking compiler-generated native code and packaging it up. Also because it compiles independently of the main firmware build, it needs a way to find symbols from the main firmware.

OutoftheBOTS_ wrote: ↑
Mon Sep 02, 2019 8:44 pm
A problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.

This is already a problem for mpy-cross when you use @native. So .mpy files are already sometimes arch specific.

v923z wrote: ↑
Mon Sep 02, 2019 5:40 pm
2. having the modules detached from the firmware would have the additional benefit that one would no longer be limited by the size of the flash: the shared objects could be stored on and loaded from the SD card.

Unfortunately this isn't quite as good as you'd hope. .mpy files have to execute from RAM, because you can't memory map an SD card (on most microcontrollers at least). So you're limited by available heap.

mattyt · Post by **mattyt** » Tue Sep 03, 2019 12:44 am

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
Yes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.

I was working on the videos from the meetup - including Damien's - today; expect them up in the next couple of days. I'll post links when they're available!

There's also some additional information about native modules toward the end of my PyCon AU talk on Extending MicroPython.

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
This is already a problem for mpy-cross when you use @native. So .mpy files are already sometimes arch specific.

I think the more interesting problem to solve is of distribution; how to ensure the right native module, built for the correct architecture, is made available...

v923z · Post by **v923z** » Wed Sep 04, 2019 6:43 pm

First of all, thanks to everyone, who has chipped in. Really enlightening discussion and comments.

A couple of concrete issues:

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
Yes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.

It's working for STM32 and Unix. ESP32 is not too far off but there are some additional complexities with the Xtensa arch. (There's also currently no native emitter for ESP32)

I will check out the video you mentioned. Thanks!

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
It's quite a complicated feature. As you've described it's basically the same functionality as a .so on Linux or a .dll on Windows, so all the same things with a dynamic loader -- code relocation etc. Added to that the MicroPython-specific complexity like QSTRs and so forth.

Fundamentally though the way this works is the same as the existing .mpy cross compiler (which takes Python code and spits out a "compiled" .mpy file). Because the Python code can already contain @native decorated functions, the .mpy format already had support for native code. The main thing this feature adds is a way of taking compiler-generated native code and packaging it up. Also because it compiles independently of the main firmware build, it needs a way to find symbols from the main firmware.

Well, I believe, this is the most exhaustive answer to my question. I wasn't aware of these technical difficulties, and these definitely put a different light on the issue at hand.

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm

OutoftheBOTS_ wrote: ↑
Mon Sep 02, 2019 8:44 pm
A problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.
This is already a problem for mpy-cross when you use @native. So .mpy files are already sometimes arch specific.

Yes, this was clear from the beginning. If I am not mistaken, I even pointed this out in the OP.

OutoftheBOTS_ wrote: ↑
Mon Sep 02, 2019 8:44 pm

v923z wrote: ↑
Mon Sep 02, 2019 5:40 pm
2. having the modules detached from the firmware would have the additional benefit that one would no longer be limited by the size of the flash: the shared objects could be stored on and loaded from the SD card.
Unfortunately this isn't quite as good as you'd hope. .mpy files have to execute from RAM, because you can't memory map an SD card (on most microcontrollers at least). So you're limited by available heap.

I think there is a slight misunderstanding here. Probably my fault. Obviously, one can't overcome the limitations of the heap. What I meant is that, if you have a hundred packages, occupying 2 MB in total, then you can still call single modules. After all, the firmware itself is already bigger than the RAM (350 kB versus 190 kB for the pyboard). What I really wanted to say is that you wouldn't have to select your modules at compile time, you could have everything, and then load the one that you need only at compile time.

Also, if you update your code, then you can simply distribute your shared file, and one doesn't have to burn the firmware.

Best,

Zoltán

mattyt · Post by **mattyt** » Wed Sep 04, 2019 11:32 pm

mattyt wrote: ↑
Tue Sep 03, 2019 12:44 am

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
Yes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.
I was working on the videos from the meetup - including Damien's - today; expect them up in the next couple of days. I'll post links when they're available!

Damien's talk on Native Modules in MicroPython (part 2) is now online. The video quality isn't great - the screen in particular is difficult to read - but hopefully it's good enough to be useful.

sebi · Post by **sebi** » Thu Sep 05, 2019 9:29 am

OutoftheBOTS_ wrote: ↑
Mon Sep 02, 2019 8:44 pm
See here where Damien is demo the progress so far https://www.youtube.com/watch?v=GDXsK1b ... XfHxj5gmLE

mattyt wrote: ↑
Wed Sep 04, 2019 11:32 pm
Damien's talk on Native Modules in MicroPython (part 2) is now online.

Such interesting videos! The potential of this new tool is huge.

sebi · Post by **sebi** » Thu Sep 05, 2019 3:51 pm

As a newbie in computer science there are plenty of terms that I don't understand well. I have tried to google them but still I am lacking of a deeper understanding. Maybe you experts can give me some hints

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
MicroPython-specific complexity like QSTRs

What are exactly Qstrings? Is it similar to interned-strings? Why is it specific to MicroPython? I thought CPython used the same approach.

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
it needs a way to find symbols from the main firmware.

What are symbols from the main firmware? Don't they have an identifier?

jimmo wrote: ↑
Mon Sep 02, 2019 10:36 pm
.mpy files have to execute from RAM, because you can't memory map an SD card

What is memory map? Cannot we use the flash memory instead? I thought the flash memory was memory mapped.

If I understand well, freezing a module allows that module to use less RAM at runtime.
Isn't freezing similar to generating a mpy file and copying that mpy file onto the flash memory?

Thx

v923z · Post by **v923z** » Thu Sep 05, 2019 5:30 pm

mattyt wrote: ↑
Wed Sep 04, 2019 11:32 pm
Damien's talk on Native Modules in MicroPython (part 2) is now online. The video quality isn't great - the screen in particular is difficult to read - but hopefully it's good enough to be useful.

Thanks for your efforts!

Zoltán

MicroPython Forum (Archive)

compiling C modules into loadable shared objects

compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects

Re: compiling C modules into loadable shared objects