compiling C modules into loadable shared objects
compiling C modules into loadable shared objects
Hi all,
I would like to raise a quite general question, most probably for the maintainers of the micropython code base.
On linux, python C modules can be compiled into shared objects, resulting in a single .so file that can then be imported. What would it take to compile micropython C modules into shared objects, and equip the interpreter with the option of loading such modules from anywhere in the file system?
My rationale is the following.
1. with such a facility, there could be a community-maintained, community-curated list of all kinds of modules, which could be employed without the hassle of the extra compilation and upload step. I believe, this would be a boost in the general accessibility of the whole
project. (I, for one, would definitely try out more modules, if I didn't have to compile, and this would also make micropython field-serviceable. I understand that I don't have to compile python modules, either, but most of them consist of multiple python files, scattered according to the linking of the maintainer. .so files, on the other hand, are single objects.)
2. having the modules detached from the firmware would have the additional benefit that one would no longer be limited by the size of the flash: the shared objects could be stored on and loaded from the SD card.
On linux (or any other OS for that matter), during the compilation of a C module, a lot of effort goes into making sure that all the required libraries are aligned, and dependencies are resolved, and this is, why you usually have to compile the code for your particular computer.
micropython would not suffer from this issue, because there is only one version, so I believe, the maintenance costs are moderate. (One would still have to compile for the various ports, but that is OK, and unavoidable, no matter what.)
Is this a good idea at all? If it isn't, what are the tripping points? If it is, how should one set out to implement it? Are the hurdles insurmountable?
Cheers,
Zoltán
I would like to raise a quite general question, most probably for the maintainers of the micropython code base.
On linux, python C modules can be compiled into shared objects, resulting in a single .so file that can then be imported. What would it take to compile micropython C modules into shared objects, and equip the interpreter with the option of loading such modules from anywhere in the file system?
My rationale is the following.
1. with such a facility, there could be a community-maintained, community-curated list of all kinds of modules, which could be employed without the hassle of the extra compilation and upload step. I believe, this would be a boost in the general accessibility of the whole
project. (I, for one, would definitely try out more modules, if I didn't have to compile, and this would also make micropython field-serviceable. I understand that I don't have to compile python modules, either, but most of them consist of multiple python files, scattered according to the linking of the maintainer. .so files, on the other hand, are single objects.)
2. having the modules detached from the firmware would have the additional benefit that one would no longer be limited by the size of the flash: the shared objects could be stored on and loaded from the SD card.
On linux (or any other OS for that matter), during the compilation of a C module, a lot of effort goes into making sure that all the required libraries are aligned, and dependencies are resolved, and this is, why you usually have to compile the code for your particular computer.
micropython would not suffer from this issue, because there is only one version, so I believe, the maintenance costs are moderate. (One would still have to compile for the various ports, but that is OK, and unavoidable, no matter what.)
Is this a good idea at all? If it isn't, what are the tripping points? If it is, how should one set out to implement it? Are the hurdles insurmountable?
Cheers,
Zoltán
Re: compiling C modules into loadable shared objects
There's https://github.com/micropython/micropython/pull/1627 which think is similar to what you're asking about and there's my take on it https://github.com/stinos/micropython/tree/windows-pyd which is roughly like what CPython does.
I'm not 100% sure that a .so built on one machine with one specific version of gcc and glibc and whatever dependencies it has, automatically works on another machine?which could be employed without the hassle of the extra compilation and upload step
It's not uncommon to have a .so file doing lower level stuff and then some additional .py files for the higher level machinery, in fact that can be pretty handy to not have to do things in C or C++ when Python is better suited for it. But that's just to illustrate that the 'single .so file' principle has it's limits as well..so files, on the other hand, are single objects
-
- Posts: 847
- Joined: Mon Nov 20, 2017 10:18 am
Re: compiling C modules into loadable shared objects
I do believe they are working on exactly this atm. They are working on being able to compile any C code into machine instructions then MP being able to execute the machine instructions.
A problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.
See here where Damien is demo the progress so far https://www.youtube.com/watch?v=GDXsK1b ... XfHxj5gmLE
A problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.
See here where Damien is demo the progress so far https://www.youtube.com/watch?v=GDXsK1b ... XfHxj5gmLE
Re: compiling C modules into loadable shared objects
Yes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.
It's working for STM32 and Unix. ESP32 is not too far off but there are some additional complexities with the Xtensa arch. (There's also currently no native emitter for ESP32)
It's quite a complicated feature. As you've described it's basically the same functionality as a .so on Linux or a .dll on Windows, so all the same things with a dynamic loader -- code relocation etc. Added to that the MicroPython-specific complexity like QSTRs and so forth.
Fundamentally though the way this works is the same as the existing .mpy cross compiler (which takes Python code and spits out a "compiled" .mpy file). Because the Python code can already contain @native decorated functions, the .mpy format already had support for native code. The main thing this feature adds is a way of taking compiler-generated native code and packaging it up. Also because it compiles independently of the main firmware build, it needs a way to find symbols from the main firmware.
It's working for STM32 and Unix. ESP32 is not too far off but there are some additional complexities with the Xtensa arch. (There's also currently no native emitter for ESP32)
It's quite a complicated feature. As you've described it's basically the same functionality as a .so on Linux or a .dll on Windows, so all the same things with a dynamic loader -- code relocation etc. Added to that the MicroPython-specific complexity like QSTRs and so forth.
Fundamentally though the way this works is the same as the existing .mpy cross compiler (which takes Python code and spits out a "compiled" .mpy file). Because the Python code can already contain @native decorated functions, the .mpy format already had support for native code. The main thing this feature adds is a way of taking compiler-generated native code and packaging it up. Also because it compiles independently of the main firmware build, it needs a way to find symbols from the main firmware.
This is already a problem for mpy-cross when you use @native. So .mpy files are already sometimes arch specific.OutoftheBOTS_ wrote: ↑Mon Sep 02, 2019 8:44 pmA problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.
Unfortunately this isn't quite as good as you'd hope. .mpy files have to execute from RAM, because you can't memory map an SD card (on most microcontrollers at least). So you're limited by available heap.
Re: compiling C modules into loadable shared objects
I was working on the videos from the meetup - including Damien's - today; expect them up in the next couple of days. I'll post links when they're available!jimmo wrote: ↑Mon Sep 02, 2019 10:36 pmYes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.
There's also some additional information about native modules toward the end of my PyCon AU talk on Extending MicroPython.
I think the more interesting problem to solve is of distribution; how to ensure the right native module, built for the correct architecture, is made available...
Re: compiling C modules into loadable shared objects
First of all, thanks to everyone, who has chipped in. Really enlightening discussion and comments.
A couple of concrete issues:
Also, if you update your code, then you can simply distribute your shared file, and one doesn't have to burn the firmware.
Best,
Zoltán
A couple of concrete issues:
I will check out the video you mentioned. Thanks!jimmo wrote: ↑Mon Sep 02, 2019 10:36 pmYes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.
It's working for STM32 and Unix. ESP32 is not too far off but there are some additional complexities with the Xtensa arch. (There's also currently no native emitter for ESP32)
Well, I believe, this is the most exhaustive answer to my question. I wasn't aware of these technical difficulties, and these definitely put a different light on the issue at hand.jimmo wrote: ↑Mon Sep 02, 2019 10:36 pmIt's quite a complicated feature. As you've described it's basically the same functionality as a .so on Linux or a .dll on Windows, so all the same things with a dynamic loader -- code relocation etc. Added to that the MicroPython-specific complexity like QSTRs and so forth.
Fundamentally though the way this works is the same as the existing .mpy cross compiler (which takes Python code and spits out a "compiled" .mpy file). Because the Python code can already contain @native decorated functions, the .mpy format already had support for native code. The main thing this feature adds is a way of taking compiler-generated native code and packaging it up. Also because it compiles independently of the main firmware build, it needs a way to find symbols from the main firmware.
Yes, this was clear from the beginning. If I am not mistaken, I even pointed this out in the OP.jimmo wrote: ↑Mon Sep 02, 2019 10:36 pmThis is already a problem for mpy-cross when you use @native. So .mpy files are already sometimes arch specific.OutoftheBOTS_ wrote: ↑Mon Sep 02, 2019 8:44 pmA problem to consider with this is that unlike your PC where everyone is running the same architecture with the same machine instructions. With MP you wouldn't be able to compile a C module using the ARM compiler then call it fro the ESP32 port of MP.
I think there is a slight misunderstanding here. Probably my fault. Obviously, one can't overcome the limitations of the heap. What I meant is that, if you have a hundred packages, occupying 2 MB in total, then you can still call single modules. After all, the firmware itself is already bigger than the RAM (350 kB versus 190 kB for the pyboard). What I really wanted to say is that you wouldn't have to select your modules at compile time, you could have everything, and then load the one that you need only at compile time.OutoftheBOTS_ wrote: ↑Mon Sep 02, 2019 8:44 pmUnfortunately this isn't quite as good as you'd hope. .mpy files have to execute from RAM, because you can't memory map an SD card (on most microcontrollers at least). So you're limited by available heap.
Also, if you update your code, then you can simply distribute your shared file, and one doesn't have to burn the firmware.
Best,
Zoltán
Re: compiling C modules into loadable shared objects
Damien's talk on Native Modules in MicroPython (part 2) is now online. The video quality isn't great - the screen in particular is difficult to read - but hopefully it's good enough to be useful.mattyt wrote: ↑Tue Sep 03, 2019 12:44 amI was working on the videos from the meetup - including Damien's - today; expect them up in the next couple of days. I'll post links when they're available!jimmo wrote: ↑Mon Sep 02, 2019 10:36 pmYes - this feature is being actively worked on. Damien actually gave a demo of this last week at the Melbourne MicroPython Meetup (an updated version of the video linked earlier in this thread). I don't think the video is online yet though. It's much nicer to use now than those earlier demos.
Re: compiling C modules into loadable shared objects
OutoftheBOTS_ wrote: ↑Mon Sep 02, 2019 8:44 pmSee here where Damien is demo the progress so far https://www.youtube.com/watch?v=GDXsK1b ... XfHxj5gmLE
Such interesting videos! The potential of this new tool is huge.mattyt wrote: ↑Wed Sep 04, 2019 11:32 pmDamien's talk on Native Modules in MicroPython (part 2) is now online.
Re: compiling C modules into loadable shared objects
As a newbie in computer science there are plenty of terms that I don't understand well. I have tried to google them but still I am lacking of a deeper understanding. Maybe you experts can give me some hints
If I understand well, freezing a module allows that module to use less RAM at runtime.
Isn't freezing similar to generating a mpy file and copying that mpy file onto the flash memory?
Thx
What are exactly Qstrings? Is it similar to interned-strings? Why is it specific to MicroPython? I thought CPython used the same approach.
What are symbols from the main firmware? Don't they have an identifier?
What is memory map? Cannot we use the flash memory instead? I thought the flash memory was memory mapped.
If I understand well, freezing a module allows that module to use less RAM at runtime.
Isn't freezing similar to generating a mpy file and copying that mpy file onto the flash memory?
Thx
Re: compiling C modules into loadable shared objects
Thanks for your efforts!mattyt wrote: ↑Wed Sep 04, 2019 11:32 pmDamien's talk on Native Modules in MicroPython (part 2) is now online. The video quality isn't great - the screen in particular is difficult to read - but hopefully it's good enough to be useful.
Zoltán