compiling native modules (ulab)

v923z · Post by **v923z** » Thu Jul 30, 2020 4:27 pm

Hi all,

I have recently seen that it is possible to turn .c code into .mpy files, so that the output can be loaded as a standard module without having to re-compile the firmware. This sounds fantastic, except that there are limitations, and even worse, I am not sure I could decipher all the limitations properly.

In particular, I would consider turning ulab into a native module, but I am not sure that is even possible. E.g., does the sentence from

"So, if your C code has writable data, make sure the data is defined globally, without an initialiser, and only written to within functions." from the micropython manual mean that user-defined structures containing writable data, like an ndarray, would not qualify? (An ndarray, since it supports the buffer protocol, could be written to by a number of things, not only by the user.)

Another question concerns performance: does the native code run at, well, native speed, or should one expect slow-down, or RAM penalties?

Can one load sub-modules of a native module, or will everything be loaded as a single chunk? ulab is split into 8 sub-modules, which can be imported separately. Would something like this be possible?

If compiling a complex library, like ulab, into a native module is feasible at all, I would definitely like to further explore the question.

Thanks,

Zoltán

jimmo · Post by **jimmo** » Fri Jul 31, 2020 12:56 am

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
and even worse, I am not sure I could decipher all the limitations properly.

Yes, unfortunately this is quite a complicated feature and getting to the bottom of it requires quite a lot of understanding of how linking and loading and stuff works... On regular operating systems you almost never think about this, but that's just because there's an awful lot more metadata available in a .elf file and the linker/loader can afford to be a lot more sophisticated (and not very "micro").

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
"So, if your C code has writable data, make sure the data is defined globally, without an initialiser, and only written to within functions." from the micropython manual mean that user-defined structures containing writable data, like an ndarray, would not qualify? (An ndarray, since it supports the buffer protocol, could be written to by a number of things, not only by the user.)

This paragraph is just providing a rewording of the technical explanation above -- there is no support for .data, only .bss.

There are three main sections in a program's binary -- .text (the machine code), .data (the global variables that have compile-time values), .bss (the global variables that are initialised).

Code: Select all

int a = 1; // this is in .data
int b; // this is in .bss

int main() {
  b = 2;
}

You can think of .data as exactly the bytes that need to be put into RAM during loading (i.e. all the pre-initialised values). There'll be a `1` exactly at the location that "a" points to. In contrast, .bss is just a region that isn't initialised, so you are responsible for setting these values yourself.

So what that paragraph is trying to say is: any global variables, even if you explicitly set a value at their definition, will not be initialised. (In other words, all .data is effectively .bss). You are responsible for setting them explicitly in your code.

This isn't generally a huge limitation, because ... you can always set these values in your initialisation function (or any other function) that runs before these values are accessed. It might be a bit tricky if you're using a library written by someone else though.

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
Another question concerns performance: does the native code run at, well, native speed, or should one expect slow-down, or RAM penalties?

I'm not aware of any slow-down. The main problem is just that because it's dynamically loaded, it's using up RAM. (Same argument as loading a regular .mpy from the filesystem vs frozen modules).

In the future, the plan is to make the .mpy loader in MicroPython able to load into a "scratch" flash space, which will avoid this issue (and make all .mpy files a lot more useful, and reduce the need for people needing to use frozen modules).

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
Can one load sub-modules of a native module, or will everything be loaded as a single chunk? ulab is split into 8 sub-modules, which can be imported separately. Would something like this be possible?

They work just like regular .mpy files (and therefore like .py files), so they behave a lot more "normally" (unlike built-in modules which work by different rules).

However. It starts to get a bit complicated if you want the modules to share state as they don't know about each other. i.e. you can't call a method directly in another module (you have to go via the Python runtime). For the same reason as why dynamic modules are limited to only access the functions from the main firmware that are in dynruntime.h (the linker just doesn't have the functionality required to link up the symbols).

v923z · Post by **v923z** » Fri Jul 31, 2020 2:24 pm

Jim, many thanks for your very exhaustive reply! Sometimes I have the feeling that the forum is a one-man show, and refrain from asking a question.

jimmo wrote: ↑
Fri Jul 31, 2020 12:56 am

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
and even worse, I am not sure I could decipher all the limitations properly.
Yes, unfortunately this is quite a complicated feature and getting to the bottom of it requires quite a lot of understanding of how linking and loading and stuff works... On regular operating systems you almost never think about this, but that's just because there's an awful lot more metadata available in a .elf file and the linker/loader can afford to be a lot more sophisticated (and not very "micro").

v923z wrote: ↑
Thu Jul 30, 2020 4:27 pm
"So, if your C code has writable data, make sure the data is defined globally, without an initialiser, and only written to within functions." from the micropython manual mean that user-defined structures containing writable data, like an ndarray, would not qualify? (An ndarray, since it supports the buffer protocol, could be written to by a number of things, not only by the user.)
This paragraph is just providing a rewording of the technical explanation above -- there is no support for .data, only .bss.

I could re-format your explanation here, and add it to the docs. Should I?

jimmo wrote: ↑
Fri Jul 31, 2020 12:56 am
I'm not aware of any slow-down. The main problem is just that because it's dynamically loaded, it's using up RAM. (Same argument as loading a regular .mpy from the filesystem vs frozen modules).

That would probably be a show-stopper. Someone has got ulab on an ESP8266 with virtually no RAM. In fact, that was, when I started to ponder native modules, but if the whole package has to run from RAM, then it is a bit of a predicament.

rcolistete · Post by **rcolistete** » Fri Jul 31, 2020 3:38 pm

.mpy disadvantages :
- longer import time to load the .mpy file from local file system;
- more RAM usage during and after the import, but many boards have a lot of RAM (Pyboard D, ESP32 PSRAM, MAix BiT with K210, OpenMV M7/H7, Teensy 4.x, etc);

.mpy advantages for native C modules :
- avoid building firmware;
- only one .mpy file for each architecture listed in "Native machine code in .mpy files" documentation :
* armv7emsp for Pyboard v1.1/Lite v1.0/Pyboard D SF2/SF3, independent of the firmware variant (sp/dp, thread, network, etc);
* armv7emdp for Pyboard D SF6/OpenMV M7 (?), independent of the firmware variant (sp/dp, thread, etc);
* xtensa for ESP8266, independent of the firmware variant (BOARD=GENERIC/GENERIC_1M/GENERIC_512k);
* xtensawin for ESP32, independent of the firmware variant (sp/dp, BOARD=GENERIC/GENERIC_SPIRAM/etc).

I think it is worth trying ulab, or a subset of ulab, in .mpy format.

v923z · Post by **v923z** » Sun Aug 02, 2020 6:49 pm

rcolistete wrote: ↑
Fri Jul 31, 2020 3:38 pm
.mpy disadvantages :
- longer import time to load the .mpy file from local file system;
- more RAM usage during and after the import, but many boards have a lot of RAM (Pyboard D, ESP32 PSRAM, MAix BiT with K210, OpenMV M7/H7, Teensy 4.x, etc);

I would like to have something that works on all platforms. You pointed out that you have 30 kB of RAM on the ESP8266. That is just a bit more than the compiled size of ndarray.c. So, you might be able to Fourier transform arrays of length 8, but I guess, that is not particularly exciting.

Even on a standard pyboard, a native module with this size could be a stretch. One of the advantages of using ulab instead of python is that the data containers are compact. But the very advantage of compactness is thrown out, as soon as you rely on a module that eats up all the RAM. So, at the moment, I am not sold on the idea.

jimmo · Post by **jimmo** » Mon Aug 03, 2020 6:13 am

v923z wrote: ↑
Fri Jul 31, 2020 2:24 pm
That would probably be a show-stopper. Someone has got ulab on an ESP8266 with virtually no RAM. In fact, that was, when I started to ponder native modules, but if the whole package has to run from RAM, then it is a bit of a predicament.

Yeah at this stage (until we get some way to have a "scratch space" in flash for the mpy loader) they're really only designed for very small performance-critical functions.

In the future, with a flash-backed loader, I think it'll be super useful for something like ulab.

v923z wrote: ↑
Fri Jul 31, 2020 2:24 pm
I could re-format your explanation here, and add it to the docs. Should I?

Yes, if you think it'll help people understand this better, please do!

v923z · Post by **v923z** » Mon Aug 03, 2020 5:33 pm

jimmo wrote: ↑
Mon Aug 03, 2020 6:13 am

v923z wrote: ↑
Fri Jul 31, 2020 2:24 pm
That would probably be a show-stopper. Someone has got ulab on an ESP8266 with virtually no RAM. In fact, that was, when I started to ponder native modules, but if the whole package has to run from RAM, then it is a bit of a predicament.
Yeah at this stage (until we get some way to have a "scratch space" in flash for the mpy loader) they're really only designed for very small performance-critical functions.

In the future, with a flash-backed loader, I think it'll be super useful for something like ulab.

OK, I'll keep this in mind, but put this on the back-burner for now.

jimmo wrote: ↑
Mon Aug 03, 2020 6:13 am

v923z wrote: ↑
Fri Jul 31, 2020 2:24 pm
I could re-format your explanation here, and add it to the docs. Should I?
Yes, if you think it'll help people understand this better, please do!

Well, it helped me a great deal, so I will.

nw0428 · Post by **nw0428** » Mon May 16, 2022 5:35 pm

I would love this feature! I am trying to load ulab on to Lego Spike prime and would rather not flash it.

MicroPython Forum (Archive)

compiling native modules (ulab)

compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)

Re: compiling native modules (ulab)