ulab, or what you will - numpy on bare metal

C programming, build, interpreter/VM.
Target audience: MicroPython Developers.
v923z
Posts: 124
Joined: Mon Dec 28, 2015 6:19 pm

Re: ulab, or what you will - numpy on bare metal

Post by v923z » Tue Nov 05, 2019 7:14 am

Hi all,

I am a bit confused by the specified flash size of the D series. According to the store's web site, https://store.micropython.org/product/PYBD-SF3-W4F2, they all should have at least 2 MB flash (external, with execute capability), yet, I can't compile the firmware with ulab enabled for the SF2, and SF3. Is the 2 MB extra for the file system only? How much leeway has one for SF6?

To put the question in context, in numpy, many methods can be called in two ways. Either as a method of the ndarray, or as a function. So, e.g.,

Code: Select all

a = array([1, 2, 3, 4])
flip(a)
and

Code: Select all

a = array([1, 2, 3, 4])
a.flip()
produce the same result, except that the first creates a copy of a first, and then operates on that, while the second manipulates a in place. If flash space is very tight, I would yank all out-of-place function objects, and keep only the in-place versions. You can always do

Code: Select all

a = array([1, 2, 3, 4])
b = a
b.flip()
if you want to work on the copy.

I would like to stress that these functions/methods haven't got to be implemented twice, it is basically two ways of calling the same internal C method. But there is still an overhead, because the function objects, the QStrings, and the function wrapper are sort of duplicate. I believe, one can save a couple of kB there.

How do you feel about this issue? Can we go with the solution I outlined, or do you think that keeping both options is important?

Cheers,

Zoltán

User avatar
jimmo
Posts: 1700
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia
Contact:

Re: ulab, or what you will - numpy on bare metal

Post by jimmo » Tue Nov 05, 2019 7:54 am

v923z wrote:
Tue Nov 05, 2019 7:14 am
Is the 2 MB extra for the file system only? How much leeway has one for SF6?
The firmware for the sf2 and sf3 is spread across the internal flash and the external qspiflash. By default everything goes to internal, except for things explicitly mentioned in the linker script (e.g. nimble and mbedtls).

Especially on the sf2, the firmware is quite close to the limit for the internal flash. So either:
- we move some more core functionality to external (e.g. some more stuff from extmod?)
- we make user modules go in external flash by default (I'm not sure exactly off the top of my head how to specify this in the linker script but I'm sure it's doable)
- you annotate functions explicitly inside the ulab code to specify which section they should end up in. (there's an attribute you can apply to symbols)

I'll have a play around later and get back to you.

Either way, this can definitely be made to work, no need to cut down useful functionality. (Although maybe making it a configuration option is still worth doing for other more constrained boards).

User avatar
jimmo
Posts: 1700
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia
Contact:

Re: ulab, or what you will - numpy on bare metal

Post by jimmo » Tue Nov 05, 2019 7:56 am

I should add, on SF6 the external qspiflash is currently unused for storing code as the main firmware fits entirely in internal flash.

User avatar
rcolistete
Posts: 209
Joined: Thu Dec 31, 2015 3:12 pm
Location: Brazil

Re: ulab, or what you will - numpy on bare metal

Post by rcolistete » Sun Jun 07, 2020 2:07 pm

Any news about firmware for Pyboard D SF2W/SF3W with ulab ?

User avatar
jcw
Posts: 36
Joined: Sat Dec 21, 2019 12:08 pm
Location: CET/CEST

Re: ulab, or what you will - numpy on bare metal

Post by jcw » Sun Jun 07, 2020 3:30 pm

I don't know what the status of ulab is, but here is some information about how external flash is used on the SF2/SF3/SF6.

From the info at https://store.micropython.org/product/PYBD-SF2-W4F2 :
  • 2MiB external QSPI flash with execute capabilities to extend internal flash
  • Additional 2MiB external QSPI flash for user filesystem and storage
So it appears that there are two external flash chips. The first one is mapped into the address space, and can eXecute code In-Place (XIP). The second one is a more conventional setup, and can be used as file system (it's probably quite a bit faster than an SD card).

By adding entries here, code can be placed in external flash - BT and TLS in this case:

https://github.com/micropython/micropyt ... ld#L48-L57

FWIW, there's plenty of code space left, the entire firmware needs under 1 MB so far:

Code: Select all

LINK build-PYBD_SF2/firmware.elf
   text	   data	    bss	    dec	    hex	filename
 994696	    284	  75840	1070820	 1056e4	build-PYBD_SF2/firmware.elf
And it appears that currently about half of the code ends up in that first QSPI external flash chip at 0x90000000:

Code: Select all

Writing build-PYBD_SF2/firmware.dfu to the board
File: build-PYBD_SF2/firmware.dfu
    b'DfuSe' v1, image size: 995285, targets: 1
    b'Target' 0, alt setting: 0, name: "ST...", size: 995000, elements: 2
      0, address: 0x08008000, size: 450616
      1, address: 0x90000000, size: 544368
Here is the upload output for an SF6 board with more internal flash memory:

Code: Select all

Writing build-PYBD_SF6/firmware.dfu to the board
File: build-PYBD_SF6/firmware.dfu
    b'DfuSe' v1, image size: 983741, targets: 1
    b'Target' 0, alt setting: 0, name: "ST...", size: 983456, elements: 1
      0, address: 0x08008000, size: 983448
As Jimmo mentioned, the SF6 doesn't place any code in external flash (which is slower):

https://github.com/micropython/micropyt ... 767.ld#L23

Adding entries in the load file is perhaps slightly more flexible than adding "__attribute__((section(...)))" in the source code, as it allows placing code differently, depending each board type.

v923z
Posts: 124
Joined: Mon Dec 28, 2015 6:19 pm

Re: ulab, or what you will - numpy on bare metal

Post by v923z » Sun Jun 07, 2020 3:41 pm

rcolistete wrote:
Sun Jun 07, 2020 2:07 pm
Any news about firmware for Pyboard D SF2W/SF3W with ulab ?
You should compile the firmware yourself from the source: https://github.com/v923z/micropython-ulab
Let me know if there are any difficulties.

You can also easily customise your ulab version (exclude sub-modules), if you find that you are running out of space.

I hope this helps,

Zoltán

User avatar
rcolistete
Posts: 209
Joined: Thu Dec 31, 2015 3:12 pm
Location: Brazil

Re: ulab, or what you will - numpy on bare metal

Post by rcolistete » Sat Jul 11, 2020 1:57 pm

I've finally succeeded compiling ulab on Pyboard Lite, v1.1, D SF2W/3W/6W.

It took some hours to discover a bug/incompatibility between gcc-arm 10.1.0 and ulab+MicroPython. For example, using Manjaro Linux v20.0.3 and installing gcc-arm via 'arm-none-eabi-gcc v10.1.0-1' from oficial repository, it gives a link error :

Code: Select all

[stm32]$ make -j8 BOARD=PYBD_SF6 USER_C_MODULES=../../../ulab all
    ...
    Including User C Module from ../../../ulab/code
    ...
    LINK build-PYBD_SF6/firmware.elf
    arm-none-eabi-ld: build-PYBD_SF6/code/ulab.o:(.bss.ulab_extras_module+0x0): multiple definition of `ulab_extras_module'; build-PYBD_SF6/code/extras.o:(.data.ulab_extras_module+0x0): first defined here
    arm-none-eabi-ld: build-PYBD_SF6/code/ulab.o:(.bss.ulab_vectorise_module+0x0): multiple definition of `ulab_vectorise_module'; build-PYBD_SF6/code/vectorise.o:(.data.ulab_vectorise_module+0x0): first defined here
    make: *** [Makefile:605: build-PYBD_SF6/firmware.elf] Error 1
The solution was to use older versions of gcc-arm from GNU Arm Embedded Toolchain Downloads, e. g.,
'gcc-arm-none-eabi-8-2019-q3-update' or 'gcc-arm-none-eabi-9-2020-q2-update'. The 'gcc-arm-none-eabi-8-2019-q3' generates code a little bit smaller, so I've chosen it.

@v923z, do you want that I post this as a GitHub issue ?

User avatar
rcolistete
Posts: 209
Joined: Thu Dec 31, 2015 3:12 pm
Location: Brazil

Re: ulab, or what you will - numpy on bare metal

Post by rcolistete » Sat Jul 11, 2020 2:14 pm

v923z wrote:
Sun Jun 07, 2020 3:41 pm
You can also easily customise your ulab version (exclude sub-modules), if you find that you are running out of space.
Thanks for the tip. Editing 'ulab/code/ulab.h' to enable/disable sub-modules is very useful.

Pyboard D SF2W (which has 512 kB of internal flash + 2x 2MB QSPI flash) with ulab v0.51.1 + current MicroPython (v1.12-623-gf743bd3d2) :
- with all sub-modules, 3196 bytes overflowed, no firmware can be created;
- without sub-module 'compare', which takes 3584 bytes in firmware, ulab fits in region `FLASH_APP' taking 41380 bytes :

Code: Select all

    text    data     bss     dec     hex filename
    1034544     428   75848 1110820  10f324 build-PYBD_SF2/firmware.elf
Pyboard D SF3W (which has 512 kB of internal flash + 2x 2MB QSPI flash) with ulab v0.51.1 + current MicroPython (v1.12-623-gf743bd3d2) :
- with all sub-modules, 3664 bytes overflowed, no firmware can be created;
- without sub-modules 'compare' & 'extras', which take (3584 + 80 = 3664) bytes in firmware, ulab fits in region `FLASH_APP' taking 41308 bytes (yeah, with 0 bytes left) :

Code: Select all

    text    data     bss     dec     hex filename
    1034920     432   76744 1112096  10f820 build-PYBD_SF3/firmware.elf
ulab v0.51.1 compiles ok, taking 45052-45056 bytes in the firmware for :
- Pyboard Lite v1.0, which has 512 kB of internal flash :

Code: Select all

    text    data     bss     dec     hex filename
    375712     124   43604  419440   66670 build-PYBLITEV10/firmware.elf
- Pyboard v1.1, which has 1 MB of internal flash :

Code: Select all

text    data     bss     dec     hex filename
    401896     124   27244  429264   68cd0 build-PYBV11/firmware.elf
- Pyboard D SF6W, which has 2MB of internal flash + 2x 2MB QSPI flash :

Code: Select all

    text    data     bss     dec     hex filename
    1042552     436  105824 1148812  11878c build-PYBD_SF6/firmware.elf
Used gnu-arm v8.3.1 20190703 (gcc-arm-none-eabi-8-2019-q3-update-linux.tar.bz2) on Manjaro Linux v20.0.3.

I'll later post these firmware images with FP32 and FP64 and announce here.

Online
User avatar
pythoncoder
Posts: 4254
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ulab, or what you will - numpy on bare metal

Post by pythoncoder » Sun Jul 12, 2020 9:42 am

On the SF2W you can put firmware in external flash: I have found a need for this when frozen modules made the firmware image too large for on-chip flash. Perhaps a corresponding fix can be done for SF3W (I don't have one of these). Here is my note on this:

Code: Select all

There is space (up to 2mbyte) in the external flash which is memory mapped.
This requires the following patch:

--- a/ports/stm32/boards/PYBD_SF2/f722_qspi.ld
+++ b/ports/stm32/boards/PYBD_SF2/f722_qspi.ld
@@ -48,6 +48,7 @@ SECTIONS
     .text_ext :
     {
         . = ALIGN(4);
+        *frozen_content.o(.text* .rodata*)
         *lib/mbedtls/*(.text* .rodata*)
         . = ALIGN(512);
         *(.big_const*)
Peter Hinch

User avatar
rcolistete
Posts: 209
Joined: Thu Dec 31, 2015 3:12 pm
Location: Brazil

Re: ulab, or what you will - numpy on bare metal

Post by rcolistete » Sun Jul 12, 2020 3:50 pm

Thanks @pythoncoder, this solve the issue of ulab (which takes approx. 45056 bytes in firmware) not fitting in internal flash with 512 kB :
- writing firmware 'PYBD-SF2-v1.12-623-gf743bd3d2_20200712.dfu' with default configuration, 449,640 bytes are written to FLASH_APP (480 kB), so there are only ((480*1024=491520) - 449640 = 41880) bytes free in FLASH_APP, not enough for ulab;
- writing firmware 'PYBD-SF2_frozen-2MB_v1.12-623-gf743bd3d2_20200712.dfu'' with .py frozen modules in FLASH_EXT (external 2 MB QSPI flash), 423,168 bytes are written to FLASH_APP (480 kB), so there are ((480*1024=491520) - 423168 = 68352) bytes free in FLASH_APP, enough for ulab.
Last edited by rcolistete on Sun Jul 12, 2020 4:46 pm, edited 1 time in total.

Post Reply