One extra option is to move 'ulab' sub-modules to FLASH_EXT, instead of full ulab to FLASH_EXT (external 2 MB QSPI flash).
Just add to line 51 of '../micropython/ports/stm32/boards/PYBD_SF2/f722_qspi.ld' (which is also used by Pyboard D SF3), here moving the sub-modules 'compare' and 'user' :
Code: Select all
*code/compare/*(.text* .rodata*)
*code/user/*(.text* .rodata*)
The advantages :
-
granularity, we can choose which parts of 'ulab' stay in the faster 480 kB of FLASH_APP (inside the 512 kB internal flash memory), or which ones are moved to the slower FLASH_EXT (external 2 MB QSPI flash);
- so parts of 'ulab' which are more time-critical (high performance is needed) from the user's point of view can stay in FLASH_APP.
Moving to FLASH_EXT :
- only '.py frozen modules' is enough to build firmware with ulab + SP (single precision) + optionally threads for Pyboard D SF2/SF3;
-
only '.py frozen modules' is not enough to build firmware with ulab + DP (double precision) + optionally threads for Pyboard D SF2/SF3, so in this case it is worth to select the minimum quantity of sub-modules, not time-critical, to move to FLASH_EXT, to avoid overflow in FLASH_APP.
To understand the speed difference between internal and external flash memories, see these preliminary benchmarks for 'ulab.fft.fft()' of 1024 points, Pyboard D SF2 + MicroPython v1.12 with ulab, default clock (120 MHz), without threads :
- full ulab in 512 kB internal flash memory, FP32 : 1.873 ms;
- full ulab in external 2 MB QSPI flash, FP32 : 2.070 ms; (10.5% more)
- full ulab in 512 kB internal flash memory, FP64 : 31.856 ms;
- full ulab in external 2 MB QSPI flash, FP64 : 60.528 ms. (90.0% more)
So :
- for single precision (SP/FP32), the overhead is small (10.5%) as the MCU do many FP32 calculations in hardware;
-
for double precision (DP/FP64), the overhead is huge (90.0%) as the MCU do all FP64 calculations in software, reading more code from the slower external flash memory.
About using Pyboard D SF2/SF3 flash memories, the logic, seems to be :
- use the maximum capacity (480 kB) of FLASH_APP inside the 512 kB internal flash memory;
- move to the slower FLASH_EXT (external 2 MB QSPI flash) only parts (modules and sub-modules) which don't fit in FLASH_APP and aren't time-critical.