ulab, or what you will - numpy on bare metal
Posted: Wed Sep 25, 2019 4:12 pm
Hi all,
As advertised in the first post of viewtopic.php?f=3&t=6874, and also under https://micropython-usermod.readthedocs ... ds_14.html, I am releasing a C module with a numpy-like interface.
For a whooping 12 kB of extra flash space, you are going to get the following:
- compact, iterable and slicable containers of numerical data in 1, and 2 dimensions (arrays and matrices). Arrays and matrices can be initialised by passing iterables or iterables of iterables into the constructor. The type of the array can be restricted by supplying the dtype keyword argument.
Arrays and matrices are pretty-printed, and can be sliced, and iterated over, even if the container is two-dimensional (in this case the iterator will return a new ndarray)
When it makes sense for a particular matrix operator, the axis keyword argument can be supplied. Thus, e.g.,
will return a new ndarray, whose elements are the maxima along the vertical axis. The axis keyword is defined for the min/max, argmin/argmax, sum, mean, std, and roll functions.
- In addition, ndarrays support all the relevant unary and binary operators (e.g., len, ==, +, *, etc.), i.e., the following is valid:
Binary operators work element-wise. For this reason, you can easily get a weighted rolling average as follows:
- universal functions and vectorised computations on micropython iterables and numerical arrays/matrices. Amongst other things, this means that the following all work
- basic linear algebra routines (matrix inversion, matrix reshaping, and transposition)
- polynomial fits to numerical data
- fast Fourier transforms of linear arrays (I don't know if 2D transforms are worth the trouble): since many a time one is interested only in the power spectrum of the signal, beyond the actual FFT, this sub-module also implements a function called spectrum that returns only the absolute value of the transformed signal. The FFT routine calculates the transform in place, i.e. there should be very little RAM overhead.
@pythoncoder would surely like to know if an FFT costs a king's ransom. No, it doesn't. In fact, a 1024-point float transform can be gotten in less than 2 ms on the pyboard.
A couple of very general remarks:
1. ndarrays are built on top of the micropython binary arrays, thus some of the binary facilities are used. However, in the long run, ndarrays could be detached from binary arrays. I bring up this point, because the binary array utilities are actually not exposed in the C code, hence, I had to copy parts verbatim. If I might ask a favour of the maintainers of micropython, then this would be it: think about what could/should be exposed at the C level. In other words, which functions could find applications in external modules. array_new is definitely such a function.
2. upcasting:
Since ndarrays can be of different types, one has to decide what should happen, if two dissimilar arrays are added, multiplied, etc. The upcasting rules of numpy are not entirely consistent; I chose mine, but they might not be perfect.
3. slicing and indexing:
While a single slice will work ask expected,
won't be valid. This has to do with the fact that the slice object in micropython does not support the comma. I think, a fix for this might be difficult, and it might happen in micropython itself. I would be happy to get feedback on this from more knowledgeable people.
At present, not all possible combinations of slices are defined, e.g., backwards slices won't work (though, you can still retrieve the very last element of an array, see the rolling average example above).
4. parsing of arguments:
I huge chunk of the code is really boring stuff, namely, parsing of the input arguments, and building a decision tree based upon, whether the input is a standard python iterable, or an ndarray. Now, the problem is not the iterator itself, because ndarrays are also iterable (though, iterating over the C array is faster). But depending on the exact C type of the input argument, different intermediate containers have to be defined. So, e.g., when finding the maximum of an array, one must define a temporary variable that holds the current maximum. This applies to all operations. If one knows a sensible universal (i.e., independent from input data type) solution to this problem, I would like to hear it.
5. additional functions
I would also like to know which other numpy functions you would find useful. At the moment, only 12 kB of extra space is used, so there is still a lot of room for additional tools. My vision was to thematically separate functions into sub-modules that one can easily exclude from the compilation, but I don't see that flash is going to be a problem anytime soon.
For the impatient: you can download the firmware for the pyboard v.1.1 from https://github.com/v923z/micropython-ul ... rmware.dfu . Otherwise, the source code can be found under https://github.com/v923z/micropython-ul ... aster/code .
I should like to hope that you find something useful in this module, and I would be glad to have feedback. If it pertains to general discussion, here, if it is related to a very specific implementation detail, then github is probably a better place.
You will find a lot of examples under https://github.com/v923z/micropython-ul ... ulab.ipynb. Incidentally, the notebook also contains the heavily commented source code itself.
If you don't fancy the jupyter notebook, you can read its content under https://github.com/v923z/micropython-ul ... e/ulab.rst. I would like to stress that the notebook should be treated as a technical document, or developer manual, and not as a user guide. It contains mostly my verbose comments on various implementation details and difficulties. As soon as I have a bit more time, and all the dust settles down, I will write up a proper user manual, though, the convention follows that of numpy's.
As stated on github, the MIT licence applies.
Cheers,
Zoltán
As advertised in the first post of viewtopic.php?f=3&t=6874, and also under https://micropython-usermod.readthedocs ... ds_14.html, I am releasing a C module with a numpy-like interface.
For a whooping 12 kB of extra flash space, you are going to get the following:
- compact, iterable and slicable containers of numerical data in 1, and 2 dimensions (arrays and matrices). Arrays and matrices can be initialised by passing iterables or iterables of iterables into the constructor. The type of the array can be restricted by supplying the dtype keyword argument.
Code: Select all
import ulab
a = ulab.ndarray([1, 2, 3], dtype=ulab.uint8)
b = ulab.ndarray([[1, 2, 3], [4, 5, 6]], dtype=ulab.float)
Code: Select all
import ulab
a = ulab.ndarray([1, 2, 3, 4, 5], dtype=ulab.uint8)
print(a[-1], a[0], a[0:2])
b = ulab.ndarray([[1, 2, 3], [4, 5, 6]], dtype=ulab.float)
print(b[0])
Code: Select all
import ulab
b = ulab.ndarray([[1, 2, 3], [4, 5, 6]], dtype=ulab.float)
ulab.max(b, axis=0)
- In addition, ndarrays support all the relevant unary and binary operators (e.g., len, ==, +, *, etc.), i.e., the following is valid:
Code: Select all
from ulab import ndarray, float
a = ndarray([1, 2, 3], dtype=float)
print(a + a*5.0)
Code: Select all
from ulab import ndarray, roll, mean
weight = ndarray([1, 2, 3, 4, 5]) # These are the weights; the last entry is the most dominant
samples = ndarray([0]*5) # initial array of samples
for i in range(5):
# a new datum is inserted on the right hand side. This simply overwrites whatever was in the last slot
samples[-1] = 5-i
print(mean(samples*weight))
# the data are then shifted by one position to the left
roll(samples, 1)
Code: Select all
from ulab import exp, ndarray
exp(1.5) # single numerical value
exp([1.5, 2.5]) # a list
exp(range(5)) # a range
a = ndarray([1, 2, 3]) # a 1D ndarray
exp(a)
a = ndarray([[1, 2, 3], [4, 5, 6]]) # a 2D ndarray
exp(a)
Code: Select all
from ulab import ndarray, inv
a = ndarray([[1, 2], [3, 4]])
b = inv(a)
b.transpose()
b.reshape((1, 4))
Code: Select all
import ulab
x = ulab.ndarray([-3, -2, -1, 0, 1, 2, 3])
y = ulab.ndarray([10, 5, 1, 0, 1, 4.2, 9.1])
p = ulab.polyfit(x, y, 2)
ulab.polyval(p, x)
Code: Select all
import ulab
a = ulab.ndarray([0, 1, 2, 3, 0, 1, 2, 3])
re, im = ulab.fft(a) # this returns two new ndarrays
print('real part: ', re)
print('imag part: ', im)
ulab.spectrum(a) # this overwrites the input array
print(ulab.log10(a)) # you get the power in dB
Code: Select all
>>> import ulab, utime
>>> x = ulab.linspace(0, 100, 1024)
>>> y = ulab.sin(x)
>>> t = utime.ticks_us()
>>> a, b = ulab.fft(y)
>>> print(utime.ticks_diff(utime.ticks_us(), t))
1948
1. ndarrays are built on top of the micropython binary arrays, thus some of the binary facilities are used. However, in the long run, ndarrays could be detached from binary arrays. I bring up this point, because the binary array utilities are actually not exposed in the C code, hence, I had to copy parts verbatim. If I might ask a favour of the maintainers of micropython, then this would be it: think about what could/should be exposed at the C level. In other words, which functions could find applications in external modules. array_new is definitely such a function.
2. upcasting:
Since ndarrays can be of different types, one has to decide what should happen, if two dissimilar arrays are added, multiplied, etc. The upcasting rules of numpy are not entirely consistent; I chose mine, but they might not be perfect.
3. slicing and indexing:
While a single slice will work ask expected,
Code: Select all
from ulab import ndarray, uint8
a = ndarray([range(10), range(10), range(10)], dtype=uint8)
a[0, 1]
At present, not all possible combinations of slices are defined, e.g., backwards slices won't work (though, you can still retrieve the very last element of an array, see the rolling average example above).
4. parsing of arguments:
I huge chunk of the code is really boring stuff, namely, parsing of the input arguments, and building a decision tree based upon, whether the input is a standard python iterable, or an ndarray. Now, the problem is not the iterator itself, because ndarrays are also iterable (though, iterating over the C array is faster). But depending on the exact C type of the input argument, different intermediate containers have to be defined. So, e.g., when finding the maximum of an array, one must define a temporary variable that holds the current maximum. This applies to all operations. If one knows a sensible universal (i.e., independent from input data type) solution to this problem, I would like to hear it.
5. additional functions
I would also like to know which other numpy functions you would find useful. At the moment, only 12 kB of extra space is used, so there is still a lot of room for additional tools. My vision was to thematically separate functions into sub-modules that one can easily exclude from the compilation, but I don't see that flash is going to be a problem anytime soon.
For the impatient: you can download the firmware for the pyboard v.1.1 from https://github.com/v923z/micropython-ul ... rmware.dfu . Otherwise, the source code can be found under https://github.com/v923z/micropython-ul ... aster/code .
I should like to hope that you find something useful in this module, and I would be glad to have feedback. If it pertains to general discussion, here, if it is related to a very specific implementation detail, then github is probably a better place.
You will find a lot of examples under https://github.com/v923z/micropython-ul ... ulab.ipynb. Incidentally, the notebook also contains the heavily commented source code itself.
If you don't fancy the jupyter notebook, you can read its content under https://github.com/v923z/micropython-ul ... e/ulab.rst. I would like to stress that the notebook should be treated as a technical document, or developer manual, and not as a user guide. It contains mostly my verbose comments on various implementation details and difficulties. As soon as I have a bit more time, and all the dust settles down, I will write up a proper user manual, though, the convention follows that of numpy's.
As stated on github, the MIT licence applies.
Cheers,
Zoltán