How to combine bytearrays fast ?

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
Blechi
Posts: 8
Joined: Mon Apr 23, 2018 5:55 pm

How to combine bytearrays fast ?

Post by Blechi » Wed Feb 12, 2020 9:30 pm

Hi folks,
I'm a little stuck here and maybe someone can help.
There are four bytearrays which have to be combined (for lack of a better word) into a fifth one using the fastest way possible.

Code: Select all

a = bytearray(240)
b = bytearray(240)
c = bytearray(240)
d = bytearray(240)
result = bytearray(240)

a = [<arbitrary byte>, 0, 0, 0, <arbitrary byte>, 0, 0, 0, ... and so on ..., <arbitrary byte>, 0, 0, 0]
b = [0, <arbitrary byte>, 0, 0, 0, <arbitrary byte>, 0, 0, ... and so on ..., 0, <arbitrary byte>, 0, 0]
c = [0, 0, <arbitrary byte>, 0, 0, 0, <arbitrary byte>, 0, ... and so on ..., 0, 0, <arbitrary byte>, 0]
d = [0, 0, 0, <arbitrary byte>, 0, 0, 0, <arbitrary byte>, ... and so on ..., 0, 0, 0, <arbitrary byte>]
The result should look like this:

Code: Select all

result = [<byte 0 from a>, <byte 1 from b>, <byte 2 from c>, <byte 3 from d>, <byte 4 from a>, ... and so on ..., <byte 236 from a>, <byte 237 from b>, <byte 238 from c>, <byte 239 from d>]
The easiest way would be:

Code: Select all

for i in range(0,240):
	result[i] = a[i] + b[i] + c[i] + d[i]
Unfortunately the operation has to be as fast as possible.
Something like

Code: Select all

result = a | b | c | d
would be nice, but of course it doesn't work that way.
Has someone perhaps a clever idea how to do this as fast as possible?

Thanks a lot in advance
Blechi

Online
User avatar
jimmo
Posts: 1156
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia

Re: How to combine bytearrays fast ?

Post by jimmo » Wed Feb 12, 2020 9:47 pm

Have you tried experimenting with the @native or @viper decorators?

Code: Select all

@micropython.native
def merge(a, b, c, d, result):
  for i in range(0,240):
    result[i] = a[i] + b[i] + c[i] + d[i]
Or

Code: Select all

@micropython.viper
def merge(a : ptr8, b : ptr8, c : ptr8, d : ptr8, result : ptr8):
  for i in range(0,240):
    result[i] = a[i] + b[i] + c[i] + d[i]
I'd need to test the viper one, some additional annotations might be required, but the native one should just work.

OutoftheBOTS_
Posts: 714
Joined: Mon Nov 20, 2017 10:18 am

Re: How to combine bytearrays fast ?

Post by OutoftheBOTS_ » Wed Feb 12, 2020 10:59 pm

First I would ask how r u going to add 4 bytes together then fit the result within a byte?? Remember a byte can only hold the values between 0 and 255.

Not sure what has been implemented in the uLAB module (uLAB is MP version of NumPy)

Online
User avatar
jimmo
Posts: 1156
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia

Re: How to combine bytearrays fast ?

Post by jimmo » Wed Feb 12, 2020 11:22 pm

OutoftheBOTS_ wrote:
Wed Feb 12, 2020 10:59 pm
First I would ask how r u going to add 4 bytes together then fit the result within a byte?? Remember a byte can only hold the values between 0 and 255.
My impression was that the arrays are zeron other than every fourth element. i.e. this code would do the same thing:

Code: Select all

def merge(a, b, c, d, result):
  i = 0
  while i < 240:
    result[i] = a[i]
    i += 1
    result[i] = b[i]
    i += 1
    result[i] = c[i]
    i += 1
    result[i] = d[i]
    i += 1
(Which may actually be faster than the OPs code, and would also work well with native/viper)

(Speculating wildly, but my guess is it's something like APA102 pixels, and different code generates the R G B and Brightness values??)

OutoftheBOTS_
Posts: 714
Joined: Mon Nov 20, 2017 10:18 am

Re: How to combine bytearrays fast ?

Post by OutoftheBOTS_ » Thu Feb 13, 2020 5:18 am

jimmo wrote:
Wed Feb 12, 2020 11:22 pm
OutoftheBOTS_ wrote:
Wed Feb 12, 2020 10:59 pm
First I would ask how r u going to add 4 bytes together then fit the result within a byte?? Remember a byte can only hold the values between 0 and 255.
My impression was that the arrays are zeron other than every fourth element. i.e. this code would do the same thing:

Code: Select all

def merge(a, b, c, d, result):
  i = 0
  while i < 240:
    result[i] = a[i]
    i += 1
    result[i] = b[i]
    i += 1
    result[i] = c[i]
    i += 1
    result[i] = d[i]
    i += 1
(Which may actually be faster than the OPs code, and would also work well with native/viper)

(Speculating wildly, but my guess is it's something like APA102 pixels, and different code generates the R G B and Brightness values??)
Ahh ok that makes more sense.

So then this

Code: Select all

a = bytearray(240)
b = bytearray(240)
c = bytearray(240)
d = bytearray(240)
result = bytearray(240)
really needs to be this

Code: Select all

a = bytearray(240)
b = bytearray(240)
c = bytearray(240)
d = bytearray(240)
result = bytearray(240*4)
If speed is really a need then I would suggest instead of originally creating 4 separate array then combine them just create the needed array to being with.

somthing like this

Code: Select all

a_pos = const(0)
b_pos = const(1)
c_pos = const(2)
d_pos = const(3)

result = bytearray(240*4)


#fill the array with some data\

#change a at postion 187 to 25
result[187*4+a_pos] = 25

#change b at postion 122 to 231
result[122*4+b_pos] = 231

#change c at postion 100 to 29
result[100*4+c_pos] = 231

#change d at postion 132 to 126
result[123*4+d_pos] = 126

Online
User avatar
jimmo
Posts: 1156
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia

Re: How to combine bytearrays fast ?

Post by jimmo » Thu Feb 13, 2020 7:58 am

OutoftheBOTS_ wrote:
Thu Feb 13, 2020 5:18 am
really needs to be this
My impression was that a,b,c,d only have values every fourth value, starting at one offset relative to each other.

i.e.
a = r1,0,0,0, r2,0,0,0, r3,0,0,0
b = 0,g1,0,0, 0,g2,0,0, 0,g3,0,0
c = 0,0,b1,0, 0,0,b2...
...
which would combine into
result = r1,g1,b1,x1,r2,g2,b2....

i.e. the + is a convenient way of ignoring the zeros. (Could also be | ).

Blechi
Posts: 8
Joined: Mon Apr 23, 2018 5:55 pm

Re: How to combine bytearrays fast ?

Post by Blechi » Thu Feb 13, 2020 4:40 pm

Wow, this is really impressive.
I can't believe that this can be done so fast.
This

Code: Select all

@micropython.native
def merge(a, b, c, d, result):
    for i in range(0,240):
        result[i] = a[i] + b[i] + c[i] + d[i]
    return(result)    
is about 7 times faster than this

Code: Select all

for i in range(0,240):
    result[i] = a[i] + b[i] * c[i] + d[i]
But this

Code: Select all

@micropython.native
def merge(a, b, c, d, result):
    i = 0
    while i < 240:
        result[i] = a[i]
        i += 1
        result[i] = b[i]
        i += 1
        result[i] = c[i]
        i += 1
        result[i] = d[i]
        i += 1
    return(result)
is about 15 (fifteen !) times faster than the simple 'for' loop.
The .viper version of the while loop is about as fast as the .native one.
And yes these are RGBW values for SK6812 neopixels which are calculated separately and then combined to shove them into the strip via esp.neopixel_write.
Now i have something to work with.
It's now possible to create something like a nineth bit per color by doubling the framerate.

Code: Select all

possible LED brightness values in the  first frame: 0,1, ... ,255,255,255, ... ,255
possible LED brightness values in the second frame: 0,0, ... ,  0,  1,  2, ... ,255
What of course doesn't make the LEDs any brighter, but allows for smoother dimming of a lot more LEDs.
Thank you so much for your help.
Thanks again
Blechi

Online
User avatar
jimmo
Posts: 1156
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia

Re: How to combine bytearrays fast ?

Post by jimmo » Thu Feb 13, 2020 10:15 pm

Yes @native is really great when you need it. Note that the generated native code will be larger than the equivalent bytecode, but for short perf-critical things like this it's very useful. (This is why it's not enabled by default though)

My guess as to why the second version with the while loop is faster is that it does way fewer reads of the buffer. Similar number of total ops though.

Like OutOfTheBots said though, I don't really understand why the input arrays have to be sparse. Why can't you combine four 60-byte arrays into the final 240-byte one?

(Or why can't you just work entirely in the output array).

Blechi
Posts: 8
Joined: Mon Apr 23, 2018 5:55 pm

Re: How to combine bytearrays fast ?

Post by Blechi » Thu Feb 13, 2020 11:45 pm

The data for each of the four LED colors is a 240 bytes window of the same bytearray which is quite a bit longer than the 240 bytes.
The four windows slide along this array at different speeds in a way that keeps the 0 bytes at the same position for each basic color array.
This is faster than calculating the content for the four bytearrays separately.
Imagine a clock where the hands all look the same but each one moves by itself.

OutoftheBOTS_
Posts: 714
Joined: Mon Nov 20, 2017 10:18 am

Re: How to combine bytearrays fast ?

Post by OutoftheBOTS_ » Fri Feb 14, 2020 6:53 am

Correct if I am wrong but it is the range(240) that takes up all the time because it doesn't create a list that it iterates through but rather is a global function that has to be called each iteration of the for loop. If you create a local list then it should run faster but use more memory.

Code: Select all

@micropython.native
def merge(a, b, c, d, result):
    iteration_list = range(0,240)
    for i in iteration_list:
        result[i] = a[i] + b[i] + c[i] + d[i]
    return(result)  
I would still assume the while loop will be faster though but would be interesting to see :)
Last edited by OutoftheBOTS_ on Fri Feb 14, 2020 6:54 am, edited 1 time in total.

Post Reply