Most efficient searching within large bytearrays

marc4444 · Post by **marc4444** » Sun Mar 15, 2020 6:28 pm

Hi All,

I've created a large (2000) bytearray with a memoryview to read a UART into. The commands that the UART is receiving can sometimes take a while to arrive, so I'd like to check for a specific finish character sequence within the bytearray in a loop.

I've noticed that Micropython doesn't support the .find() method. What is the most efficient way of doing this without creating a copy of the bytearray? Ideally I could specify a start/finish index too like the find method. I have something rough below that i'm using but doesn't seem very efficient.

Code: Select all

def has_substring(input_string, pattern, start=0, end=-1):
    j=0
    i=start
    k=0
    if end == -1:
        l = len(input_string)
    else:
        l = end
    m = len(pattern)
    while((i<l) and (j<m)):
        if(input_string[i] == pattern[j]):
            i+=1
            j+=1
        else:
            j=0
            k+=1
            i=k
    if(j==m):
        return i
    else:
        return -1

Thanks in advance for the help!

Roberthh · Post by **Roberthh** » Sun Mar 15, 2020 7:59 pm

find is supported. How did you try?

marc4444 · Post by **marc4444** » Sun Mar 15, 2020 8:05 pm

Hi Robert,

Thanks for the quick reply, see below prints on the REPL. I'm just using a pyboard, do I have some old version or something? I should have added to my original post that I'm trying to do the .find() on the bytearray.

Thanks!

Print from Pyboard:

Code: Select all

>>> a = bytearray(20)
>>> a.find(b'\r\n')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'bytearray' object has no attribute 'find'
>>> print(type(a))
<class 'bytearray'>
>>> print(dir(a))
['__class__', 'append', 'extend', 'decode']

Print on python 3 on PC:

Code: Select all

>>> a = bytearray(b'qwerty')
>>> a.find(b'ty')
4

Roberthh · Post by **Roberthh** » Sun Mar 15, 2020 8:59 pm

You are right, in that there is no find for bytearrays. I looked at strings and bytes objects.

marc4444 · Post by **marc4444** » Mon Mar 16, 2020 9:25 am

Thanks Robert - any ideas from anyone on a more efficient way than the function I posted are appreciated?

Best,
Marc

Roberthh · Post by **Roberthh** » Mon Mar 16, 2020 10:13 am

You can use comparison on substrings, like equality. Using memoryview avoids allocation. But for short patterns the overhead may be similar, only the code looks smaller. You can also use something like

pattern in input_string

as a fast test, whether the patter is in the buffer at all. If input_string would not be so large, then bytes(input_string).find(pattern) would be useful. But bytes() creates a temporary object.

TomLin · Post by **TomLin** » Thu Apr 22, 2021 9:08 am

I have the same problem. Tried above mentioned solution: bytes(input_string).find(pattern) but it results in error.

Here is my sample code (with only essential lines shown):

buf = bytearray(255) #initialize UART input buffer as bytearray
resp2 = uart.readinto(buf) #read UART input into buf
print (bytes(buf).find('OK')) #test if buf contains a substring 'OK'

The last line raises an an error:
TypeError: can't convert 'str' object to bytes implicitly
MicroPython v1.11 on 2019-05-29; PYBv1.1 with STM32F405RG

Can you please advise me how to correct this?

Roberthh · Post by **Roberthh** » Thu Apr 22, 2021 9:20 am

print (bytes(buf).find(b'OK')) #test if buf contains a substring 'OK'

Note: bytes(buf) creates a copy of buf.

TomLin · Post by **TomLin** » Thu Apr 22, 2021 11:59 am

Oh yes, of course! Thank you so much for your quick answer.

Concerning the making of a copy of bytearray buf, it is kind of waste of memory, which could be tolerated,
however. But if I put this sentence within a loop, will this extra area be reused during consecutive passes of
the code in the loop?

MicroPython Forum (Archive)

Most efficient searching within large bytearrays

Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays

Re: Most efficient searching within large bytearrays