Having trouble with ure.match()

RP2040 based microcontroller boards running MicroPython.
Target audience: MicroPython users with an RP2040 boards.
This does not include conventional Linux-based Raspberry Pi boards.
Post Reply
hwiguna
Posts: 7
Joined: Sat Aug 17, 2019 8:07 pm

Having trouble with ure.match()

Post by hwiguna » Thu Apr 22, 2021 2:40 am

Maybe I wrote the regex wrong, but according to regex101.com regex below should return four matches.
ure.match("[abcd][01][01]","d00ba11c01d10")

When I use ure on "MicroPython v1.14 on 2021-04-11; Raspberry Pi Pico with RP2040" that expression only returns one match (the first one). How do I get to the other matches? Thanks!!!

Code: Select all

>>> import ure
>>> r=ure.match("[abcd][01][01]","d00ba11c01d10")
>>> r
<match num=1>

>>> dir(r)
['__class__', 'end', 'start', 'group', 'groups', 'span']

>>> r.group(0)
'd00'

>>> r.group(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 1

>>> r.span(0)
(0, 3)

>>> r.span(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 1

taPIQoLEHUMA
Posts: 15
Joined: Thu Mar 04, 2021 2:59 am

Re: Having trouble with ure.match()

Post by taPIQoLEHUMA » Fri Apr 23, 2021 8:51 pm

Wow. I hope it doesn't return four matches. I would expect one match with three characters in it. And that is what I get in micropython on the laptop.

The reg-ex has three character-range expressions: it will match one of a/b/c/d, followed by a 0 or 1, followed by another 0 or 1 character. If it matches, it returns three characters. (which it did "d00").

On MP for unix, the "dir(r)" showed only 'group' and '__class__', but there was only one entry in group, as expected.

We're back to the drawing board if you need to match multiple three-character entries in one string. (and your input will break it at the second entry of "ba11").

Christian Walther
Posts: 169
Joined: Fri Aug 19, 2016 11:55 am

Re: Having trouble with ure.match()

Post by Christian Walther » Sat Apr 24, 2021 11:01 am

Read the docs to learn what match() does. It is not what you want, it only matches at the beginning of the string.

ure doesn’t seem to have a function to get multiple matches (that would be re.findall() or re.finditer() in CPython), so you’ll have to do that yourself using repeated application of ure.search(). Or you might be able to misuse the function-as-replacement capability of ure.sub() if it’s available on the RP2040 port.

Post Reply