re.sub regex won't match

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
User avatar
devnull
Posts: 473
Joined: Sat Jan 07, 2017 1:52 am
Location: Singapore / Cornwall
Contact:

re.sub regex won't match

Post by devnull » Wed Oct 17, 2018 6:44 am

https://regex101.com/r/vouYnV/1

Any idea why my regex won't find & remove this hex value embedded in carriarge return / line feeds :

Code: Select all

>>> re.sub(r'^\\r\\n[0-9,a-f]+\\r\\n','','\r\nfe5\r\nabcdefghxxxxx')
'\r\nfe5\r\nabcdefghxxxxx'
on regex101 it matches but I can't get it to work in uPython !

what it should return is:

Code: Select all

abcdefghxxxxx
This is used to removed 'chunked' data from http 1.1 requests.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: re.sub regex won't match

Post by jickster » Wed Oct 17, 2018 2:39 pm

Somewhere I read micropython has issues with r-strings; create the string as a non r-string.

pfalcon
Posts: 1155
Joined: Fri Feb 28, 2014 2:05 pm

Re: re.sub regex won't match

Post by pfalcon » Wed Oct 17, 2018 5:02 pm

MicroPython doesn't have issues with r-strings (unless someone provides a testcase).

As often happens, this case has 2 layers of confusion.

First of all, the regexp is not correct for the purpose stated (can be easily verified in CPython).

Secondly ure (sic!) module doesn't implement any superfluous escapes which can be otherwise done in Python itself. So, r"\n" isn't going to match "\n". But of course "\n" will match "\n".

Module re-pcre from micropython-lib provides (almost) complete regex syntax.
Awesome MicroPython list
Pycopy - A better MicroPython https://github.com/pfalcon/micropython
MicroPython standard library for all ports and forks - https://github.com/pfalcon/micropython-lib
More up to date docs - http://pycopy.readthedocs.io/

User avatar
devnull
Posts: 473
Joined: Sat Jan 07, 2017 1:52 am
Location: Singapore / Cornwall
Contact:

Re: re.sub regex won't match

Post by devnull » Wed Oct 17, 2018 11:09 pm

Sorry pfalcon I don't understand your reply and also I don't use cpython.

EDITED

This without using 'r' strings, this works in uPython:

Code: Select all

>>> re.sub('^\r\n[0-9,a-f]+\r\n','','\r\nfe5\r\nabcdefghxxxxx')
'abcdefghxxxxx'
>>> 

efahl
Posts: 15
Joined: Sat Dec 23, 2017 2:02 pm

Re: re.sub regex won't match

Post by efahl » Thu Oct 18, 2018 2:48 pm

Paul is saying that r'\n' and '\n' are completely different strings, which can be verified in any version of python.

r'\n' -> two-character string containing '\' and 'n'.
'\n' -> string containing a single newline character.

This is not a bug in r-strings, it is in fact the reason they exist.

User avatar
dhylands
Posts: 3821
Joined: Mon Jan 06, 2014 6:08 pm
Location: Peachland, BC, Canada
Contact:

Re: re.sub regex won't match

Post by dhylands » Thu Oct 18, 2018 4:15 pm

The bug is in the re module which if passed a pattern containing the two characters \ followed by r (i.e. r'\r') then it should match the single character carriage return (i.e. '\r').

i.e. in CPython:

Code: Select all

>>> re.sub(r'a\rb', '', 'aa\rbb')
'ab'
whereas MicroPython (on my pyboard) gives:

Code: Select all

>>> re.sub(r'a\rb', '', 'aa\rbb')
'aa\rbb'

pfalcon
Posts: 1155
Joined: Fri Feb 28, 2014 2:05 pm

Re: re.sub regex won't match

Post by pfalcon » Sat Oct 20, 2018 9:18 am

devnull wrote:
Wed Oct 17, 2018 11:09 pm
and also I don't use cpython.
Well, you should. MicroPython Unix port and reference implementation of the Python language, CPython, should be at the fingertips of every MicroPython programmer, or they complicate their lives unnecessarily - literally, looking at the vast (Micro)Python landscape thru narrow tube of a serial connection.
Awesome MicroPython list
Pycopy - A better MicroPython https://github.com/pfalcon/micropython
MicroPython standard library for all ports and forks - https://github.com/pfalcon/micropython-lib
More up to date docs - http://pycopy.readthedocs.io/

pfalcon
Posts: 1155
Joined: Fri Feb 28, 2014 2:05 pm

Re: re.sub regex won't match

Post by pfalcon » Sat Oct 20, 2018 9:28 am

dhylands wrote:
Thu Oct 18, 2018 4:15 pm
The bug is in the re module
As mentioned above, there's no (builtin) "re" module in MicroPython, only "ure". And "ure" works per the specification (that being absolutely smallest and not duplicating functionality already provided by MicroPython, like quoting).

However, there's a bug in some ports, where native MicroPython modules, starting with "u", are accessible by different name (Unix port for example doesn't have such a bug, so testing with it would show problem right away). Yes, it initially was conceived as a convenience, but long since has become mis-convenience. The rule of thumb is simple - someone developing for MicroPython should always use "u" modules, exactly how they're named in the docs: http://docs.micropython.org/en/latest/l ... -libraries

The only reason to use non-"u" module is writing code which is compatible with CPython. Which can be called "advanced topic" given confessions like "I don't use cpython".


Oh, and the docs for ure module are updated to be more complete: https://pycopy.readthedocs.io/en/latest ... y/ure.html
Awesome MicroPython list
Pycopy - A better MicroPython https://github.com/pfalcon/micropython
MicroPython standard library for all ports and forks - https://github.com/pfalcon/micropython-lib
More up to date docs - http://pycopy.readthedocs.io/

User avatar
devnull
Posts: 473
Joined: Sat Jan 07, 2017 1:52 am
Location: Singapore / Cornwall
Contact:

Re: re.sub regex won't match

Post by devnull » Sat Oct 20, 2018 2:11 pm

@pfalcon - I think you have your wires crossed my friend !

I have never implied any bug (other posters may have), I was simply asking for help with a regex query.

I'm sorry if asking for this kind of help is not acceptable to you ?!

pfalcon
Posts: 1155
Joined: Fri Feb 28, 2014 2:05 pm

Re: re.sub regex won't match

Post by pfalcon » Sat Oct 20, 2018 10:24 pm

devnull, I'm not sure what you're talking about. There're 2 my replies above, one clearly addressed to you, another to dhylands. Feel free to skip anything not addressed to you (or anything at all).
Awesome MicroPython list
Pycopy - A better MicroPython https://github.com/pfalcon/micropython
MicroPython standard library for all ports and forks - https://github.com/pfalcon/micropython-lib
More up to date docs - http://pycopy.readthedocs.io/

Post Reply