Page 1 of 1

Invalid encoding for ISO-8859-1 (extended ASCII including Latin1) under MicroPython?

Posted: Sat Oct 03, 2020 12:18 pm
by MCHobby
Hi,
I wanted to transform a string into Latin 1 (ISO-8859-1, a kind of extented ASCII containing the é,è,.. accents).
I got a wierd result when encoding a string with ISO-8859-1 under MicroPython. It seems to fails... Or Maybe I'm wrong!
So I wrote a small script with Atom on my linux machine and it should be UTF-8 ( Can be downloaded here from GitHub.

Code: Select all

#!/usr/local/bin/python3
# -*- coding: UTF-8 -*-

s = 'é' # Latin1 - Small letter e with acute accent - decimal 233

print( s[0] ) # é Should be displayed properly into Python output
print( ord(s[0]) ) # Should display 233

data = s.encode('ISO-8859-1') # Extended ASCII for Latin1.
print( data )     # Should only contains 1 byte with value 233
print( data[0] )  # should display 233
test with Python3 --> OK

If I do test this script from Linux command prompt I got following expected result.
I finally got a bytes() containing one byte (being the LATIN1 ASCII code 233).

Code: Select all

$ python3 encoding.py
é
233
b'\xe9'
233
test with MicroPython (1.10) --> FAILS

If I do transfert the file to my Pyboard (with RShell) and run the same test from REPL, I got the following result which is invalid.

Code: Select all

MicroPython v1.10 on 2019-01-25; PYBv1.1 with STM32F405RG
Type "help()" for more information.
>>>
>>> import encoding
é
233
b'\xc3\xa9'
195
This time, the encoded results in 2 bytes! Which is obviously invalid for ISO-8859-1

Am I wrong or MicroPython is wrong?

Re: Invalid encoding for ISO-8859-1 (extended ASCII including Latin1) under MicroPython?

Posted: Sat Oct 03, 2020 9:45 pm
by scruss
You're running quite an old version of MicroPython there, but even running on the current stable version (1.13) it seems that UTF-8 is the only codec around. Specifying ISO-8859-1 doesn't seem to make any difference.

Re: Invalid encoding for ISO-8859-1 (extended ASCII including Latin1) under MicroPython?

Posted: Tue Oct 20, 2020 7:09 am
by jimmo
MCHobby wrote:
Sat Oct 03, 2020 12:18 pm
Am I wrong or MicroPython is wrong?
MicroPython only supports utf-8.

s.encode(...) will always encode as utf-8. (I'm fairly sure the argument is ignored)