flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
kurt
Posts: 20
Joined: Sun Feb 11, 2018 7:45 pm

flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kurt » Fri Jun 22, 2018 10:20 am

HI,
working with v1.9.4-198-g25ae98f on ESP8266.
From time to time the above problem occurs.
I'm using asyncio and in one task log messages generated in other tasks with "uheapq.heappush(logQueue, ..." are written to a file named syslog.
If there are predefined number of lines in syslog, a new file (syslog.1) is created, all the content of syslog is copied to the new file except the first line. Then syslog is removed using uos.remove(syslog) and syslog.1 is renamed to syslog using uos.rename(syslog.1, syslog).

Following error message is shown in repl right after printing "nach rename":
Fatal exception 0(IllegalInstructionCause):
epc1=0x402773b4, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

The flash directory shows:

\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

The instructions I use:
...
# wrap around syslog file if lines exceeds config.syslog_lines
try:
lines = 0
for line in open('syslog', 'r'):
lines += 1
if lines >= config.SYSLOG_LINES:
with open('syslog', 'r') as f:
for i in range(lines - config.SYSLOG_LINES + 1):
line = f.readline()
with open('syslog.1', 'w') as output:
for line in f:
output.write(line)
print('vor remove')
uos.remove('syslog')
await asyncio.sleep(0.5)
print('vor rename')
uos.rename('syslog.1', 'syslog')
print('nach rename')
except OSError: # file not found
print('syslog not found')
pass
...

With insert of "await asyncio.sleep(0.5)" I tried to solve the problem. But unfortunately not.

Any help is highly welcome
Kurt

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kevinkk525 » Fri Jun 22, 2018 8:51 pm

I get that problem sometimes if my unit reboots too often as it goes crazy sometimes when I connect it to my PC and just keeps booting over and over again.
Shouldn't be caused by your script, i think.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by pythoncoder » Sat Jun 23, 2018 6:05 am

@kurt I don't really understand your code.

Code: Select all

for line in open('syslog', 'r'):
    lines += 1
I'm not clear what that's supposed to do. More worryingly you've opened the file here (before opening it again in the context manager). I think the CM will open (and implicitly close) a second file handle, but the first will still be open. If I'm right you're then renaming an open file.

However this should raise an exception rather than cause a crash so you may have uncovered a MicroPython bug here.
[EDIT]
I tried renaming an open file on a Pyboard and it neither raised an exception nor crashed: the rename worked and the file handle continued to work too.

In practice the only way to track down a crash is progressively to reduce the program to a minimal test case which causes the crash to occur.
Peter Hinch
Index to my micropython libraries.

kurt
Posts: 20
Joined: Sun Feb 11, 2018 7:45 pm

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kurt » Sat Jun 23, 2018 9:01 am

HI,
many thanks for the answers.
After changing to:
try:
lines = 0
with open('syslog', 'r') as f:
for line in f:
lines += 1
print('lines read: ', lines)
if lines >= config.SYSLOG_LINES:
with open('syslog', 'r') as f:
for i in range(lines - config.SYSLOG_LINES + 1):
line = f.readline()
with open('syslog.1', 'w') as output:
for line in f:
output.write(line)
print('vor remove')
uos.remove('syslog')
#await asyncio.sleep(0.5)
print('vor rename')
uos.rename('syslog.1', 'syslog')
print('nach rename')
except OSError: # file not found
print('syslog not found')
pass
Unfortnuately, the same error occurs.
It seems, the error happens only, when the wifi connection is established. I cannot understand, what the relation could be.

Many thanks for your help
Kurt

Update:
Free memory seems not to be the problem:
noumber lines in file: 10
mem_free before remove: 13616
mem_free before rename: 13040
mem_free after rename: 13040
noumber lines in file: 10
mem_free before remove: 15072
Fatal exception 0(IllegalInstructionCause):
epc1=0x402773b4, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

kurt
Posts: 20
Joined: Sun Feb 11, 2018 7:45 pm

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kurt » Sun Jun 24, 2018 6:07 am

Hi,
the problem can be recreated by flashing ESP8266 with the latest micropython build from micropython download page:
esp8266-20180511-v1.9.4.bin (elf, map) (latest)

using this small sketch:

Code: Select all

import network

wlan_sta = network.WLAN(network.STA_IF)
wlan_sta.active(True)
wlan_sta.connect('SSID','PASSWORD')

while not wlan_sta.isconnected():
    pass

while True:
    try:
        lines = 0
        with open('syslog', 'r') as f1:
            for line in f1:
                lines += 1
        print('number lines in syslog: ', lines)
        if lines >= 10:
            with open('syslog', 'r') as f2:
                for i in range(lines - 10 + 1):
                    line = f2.readline()
                with open('syslog.1', 'w') as output:
                    for line in f2:
                        output.write(line)
            uos.remove('syslog')
            uos.rename('syslog.1', 'syslog')
    except OSError:     # syslog not found
        pass

    with open('syslog', 'a') as f3:
        f3.write('testdata\n')
let the sketch run till "number lines in syslog: 10" will shown more than one and interrupt then with ctrl c.

The error occurs not every time. After some attempts, I get:
...
number lines in syslog: 10
number lines in syslog: 10
number lines in syslog: 10

ets Jan 8 2013,rst cause:2, boot mode:(3,6)

load 0x40100000, len 31108, room 16
tail 4
chksum 0x28
load 0x3ffe8000, len 1100, room 4
tail 8
chksum 0x4e
load 0x3ffe8450, len 3268, room 0
tail 4
chksum 0x09
csum 0x09
l��|��rrnb��l�b�lb쌜���l�b�lrlll��|��rrnb��ll�lb�b쌜��b�b��lrl�l��|��rrnb��l��b�b쌜���b�lbl����n��r��n|�llll`b��|r�l�n��n�
ets_task(40100130, 3, 3fff83ec, 4)
OSError: [Errno 2] ENOENT
OSError: [Errno 2] ENOENT

MicroPython v1.9.4-8-ga9a3caad0 on 2018-05-11; ESP module with ESP8266
Type "help()" for more information.
>>>

and the directory is corrupted as mentioned in the first post.

Many thanks for any help.
Kurt

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by pythoncoder » Mon Jun 25, 2018 7:30 am

Your code ran indefinitely on my hardware (the Adafruit Feather Huzzah reference board). If I arrange a counter so it runs N times before quitting it seems reliable.

However interrupting it with ctrl-C sometime causes an 'illegal instruction' crash and a corrupted filesystem. This seems to be a minimal testcase. First I create a file with this script:

Code: Select all

with open('syslog', 'w') as f3:
    for _ in range(10):
        f3.write('testdata\n')
Then I reboot and run this:

Code: Select all

while True:
    with open('syslog', 'r') as f2:
        with open('syslog.1', 'w') as output:
            for line in f2:
                output.write(line)
When I interrupt this with ctrl-C I get an illegal instruction crash. I have raised an issue.
Peter Hinch
Index to my micropython libraries.

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by pythoncoder » Mon Jun 25, 2018 8:53 am

There is a separate issue: with recent builds I have seen corrupted filesystems with a lot of zeros immediately after a flash erase and install. This was fixed by using the very latest firmware, so I suggest you do likewise. Alas new firmware doesn't fix the above issue.
Peter Hinch
Index to my micropython libraries.

kurt
Posts: 20
Joined: Sun Feb 11, 2018 7:45 pm

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kurt » Mon Jun 25, 2018 11:38 am

Hi Peter,
many thanks for your effort to help me.
Unfortunately, there is the same crash as before.
I flashed the latest daily build downloaded from http://micropython.org/download:
esp8266-20180625-v1.9.4-199-g6fc84a74.bin
Used uPyLoader to transfer your sketch into the flash:

Code: Select all

with open('syslog', 'w') as f3:
    for _ in range(10):
        f3.write('testdata\n')

while True:
    with open('syslog', 'r') as f2:
        with open('syslog.1', 'w') as output:
            for line in f2:
                output.write(line)
After a short while the the following is shown without pressing ctrl-C key:

Fatal exception 0(IllegalInstructionCause):
epc1=0x40276cb4, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

ets Jan 8 2013,rst cause:2, boot mode:(3,6)

load 0x40100000, len 31048, room 16
tail 8
chksum 0xb6
load 0x3ffe8000, len 1092, room 0
tail 4
chksum 0xf2
load 0x3ffe8450, len 3252, room 4
tail 0
chksum 0x9b
csum 0x9b
����n�r��n|�llll`��r�l�l�l`��r�l�l�l`��r�l��
SError: [Errno 2] ENOENT

MicroPython v1.9.4-199-g6fc84a74 on 2018-06-25; ESP module with ESP8266
Type "help()" for more information.
>>>

Also the flash directory is corupted as described above.
Don't know what could be wrong on my side.
My hardware is a similler version of the following:
https://www.exp-tech.de/en/platforms/in ... ed-esp8266

Regards
Kurt

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by pythoncoder » Tue Jun 26, 2018 8:03 am

If you revisit the issue I raised (and referenced above) you'll see that @Damien has acknowledged and diagnosed the bug. I'm sure it will be fixed shortly.
Peter Hinch
Index to my micropython libraries.

kurt
Posts: 20
Joined: Sun Feb 11, 2018 7:45 pm

Re: flash directory corrupted with \x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00

Post by kurt » Sat Jun 30, 2018 10:07 am

Hi Peter,
saw the fix of dpgeorge at your issue https://github.com/micropython/micropython/issues/3897 and tested it using esp8266-20180630-v1.9.4-227-gab02abe9.bin with code

Code: Select all

with open('syslog', 'w') as f3:
    for _ in range(10):
        f3.write('testdata\n')

n = 0
while True:
    with open('syslog', 'r') as f2:
        with open('syslog.1', 'w') as output:
            for line in f2:
                output.write(line)
                n += 1
                if n >= 20:
                    raise KeyboardInterrupt
                    
which runs fine, without destroying the file directory.

But, if I run

Code: Select all

with open('syslog', 'w') as f3:
    for _ in range(10):
        f3.write('testdata\n')
one time, do a hard reset, and then run this

Code: Select all

while True:
    with open('syslog', 'r') as f2:
        with open('syslog.1', 'w') as output:
            for line in f2:
                output.write(line)
without doing a ctrl-C I get after a while:

Fatal exception 0(IllegalInstructionCause):
epc1=0x40276fb4, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

ets Jan 8 2013,rst cause:2, boot mode:(3,7)

load 0x40100000, len 30904, room 16
tail 8
chksum 0x74
load 0x3ffe8000, len 1092, room 0
tail 4
chksum 0xdd
load 0x3ffe8450, len 3252, room 4
tail 0
chksum 0x94
csum 0x94
����n�r��n|�llll`��r�l�l�l`��r�l�l�l`��r�l��
OSError: [Errno 2] ENOENT

MicroPython v1.9.4-227-gab02abe9 on 2018-06-30; ESP module with ESP8266
Type "help()" for more information.
>>>

Unfortunately, my problem is not yet fixed.

Regards
Kurt

Post Reply