ESP8266 gets currupted a few hours of running.

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
Post Reply
jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

ESP8266 gets currupted a few hours of running.

Post by jeanl » Wed Dec 30, 2020 8:59 pm

Hi guys!
I'm new to micropython, but not to python or rpi which I've used extensively.
I'm trying to use an ESP8266 board to create a remote temperature/humidity sensor in the backroom of our house. Using esp8266-20200911-v1.13.bin for the flash.

I'm using a SHT30 i2c temp sensor, the board measures temperature every 2 seconds using a timer, and prints it out (using print(), for debug).

I'm finding that things work very well for quite a while, but if I leave the board running overnight (connected to my PC so I can monitor the prints), on many occasions I come back and find the system heavily corrupted so the only thing I can do is re-flash the board.
Specifically, the board appears to spontaneously reboot at some point, and the last messages are typically these:

Code: Select all

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)
load 0x40100000, len 30768, room 16
tail 0
chksum 0xc4
load 0x3ffe8000, len 1024, room 8
tail 8
chksum 0xd8
load 0x3ffe8400, len 1080, room 0
tail 8
chksum 0xc4
csum 0xc4
The reset cause is 2, but of course, I didn't press any button, nor do I have any reset() call in my program. There was no brownout/blackout, my computer is just as I left it and the board is connected to it.
At this point, the board needs a reflash, nothing else works.

I started using FAT32, then switched to littlefs thinking that it would be more robust but I get the same issue.

Does this ring a bell for anybody?
I've already pruned a lot of the code to try to pinpoint where the issue might come from (I had code that responded to http requests, but removed it). I can do more pruning but before I dig deeper I wanted to ask around.

Thanks in advance!
Jean

User avatar
pythoncoder
Posts: 4787
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ESP8266 gets currupted a few hours of running.

Post by pythoncoder » Thu Dec 31, 2020 6:47 am

You might like to read this guide first.

I have two libraries which aim to support long term communication with ESP8266: micropython-iot and resilient asynchronous MQTT.
Peter Hinch

jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

Re: ESP8266 gets currupted a few hours of running.

Post by jeanl » Thu Dec 31, 2020 4:54 pm

pythoncoder wrote:
Thu Dec 31, 2020 6:47 am
You might like to read this guide first.

I have two libraries which aim to support long term communication with ESP8266: micropython-iot and resilient asynchronous MQTT.
Ah thanks, that's helpful but my problems are not related to wifi as far as I can tell (but I like your investigation, and I've also found that you have to work somewhat hard to make the wifi reliable, including using timeouts etc).
Since my first message I've simplified my code to the point where all I have is one loop that reads the temperature and a timer that also attempts to read the temperature (and no http code at all). This crashes the board reliably (to the point where it has to be reflashed, you lose all your files etc)

So I conclude that accessing the i2c interface from a while() loop + a timer creates lots of problems! I wonder if that's a known issue but now that I think of it, I can see why that would create problems with the i2C interface, (but I don't see why it would erase the filesystem!)
jean

jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

Re: ESP8266 gets currupted a few hours of running.

Post by jeanl » Fri Jan 01, 2021 3:19 am

Well, it's not even that... Even without timers, without http, the board gets seriously corrupted. The code is still running, I can see the printouts, but the file system is gone, and if you reset, the board just sits there...
I'm not sure whether it's just me, but I've tried 2 esp8266 boards so far (different types of boards), running micropython and I can't get any of these two to work reliably. One's using a one wire temp sensor, the other an i2c, both crash after a few hours.

User avatar
pythoncoder
Posts: 4787
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ESP8266 gets currupted a few hours of running.

Post by pythoncoder » Fri Jan 01, 2021 5:41 am

In my view this is highly unusual. As you may have gathered, working with another user, we have spent a great deal of time trying to achieve long term reliability of ESPx applications. While ESP8266 crashes do occur, typically after a few days of continuous WiFi operation, they do not take out the filesystem.

I would strongly suspect power problems, which are commonplace. Wall warts vary greatly in quality. USB cables can suffer voltage drops if the current increases (e.g. when WiFi operates). You need a short, thick USB lead and a wall wart designed for continuous operation rather than just for charging phones. Official Raspberry Pi wall warts are good in my experience.
Peter Hinch

jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

Re: ESP8266 gets currupted a few hours of running.

Post by jeanl » Fri Jan 01, 2021 5:06 pm

OK that's good to know. I'll do some more experiments. In many of my tests the boards were connected to my PC (which has a decent power supply, definitely able to support a chip like that), with 3-foot cables. But who knows, it could indeed be that. What's strange is that power supplies problems like that would take out the filesystem (littlefs) systematically. But maybe that's what it is.
I'm planning on creating a short, reproducible examples, and see if other people can confirm the issue.

Thanks
J.

davef
Posts: 134
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ESP8266 gets currupted a few hours of running.

Post by davef » Fri Jan 01, 2021 8:12 pm

I found that looking at 3V3 with a DSO, even a cheap unit like a DSO138 was enough to show the "dips" in supply voltage that was causing my problems.

On a USB to UART converter board I had to place an extra 1000uF on the output to get reliable operation. I also found that using a LM317N linear regulator (typical current limit 200mA) couldn't hack-it either.

jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

Re: ESP8266 gets currupted a few hours of running.

Post by jeanl » Sat Jan 09, 2021 11:43 pm

Thanks for the responses and sorry for the delay (I'm not getting notifications when new posts appear, not sure why). In any case, trying to minimize my code to get a reproducible failure turned out to not be that easy! My test .py file no longer crashes, which is a good problem to have!
On occasions, I lose connectivity (despite the fact that I restart the wifi and reconnect when the board detects it can't connect to the gateway), but I no longer lose the filesystem! :D

@Peter: I took a look at your resilient libraries, but they require me to switch away from regular http requests, right? In my case the 8266 board acts as a server and a client queries it regularly using http requests. If I were to use your libraries, I would have to change the client code to use MQTT instead, is that correct?

User avatar
pythoncoder
Posts: 4787
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ESP8266 gets currupted a few hours of running.

Post by pythoncoder » Sun Jan 10, 2021 6:52 am

jeanl wrote:
Sat Jan 09, 2021 11:43 pm
...
@Peter: I took a look at your resilient libraries, but they require me to switch away from regular http requests, right? In my case the 8266 board acts as a server and a client queries it regularly using http requests. If I were to use your libraries, I would have to change the client code to use MQTT instead, is that correct?
Yes. The async MQTT library supports only MQTT. The IOT library supports a simple resilient, asynchronous socket-like object that communicates with a server module. The latter can run on something very minimal such as a Raspberry Pi. The server module can communicate with the internet using HTTP or any other protocol.

We encountered significant difficulties making these modules resilient in the face of outages. The official MQTT modules lack the necessary code and mileage may vary with HTTP frameworks (I haven't studied all the available offerings). The fundamental problem is that, on a PC, the OS does an amazing job of hiding the fact that wireless communication is inherently unreliable. Web programmers therefore tend to work on the assumption that the TCP/IP guarantees will be met. Where there is no OS, this is not the case: the link will suffer outages which tend to break protocols. You have to code round this. My libraries aim to save users from the trouble.
Peter Hinch

jeanl
Posts: 6
Joined: Sun Dec 20, 2020 9:36 pm

Re: ESP8266 gets currupted a few hours of running.

Post by jeanl » Sun Jan 10, 2021 6:37 pm

Cool! Thanks a bunch.

Post Reply