Recovering from failure
Recovering from failure
My 8266 seems to hang after couple days. I couldn't replicate issue unless I put it in use (i.e. I cannot hook a computer to serial). I am wondering if there is any way to get a crash log? It doesn't seem that the problem was due to wifi as it seems to reconnect if I restart my router. Is it possible to have an external circuit to monitor failure (e.g. keep alive) and auto reset the device?
-
- Posts: 969
- Joined: Sat Feb 03, 2018 7:02 pm
Re: Recovering from failure
I have a similar problem in this thread: viewtopic.php?f=16&t=4706&start=20
My esp8266 hangs after 1-2 weeks for 1h10 minutes, then recovers.
You could try a watchdog implementation (only software wdt using interrupts as the esp8266 does not have a hardware wdt):
You can easily adapt it to your environment by removing the log and the pysmartnode references (and asyncio if you don't use it).
My esp8266 hangs after 1-2 weeks for 1h10 minutes, then recovers.
You could try a watchdog implementation (only software wdt using interrupts as the esp8266 does not have a hardware wdt):
Code: Select all
import gc
import uasyncio as asyncio
import machine
from pysmartnode.utils import sys_vars
gc.collect()
from pysmartnode import logging
log = logging.getLogger("WDT")
class WDT:
def __init__(self, id=0, timeout=120):
self._timeout = timeout / 10
self._counter = 0
self._timer = machine.Timer(id)
self.init()
asyncio.get_event_loop().create_task(self._resetCounter())
if sys_vars.hasFilesystem():
try:
with open("watchdog.txt", "r") as f:
if f.read() == "True":
log.warn("Reset reason: Watchdog")
except Exception as e:
print(e) # file probably just does not exist
try:
with open("watchdog.txt", "w") as f:
f.write("False")
except Exception as e:
log.error("Error saving to file: {!s}".format(e))
def _wdt(self, t):
self._counter += self._timeout
if self._counter >= self._timeout * 10:
if sys_vars.hasFilesystem():
try:
with open("watchdog.txt", "w") as f:
f.write("True")
except Exception as e:
print("Error saving to file: {!s}".format(e))
machine.reset()
def feed(self):
self._counter = 0
def init(self, timeout=None):
timeout = timeout or self._timeout
self._timeout = timeout
self._timer.init(period=int(self._timeout * 1000), mode=machine.Timer.PERIODIC, callback=self._wdt)
def deinit(self): # will not stop coroutine
self._timer.deinit()
async def _resetCounter(self):
while True:
await asyncio.sleep(self._timeout)
self.feed()
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Recovering from failure
For a software watchdog with uasyncio you could look at this module. This has a Delay_ms class. This will trigger a callback if it is not repeatedly triggered.
The ultimate solution is a hardware watchdog. A retriggerable monostable is repeatedly retriggered by a software-generated puls on a pin. It activates the hardware reset if it times out. The key advantage is that the system will recover from a total crash where the CPU stops running code.
The ultimate solution is a hardware watchdog. A retriggerable monostable is repeatedly retriggered by a software-generated puls on a pin. It activates the hardware reset if it times out. The key advantage is that the system will recover from a total crash where the CPU stops running code.
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.