ESP8266 Code crashes after ~30 hours

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
Post Reply
ThomasChr
Posts: 121
Joined: Sat Nov 25, 2017 7:50 am

ESP8266 Code crashes after ~30 hours

Post by ThomasChr » Sun Jan 13, 2019 4:33 pm

Hello Forum,

I have this rather simple code for an ESP8266 which works fine bit crashes after ~30 hours. Did I do anything wrong in those few lines of code?

Thanks for your help!

(PS: The reset reason is 2 - so an Exception happend, but not one I could catch. Maybe it happend in the ISR where I had no exception handling?)

Code: Select all

import time
import machine
import network
import usocket
import micropython

# What is our sensorid?
sensorid = 5
# How many seconds between sending data?
sendseconds = 60
# How many seconds do we try until we abort connecting to our Wifi?
secondstrywificonnect = 45
# How many blinks does our power meter give per kWh
impulses_per_kwh = 10000
# GPIO 5 is Pin D1 on NodeMCU
pinwithIRsensor = 5
# Wifidata
wifiname = '1'
wifipass = '2'
# Serverdata
serveraddress = 'myserver.de'
passforsending = '3'
# Blinks we've seen from our power meter since startup
totblinks = 0
# We save our timestamp here to calculate the time between two blinks
messA = 0
messB = 0

# Exception in an ISR should be handled, reserve memory for that
micropython.alloc_emergency_exception_buf(100)

# This function is called everytime we get a pulse (Pin Change Interrupt)
def blinkarrived(pin):
    global messA
    global messB
    global totblinks
    totblinks = totblinks + 1
    if messA > messB:
        messB = time.ticks_ms()
    else:
        messA = time.ticks_ms()
    return

# This function will send our Data to the Internet
def senddata(timer):
    # Any exception will return
    try:
        global messA
        global messB
        global totblinks
        if messA == 0 or messB == 0:
            return
        # Check for overflow
        if abs(messA - messB) > 3600000:
            if messA > messB:
                messA = 0
            else:
                messB = 0
            return
        if messA > messB:
            timebetweenpulses = messA - messB
        else:
            timebetweenpulses = messB - messA
        # Calculate Watts
        watt = (36 * impulses_per_kwh) / timebetweenpulses
        # Calculate kWh since start
        kwh_since_start = totblinks / impulses_per_kwh
        # Connect to WiFi -> Get interface
        sta_if = network.WLAN(network.STA_IF)
        # Now connect
        connectcount = 0
        if not sta_if.isconnected():
            sta_if.active(True)
            sta_if.connect(wifiname, wifipass)
            while not sta_if.isconnected():
                connectcount = connectcount + 1
                time.sleep(1)
                if connectcount > secondstrywificonnect:
                    # We didn't connect after secondstrywificonnect seconds. Return for now
                    return
        content = b'sensorid=' + str(sensorid) + '&power=' + str(watt) + '&kwh_since_start=' + str(kwh_since_start) + '&password=' + passforsending
        addr_info = usocket.getaddrinfo(serveraddress, 80)
        addr = addr_info[0][-1]
        sock = usocket.socket()
        sock.connect(addr)
        sock.send(b'POST /tempsensor.php HTTP/1.1\r\n')
        sock.send(b'Host: ' + serveraddress + b'\r\n')
        sock.send(b'Content-Type: application/x-www-form-urlencoded\r\n')
        sock.send(b'Content-Length: ' + str(len(content)) + '\r\n')
        sock.send(b'\r\n')
        sock.send(content)
        sock.send(b'\r\n\r\n')
        sock.close()
        # Done
        messA = 0
        messB = 0
        return
    except:
        return

# And now we are in main!
# Any exception will reset us
try:
    # Activate a timer which will send our last sample every sendseconds Seconds
    tim = machine.Timer(-1)
    tim.init(period = sendseconds * 1000, mode = machine.Timer.PERIODIC, callback = senddata)
    # Activate a callback everytime we get a blink
    # We're using a Pin Change Interrupt, a hard one. To be as quick as possible.
    # Beware: The ISR can't allocate any memory and should be as short as possible!
    irsensor = machine.Pin(pinwithIRsensor, machine.Pin.IN)
    irsensor.irq(trigger = machine.Pin.IRQ_RISING, handler = blinkarrived, hard = True)
    # Configure WiFi
    # Get interfaces
    sta_if = network.WLAN(network.STA_IF)
    ap_if = network.WLAN(network.AP_IF)
    # Deactivate access point, we're station only
    ap_if.active(False)
except:
    machine.reset()
Github Repo: https://github.com/ThomasChr/ESP8266-read-power-meter

Thomas

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: ESP8266 Code crashes after ~30 hours

Post by kevinkk525 » Sun Jan 13, 2019 5:00 pm

What hardware are you using? I did not look at the code too closely but looks good to me though. The isr is short and should always work.

Try to catch the exception and write it to a file so you can look it up after a crash. Just resetting the esp won't give you much information.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

ThomasChr
Posts: 121
Joined: Sat Nov 25, 2017 7:50 am

Re: ESP8266 Code crashes after ~30 hours

Post by ThomasChr » Sun Jan 13, 2019 5:02 pm

I tried to catch the exception but never got one. It‘s a cheap NodeMCU Board.

Glad to know that my code has no obvious flaws!

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ESP8266 Code crashes after ~30 hours

Post by pythoncoder » Sun Jan 13, 2019 5:58 pm

Your senddata routine instantiates a socket. If an exception occurs after that point the socket is not closed. This can occur if you get a WiFi outage. It is worth ensuring that sockets are always closed as they use RAM.

The reliability of ESP8266 devices depends also on the quality of the device and its power supply. Even with very carefully written code and a good quality power supply they can crash. I have kept them running for a cumulative total of about 1000 device-hours but I have experienced one inexplicable crash where the device shut down without apparent reason. For true 24/7 reliability I think you need a hardware watchdog.

You might like to read these thoughts on writing resilient ESP8266 applications. This goes beyond what you need to do, but it does give an overview of the problems.
Peter Hinch
Index to my micropython libraries.

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: ESP8266 Code crashes after ~30 hours

Post by kevinkk525 » Sun Jan 13, 2019 6:50 pm

That is true. Did not catch that. I'd try running it with a socket.close() statement before trying anything else.

As you say it is "crashing" I assume it resets itself? This can happen with unstable power supply or just occassionally with the cheap NodeMCUs. Have those too. Does not happen that often typically and when it happens, it's actually a freeze and I have to reset the controller either by a hard reset or software watchdog which surprisingly works. You don't state a freeze so I assume it is something else.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

ThomasChr
Posts: 121
Joined: Sat Nov 25, 2017 7:50 am

Re: ESP8266 Code crashes after ~30 hours

Post by ThomasChr » Sun Jan 13, 2019 7:08 pm

@pythoncoder: Exactly such an error is was looking for, thank you!

I go for that code, any errors?

Code: Select all

import time
import machine
import network
import usocket
import micropython

# What is our sensorid?
sensorid = 5
# How many seconds between sending data?
sendseconds = 60
# How many seconds do we try until we abort connecting to our Wifi?
secondstrywificonnect = 45
# How many blinks does our power meter give per kWh
impulses_per_kwh = 10000
# GPIO 5 is Pin D1 on NodeMCU
pinwithIRsensor = 5
# Wifidata
wifiname = 'FRITZ!Box 7490'
wifipass = '53082868310446320491'
# Serverdata
serveraddress = 'tclinux.de'
passforsending = 'rumpelsilzchen42'
# Blinks we've seen from our power meter since startup
totblinks = 0
# We save our timestamp here to calculate the time between two blinks
messA = 0
messB = 0

# Exception in an ISR should be handled, reserve memory for that
micropython.alloc_emergency_exception_buf(100)

# This function is called everytime we get a pulse (Pin Change Interrupt)
def blinkarrived(pin):
    global messA
    global messB
    global totblinks
    totblinks = totblinks + 1
    if messA > messB:
        messB = time.ticks_ms()
    else:
        messA = time.ticks_ms()
    return

# This function will send our Data to the Internet
def senddata(timer):
    # Any exception will return
    try:
        global messA
        global messB
        global totblinks
        if messA == 0 or messB == 0:
            return
        # Check for overflow
        if abs(messA - messB) > 3600000:
            if messA > messB:
                messA = 0
            else:
                messB = 0
            return
        if messA > messB:
            timebetweenpulses = messA - messB
        else:
            timebetweenpulses = messB - messA
        # Calculate Watts
        watt = (36 * impulses_per_kwh) / timebetweenpulses
        # Calculate kWh since start
        kwh_since_start = totblinks / impulses_per_kwh
        # Connect to WiFi -> Get interface
        sta_if = network.WLAN(network.STA_IF)
        # Now connect
        connectcount = 0
        if not sta_if.isconnected():
            sta_if.active(True)
            sta_if.connect(wifiname, wifipass)
            while not sta_if.isconnected():
                connectcount = connectcount + 1
                time.sleep(1)
                if connectcount > secondstrywificonnect:
                    # We didn't connect after secondstrywificonnect seconds. Return for now
                    return
        content = b'sensorid=' + str(sensorid) + '&power=' + str(watt) + '&kwh_since_start=' + str(kwh_since_start) + '&password=' + passforsending
        addr_info = usocket.getaddrinfo(serveraddress, 80)
        addr = addr_info[0][-1]
        sock = usocket.socket()
        try:
            sock.connect(addr)
            sock.send(b'POST /tempsensor.php HTTP/1.1\r\n')
            sock.send(b'Host: ' + serveraddress + b'\r\n')
            sock.send(b'Content-Type: application/x-www-form-urlencoded\r\n')
            sock.send(b'Content-Length: ' + str(len(content)) + '\r\n')
            sock.send(b'\r\n')
            sock.send(content)
            sock.send(b'\r\n\r\n')
            sock.close()
        except:
            return
        finally:
            sock.close()
        # Done
        messA = 0
        messB = 0
        return
    except:
        return

# And now we are in main!
# Any exception will reset us
try:
    # Configure WiFi
    # Get interfaces
    sta_if = network.WLAN(network.STA_IF)
    ap_if = network.WLAN(network.AP_IF)
    # Deactivate access point, we're station only
    ap_if.active(False)
    # Activate a timer which will send our last sample every sendseconds Seconds
    tim = machine.Timer(-1)
    tim.init(period = sendseconds * 1000, mode = machine.Timer.PERIODIC, callback = senddata)
    # Activate a callback everytime we get a blink
    # We're using a Pin Change Interrupt, a hard one. To be as quick as possible.
    # Beware: The ISR can't allocate any memory and should be as short as possible!
    irsensor = machine.Pin(pinwithIRsensor, machine.Pin.IN)
    irsensor.irq(trigger = machine.Pin.IRQ_RISING, handler = blinkarrived, hard = True)
except:
    machine.reset()
 
Here is the diff: https://github.com/ThomasChr/ESP8266-re ... 7b8d793ee8

@kevin: The NodeMCU resetted. Yes, I know those boards are not really stable and I'm looking forward for a new Pyboard D (with WiFi) to read my Power Meter in the celler. Surely it will be much more stable!

ThomasChr
Posts: 121
Joined: Sat Nov 25, 2017 7:50 am

Re: ESP8266 Code crashes after ~30 hours

Post by ThomasChr » Mon Jan 21, 2019 11:23 am

The program is up since a week now. And that using a cheap NodeMCU ESP8266 Board from China. Very cool, I'm impressed!

Thanks for all your help!

Thomas

Post Reply