Page 1 of 1

Error messages via mqtt

Posted: Wed Sep 23, 2020 3:54 pm
by Divergentti
Hello.

I had impression that I can setup a mqtt-broker so that it handles str-only data something like this https://github.com/divergentti/kotiauto ... rheille.py

and I can see at the broker error messages such as:

"MQTT to InfluxDB VirheSilta
Connected with result code 0
errors/sensors/esp32/error b'23.9.2020 klo 14:25:00 uptime: 20712 device: ESP32-kanalaPIR error: [Errno 113] EHOSTUNREACH memfree: 87056'"

where the mqtt-message is generated from an ESP32 like this: https://github.com/divergentti/kotiauto ... in/main.py

.... practically first collecting errors first to the file (translated to English):

Code: Select all

def report_error(error):
    # IN: str virhe = error text part
    try:
        f = open('errors.txt', "r")
        # if exists, skip
    except OSError:  # does not exist, create one
        f = open('errors.txt', 'w')
    errormsg = str(resolve_time()) + " uptime: " + str(utime.ticks_ms()) + " device: " \
        + str(CLIENT_ID) + " error: " + str(error) + " memfree: " + str(gc.mem_free())
    f.write(errormsg)
    f.close()

def check_errorfile():
    try:
        f = open('errors.txt', "r")
    except OSError:  
        return
        #  Read the file and send to mqtt broker
    rows = f.readline()
    while rows:
        try:
            client.publish(AIHE_VIRHEET, str(rows), retain=False)
            rows = f.readline()
        except OSError as e:
            # Not working, lets boot
            restart_and_reconnect()
    #  File read and reported, close and delete it
    f.close()
    os.remove('errors.txt')
.... but for some reason I do not see anything in the InfluxDB (nor Grafana), which I think means that I can pass only values (not long strings) to the InfluxDB.

For better error logging I propably shall write broker which collects error messages via mqtt to mtqq-server (mosquitto) flat file. Or should I try some ready-2-go solution for this type of logging? I guess snmp traps or remote syslog is out of question?

Re: Error messages via mqtt

Posted: Wed Sep 23, 2020 7:16 pm
by kevinkk525
In my project I send log messages over mqtt like this: https://github.com/kevinkk525/pysmartno ... ng_full.py
Every log message will be sent directly over mqtt, no files on the device itself, but it could easily be changed to also write log files on the device itself if that is needed.

And on the server where the mqtt broker is running I have an mqtt client running which logs those messages to a file per device (optionally integrated into homeassistant if one uses that): https://github.com/kevinkk525/SmartServer

This solution might be a lot easier than what you try to do with InfluxDB etc.

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 7:00 am
by Divergentti
kevinkk525 wrote:
Wed Sep 23, 2020 7:16 pm
In my project I send log messages over mqtt like this: https://github.com/kevinkk525/pysmartno ... ng_full.py
Every log message will be sent directly over mqtt, no files on the device itself, but it could easily be changed to also write log files on the device itself if that is needed.

And on the server where the mqtt broker is running I have an mqtt client running which logs those messages to a file per device (optionally integrated into homeassistant if one uses that): https://github.com/kevinkk525/SmartServer

This solution might be a lot easier than what you try to do with InfluxDB etc.
Thank you for your advice! Your approach looks an excellent approach to this problem. I will test it asap.

By collecting running configurations to a server is actually something what I have to implement too. I shall have some sort of ESP32 (or other IoT device) management system, which can both push and pull configurations to the devices. Something like OTA-updater, which I have not tested yet https://medium.com/@ronald.dehuysser/mi ... fde670d4eb.

In my simple error handler code, error will be written first to a logfile, then when connectivity is ok, reported to a mqtt-broker, and when complete, error.log will be deleted. With this setup I do not loose errors if ESP32 reboots. As an example, last night I got these errors at my mqtt-broker REPL (which does not push these values to InfluxDB as it should):

(virheet = errors, sensorit = sensors, virhe = error, muistia = memory)

virheet/sensorit/esp32/virhe b'23.9.2020 klo 18:43:05 uptime: 21117 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 870566'
virheet/sensorit/esp32/virhe b'23.9.2020 klo 19:54:40 uptime: 21017 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 87216'
virheet/sensorit/esp32/virhe b'23.9.2020 klo 21:47:19 uptime: 21265 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 87056'
virheet/sensorit/esp32/virhe b'23.9.2020 klo 23:44:30 uptime: 20713 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 87056'
virheet/sensorit/esp32/virhe b'24.9.2020 klo 00:44:58 uptime: 21140 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 87216'
virheet/sensorit/esp32/virhe b'24.9.2020 klo 02:02:49 uptime: 21219 laite: ESP32-kanalaPIR virhe: [Errno 113] EHOSTUNREACH muistia: 87056'

So, in this case I have some issues with mqtt-connetivity (EHOSTUNREACH), which I have to look with Wireshark. Uptime in other hand may relate to some sort of memory overflow as well. Other ESP32s worked just fine last night (pushed information to the InfluxDB and the Grafana).

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 7:12 am
by kevinkk525
Then I guess you should find out why your mqtt connection breaks this often. I never have any connection issues except for the occasional wifi outage that triggers a reconnect but those take less than 10 seconds so I never actually lose any mqtt messages even though I send most of them with a timeout of less than 30 seconds.

Do you use a local mqtt broker? Connect by ip or hostname?

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 8:59 am
by Divergentti
kevinkk525 wrote:
Thu Sep 24, 2020 7:12 am
Then I guess you should find out why your mqtt connection breaks this often. I never have any connection issues except for the occasional wifi outage that triggers a reconnect but those take less than 10 seconds so I never actually lose any mqtt messages even though I send most of them with a timeout of less than 30 seconds.

Do you use a local mqtt broker? Connect by ip or hostname?
The mqtt-broker is a Raspberry PI / mosquitto running in the same subnet. Connect by using an ip-address (dns do not resolve local addresses).

Problem was most likely in the boot.py, but I do not understand why other ESP32s works fine with my old boot.py.

I changed boot.py like this and now it seems to work:

Code: Select all

import utime
import time
import machine
import network
import ntptime
import esp
import webrepl  # 
from time import sleep
# Parameters from parametrit.py (salasana = passwod)
from parametrit import SSID1, SSID2, SALASANA1, SALASANA2, WEBREPL_SALASANA, NTPPALVELIN

machine.freq(240000000)  # first full power
esp.osdebug(None)
webrepl.start(password=WEBREPL_SALASANA)
wificlient_if = network.WLAN(network.STA_IF)
wificlient_if.active(False)
ntptime.host = NTPPALVELIN


def yhdista_wifi(ssid_nimi, salasana):
    global wificlient_if
    print("Let's try %s" % ssid_nimi)
    wificlient_if.active(True)
    wificlient_if.connect(ssid_nimi, salasana)
    time.sleep(2)
    if wificlient_if.isconnected():
        print('Network setup:', wificlient_if.ifconfig())
        print("Signal level %s" % (wificlient_if.status('rssi')))
        aseta_aika()  # set time
        return True
    else:
        return False


def ei_voida_yhdistaa():
    print("No connection. Boot in 1 s.")
    vilkuta_ledi(10)
    sleep(1)
    machine.reset()


def vilkuta_ledi(kertaa):
    ledipinni = machine.Pin(2, machine.Pin.OUT)
    for i in range(kertaa):
        ledipinni.on()
        utime.sleep_ms(100)
        ledipinni.off()
        utime.sleep_ms(100)


def aseta_aika():
    try:
        ntptime.settime()
        print(utime.localtime(utime.time()))
    except OSError as e:
        print("No time from %s! Error %s" % (NTPPALVELIN, e))
        ei_voida_yhdistaa()
    try:
        webrepl.start()  # WebREPL
        machine.freq(80000000)  # slow down
    except OSError as e:
        print("WebREPL ei kaynnisty. Virhe %s" % e)
        ei_voida_yhdistaa()


try:
    yhdista_wifi(SSID1, SALASANA1)
except False:
    try:
        yhdista_wifi(SSID2, SALASANA2)
    except False:
        print("No go, let's boot!")
        ei_voida_yhdistaa()

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 10:04 am
by kevinkk525
I put time.sleep_ms(100) before executing any code, lets the esp background tasks settle, maybe that helps a bit, maybe not :D

In your code I can't see a mqtt client but I see that you are using a synchronous programming approach (using utime.sleep). Maybe that is a source of problems too because longer blinking of your leds will spend a very long time in that function and the connection to your broker could be lost during that time? However, I wouldn't expect an error like EHOSTUNREACH in that case but i don't know..
I use a local mqtt broker and (u)asyncio with an async mqtt client and have no problems with the mqtt connection.

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 11:42 am
by Divergentti
kevinkk525 wrote:
Thu Sep 24, 2020 10:04 am
I put time.sleep_ms(100) before executing any code, lets the esp background tasks settle, maybe that helps a bit, maybe not :D

In your code I can't see a mqtt client but I see that you are using a synchronous programming approach (using utime.sleep). Maybe that is a source of problems too because longer blinking of your leds will spend a very long time in that function and the connection to your broker could be lost during that time? However, I wouldn't expect an error like EHOSTUNREACH in that case but i don't know..
I use a local mqtt broker and (u)asyncio with an async mqtt client and have no problems with the mqtt connection.
Ahh ... thank you for that advice!

I was about to get ridd of the led blinking sub anyways, but never really think through how much the CPU spends time waiting. Primary purpose for led blinking was related to errors, something like 1 blink = success, 5 blinks = error etc.

+ with ESP32-Wroom32U (model with external antenna) there is no blue led to blink anymore. Worst of all, the PCB of the DevBoard is 2 mm wider than in S-model (without antenna)... so, had to make new case for it -> https://www.thingiverse.com/thing:4604568

Re: Error messages via mqtt

Posted: Thu Sep 24, 2020 12:30 pm
by kevinkk525
good luck!
nice design :D