ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
this_andre
Posts: 12
Joined: Mon Dec 24, 2018 2:26 pm
Location: Germany - near Stuttgart

ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by this_andre » Mon Mar 16, 2020 5:19 pm

The ESP8266, running MicroPython is not able anymore to communicate with the MQTT server over WiFi, from the moment I installed a new cable modem (branded Vodafone but actually an Arris TG3442DE). The ESP32 however, communicates without problems.
Before the new modem was installed, I did not have any problems at all to communicate with the ESP8266 and ESP32 over WiFi.
At the ESP8266, an ‘ECONABORTED’ message is shown when trying to open the IP socket (in this case when setting up the mqtt connection).
The WiFi logs from the cablemodem in both cases (ESP8266 and ESP32) and the Wireshark logs from the machine running the MQTT server are available when someone is interested to see. At MQTT side, one can see the repeated message coming in from the ESP8266 asking ‘Who has the IP address’. The machine running, the MQTT server, answers all the time, but this apparently isn’t seen by the ESP8266 and this one times out with ECONABORTED during the socket connect. As far as I can see, the WiFi connection is successfully established at the router side (802.11n).
Does anyone have an idea? I saw messages circulating in the internet about this problem, discussing the old TCP/IP stack version running at the ESP8266, in comparison with the ESP32 but no solution was suggested.
The MicroPython software version running at the ESP8266 is 1.12 but I tried it as well with 1.9.

User avatar
tve
Posts: 216
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by tve » Mon Mar 16, 2020 7:25 pm

I don't have any good suggestions... I assume you have reset the esp8266 and that you see no TCP SYN arriving at the mqtt server end. You can verify with the esp32 that you have the wireshark filter right... I do know that the esp8266 often requires duplicate arp requests and responses for some mysterious reason. I assume you have a connection retry with a little delay in your code. I also assume your mqtt server is on the lan so it's not some firewall issue. Grrr, sounds frustrating!

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by kevinkk525 » Mon Mar 16, 2020 10:13 pm

Do you connect to your mqtt broker by ip or hostname?
This all sounds very strange.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

this_andre
Posts: 12
Joined: Mon Dec 24, 2018 2:26 pm
Location: Germany - near Stuttgart

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by this_andre » Tue Mar 17, 2020 11:17 am

I connect to the MQTT server by IP address. As such (I can see this on the Wireshark log), the machine running the MQTT server receives a request from the ESP8266 asking 'who has 192.168.0.17'? tell 192.168.0.21 (Esp). The answer is a message with the IP address and the MAC address of the MQTT machine. This goes on 7 times and than, the ESP times out with ECONABORTED. As you said, very strange since the same code is running well on the ESP32!
And yes, I put the filters correct at Wireshark!

this_andre
Posts: 12
Joined: Mon Dec 24, 2018 2:26 pm
Location: Germany - near Stuttgart

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by this_andre » Wed Mar 18, 2020 12:16 pm

can anyone tell me whether the ESP 8266 IP stack uses the mac address or the IP address to send out IP packets?
In my case since I installed the new modem, I observed that the MAC address of the destination module is corrupted by the new modem (last 3 groups are put at '0' - eg. 3c:7f:cb:00:00:00).
When the ESP sends out the MQTT connect to a corrupted mac address, this will fail!
I strongly believe that this is the cause of my problem.
If that's true, any change to get an updated IP stack for the ESP8266, that behaves the same way as the ESP32, which doesn't suffer from this?

User avatar
tve
Posts: 216
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by tve » Wed Mar 18, 2020 4:36 pm

Packets sent by the esp find their destination on the local lan via their mac address. In this situation, the IP address is only used in the ARP-level IP->MAC translation. (The dest computer also checks the dest IP address to ensure it is the actual destination and to pick the correct local interface if there are multiple, which is uncommon.)

I don't know what you mean by "the MAC address of the destination module is corrupted". Perhaps you mean that the source MAC address of ARP packets sent by the esp have 3 groups of zeroes? In that case it would look to me like corruption in the esp, namely where the MAC address is stored. I forgot how the esp generates its mac address, but perhaps an erase_chip followed by reflashing fixes it?

If I misunderstood what you're observing it may help to post some packets as output by wireshark/tcpdump...

this_andre
Posts: 12
Joined: Mon Dec 24, 2018 2:26 pm
Location: Germany - near Stuttgart

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by this_andre » Wed Mar 18, 2020 5:35 pm

Thanks for your comment!
The problem seems to be the router, who gets in the ARP response from the MQTT machine, the IP & MAC address of the MQTT machine and currupts the MAC address. I opened a ticket at my ISP provider to mention the issue.
The Wireshark log at the MQTT machine, shows that the ARP response from the MQTT machine is OK. However, if I look in the router log (when setting up the WIFI connection, because this is the only place I can get logging information from the router), I can see the corruption of the MAC address by zeroing the last 3 numbers as I explained my previous comment.
My question still is : why is the ESP8266 suffering from this and the ESP32 running OK???

User avatar
tve
Posts: 216
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by tve » Thu Mar 19, 2020 7:29 pm

TBH I very much doubt that your analysis is correct. Your cable modem runs some form of embedded linux (openwrt et al) and whether it switches the packets in hardware or bridges them in the kernel I really have a hard time believing anything non-standard happens.
If you want help here please post packet traces.

User avatar
tve
Posts: 216
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by tve » Thu Mar 19, 2020 7:35 pm

NB: in a PM you asked whether there are ways to capture the esp8266 packets. Yes, there is. You need an atheros Wifi adapter (typ. old USB dongle) on a linux box and set up packet sniffing ("monitoring mode"). Then use wireshark to sniff the packets "in the air". Best done using an open wifi network (make sure no-one hacks you...) but also possible when using WPA, just more manual pain. When I set that up a year ago it was a royal PITA to get all the parts just right, but gave me invaluable insights on what was happening (I was looking at low power and the time the esp takes to associate on wake-up). YMMV...

this_andre
Posts: 12
Joined: Mon Dec 24, 2018 2:26 pm
Location: Germany - near Stuttgart

Re: ESP8266 does not connect to MQTT server anymore, ESP32 does, since new WiFi cablemodem is installed

Post by this_andre » Sat Mar 21, 2020 4:53 pm

Hi TVE,
I believe your remark about my faulty analysis is correct :oops:
I cannot find a corrupted mac address in the Wireshark log. (Most likely, the router log displays it wrongly).
The main issue is still valid: multiple incoming ARP requests from the ESP8266 till a maximum number of retransmissions occur. After this, the ECONABORTED failure is generated by the ESP8266.
This seems to point down the problem to the ESP itself. One of the differences between the ESP8266 and ESP32 is apparently the lwip version (TCP/IP stack). As far as I understand, the ESP32 is running lwip V2 (including IP v6 and bug fixes). Since the ESP32 is running fine, lwip could be determining factor. Can you tell me what lwip version the ESP8266 is using? I can find information about the lwip version topic but there does not seem to be a procedure available how to link the lwipV2 version into the esp-open-sdk, needed to build micropython. Do you have any hint to get further on this issue?

Post Reply