ntptime kills the wifi station, hard reset required to fix

The official PYBD running MicroPython, and its accessories.
Target audience: Users with a PYBD
kwiley
Posts: 140
Joined: Wed May 16, 2018 5:53 pm
Contact:

ntptime kills the wifi station, hard reset required to fix

Post by kwiley » Tue Apr 06, 2021 5:07 am

There is a module ntptime.py buried in the ESP8266 subproject. It isn't included in the PyBoardD firmware, but I copied the file over the flash anyway. It works perfectly well to sync up the PyBoardD's RTC from an NTP (modulo the fact that it even admits in a comment that there is no clear way to set a local timezone).

So, great. Problem is, you can only do it once. After that, the Wifi station is dead. Subsequent calls to the NTP module fail at socket.getaddrinfo, with various combinations of OSError -2 (I can't find a description of any of the special negative error codes that getaddrinfo is admittedly documented to produce) and occasionally -6 (which although also undescribed, I've discovered indicates a closed connection) and 110 (ETIMEDOUT).

Other uses of the Wifi also fail. For example, https://docs.micropython.org/en/latest/ ... k_tcp.html (also buried in the ESP8266 section, curiously, even though the PyBoardD has built-in Wifi and I would expect it to be documented in a more general sense for that reason) has a section "5.2. HTTP GET request" which shows how to build and call a minimal http request. That call works perfectly at any point before I call the NTP module, but just as the NTP module itself will no longer work after it has been called once, this basic web request example also fails at any point after the call to the NTP module.

There must be something about the NTP module that, although running successfully the first time, nevertheless leaves the socket or the station in a bad state.

Here's the code from ntptime.py:

Code: Select all

def time():
    NTP_QUERY = bytearray(48)
    NTP_QUERY[0] = 0x1B
    addr = socket.getaddrinfo(host, 123)[0][-1]
    s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    try:
        s.settimeout(1)
        res = s.sendto(NTP_QUERY, addr)
        msg = s.recv(48)
    finally:
        s.close()
    val = struct.unpack("!I", msg[40:44])[0]
    return val - NTP_DELTA
Call that once, and you're golden. Call it a second time and it dies at getaddrinfo() with a varying mix of -2, -6, and 110. Similarly, any call via unrelated code that would make use of the Wifi station, such as the code in section 5.2 of network_tcp.html, will also fail (but will run just fine before calling the time() function above the first time).

BTW, to avoid any possible name collisions between the function above named "time()" and similar functions in the time.py/utime.py modules, I actually renamed that function above in ntptime.py when I copied the file to the PyBoardD flash. So, the error surely has nothing to do with time() being reused.

Something about the code above is "bad socket form". What do you think?

To replicate this simply paste the following into the REPL via ^E paste mode:

Code: Select all

# From micropython/ports/esp8266/modules/ntptime.py with some compaction and function renaming for demo purposes
try:
	import usocket as socket
except:
	import socket
try:
	import ustruct as struct
except:
	import struct
def get_ntp_time():
	NTP_QUERY = bytearray(48)
	NTP_QUERY[0] = 0x1B
	addr = socket.getaddrinfo("pool.ntp.org", 123)[0][-1]
	s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
	try:
		s.settimeout(1)
		res = s.sendto(NTP_QUERY, addr)
		msg = s.recv(48)
	finally:
		s.close()
	val = struct.unpack("!I", msg[40:44])[0]
	return val - 3155673600
# From https://docs.micropython.org/en/latest/esp8266/tutorial/network_tcp.html
def http_get(url):
	import socket
	_, _, host, path = url.split('/', 3)
	addr = socket.getaddrinfo(host, 80)[0][-1]
	s = socket.socket()
	s.connect(addr)
	s.send(bytes('GET /%s HTTP/1.0\r\nHost: %s\r\n\r\n' % (path, host), 'utf8'))
	while True:
		data = s.recv(100)
		if data:
			print(str(data, 'utf8'), end='')
		else:
			break
	s.close()
# Connect
import time, network
print("Connecting...")
wlan = network.WLAN(network.STA_IF)
wlan.active(True)
wlan.connect(addr, pw) # Obviously, you'll have to fill this in for your own router
while not wlan.isconnected():
	pyb.LED(3).on()
	time.sleep(.1)
	pyb.LED(3).off()
	time.sleep(.9)
print("Connected")
# Test http
http_get('http://micropython.org/ks/test.html')
# Query NTP
get_ntp_time()
# At this point, despite the two successful calls just made, the Wifi station is now dead.
# The following calls will fail.
# A soft reset either via ^D or Reset button won't fix the problem.
# Only disconnecting the USB cable will fix the problem.
get_ntp_time()
http_get('http://micropython.org/ks/test.html')

kwiley
Posts: 140
Joined: Wed May 16, 2018 5:53 pm
Contact:

Re: ntptime kills the wifi station, hard reset required to fix

Post by kwiley » Tue Apr 06, 2021 6:15 am

I've determined that simply attempting to build the DGRAM socket is sufficient to cause this problem. Note the following modification to the code in the previous post:

Code: Select all

def get_ntp_time():
	NTP_QUERY = bytearray(48)
	NTP_QUERY[0] = 0x1B
	addr = socket.getaddrinfo("pool.ntp.org", 123)[0][-1]
	s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
	try:
		pass
		# s.settimeout(1)
		# res = s.sendto(NTP_QUERY, addr)
		# msg = s.recv(48)
	finally:
		s.close()
	# val = struct.unpack("!I", msg[40:44])[0]
	# return val - 3155673600
If you substitute the code above into the example in the previous post, the same problem will occur.

davef
Posts: 811
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ntptime kills the wifi station, hard reset required to fix

Post by davef » Tue Apr 06, 2021 6:34 am

kwiley,

Good to see someone trying to sort this. See my last post here:
viewtopic.php?f=16&t=3675&start=10
and the "performance" I had to go through to get reliable ntptime().

Good luck!

BTW -202 is Micropython for Python 202 ie the minus sign is specific to Micropython. Evidently, applies to errors that Python's socket.gaierror would make.
Last edited by davef on Sun Apr 18, 2021 3:35 am, edited 1 time in total.

kwiley
Posts: 140
Joined: Wed May 16, 2018 5:53 pm
Contact:

Re: ntptime kills the wifi station, hard reset required to fix

Post by kwiley » Wed Apr 07, 2021 12:35 pm

Has anyone attempted to replicate my problem posted above? I included a single conglomerated code snippet that you can simply copy out of the forum and paste into the REPL. You'll have to assign your Wi-Fi endpoint name and password on one line, but other than that, it's a straight copy and paste test. I would like to know if other people have the same problem.

Thanks.

davef
Posts: 811
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ntptime kills the wifi station, hard reset required to fix

Post by davef » Thu Apr 08, 2021 9:18 am

I ran the script on a ESP32, which connects to a hotspot and got this:

Code: Select all

>>> import ntp_test
Connecting...
Connected
HTTP/1.1 200 OK
Server: nginx/1.10.3
Date: Thu, 08 Apr 2021 09:15:02 GMT
Content-Type: text/html
Content-Length: 180
Last-Modified: Tue, 03 Dec 2013 00:16:26 GMT
Connection: close
Vary: Accept-Encoding
ETag: "529d22da-b4"
Strict-Transport-Security: max-age=15768000
Accept-Ranges: bytes

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>Test</title>
    </head>
    <body>
        <h1>Test</h1>
        It's working if you can read this!
    </body>
</html>
HTTP/1.1 200 OK
Server: nginx/1.10.3
Date: Thu, 08 Apr 2021 09:15:04 GMT
Content-Type: text/html
Content-Length: 180
Last-Modified: Tue, 03 Dec 2013 00:16:26 GMT
Connection: close
Vary: Accept-Encoding
ETag: "529d22da-b4"
Strict-Transport-Security: max-age=15768000
Accept-Ranges: bytes

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>Test</title>
    </head>
    <body>
        <h1>Test</h1>
        It's working if you can read this!
    </body>
</html>
Any help?

kwiley
Posts: 140
Joined: Wed May 16, 2018 5:53 pm
Contact:

Re: ntptime kills the wifi station, hard reset required to fix

Post by kwiley » Thu Apr 08, 2021 10:58 pm

That would seem to indicate that it ran to the end successfully. I have an ESP32 I could test it with too I suppose, and scads of ESP8266s lying around. That would seem to suggest that the PyBoard D specific firmware has a problem. I'd love for something else with a D to try this. I have three Ds myself I suppose. Just curious what others get.

Thanks again.

davef
Posts: 811
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ntptime kills the wifi station, hard reset required to fix

Post by davef » Fri Apr 09, 2021 2:50 am

I'd like to have 1 or 2 PYboard Ds. I think Peter Hinch has one.

I issue I had was I think ntptime() was throwing an error that I couldn't catch, which led to the ESP32 just freezing. And that was extra to the TIMEOUT errors it often threw.

A software WDT "sorted" that issue on the ESP32 and wasn't required on the ESP8266.

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: ntptime kills the wifi station, hard reset required to fix

Post by pythoncoder » Fri Apr 09, 2021 1:02 pm

Perhaps I'm misunderstanding the problem here, but this works for me on a Pyboard D SF2W:

Code: Select all

MicroPython v1.13-107-g18518e26a-dirty on 2020-11-02; PYBD-SF2W with STM32F722IEK
Type "help()" for more information.
>>> import do_connect
>>> do_connect.do_connect()
connecting to network...
network config: ('192.168.0.36', '255.255.255.0', '192.168.0.1', '208.67.220.220')
MAC 48:4a:30:01:9d
>>> import ntptime
>>> ntptime.time()
671288087
>>> ntptime.time()
671288091
>>> ntptime.time()
671288094
>>> 
The do_connect script is:

Code: Select all

def do_connect():
    import network
    sta_if = network.WLAN(network.STA_IF)
    ap = network.WLAN(network.AP_IF) # create access-point interface
    ap.active(False)         # deactivate the interface
    if not sta_if.isconnected():
        print('connecting to network...')
        sta_if.active(True)
        sta_if.connect('ssid', 'password')  # redacted
        while not sta_if.isconnected():
            pass
    print('network config:', sta_if.ifconfig())
    a = sta_if.config('mac')
    print('MAC {:02x}:{:02x}:{:02x}:{:02x}:{:02x}'.format(a[0],a[1],a[2],a[3],a[4]))
Peter Hinch
Index to my micropython libraries.

davef
Posts: 811
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ntptime kills the wifi station, hard reset required to fix

Post by davef » Fri Apr 16, 2021 10:17 pm

Peter,

Thank you for posting a test procedure for this issue. I run through this procedure multiple times. esp32-idf4-20210202-v1.14.bin

Code: Select all

...
671927753
>>> ntptime.time()
671927754
>>> ntptime.time()
671927754
>>> ntptime.time()
-3155673600
>>> ntptime.time()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "ntptime.py", line 25, in time
OSError: [Errno 110] ETIMEDOUT
Is that -3155673600 indicative of the problem? Actually after trying this a few more times it seems if I don't wait more than one (?) second between tries then I get the error.

Thanks

davef
Posts: 811
Joined: Thu Apr 30, 2020 1:03 am
Location: Christchurch, NZ

Re: ntptime kills the wifi station, hard reset required to fix

Post by davef » Tue Apr 20, 2021 12:34 am

Still struggling with this issue on a ESP32. I managed to log one error calling ntptime.settime(). Unfortunately, I am not able to catch the system error, because the device appears to lock up on an unsuccessful call to ntptime.setttime(), ie "pass" is not executed so that it can try again.

Code: Select all

    while (count < 10):
        count += 1

        try:
            my_time = ntptime.settime() # set the rtc datetime from the remote server
        except:
            try:
                with open('errors.txt', 'a') as outfile:
                    outfile.write('ntptime.settime() failed'  + '\n')
            except OSError:
                pass

        print('.', end = '')
        utime.sleep(5)

        if (str(my_time) == 'None'):
            count = 0
            break
Without modifying ntptime.settime() is there some other way to display what is going wrong? I have changed the logging so that the next time it happens I will see what "my_time" is.

Post Reply