[SOLVED] uasyncio is blocked for 1 hour, 11 minutes on ESP8266

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by kevinkk525 » Tue May 21, 2019 9:12 am

jomas wrote:
Tue May 21, 2019 9:04 am
MICROPY_PY_UTIME_TICKS_PERIOD = 2^30.
So MICROPY_PY_UTIME_TICKS_PERIOD-1 is used as a mask of 30bits (0b111111111111111111111111111111)
Ah thanks. This however gives exactly the same result.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by kevinkk525 » Tue May 21, 2019 10:25 am

The micropython test revealed the offset (same as Roberthh's finding):

Code: Select all

>>> while True:
...     gc.collect()
...     t=time.ticks_ms()
...     if time.ticks_us()<20:
...         print("us overflow")
...     time.sleep_us(20)
...     delta = time.ticks_diff(time.ticks_ms(), t)
...     if delta < 0 or delta >5:
...         print("Fail time interval", delta, "t",t,"now:",time.ticks_ms())
...
...
...
Fail time interval 4294967 t 1 now: 4294968
I realized I had a mistake in my test logic in the arduino code, this is the corrected version:

Code: Select all

uint32_t system_time_high_word=0;
uint32_t current_time = 4294967295;  //just before overflow
uint32_t MICROPY_PY_UTIME_TICKS_PERIOD =1073741824; 

uint32_t system_get_time(void) {
  return current_time;
}

uint32_t mp_hal_ticks_ms(void) {
    return ((uint64_t)system_time_high_word << 32 | (uint64_t)system_get_time()) / 1000;
}

int32_t time_ticks_diff(uint32_t end_in, uint32_t start_in) {
    // we assume that the arguments come from ticks_xx so are small ints
    uint64_t start = uint64_t(start_in);
    uint64_t end = uint64_t(end_in);
    // Optimized formula avoiding if conditions. We adjust difference "forward",
    // wrap it around and adjust back.
    int64_t diff = ((end - start + MICROPY_PY_UTIME_TICKS_PERIOD / 2) & (MICROPY_PY_UTIME_TICKS_PERIOD - 1))
                   - MICROPY_PY_UTIME_TICKS_PERIOD / 2;
    return int32_t(diff);
}

uint32_t time_ticks_ms(void) {
  return uint32_t(mp_hal_ticks_ms() & (MICROPY_PY_UTIME_TICKS_PERIOD -1));
}

void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  uint32_t cur_time_hal = mp_hal_ticks_ms();
  uint32_t cur_time_py = time_ticks_ms();
  Serial.print("Current time hal: ");Serial.print(cur_time_hal);Serial.println("ms");
  Serial.print("Current time upy: ");Serial.print(cur_time_py);Serial.println("ms");
  Serial.print("Current time counter uint32_t: ");Serial.println(current_time);
  
  current_time++; // let counter overflow
  Serial.println("Next ms, overflow of counter");
  cur_time_hal = mp_hal_ticks_ms();
  cur_time_py = time_ticks_ms();
  Serial.print("Current time hal with overflow in system_get_time without high word increment: ");Serial.print(cur_time_hal);Serial.println("ms");
  Serial.print("Current time upy with overflow in system_get_time without high word increment: ");Serial.print(cur_time_py);Serial.println("ms");

  Serial.println("Increment system_time_high_word");
  system_time_high_word++;
  uint32_t later_time_hal = mp_hal_ticks_ms();
  uint32_t later_time_py = time_ticks_ms();
  
  Serial.print("Later time hal with high word increment: ");Serial.print(later_time_hal);Serial.println("ms");
  Serial.print("Later time upy with high word increment: ");Serial.print(later_time_py);Serial.println("ms");
  Serial.print("time diff hal: ");Serial.print(later_time_hal-cur_time_hal);Serial.println("ms");
  Serial.print("time diff upy: ");Serial.print(time_ticks_diff(later_time_py, cur_time_py));Serial.println("ms");
}
Output:

Code: Select all

12:16:06.603 -> Current time hal: 4294967ms
12:16:06.603 -> Current time upy: 4294967ms
12:16:06.603 -> Current time counter uint32_t: 4294967295
12:16:06.603 -> Next ms, overflow of counter
12:16:06.603 -> Current time hal with overflow in system_get_time without high word increment: 0ms
12:16:06.603 -> Current time upy with overflow in system_get_time without high word increment: 0ms
12:16:06.637 -> Increment system_time_high_word
12:16:06.637 -> Later time hal with high word increment: 4294967ms
12:16:06.637 -> Later time upy with high word increment: 4294967ms
12:16:06.637 -> time diff hal: 4294967ms
12:16:06.637 -> time diff upy: 4294967ms
It clearly shows that if you poll the time before system_time_high_word gets incremented, you will create a time.ticks_diff of 2^32 us.
This directly affects uasyncio which will completely block for this period.

This behaviour can be observed with the (faster) testing code above every time the tick_us() counter (system_get_time) overflows.

A possible fix would be to change mp_hal_ticks_ms() [it's the only function using system_high_word] to this:

Code: Select all

uint32_t mp_hal_ticks_ms(void) {
    uint32_t system_time_cur = system_get_time();
    if (system_time_cur < system_time_prev) {
        system_time_high_word +=1; //record overflow of low 32-bits
    }
    system_time_prev = system_time_cur
    return ((uint64_t)system_time_high_word << 32 | (uint64_t)system_time_cur) / 1000;
}
and record system_time_prev in esp_mphal.c instead of ets_alt_task.c and in ets_alt_task just call mp_hal_ticks_ms to ensure that the overflow is recognized in case no other program is calling time.ticks_ms().

I'm of course not that familiar with the rest of the micropython code so there may be problems with that fix. Also I haven't tested it yet.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

jomas
Posts: 59
Joined: Mon Dec 25, 2017 1:48 pm
Location: Netherlands

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by jomas » Tue May 21, 2019 11:00 am

That could be tricky. If for example in an interrupt occurs which calls 'ets_loop_iter', in the middle of 'mp_hal_ticks_ms', then
'system_time_high_word' may be incremented 2 times. I think you should protect the code with ets_intr_lock();/ets_intr_unlock();
just as in 'ets_loop_iter'.

Or 'mp_hal_ticks_ms' should have it's own counter or there should be a safe 64-bit uSecond-counter be implemented somewhere.
I think it's now time for the core-developers of Micropython to fix this bug. Without this fix uasyncio cannot be used reliable on an esp8266 for programs that last for long times.

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by Roberthh » Tue May 21, 2019 11:12 am

if not already done, someone of you should raise an issue and link to this post, which contains all the information. Mayb @jomas as the one who started this thread.

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by kevinkk525 » Tue May 21, 2019 11:28 am

jomas wrote:
Tue May 21, 2019 11:00 am
That could be tricky. If for example in an interrupt occurs which calls 'ets_loop_iter', in the middle of 'mp_hal_ticks_ms', then
'system_time_high_word' may be incremented 2 times. I think you should protect the code with ets_intr_lock();/ets_intr_unlock();
just as in 'ets_loop_iter'.

Or 'mp_hal_ticks_ms' should have it's own counter or there should be a safe 64-bit uSecond-counter be implemented somewhere.
I think it's now time for the core-developers of Micropython to fix this bug. Without this fix uasyncio cannot be used reliable on an esp8266 for programs that last for long times.
You are right, I forgot about the lock.

I will add all this information with an explanation to the issue I opened. With all this the devs should have no problem confirming and fixing the problem.
Roberthh: I created an issue for jomas as he has no github account: https://github.com/micropython/micropython/issues/4795
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by pythoncoder » Wed May 22, 2019 8:10 am

Congratulations, guys, on identifying this bug. Great work :!:
Peter Hinch
Index to my micropython libraries.

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: uasyncio is blocked for 1 hour, 11 minutes (Bug?)

Post by pythoncoder » Wed May 22, 2019 8:18 am

jomas wrote:
Tue May 21, 2019 8:36 am
@pythoncoder,
Your suggested mp_hal_ticks_ms will not work because system_get_time() will roll-over after 1:11H.
...
You are absolutely right. I'd misread the code, so my observations about it were wrong :oops:
Peter Hinch
Index to my micropython libraries.

Post Reply