[SOLVED]garbage collector + stack of startup task

C programming, build, interpreter/VM.
Target audience: MicroPython Developers.
BramPeeters
Posts: 54
Joined: Wed Jan 31, 2018 3:10 pm

[SOLVED]garbage collector + stack of startup task

Post by BramPeeters » Tue Sep 11, 2018 9:46 am

Hi,

I am running into a problem that a thread that was blocked some time and then continues crashes if the garbage collector was active in the mean time, so obviously something gets cleaned up that shouldn't (it works if i do not do collect).

So I am trying to understand the GC code to figure out what goes wrong, and one of the problems I see is that it seems that it only accepts pointers that are located within the heap (VERIFY_PTR). This implies that the stack of the startup task (eg in the stm32 port , the task that runs 'main') is ignored (not sure this is my problem but it sure looks suspicious). Can anyone with more insight in the GC confirm that this is indeed a bug, or if not why ?
As for a solution, i assume that would be to make a new thread with a stack located in the heap to run python code (instead of running python code from the startup task directly)?

Speaking of stacks, i came accross the following in 'mp_thread_gc_others'
gc_collect_root(th->stack, th->stack_len); // probably not needed

Why would it not be needed to check the stacks of the tasks ? Or is it because gc_collect_root((void**)&th, 1); is alrady added so any pointers (suck as the stack) in the th 'block' will already be covered ?

EDIT/ TLDR : problem was caused due to incorrectly configured stack size in thread init call of the main/initial task. VERIFY_PTR is not a problem for a stack not in the heap because the stack addresses are dereferenced before they are passed to VERIFY_PTR.

Regards,
Bram
Last edited by BramPeeters on Mon Sep 17, 2018 8:15 pm, edited 1 time in total.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: garbage collector + stack of startup task

Post by jickster » Tue Sep 11, 2018 3:55 pm

BramPeeters wrote:
Tue Sep 11, 2018 9:46 am
Hi,

I am running into a problem that a thread that was blocked some time and then continues crashes if the garbage collector was active in the mean time, so obviously something gets cleaned up that shouldn't (it works if i do not do collect).

So I am trying to understand the GC code to figure out what goes wrong, and one of the problems I see is that it seems that it only accepts pointers that are located within the heap (VERIFY_PTR). This implies that the stack of the startup task (eg in the stm32 port , the task that runs 'main') is ignored (not sure this is my problem but it sure looks suspicious). Can anyone with more insight in the GC confirm that this is indeed a bug, or if not why ?
It's not a bug.


There's things called "root-pointers" aka "GC-roots"
Image
These are objects that the GC is never supposed to collect.
Go to mpstate.h and search for "END ROOT POINTER SECTION" and you'll find the objects that are never supposed to be collected.
Furthermore, if there is a reference chain "downward" from the root objects, those objects are not collected either.

How is that implemented?

Code: Select all

void **ptrs = (void**)(void*)&mp_state_ctx;
    size_t root_start = offsetof(mp_state_ctx_t, thread.dict_locals);
    size_t root_end = offsetof(mp_state_ctx_t, vm.qstr_last_chunk);
    gc_collect_root(ptrs + root_start / sizeof(void*), (root_end - root_start) / sizeof(void*));
That code is saying: "this area of memory holds root objects. Any object they hold a reference to (and any objects those hold a reference to . . . etc) MARK as 'do not collect' for when we call the SWEEP phase of the gc"

The VERIFY_PTR macro says "if the pointer doesn't point into the heap, don't worry about it . . . because the gc is only supposed to touch the heap"

But this doesn't take care of the stack which the following code does.

Code: Select all

/ get the registers and the sp
    uintptr_t regs[10];
    uintptr_t sp = gc_helper_get_regs_and_sp(regs);

    // trace the stack, including the registers (since they live on the stack in this function)
    #if MICROPY_PY_THREAD
    gc_collect_root((void**)sp, ((uint32_t)MP_STATE_THREAD(stack_top) - sp) / sizeof(uint32_t));
    #else
    gc_collect_root((void**)sp, ((uint32_t)&_ram_end - sp) / sizeof(uint32_t));
    #endif

    // trace root pointers from any threads
    #if MICROPY_PY_THREAD
    mp_thread_gc_others();
    #endif
BramPeeters wrote:
Tue Sep 11, 2018 9:46 am
Speaking of stacks, i came accross the following in 'mp_thread_gc_others'
gc_collect_root(th->stack, th->stack_len); // probably not needed

Why would it not be needed to check the stacks of the tasks ? Or is it because gc_collect_root((void**)&th, 1); is alrady added so any pointers (suck as the stack) in the th 'block' will already be covered ?

Regards,
Bram

Code: Select all

void mp_thread_gc_others(void) {
    mp_thread_mutex_lock(&thread_mutex, 1);
    for (thread_t *th = thread; th != NULL; th = th->next) {
        gc_collect_root((void**)&th, 1);
        gc_collect_root(&th->arg, 1); // probably not needed
        if (th->id == xTaskGetCurrentTaskHandle()) {
            continue;
        }
        if (!th->ready) {
            continue;
        }
        gc_collect_root(th->stack, th->stack_len); // probably not needed
    }
    mp_thread_mutex_unlock(&thread_mutex);
}
Because these 3 statements accomplish the same thing

Code: Select all

gc_collect_root((void**)&th, 1);
        gc_collect_root(&th->arg, 1); // probably not needed
        gc_collect_root(th->stack, th->stack_len); // probably not needed
        

BramPeeters
Posts: 54
Joined: Wed Jan 31, 2018 3:10 pm

Re: garbage collector + stack of startup task

Post by BramPeeters » Tue Sep 11, 2018 7:45 pm

Hi,

Actually, I am now pretty convinced it is a bug, because if i fixed it my problem was gone :)

>The VERIFY_PTR macro says "if the pointer doesn't point into the heap, don't worry about it . . . because the gc is only supposed to touch the heap"

Exactly, which is why there is a bug because the initial stack is not on the heap in the stm32 port and others, but it very much might hold references to objects that are (and that are still alive). So these objects are not 'marked', and therefore cleaned up while still being referenced to.

I have no idea what exactly is put on there but it must be pretty important since my task crashes somewhere in the VM after resuming before even running the next python instruction.

In the initial version in verbose mode i got around 20 sweep messages, after moving code around to first init the GC and then malloc a stack using the GC for the initial task in which the script is ran, only around 5 sweep messages appear.
So around 15 things got cleaned up incorrectly, and this was with a 'simple' test script.

Further remarks:
But this doesn't take care of the stack which the following code does.

Code: Select all

gc_collect_root((void**)sp, ((uint32_t)MP_STATE_THREAD(stack_top) - sp) / sizeof(uint32_t));
Actually that takes care of checking the stack of the task that called gc.collect(), which in my test script is not the initial task.
The stack of the initial task would have been added via the mp_thread_gc_others if it weren't rejected.

Now that i think about it, since the thread_t object in which the initial thread parameters are stored (the tail of the list of thread_ objects) is a static object
STATIC thread_t *thread; // root pointer, handled bp mp_thread_gc_others
it is also being ignored when an attempt is made via
gc_collect_root((void**)&th, 1);
in mp_thread_gc_others

So these 3 things

Code: Select all

gc_collect_root((void**)&th, 1);
gc_collect_root(&th->arg, 1); // probably not needed
gc_collect_root(th->stack, th->stack_len); // probably not needed
do *not* accomplish the same thing with a th that is not in the GC heap (which is the case for the th configured by mp_thread_init) !

And my fix would still not have worked if gc_collect_root(th->stack, th->stack_len); was not there, because even though my stack is now in the GC heap, STATIC thread_t *thread is not so gc_collect_root((void**)&th, 1); would still refuse to check the stack.

The more i think about it, the wronger it seems. Am i missing something obvious here ? The problem comes down to the fact that any pointer that you try to pass to the gc_collect_root function that is not in the GC heap is useless because it will be completely ignored !?

BramPeeters
Posts: 54
Joined: Wed Jan 31, 2018 3:10 pm

Re: garbage collector + stack of startup task

Post by BramPeeters » Tue Sep 11, 2018 7:53 pm

(also: after moving the stack to the GC heap which is in external sram, the lexer is now dead slow ( 9 seconds which triggers my watchdog versus 3s with the stack still in internal memory of the microcontroller) so i am still not out of the woods :( )

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: garbage collector + stack of startup task

Post by jickster » Tue Sep 11, 2018 8:23 pm

You ignored my words about root pointers and END ROOT POINTER SECTION.


Do you understand what this macro does?

Code: Select all

#define MICROPY_PORT_ROOT_POINTERS \
    const char *readline_hist[8]; \
    \
    mp_obj_t pyb_hid_report_desc; \
    \
    mp_obj_t pyb_config_main; \
    \
    mp_obj_t pyb_switch_callback; \
    \
    mp_obj_t pin_class_mapper; \
    mp_obj_t pin_class_map_dict; \
    \
    mp_obj_t pyb_extint_callback[PYB_EXTI_NUM_VECTORS]; \
    \
    /* pointers to all Timer objects (if they have been created) */ \
    struct _pyb_timer_obj_t *pyb_timer_obj_all[MICROPY_HW_MAX_TIMER]; \
    \
    /* stdio is repeated on this UART object if it's not null */ \
    struct _pyb_uart_obj_t *pyb_stdio_uart; \
    \
    /* pointers to all UART objects (if they have been created) */ \
    struct _pyb_uart_obj_t *pyb_uart_obj_all[MICROPY_HW_MAX_UART]; \
    \
    /* pointers to all CAN objects (if they have been created) */ \
    struct _pyb_can_obj_t *pyb_can_obj_all[2]; \
    \
    /* list of registered NICs */ \
    mp_obj_list_t mod_network_nic_list;

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: garbage collector + stack of startup task

Post by jickster » Tue Sep 11, 2018 9:12 pm

BramPeeters wrote:
Tue Sep 11, 2018 7:45 pm

The more i think about it, the wronger it seems. Am i missing something obvious here ? The problem comes down to the fact that any pointer that you try to pass to the gc_collect_root function that is not in the GC heap is useless because it will be completely ignored !?
Yes, you're missing something obvious: END ROOT POINTER SECTION.

BramPeeters
Posts: 54
Joined: Wed Jan 31, 2018 3:10 pm

Re: garbage collector + stack of startup task

Post by BramPeeters » Tue Sep 11, 2018 9:41 pm

Ok seriously I do not know what to make from your replies, either I am having a brain fart and there is something I am missing in which case i apologize or you do not understand what I am trying to point out (or you are having a bad day and you are just trolling with me here).

Assuming you are serious and trying to help (you did make an attempt yourself to understand the GC in viewtopic.php?f=3&t=4069&start=30 which i found really helpful) i will go over it one last time like I understand the GC so you can check where it does not match with your interpretation.

I did not ignore the "END ROOT POINTER SECTION", i did not further address it because
[1] as you mention yourself it is not relevant for adding the stack as a root pointer. It is interesting background info about the GC, but nothing more for this particular issue.
[2] as you point out, even those elements ultimately just pass via gc_collect_root which is the problem area, which I did further address

From what I understand the macro points to references the GC process uses as a start point for looking for further references to 'mark' so they are not 'collected'.
It contains a bunch of 'defaults' and it can be extended with MICROPY_PORT_ROOT_POINTERS

Nothing in there is remotely relevant for the stack problem I am talking about. If I am wrong and it is relevant for the 'initial' stack, can you elaborate how ?

And do you agree that anything that is directly passed to gc_collect_root, or indirectly via gc_drain_stack passes via

Code: Select all

// ptr should be of type void*
#define VERIFY_PTR(ptr) ( \
        ((uintptr_t)(ptr) & (BYTES_PER_BLOCK - 1)) == 0      /* must be aligned on a block */ \
        && ptr >= (void*)MP_STATE_MEM(gc_pool_start)     /* must be above start of pool */ \
        && ptr < (void*)MP_STATE_MEM(gc_pool_end)        /* must be below end of pool */ \
    )

// ptr should be of type void*
#define VERIFY_MARK_AND_PUSH(ptr) \
    do { \
        if (VERIFY_PTR(ptr)) { \
            size_t _block = BLOCK_FROM_PTR(ptr); \
            if (ATB_GET_KIND(_block) == AT_HEAD) { \
                /* an unmarked head, mark it, and push it on gc stack */ \
                DEBUG_printf("gc_mark(%p)\n", ptr); \
                ATB_HEAD_TO_MARK(_block); \
                if (MP_STATE_MEM(gc_sp) < &MP_STATE_MEM(gc_stack)[MICROPY_ALLOC_GC_STACK_SIZE]) { \
                    *MP_STATE_MEM(gc_sp)++ = _block; \
                } else { \
                    MP_STATE_MEM(gc_stack_overflow) = 1; \
                } \
            } \
        } \
    } while (0)
and that these macro's will ignore anything that is not on the GC heap.

So therefore a stack (such as the initial stack) or other element (eg static variable) that that is not in the heap area will not be further searched for references, no matter if you put it in the macro (since the elements of the macro are passed via gc_collect_root).

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

Re: garbage collector + stack of startup task

Post by jickster » Tue Sep 11, 2018 10:14 pm

This DOES have to do with the stack and I'll explain how; I experienced this issue of crashing after a gc_collect().

Say that some initialization function is called that creates objects on the heap. Could be a float, a module, etc.

Code: Select all

main() -> ... -> /* init functions to set-up  micropython heap and stack stack*/
main() -> ... -> init_uPyObj() // create heap objects and expose to Python
Now we've left function init_uPyObj() so the stack is gone.

Code: Select all

main() -> ... -> execute_script() // this will trigger a gc eventually
Since the stack of init_uPyObj() is totally and completely erased, what will happen to the heap-objects created in init_uPyObj() ABSENT SOME OTHER MECHANISM?
They'll be collected and you'll see a crash when you try accessing those heap-objects.

So how do we protect them from a gc?
We place the objects that we created in MICROPY_PORT_ROOT_POINTERS so when a gc happens, the collection will consider those pointers "roots".

Here's my example:

Code: Select all

// Any root pointers for GC scanning - see mpstate.c
#define MICROPY_PORT_ROOT_POINTERS \
	mp_obj_module_t device_module;
In my case, I created a module that always appears in the environment which is accessed like

Code: Select all

device.<HWmodule>.<message_type>
Everything below device was created on the heap in my version of init_uPyObj() .
Before I declared mp_obj_module_t device_module as a root pointer, it would crash when I'd try to access device.<HWmodule>

BramPeeters
Posts: 54
Joined: Wed Jan 31, 2018 3:10 pm

Re: garbage collector + stack of startup task

Post by BramPeeters » Tue Sep 11, 2018 11:33 pm

Ok, thanks for the elaboration.
I now fully understand what you are saying, and i should absolutely recheck my code to see if I do stuff like that but that is a problem you will have with any thread/stack (no matter where it is located). (I have threading enabled and have several threads running)

The issue I am addressing is specific to the initial stack, the one that is running main. It is that eg even while you are still executing in the init_uPyObj() (to use your example), and then the GC runs, it will clean up the object because it is ignoring this and only this particular initial stack ( because the stack is outside of the GC heap ).

So to make a super artificial example to clearly illustrate the problem in a way you easily should be able to try/reproduce (mp_new/gc_collect function names might be wrong as i am typing this out of my head but you get the idea):

Code: Select all

main() -> ... -> init_uPyObj()
with

Code: Select all

void init_uPyObj (void ) 
{
  someobject myobject = mp_new( xxx );
  gc_collect();

  /* At this point myobject is already incorrectly cleaned 
   * up by the GC since this stack is ignored so gc_collect is under 
   * the impression that myobject is not referenced anymore 
   */
  ...
}

This is of course not something i am actually doing , I am triggering gc.collect from an extra thread every 100ms from within python , so from within execute_script(). And it is in the VM itself i crash when resuming from yet another task (not the initial one) even before running the next python level instruction, so not when trying to access some object I created . I am still not sure what it is exactly that is referred to from the initial stack that gets cleaned up if i do not move the stack to within the heap so that it keeps existing.

Further complication:
For stacks that are not ignored by the GC, it will scan the ENTIRE (*) stack and not just the stack in use, with the sole exception if it happens to be the initial thread that calls the gc.collect() and has a stack that is not ignored.
So if you had a deeply nested function call that created some object , and then do not refer to the object anymore but you don't go so deep anymore for a while, the object will keep existing. This might hide some issues (and they will only sporadically surface the next time you have a deeply nested call).
(*) The reason is an attempt is made to limit the scanning to the used portion in gc_collect_root((void**)sp, ((uint32_t)MP_STATE_THREAD(stack_top) - sp) / sizeof(uint32_t));, but then mp_thread_gc_others it will add the entire stack memory range of all tasks anyway via gc_collect_root((void**)&th, 1); or via gc_collect_root(th->stack, th->stack_len); for the initial task.
The exception for the initial task if it is calling gc.collect() because then th is pointing to a static object, and it is also not added via gc_collect_root(th->stack, th->stack_len); because of the if (th->id == xTaskGetCurrentTaskHandle()) test.

jickster
Posts: 629
Joined: Thu Sep 07, 2017 8:57 pm

garbage collector + stack of startup task

Post by jickster » Wed Sep 12, 2018 1:22 am

If you call gc_collect(), the correct stack is correctly taken into account so your entire first example is incorrect.


Sent from my iPhone using Tapatalk Pro

Post Reply