[UNIX] Setting stack size/max recursion depth

Discussion and questions about boards that can run MicroPython but don't have a dedicated forum.
Target audience: Everyone interested in running MicroPython on other hardware.
Post Reply
bwyld
Posts: 7
Joined: Tue Oct 19, 2021 3:03 pm

[UNIX] Setting stack size/max recursion depth

Post by bwyld » Tue Oct 19, 2021 3:15 pm

Hi

I am trying to use the unix port as test environment (for code targeting a pycom board).
However, my code is relatively complex and when starting I get a 'max recursion depth exceeded' error.
I understand this is actually a 'max stack size exceeded check' (ie I get a nice error because MICROPY_STACK_CHECK is set to (1)). and indeed, if I set the check to (0) then it just core dumps... not the desired operation...

I had experienced this kind of error on the pycom _thread port, and used the _thread.set_stack_size() call to fix it (mostly, there are other restrictions on the max stack size in that ESP32 port).

However, on the UNIX port, this occurs on the the 'main' thread and I cannot work out how to increase this stack size? There doesn't seem to be a define in the header files that I can find to configure this?

Can someone point me in the right direction?
btw I am running all this in a docker image, so would like to be able to configure this by a parameter to the makefile if possible...

thanks!

PS> my code is NOT using recursion, its just (IMHO) correctly structured... So refactoring to reduce the max function call depth is not a viable option

stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: [UNIX] Setting stack size/max recursion depth

Post by stijn » Wed Oct 20, 2021 6:13 am

A lot of nested function calls can trigger this error as well, not necessarily recursion.

See main() in unix/main.c where it sets stack limit: that should be the number you want to change. Not configurable via the build at the moment, but it's easy to add that functionality.

bwyld
Posts: 7
Joined: Tue Oct 19, 2021 3:03 pm

Re: [UNIX] Setting stack size/max recursion depth

Post by bwyld » Sun Oct 24, 2021 5:48 pm

Ok, I modified main.c around line 455:
#ifndef STACK_LIMIT
#define STACK_LIMIT (40000)
#endif
// Define a reasonable stack limit to detect stack overflow.
mp_uint_t stack_limit = STACK_LIMIT * (sizeof(void *) / 4);

And then built it with -DSTACK_LIMIT=200000 (so a 5x increase from the previous hardcoded value of 40000)
However, it now gives a segmentation fault...
/flash/run.sh: line 3: 90 Segmentation fault /micropython/ports/unix/micropython -X heapsize=16M /flash/boot.py

This is a line that does socket.getaddrinfo(). The actual getaddrinof() works if I do it from the REPL, so I guess its running off the stack...

This implies that the call to
mp_stack_set_limit(stack_limit);
just sets the limit, it doesn't actually increase the stack size? I set a 16Mb heapsize on the grounds this would give me loads of space...

Any pointers to setup the stack/heap correctly?

Thanks!

stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: [UNIX] Setting stack size/max recursion depth

Post by stijn » Mon Oct 25, 2021 8:46 am

Isn't the stack used the C stack and only limited by memory? The stack limit is indeed just a number used to check the size. Can you attach a C debugger and see what goes on perhaps?

bwyld
Posts: 7
Joined: Tue Oct 19, 2021 3:03 pm

Re: [UNIX] Setting stack size/max recursion depth

Post by bwyld » Mon Nov 01, 2021 4:54 pm

Well, it's been a while since I tried debugging un*x code... not really what I was hoping to do with this project..

bwyld
Posts: 7
Joined: Tue Oct 19, 2021 3:03 pm

Re: [UNIX] Setting stack size/max recursion depth

Post by bwyld » Mon Nov 01, 2021 5:20 pm

So, gdb tells me:

Thread 2 "micropython" received signal SIGSEGV, Segmentation fault.
0x00007fb84c790dd1 in __GI___res_nameinquery (name=name@entry=0x7fb84db06380 "mqtt.infrafon.club", type=1, class=1, buf=buf@entry=0x7fb84db07340 "L\353\201\200",
eom=eom@entry=0x7fb84db07b40 "\200\226\260M\270\177") at res_send.c:272

No source for res_send.c of course...

backtrace of the stack looks ok but not informative...
(gdb) bt
#0 0x00007fb84c790dd1 in __GI___res_nameinquery (name=name@entry=0x7fb84db06380 "mqtt.infrafon.club", type=1, class=1,
buf=buf@entry=0x7fb84db07340 "L\353\201\200", eom=eom@entry=0x7fb84db07b40 "\200\226\260M\270\177") at res_send.c:272
#1 0x00007fb84c790f86 in __GI___res_queriesmatch (buf1=buf1@entry=0x7fb84db06b50 "L\353\001", eom1=eom1@entry=0x7fb84db06b74 "$\367\001",
buf2=0x7fb84db07340 "L\353\201\200", eom2=0x7fb84db07b40 "\200\226\260M\270\177") at res_send.c:401
#2 0x00007fb84c791712 in send_dg (statp=0x7fb84db09db8, buf=<optimized out>, buflen=<optimized out>, buf2=0x7fb84db068d0 "2 \200a", buflen2=<optimized out>,
ansp=<optimized out>, anssizp=<optimized out>, terrno=<optimized out>, ns=<optimized out>, v_circuit=<optimized out>, gotsomewhere=<optimized out>,
anscp=<optimized out>, ansp2=<optimized out>, anssizp2=<optimized out>, resplen2=<optimized out>, ansp2_malloced=<optimized out>) at res_send.c:1372
#3 0x00007fb84c791f89 in __res_context_send (ctx=ctx@entry=0x7fb848000de0, buf=buf@entry=0x7fb84db06b50 "L\353\001", buflen=36,
buf2=buf2@entry=0x7fb84db06b74 "$\367\001", buflen2=buflen2@entry=36, ans=<optimized out>, ans@entry=0x7fb84db07340 "L\353\201\200", anssiz=<optimized out>,
ansp=<optimized out>, ansp2=<optimized out>, nansp2=<optimized out>, resplen2=<optimized out>, ansp2_malloced=<optimized out>) at res_send.c:530
#4 0x00007fb84c78ed4a in __GI___res_context_query (ctx=ctx@entry=0x7fb848000de0, name=name@entry=0x7fb84c831180 "mqtt.infrafon.club", class=class@entry=1,
type=type@entry=439963904, answer=answer@entry=0x7fb84db07340 "L\353\201\200", anslen=anslen@entry=2048, answerp=0x7fb84db07b90, answerp2=0x7fb84db07b98,
nanswerp2=0x7fb84db07b80, resplen2=0x7fb84db07b84, answerp2_malloced=0x7fb84db07b88) at res_query.c:216
#5 0x00007fb84c78f9bf in __res_context_querydomain (answerp2_malloced=0x7fb84db07b88, resplen2=0x7fb84db07b84, nanswerp2=0x7fb84db07b80, answerp2=0x7fb84db07b98,
answerp=0x7fb84db07b90, anslen=2048, answer=0x7fb84db07340 "L\353\201\200", type=439963904, class=1, domain=0x0, name=0x7fb84c831180 "mqtt.infrafon.club",
ctx=0x7fb848000de0) at res_query.c:601
#6 __GI___res_context_search (ctx=ctx@entry=0x7fb848000de0, name=name@entry=0x7fb84c831180 "mqtt.infrafon.club", class=class@entry=1, type=type@entry=439963904,
answer=answer@entry=0x7fb84db07340 "L\353\201\200", anslen=anslen@entry=2048, answerp=<optimized out>, answerp2=<optimized out>, nanswerp2=<optimized out>,
resplen2=<optimized out>, answerp2_malloced=<optimized out>) at res_query.c:370
#7 0x00007fb84c7a4d0a in _nss_dns_gethostbyname4_r (name=name@entry=0x7fb84c831180 "mqtt.infrafon.club", pat=pat@entry=0x7fb84db07d28,
buffer=0x7fb84db08030 "\254\021", buflen=1024, errnop=errnop@entry=0x7fb84db09680, herrnop=herrnop@entry=0x7fb84db096e4, ttlp=<optimized out>)
at nss_dns/dns-host.c:372
#8 0x00007fb84d8ad3a6 in gaih_inet (name=<optimized out>, name@entry=0x7fb84c831180 "mqtt.infrafon.club", service=service@entry=0x7fb84db07f40,
req=req@entry=0x7fb84db08490, pai=pai@entry=0x7fb84db07f28, naddrs=naddrs@entry=0x7fb84db07f24, tmpbuf=tmpbuf@entry=0x7fb84db08020)
at ../sysdeps/posix/getaddrinfo.c:765
#9 0x00007fb84d8ae225 in __GI_getaddrinfo (name=<optimized out>, service=<optimized out>, hints=0x7fb84db08490, pai=0x7fb84db08488)
at ../sysdeps/posix/getaddrinfo.c:2256
#10 0x000055cd1558d864 in ?? ()
#11 0x000055cd1557a229 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x000055cd1556d089 in ?? ()
#13 0x000055cd1557a229 in ?? ()
#14 0x000055cd1556d089 in ?? ()
#15 0x000055cd1557a229 in ?? ()
#16 0x000055cd1556d089 in ?? ()
#17 0x000055cd1557a229 in ?? ()
#18 0x000055cd1556d089 in ?? ()
#19 0x000055cd1557a229 in ?? ()
#20 0x000055cd1556d089 in ?? ()
#21 0x000055cd1556acd0 in ?? ()
#22 0x000055cd15567e65 in ?? ()
#23 0x000055cd1557a29b in ?? ()
#24 0x000055cd1556d089 in ?? ()
#25 0x000055cd1556acd0 in ?? ()
#26 0x000055cd1557988a in ?? ()
#27 0x00007fb84dae9ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#28 0x00007fb84d8c3def in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Not sure whether this helps...

stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: [UNIX] Setting stack size/max recursion depth

Post by stijn » Fri Nov 05, 2021 5:07 pm

Those are all libc calls, micropython is missing in the call stack (perhaps not built with debug info?); anyway what you could do here is put a breakpoint at the point where micropython is going to call getaddrinfo and there have a look at what micropython is passing into it and what overall memory usage is like. But if it's memory corruption (because the garbage collector is doing something wrong) it might be hard to track down. You're sure this is a single-threaded application right? All micropython code or custom buts as well? Any chance you could post a minimal case which reproduces this?

Post Reply