How do I debug a stack overflow?

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
Post Reply
BetterAutomations
Posts: 83
Joined: Mon Mar 20, 2017 10:22 pm

How do I debug a stack overflow?

Post by BetterAutomations » Sat Dec 04, 2021 3:12 pm

How would I proceed to debug this stack overflow? Seems to be triggering in the urequests module. Looks like from mem_info that I might just be low on stack; If so, not quite sure how to free that up. I've followed the advice here. I am using frozen modules, const(), .format() on strings, etc. This code did work at one time but a lot has changed since I had to last use it, so I'm not sure where I went wrong.

Sometimes instead of a stack overflow it generates an ENOMEM. I have an ESP32 with 8MB SPIRAM, have compiled the code for the larger RAM, and I load the SSL cert into memory at device bootup. I modified urequests to close the socket after it's done, but this is the first call to the socket after boot, so it's not having a problem with previously-opened sockets. There is a Bluetooth connection open while this runs.

Code: Select all

[DEBUG] urequests.request() url: 'https://****************/api/device/test_login'
[DEBUG] urequests.request() proto: 'https:'
[DEBUG] urequests.request() proto == "https:" 1
[DEBUG] urequests.request() host: '****************'
[DEBUG] urequests.request() port: 443
[DEBUG] urequests.request() ai: (2, 1, 0, '****************', ('****************', 443))
***ERROR*** A stack overflow in task mp_thread has been detected.
abort() was called at PC 0x400930d0 on core 0

ELF file SHA256: d500251a578fe87f

Backtrace: 0x40092cd3:0x3ffd5cf0 0x400930b9:0x3ffd5d10 0x400930d0:0x3ffd5d30 0x40095515:0x3ffd5d50 0x400947cc:0x3ffd5d70 0x40094782:0xdeadbeef |<-CORRUPTED

Rebooting...
Relevant code:

Code: Select all

gc.collect()
gc.threshold(gc.mem_free() // 4 + gc.mem_alloc())
print(repr(mem_info(1))
result = requests.get(
                url=url,
                auth=auth(),
                ssl_params={'ca_certs': "*************", 'cert_reqs': 0xffffff},
                data=data)
Here's mem_info(1) just before executing urequests.request():

Code: Select all

stack: 2516 out of 4096
GC: total: 4098240, used: 30016, free: 4068224
 No. of 1-blocks: 323, 2-blocks: 92, max blk sz: 463, max free sz: 253781
GC memory layout; from 3f817740:
00000: MDhhhSMDDMDhhhh=hhhhBShDDhDhh==Dhh==h=================hh=======h
00400: =======h=hT=hh=MDhhSMDMDhhh=================Mh=hShThShTShhShh==S
00800: h===h=======h=======hSThhBBBBBSB=MDh=hMDh===========hBBBBBSSh===
00c00: ====h=====MDhTh====hSDhBh=hTB==BBBBBMDhBBh==hShhh=Sh==BBBBBBAB=h
01000: B=hMDBBBBBBMDhMDhh==hBBBBhh===h=======================BBBDDh=h=h
01400: hB=B=BhhMDhh=====Bh===========h=SBBMDh=MDBBBhMDh=BBh=h=Dh=======
01800: =h==BBBBBBhBhDDBBh===h===h=B==BBBBBhB=Bh==h=====MDh=hh=Dh=hBh===
01c00: ===========h=B=BBBh=MDhBBh=MDh======h=hBh=hBBBB==hBh=h====h=hh=h
02000: =hh=Th=====h=h=ThSShh=Th=h=Th=Th=h=h=Th=h=h=TSTh=h=Thhhh=h=hSh=T
02400: h=SSh=TDhh=h=Th=h=h=Thh=hh=Th=hhh=Th=h=h=Th=====================
02800: ========hh=TT====TBhDBFB=BBBBBBBBBBBBBBAB=BBBBBBBBBBBBh===BhDBBB
02c00: BBMDSB.D.BBh==h===h==================h===========h==============
03000: BDB=MD.h===h===h===T=T=T=T=T=T=T=T=T=h==.B=Bh=B=Bh===.Bh=====h==
03400: =h=..h===SB.BBBBB.hB.B=BBBBBBT=BBh==============h==B.B=BBBBBBBBB
03800: BBBh====.h=====h==h=h=======..hSh==h=======h=h=h====B.h=========
03c00: =======================================h========================
04000: ======================================h=========================
04400: =====================================h==========================
04800: ====================================h===========================
04c00: ===================================h============================
05000: ==================================h=====h==========........hS...
05400: .............................h==================================
05800: ============================..........................h=.....BhB
05c00: hBhBhBh==========h==============================================
06000: ================================================================
06400: ================================================================
06800: ================================================================
06c00: ================================================================
07000: ================================================================
07400: ================================================================
07800: ================================h=========================h=====
07c00: ======h==================hBhBhBh=BhBhBhBhBhB....................
       (4 lines all free)
09000: ................................................S..h=h=.........
       (3965 lines all free)
e8800: ............

BetterAutomations
Posts: 83
Joined: Mon Mar 20, 2017 10:22 pm

Re: How do I debug a stack overflow?

Post by BetterAutomations » Sat Dec 04, 2021 6:46 pm

I think I'll try a reboot, do the web operation, then reboot back to BTooth. Adds time and complexity, but would solve the problem of a diminished stack.

I would still like to know the procedures though for properly debugging a stack overflow. It may come up again.

Rahul B
Posts: 3
Joined: Thu Mar 03, 2022 1:08 pm

Re: How do I debug a stack overflow?

Post by Rahul B » Wed Mar 09, 2022 11:04 am

Hello BetterAutomation have you solved this issue. I am also facing similar one. I am testing this on ESP Wroom-32.

BetterAutomations
Posts: 83
Joined: Mon Mar 20, 2017 10:22 pm

Re: How do I debug a stack overflow?

Post by BetterAutomations » Wed Mar 09, 2022 2:35 pm

I worked around my problem but never figured out how to debug a stack overflow. I create a flag file /start_ble or /start_web and on boot, look for the file, then branch the logic based on which file exists. Only Bluetooth or web can be running. Then reboot back to the other. Remember to clear the flag files and have a default mode.

You may not be able to do it on an Esp32-wroom, I had to upgrade to a wrover and compile mpy files to save memory. MicroPy is a hog.

Post Reply