Memory allocation and deallocation

C programming, build, interpreter/VM.
Target audience: MicroPython Developers.
User avatar
elliotwoods
Posts: 23
Joined: Wed Dec 04, 2019 8:11 am

Re: Memory allocation and deallocation

Post by elliotwoods » Mon Jul 13, 2020 12:10 pm

Ok looking at the source code more, it looks like memory allocated in the standard ways (e.g. malloc) shouldn't be affected by the garbage collector (the gc uses a malloc at the start of run to make its own block of memory 'the heap')

investigating..

User avatar
elliotwoods
Posts: 23
Joined: Wed Dec 04, 2019 8:11 am

Re: Memory allocation and deallocation

Post by elliotwoods » Mon Jul 13, 2020 12:49 pm

I'm still having issues with memory that's allocated with malloc

From the other comments, it sounds like memory allocated using the standard functions (e.g. new and malloc) can still be affected. Is this true?

User avatar
elliotwoods
Posts: 23
Joined: Wed Dec 04, 2019 8:11 am

Re: Memory allocation and deallocation

Post by elliotwoods » Mon Jul 13, 2020 1:32 pm

hi all

i see the same issue with

Code: Select all

uint8_t heapAreaGlobal[2 * 1024 + alignment];

i.e. when memory is defined statically at the top of the cpp file, it is still (somehow) affected by the garbage collector

let me explain what happens:

1. Allocate a section of memory to store tensors, objects, models, etc (this is called tensorArena)
2. Execute the model (invoke) 100 times. In the debugger, I see that tensorArena does not change between calls.
3. Garbage collect
4. Execute the model more times. From here, I can see that the memory in the tensorArena starts to change between every call
5. After a few calls (e.g. 5) I get a read access violation from a method being called on an object which is now a nullptr

Note : the calls are being made from a loop in Python, so there is some allocation happening between each call to make the return objects

If any of you want to see, here's a link to the code as it is right now:
https://github.com/elliotwoods/micropyt ... tensorflow

* Model.cpp (tensorflow wrapper cpp object)
* extern.c (export all the actions on Model.cpp into C syntax)
* module.c (micropython code)

Thank you
Elliot

P.S. it's really great to be able to use the windows port and debug with Visual Studio!!

User avatar
pythoncoder
Posts: 4647
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Memory allocation and deallocation

Post by pythoncoder » Mon Jul 13, 2020 3:03 pm

Since nobody more knowledgeable has responded, I'll chip in. Be warned, I haven't studied the code and the following is based on things I've read over the last six years.

The GC operates only on the Python heap. This is malloc'd permanently before the bytecode interpreter and GC starts. The GC can't operate on other malloc'd areas because these will contain arbitrary data. As you've pointed out, the GC couldn't distinguish pointers from code and data anywhere other than in the Python heap.

The heap contains bytecode, integers, floats and pointers. All are constrained to 30 bits, with the remaining 2 bits used to distinguish between these types. Pointers are inherently constrained by microcontroller RAM sizes/address spaces.

The GC can then reliably operate. Somewhere there is documentation on how this is done: I rather think it's not how you might expect. The main consequence for Python coders is the early transition from small to large ints and the loss of FP precision.
Peter Hinch

User avatar
dhylands
Posts: 3482
Joined: Mon Jan 06, 2014 6:08 pm
Location: Peachland, BC, Canada
Contact:

Re: Memory allocation and deallocation

Post by dhylands » Mon Jul 13, 2020 5:59 pm

Right - if you allocate something in C land (using malloc or whatever) then a pointer to that object either needs to be stored inside a python object or needs to be stored someplace in the root pointers area or the garbage collector will consider the alloced memory to be "unused" and will free it the next time that gc_collect is called.

Online
User avatar
Roberthh
Posts: 2232
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Memory allocation and deallocation

Post by Roberthh » Mon Jul 13, 2020 8:38 pm

Right - if you allocate something in C land (using malloc or whatever) then a pointer to that object either needs to be stored inside a python object or needs to be stored someplace in the root pointers area or the garbage collector will consider the alloced memory to be "unused" and will free it the next time that gc_collect is called.
@dhylands. Does that also apply to storage allocated in C-land with malloc() and only used in C-Land, especially never used for a Python object? The how should the garbage collector get hold of it? My impression and experience with crashes was, the trouble start as soon as you mix these memory management domains.

User avatar
dhylands
Posts: 3482
Joined: Mon Jan 06, 2014 6:08 pm
Location: Peachland, BC, Canada
Contact:

Re: Memory allocation and deallocation

Post by dhylands » Mon Jul 13, 2020 10:38 pm

Roberthh wrote:
Mon Jul 13, 2020 8:38 pm
Right - if you allocate something in C land (using malloc or whatever) then a pointer to that object either needs to be stored inside a python object or needs to be stored someplace in the root pointers area or the garbage collector will consider the alloced memory to be "unused" and will free it the next time that gc_collect is called.
@dhylands. Does that also apply to storage allocated in C-land with malloc() and only used in C-Land, especially never used for a Python object? The how should the garbage collector get hold of it? My impression and experience with crashes was, the trouble start as soon as you mix these memory management domains.
Hmm. So I took a closer look at this and it looks like malloc will go thru to the underlying OS malloc (for those ports which have an underlying OS). For the bare-metal ports, malloc doesn't appear to exist at all.

m_malloc gets mapped to gc_alloc inside the py/malloc.c file.

So if there is an underlying OS which provides malloc, then it will allocate outside of the micropython heap and won't be gc'd.

User avatar
elliotwoods
Posts: 23
Joined: Wed Dec 04, 2019 8:11 am

Re: Memory allocation and deallocation

Post by elliotwoods » Tue Jul 14, 2020 9:47 am

Thanks all
So we can confirm that the GC shouldn't touch memory which is created by the platform's own malloc (e.g. ESP32 heap_caps_malloc).
I am getting an issue with this memory becoming corrupted after GC sweeps, and I presume there's something 'indirect' going on.

My next step is to trace down how this is happening. I'm going to spend some time with the VS debugger looking for the exact place and time the memory is changed. I don't have time to check this today so will get back soon on this. (again, thankfully I can recreate the memory issue on the windows port, so it becomes much easier to debug).

Thank you all for your responses, it's extremely helpful
Elliot

User avatar
pythoncoder
Posts: 4647
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Memory allocation and deallocation

Post by pythoncoder » Tue Jul 14, 2020 3:32 pm

dhylands wrote:
Mon Jul 13, 2020 10:38 pm
...For the bare-metal ports, malloc doesn't appear to exist at all.

m_malloc gets mapped to gc_alloc inside the py/malloc.c file...
Does that mean that, on bare-metal ports, C allocates memory from the Python heap?

If so, you must have to take steps to ensure that the GC "knows" this, presumably by creating a Python root pointer to the block and removing the pointer when it's deallocated. Or is this done automagically?
Peter Hinch

User avatar
dhylands
Posts: 3482
Joined: Mon Jan 06, 2014 6:08 pm
Location: Peachland, BC, Canada
Contact:

Re: Memory allocation and deallocation

Post by dhylands » Tue Jul 14, 2020 4:42 pm

pythoncoder wrote:
Tue Jul 14, 2020 3:32 pm
dhylands wrote:
Mon Jul 13, 2020 10:38 pm
...For the bare-metal ports, malloc doesn't appear to exist at all.

m_malloc gets mapped to gc_alloc inside the py/malloc.c file...
Does that mean that, on bare-metal ports, C allocates memory from the Python heap?
The python heap is the only heap available on the bare metal ports. The code would need to call m_malloc (which comes from the python heap). Calling malloc should result in a linker failure.
pythoncoder wrote:
Tue Jul 14, 2020 3:32 pm
If so, you must have to take steps to ensure that the GC "knows" this, presumably by creating a Python root pointer to the block and removing the pointer when it's deallocated. Or is this done automagically?
You would need to take steps to ensure that the pointer to the m_malloc allocated object is referenceable via the root pointers (i.e. either directly in the root pointers or stored inside an object which is reachable via the root pointers).

On the STM32 port, the root pointers are found here:
https://github.com/micropython/micropyt ... #L312-L345

For example, line 329 is the root pointers for the timers.
This line is where the timer object is allocated from the heap: https://github.com/micropython/micropyt ... mer.c#L902 and then on line 911 it stores that pointer in the root pointer structure (line 329 from mpconfigport.h)

Post Reply