Memory allocation and deallocation

C programming, build, interpreter/VM.
Target audience: MicroPython Developers.
sameid
Posts: 4
Joined: Mon Feb 27, 2017 7:30 am

Memory allocation and deallocation

Postby sameid » Mon Feb 27, 2017 11:43 am

First of all I would like to say that I find your work amazing. Bringing the Python language to this standard is really beautiful - I can finally use its embeddable properties as a replacement for Lua - with better syntax (generators look way better) and better language support (exceptions are just great).

I started by developing a small module in c as an extension and I still haven't figured out how the gc finds "abandoned" memory.

I've read the wiki page regarding the gc and still it's a mystery for me.

Question 1.
If I create my own type with a "new" function.
In that function I allocate, using m_new some memory, and return it from the function. Now I understand that now the object is referenced by the python vm - and when references no longer exist - the gc will be able to clean that piece of memory.
But assuming my object has a pointer field to an array of ints I want to allocate dynamically using the same python heap in my "new" function - how does the gc knows to connect between the outer object seen by the vm and my inner implementation?
At first I thought I should implement a specific __del__ function for my type - but __del__ doesn't even get called!

Question 2.
What happens if an exception is thrown in my c code using nlr magic after I allocated some memory using m_new but haven't returned it yet?

Question 3.
Connected to question 2 - in objstr.c we have:

STATIC mp_obj_t str_encode(size_t n_args, const mp_obj_t *args) {
mp_obj_t new_args[2];
if (n_args == 1) {
new_args[0] = args[0];
new_args[1] = MP_OBJ_NEW_QSTR(MP_QSTR_utf_hyphen_8);
args = new_args;
n_args++;
}
return bytes_make_new(NULL, n_args, 0, args);
}

1. New qstr - heap allocation?
2. New bytes - heap allocation?

What happens if second one fail? nlr magic? What happens to memory allocated first?

(Nothing special about objstr.c just tried to find two allocations without checks in between)

Thanks!
Sam

User avatar
dhylands
Posts: 2264
Joined: Mon Jan 06, 2014 6:08 pm
Location: Shuswap, BC, Canada
Contact:

Re: Memory allocation and deallocation

Postby dhylands » Mon Feb 27, 2017 4:31 pm

The way that the GC (garbage collector) works is that any object which has a pointer to it is kept.

MicroPython has the notion of root pointers. These are maintained in the mp_state_t struct:
https://github.com/micropython/micropyt ... #L100-L169

Any object which is pointed to through a root pointer, or another object, or a pointer on the stack or from one of the registers will be kept. Other objects will be freed.

So if you allocate something and then throw an exception, the first allocated object will generally wind up being freed (at garbage collection time), unless a pointer to it was ultimately stored in a root pointer or other python object.

QSTR's are otherwise known as "interned" strings. These are referenced by index and not by pointer, which allows some performance improvements. Once a QSTR is allocated it never gets freed. There is a pool of them. QSTRs are also only allocated once. If a QSTR already exists, then the index of it will be returned rather than allocating a new one.

sameid
Posts: 4
Joined: Mon Feb 27, 2017 7:30 am

Re: Memory allocation and deallocation

Postby sameid » Tue Feb 28, 2017 6:46 am

Thanks for the explanation, but I still haven't gotten my answer.

If I return my own object, from a type I created using m_new on 10 bytes from a c function to the vm. But in my own object the first 4 bytes (32 bit machine) is an internal pointer in the struct that points to another memory, let's say 4 bytes that were allocated by m_new too - to hold a dynamic int.

When my own object gets GC'd - how can the GC possibly know about my 4 internally allocated bytes, shouldn't it leak? Isn't a some kind of a destructor needed for my object?

Thanks,
Sam

User avatar
dhylands
Posts: 2264
Joined: Mon Jan 06, 2014 6:08 pm
Location: Shuswap, BC, Canada
Contact:

Re: Memory allocation and deallocation

Postby dhylands » Tue Feb 28, 2017 7:21 am

The GC will examine all of the objects and follow any pointers contained within them.

For example, here's a C strcuture for the UART:https://github.com/micropython/micropython/blob/1f549a3496ac543390a170a7eb7b242475063016/stmhal/uart.c#L78-L92 which contains a pointer to read buffer (line 91).

A side effect of this is that the GC will scan the UART object and see the pointer and thus not free it. The way the gc works, if the object contains a bit pattern which looks like a heap pointer, it might accidentally not free an object which should otherwise be freed.

When the UART object gets GC'd, then there will no longer be a pointer to the read_buf (assuming that there isn't one someplace else), so it will in turn be GC'd.

sameid
Posts: 4
Joined: Mon Feb 27, 2017 7:30 am

Re: Memory allocation and deallocation

Postby sameid » Tue Feb 28, 2017 7:50 am

Interesting, so let's see if I got it right:

1. If the heap contains a pointer to the heap - the GC will treat it as a referenced memory. (It must be aligned?)

2. The GC has to scan all bytes in the heap to clean internally referenced memory

3. If, god forbid, my internal pointer will not be aligned - the memory it points too will eventually be GC'd? Making it point to garbage.

4. The pattern is something like pointer size in heap range with a certain suffix? (Depending on block size?)

5. If I have a large array of ints on the heap which never gets freed and by 'accident' these ints are following the pattern of the heap - memory will not be GC'd?

6. Finally, if objects have internal state - the correct way to use them is with a ".close()" function like in sockets

Thanks,
Sam

User avatar
pythoncoder
Posts: 1393
Joined: Fri Jul 18, 2014 8:01 am

Re: Memory allocation and deallocation

Postby pythoncoder » Tue Feb 28, 2017 12:22 pm

Small integers are constrained such that b31==b30. This enables uPython to reliably distinguish between a small integer and a pointer which is guaranteed not to follow this pattern. This is possible because pointers are word aligned, so have two 'spare' bits. It's binary blobs like bytes instances which can (rarely) be mistaken for pointers and needlessly retained.

Pointers created by uPython will always be word aligned. I guess you could create a non-aligned pointer in C or inline assembler but making this visible to uPython is probably a bad idea.

If an object has internal state its RAM will be reclaimed when the last pointer to the object goes out of scope. There is usually no need to code explicit closure.
Peter Hinch

sameid
Posts: 4
Joined: Mon Feb 27, 2017 7:30 am

Re: Memory allocation and deallocation

Postby sameid » Tue Feb 28, 2017 3:43 pm

I see,

It's a shame that there is no clear/thouroughful documentation regarding this matter - feels like it's one of the basics in case one wants to contribute.

In any case, big thanks for the explanations!

Sam


Return to “Development of MicroPython”

Who is online

Users browsing this forum: No registered users and 1 guest