laukejas wrote: ↑Fri Apr 03, 2020 6:42 pm
list4 = array.array('H', (0 for _ in range(1000))) - takes 2092 bytes according to Python 3.7, 2336 bytes according to MicroPython. This seems consistent with the fact that Short datatype takes 2 bytes per member. Overhead is quite large, though, especially in MicroPython.
Let's take this particular example and find out why your conclusion is wrong. For this I build MicroPython with verbose debug printing for gc.c so we can see what gets allocated, and step through the code with a debugger. So linux or windows ports.
To make things a bit less complicated I'm declaring the initial value first as a list, for 2 reasons: the statement you show also has to allocate the generator, and the length of a generator can only be known by iterating it meaning the array cannot be allocated in one go simply because the constructor doesn't know yet how bit it is. But want to know exactly what memory it takes up so passing it a list with a known length is going to make reading that easier, vs having to look at a bunch of reallocations.
Code: Select all
>>> initializer = [0 for _ in range(1000)]
>>> gc.collect() # To make sure no dangling things left
Note I'm still doing this on the REPL to show why that has problems; now let's start with the actual business, including the debug output from gc.c:
Code: Select all
>>> start = gc.mem_free()
gc_alloc(22 bytes -> 1 blocks)
gc_alloc(0000025EA457D260)
gc_alloc(32 bytes -> 1 blocks)
gc_alloc(0000025EA457D2C0)
gc_alloc(160 bytes -> 5 blocks)
gc_alloc(0000025EA457E460)
gc_alloc(20 bytes -> 1 blocks)
gc_alloc(0000025EA457D300)
gc_alloc(32 bytes -> 1 blocks)
gc_alloc(0000025EA457D340)
gc_alloc(1024 bytes -> 32 blocks)
gc_alloc(0000025EA4580720)
gc_alloc(256 bytes -> 8 blocks)
gc_alloc(0000025EA457E540)
gc_alloc(144 bytes -> 5 blocks)
gc_alloc(0000025EA457E640)
gc_free(0000000000000000)
gc_free(0000025EA4580720)
gc_free(0000025EA457E540)
gc_free(0000025EA457D2C0)
gc_free(0000025EA457D340)
gc_free(0000025EA457D300)
gc_free(0000025EA457E460)
gc_alloc(72 bytes -> 3 blocks)
gc_alloc(0000025EA457D460)
gc_alloc(24 bytes -> 1 blocks)
gc_alloc(0000025EA457D2C0)
gc_alloc(64 bytes -> 2 blocks)
gc_alloc(0000025EA457D340)
gc_alloc(144 bytes -> 5 blocks)
gc_alloc(0000025EA457E460)
gc_alloc(0 bytes -> 0 blocks)
gc_alloc(21 bytes -> 1 blocks)
gc_alloc(0000025EA457D300)
gc_alloc(0 bytes -> 0 blocks)
gc_free(0000000000000000)
gc_free(0000025EA457E460)
gc_free(0000025EA457E640)
gc_free(0000025EA457D340)
gc_free(0000025EA457D460)
gc_alloc(32 bytes -> 1 blocks)
gc_alloc(0000025EA457D340)
>>>
Wait, what?? Yes that's like 2000bytes which have been allocated under the hood, just to run that statement. Now most of that has been deallocated but not all of it: having to store that variable 'start' also requires memory. Since I'm running this on a 64bit build that should be 2 times 8 bytes (one 'pointer' for the name, one for the integer value) but as you can see there's no allocation of just 16 bytes. The reason for that is that when MicroPython stores a new name in the globals() dictionary and it needs to allocate more space it allocates some extra so it doesn't constantly have to reallocate.
This shows 2 possible problems with using mem_free() as substitue for getsizeof():
- it takes the overhead of storing the name into account, which might or might not matter, but shows why your question 'what is the size of the object' is by itself flawed with respect to what you're trying to do. Sidenote: multiple names can point to the same object. I.e. if your array object is 64 bytes, but you then do x = array, y = array, z = array you are effectively using 48 bytes extra. That's almost as much as the size of the object itself. Not gonna matter for large arrays, but still, worth consideration.
- if you see that list of allocations, and know there is fragmentation, and know that garbage collection is not deterministc meaning gc.collect() does not always succeed in collecting everything right away, it's extra clear that what mem_free() returns next time might not always be the exact size of the object
Next (now not showing the allocations for compilation etc, explanation shown inline):
Code: Select all
>>> arr = array.array('H',initializer)
# This is allocation for the internal MicroPython array object, i.e. a strcut
# with things like typecode, length, pointer to data
gc_alloc(32 bytes -> 1 blocks)
gc_alloc(0000025EA457D4C0)
# Now allocating the actual data
gc_alloc(2000 bytes -> 63 blocks)
gc_alloc(0000025EA4580720)
# The data needs to be initialized, for which an iterator needs to be allocated;
# on the next call to gc.collect() this should be gone though.
gc_alloc(32 bytes -> 1 blocks)
gc_alloc(0000025EA457D500)
# Storing the name 'arr' in globals() extra preallocation
gc_alloc(128 bytes -> 4 blocks)
gc_alloc(0000025EA457DB20)
So in the strict sense, the actual amount of bytes taken by just the array object here is 2032 or 2048 if you count the storage for the name. Better than CPython, and much better than what you thought it was.
So what does mem_free() say?
Code: Select all
>>> gc.mem_free() - start
-2368
# Without collect() that's rather bogus though, try again.
>>> gc.collect()
>>> gc.mem_free() - start
-2080
It's detting close but not there yet. Not running on the REPL might shave off some more. I hope this gives some insight as to why mem_free() and the way you're using it has problems.
As far as solutions go: if you go over the code you could cook your own sys.sizeof() implementation for types of which you have looked up what they actually need. But as sai, that only answers the question 'how do I get the sie of an object' which is not too interesting as it is not going to tell you how much RAM will get used by an application or how much you have left to allocate in an actual program of more than a couple of lines.
If I would have to figure that out I'd probably make a prototype application which does roughly what the specs say it has to do, then run it while repeatedly calling collect()/mem_free() at points of interest. Then crank up allocation size and try again. Then change number of allocated objects and repeat. Get all data, use interpolation to get to the amounts of memory and allocations the actual application would have, add some error margin, that should give a rough estimate. And unless I'm mistaken, more accurate then just adding the result of a bunch of getsizeof() calls together sice it would be taking other allocations into account as well. It would however be pretty interesting to compare it with the sum of getsizeof() calls of all objects, to see if the relationship is lineair for instance.