Max number of files to iterate with os.listdir()
Max number of files to iterate with os.listdir()
So I'm making a data logger using my Pyboard D,
I need to store data on the SD card in folders.
From experimenting I've found that the Max files per folder is 32768 (2^15)
However I run into trouble when trying to index files using os.listdir() , i'm guessing this is due to limited memory on the Pyboard.
But the issue is the number of files I can index changes, there doesn't seem to be a constant limit.
Does anyone know the maximum files os.listdir() can index before throwing an error?
I need to store data on the SD card in folders.
From experimenting I've found that the Max files per folder is 32768 (2^15)
However I run into trouble when trying to index files using os.listdir() , i'm guessing this is due to limited memory on the Pyboard.
But the issue is the number of files I can index changes, there doesn't seem to be a constant limit.
Does anyone know the maximum files os.listdir() can index before throwing an error?
Re: Max number of files to iterate with os.listdir()
Would help if you explain your goal. Do you really need os.listdir()? As far as I know that returns a list so you need the memory to store all names. But if you do not need that list as a whole, which is not unlikely, you could use ilistdir() instead: that gives you an iterator so you can access each element one by one so it hardly uses any memory.
That is impossible to answer: it's limited by memory, which is limited by your harware and software, so unless you have a progran which does exactly the same on each run the results will vary.Does anyone know the maximum files os.listdir() can index before throwing an error?
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Max number of files to iterate with os.listdir()
Do you really need a huge number of files? This is asking the filesystem to perform data lookup which gets unwieldy when the number of files becomes excessive.
There are usually better options which boil down to letting Python do the lookup. Use MicroPython's btree database. Or store an object such as a JSON encoded Python dict in a file. Or append lines of text to a single file.
There are usually better options which boil down to letting Python do the lookup. Use MicroPython's btree database. Or store an object such as a JSON encoded Python dict in a file. Or append lines of text to a single file.
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.
Re: Max number of files to iterate with os.listdir()
FAT filesystem has a terrible performance with more than some thousands of entries per directory.
Try to use a multi level hierarchy with less than 1000 entries per directory.
Try to use a multi level hierarchy with less than 1000 entries per directory.
Re: Max number of files to iterate with os.listdir()
Well I'm transmitting data to my server using the Pyboard and a SIM module. So I really just need a way to backup my data on the SD card in the event it can't send the data.pythoncoder wrote: ↑Sun Jun 21, 2020 8:53 amDo you really need a huge number of files? This is asking the filesystem to perform data lookup which gets unwieldy when the number of files becomes excessive.
There are usually better options which boil down to letting Python do the lookup. Use MicroPython's btree database. Or store an object such as a JSON encoded Python dict in a file. Or append lines of text to a single file.
I'll look into the btree module, is there any limit to the amount of entries that can be stored in it?
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Max number of files to iterate with os.listdir()
I don't think there's any limit other than available space for the file. The btree database behaves like a persistent dictionary. From your description items can be removed once they have been sent, so excessive growth sounds unlikely.
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.
Re: Max number of files to iterate with os.listdir()
Hi I'm using the latest daily build for my Pyboard D and I'm unable to import the btree modulepythoncoder wrote: ↑Mon Jun 22, 2020 3:56 pmI don't think there's any limit other than available space for the file. The btree database behaves like a persistent dictionary. From your description items can be removed once they have been sent, so excessive growth sounds unlikely.
Code: Select all
>> import btree
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: no module named 'btree'
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Grrr.
Sorry - I find all these configuration options extremely confusing. The docs indicate that btree exists, but probing around in the source tree suggests that it needs to be explicitly specified in the build in mpconfigbord.mk:
For reasons I can't even guess at, the module is compiled in the Unix build but not (AFAICS) in network-capable hardware builds.
Another option is to maintain a Python dict, and make it persist using ujson.
Code: Select all
# btree module using Berkeley DB 1.xx
MICROPY_PY_BTREE = 1
Another option is to maintain a Python dict, and make it persist using ujson.
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.
Re: Max number of files to iterate with os.listdir()
Ah that's a shame.
Regrading your idea of a dict maintained by ujson, wouldn't I need to load this dict every time I want to store more data.
Wouldn't this have the same bottleneck with the 2MB on board Memory?
Do you have a code snippet as an example?
Regrading your idea of a dict maintained by ujson, wouldn't I need to load this dict every time I want to store more data.
Wouldn't this have the same bottleneck with the 2MB on board Memory?
Do you have a code snippet as an example?
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Max number of files to iterate with os.listdir()
Here is the kind of thing I have in mind. In this instance the queue of outstanding items is stored as a list.
Note I haven't actually tested this
Code: Select all
def log_data(filename):
try:
with open(filename, 'r') as f: # Resume after power failure
q = ujson.load(f)
q_changed = False
except OSError: # First time run: no file yet created
q = []
q_changed = True # Force a write
while True:
d = get_data() # return some kind of object
if link_is_open(): # Check communication status
for item in q:
send(item) # Transmit the object
q = [] # queue is now empty
q_changed = True
send(d)
else:
q.append(d)
q_changed = True
if q_changed:
q_changed = False
with open(filename, 'w') as f:
ujson.dump(q, f) # in case of power failure
time.sleep(10) # However long you want to wait between samples
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.