[SOLVED] Using btree for key-value-database (OSError: 0)

All ESP8266 boards running MicroPython.
Official boards are the Adafruit Huzzah and Feather boards.
Target audience: MicroPython users with an ESP8266 board.
crizeo
Posts: 42
Joined: Sun Aug 06, 2017 12:55 pm
Location: Germany

[SOLVED] Using btree for key-value-database (OSError: 0)

Post by crizeo » Sun Aug 06, 2017 9:57 pm

I want to map ~500000 strings to integers. As I cannot keep this amount of data in the memory of the ESP8266, I have to use some kind of database that can be accessed whenever I need a value (not that often).

I tried btree but this does not seem to work. I keep getting an "OSError: 0" after flushing 4 to 5 kB to the btree. The size of the database does not seem to be the problem. Garbage collection (GC) and heap usage are ok as well. I tried to fix this for days but I can't get rid of the error. Does anybody have an idea? I don't even have to add the values on the ESP, could do it on Windows / Unix instead, but I did not find a way to create a database based on the Berkerley DB 1.8.5 (bsddb185 module not wokring for me). :roll:

Or is there a simple alternative to btree? I could even write the values to a standard file (open(..., "w")), but it would take minutes/hours to find a value...


Here is the sample code (sample database with random values just growing until the OSError occurs):

Code: Select all

import btree
import uos, urandom  # for random bytestring + db size
import gc  # for showing free heap


DATABASE_FILENAME = "btree.db"


def randbytes():
    length = 1 + urandom.getrandbits(4)  # 1 to 16
    return uos.urandom(length)


dbfile = open(DATABASE_FILENAME, 'w+b')
db = btree.open(dbfile)

counter_entries = 0  # number of entries (pairs of k+v) in db
counter_bytes = 0  # number of bytes (keys + values) in database
for i in range(500):
    key, value = randbytes(), randbytes()
    db[key] = value  # here is where the error occurs
    counter_entries += 1
    counter_bytes += len(key) + len(value)
    print("total: %d entries (%d B); db size: %d B; free heap: %d B" %
          (counter_entries, counter_bytes, uos.stat(DATABASE_FILENAME)[6], gc.mem_free()))
    db.flush()


db.close()
dbfile.close()
And here's what I get when I run the file with the above code (via serial connection using PUTTY and the command "import btree_test"), always happening at ~ 4.4 kB:

Code: Select all

...
total: 249 entries (4168 B); db size: 16384 B; free heap: 512 B
total: 250 entries (4186 B); db size: 16384 B; free heap: 176 B
total: 251 entries (4205 B); db size: 16384 B; free heap: 8608 B
total: 252 entries (4222 B); db size: 16384 B; free heap: 8256 B
total: 253 entries (4242 B); db size: 16384 B; free heap: 7904 B
total: 254 entries (4259 B); db size: 16384 B; free heap: 7536 B
total: 255 entries (4271 B); db size: 16384 B; free heap: 7184 B
total: 256 entries (4280 B); db size: 16384 B; free heap: 6832 B
total: 257 entries (4304 B); db size: 16384 B; free heap: 6480 B
total: 258 entries (4312 B); db size: 16384 B; free heap: 6128 B
total: 259 entries (4329 B); db size: 16384 B; free heap: 5776 B
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "btree_test.py", line 21, in <module>
OSError: 0

Edit: Solution was posted by Roberthh (see page 2):

Code: Select all

db = btree.open(dbfile, pagesize=1024)  # or 512
Last edited by crizeo on Fri Aug 11, 2017 8:39 am, edited 2 times in total.

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Using btree for key-value-database (OSError: 0)

Post by pythoncoder » Mon Aug 07, 2017 8:28 am

I don't know why this is failing: perhaps it should be raised as an issue.

As a workround why not use the Unix build to create the database file, then copy it to the ESP8266 filesystem?
Peter Hinch
Index to my micropython libraries.

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Using btree for key-value-database (OSError: 0)

Post by Roberthh » Mon Aug 07, 2017 10:33 am

With 500.000 key values you will need to ad an SD card to your ESP8266. The flash file system has a sie of only about 2 MB.

crizeo
Posts: 42
Joined: Sun Aug 06, 2017 12:55 pm
Location: Germany

Re: Using btree for key-value-database (OSError: 0)

Post by crizeo » Mon Aug 07, 2017 10:57 am

@pythoncoder: I had the same idea, but do you know how to create an btree database (based on Berkeley DB 1.8.5, which is very old and I can't get installed) on Windows or Linux/UNIX. Do you maybe know a tutorial on how to do that?

@Roberthh: I know, I will use external storage (SD card), but the problem is that even without the SD card, my test (!) program fails.

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Using btree for key-value-database (OSError: 0)

Post by Roberthh » Mon Aug 07, 2017 3:41 pm

If you add the line:

Code: Select all

MICROPY_PY_BTREE ?= 1
to Makefile in the unix directory, you can build a linux version which supports btree. That works, if you replace 'uos' by 'os' in your sample code.

Update: 500000 entries with the test code created a 20 MByte database file.
Update 2: Transferring the smaller test file to ESP8266 works, but a simple trial to print all keys runs into an EINVAL errror.

crizeo
Posts: 42
Joined: Sun Aug 06, 2017 12:55 pm
Location: Germany

Re: Using btree for key-value-database (OSError: 0)

Post by crizeo » Mon Aug 07, 2017 5:28 pm

It worked for me (after installing os "./micropython -m micropython-os")! Thank you so much, Roberthh! :D

Will try adding the keys and values now...

Edit:
Still not working... :cry: I created a btree database containing 1000 items on Unix and transfered it to the ESP. That's what I get when I search for a entry (OSError: 0):

Code: Select all

>>> import btree, os
>>> os.listdir()
['System Volume Information', 'btree.db']
>>> dbfile = open("btree.db", "r+b")
>>> db = btree.open(dbfile)
>>> db["TEST"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: 0
>>> os.listdir()
['\x00\x00\x00\x0f']

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Using btree for key-value-database (OSError: 0)

Post by Roberthh » Mon Aug 07, 2017 7:59 pm

I made a similar test, creating a database with increasing numbers as text as keys, and random hex strings as values. So i know which keys are there. My result:
a) I created the database on the ESP. At most, I can create about 200 entries before getting OSError 0. But these can be retrieved. I modified the test file to call gc.collect after every flush. This is the log:

Code: Select all

total: 10 entries (58 B); db size: 8192 B; free heap: 12608 B
total: 20 entries (130 B); db size: 8192 B; free heap: 12576 B
total: 30 entries (212 B); db size: 8192 B; free heap: 12576 B
total: 40 entries (284 B); db size: 8192 B; free heap: 12576 B
total: 50 entries (356 B); db size: 8192 B; free heap: 12576 B
total: 60 entries (416 B); db size: 8192 B; free heap: 12576 B
total: 70 entries (494 B); db size: 8192 B; free heap: 12576 B
total: 80 entries (548 B); db size: 8192 B; free heap: 12576 B
total: 90 entries (618 B); db size: 8192 B; free heap: 12576 B
total: 100 entries (690 B); db size: 8192 B; free heap: 12576 B
total: 110 entries (776 B); db size: 8192 B; free heap: 12576 B
total: 120 entries (852 B); db size: 8192 B; free heap: 12576 B
total: 130 entries (938 B); db size: 8192 B; free heap: 12608 B
total: 140 entries (1022 B); db size: 8192 B; free heap: 12576 B
total: 150 entries (1090 B); db size: 8192 B; free heap: 12576 B
total: 160 entries (1172 B); db size: 8192 B; free heap: 12576 B
total: 170 entries (1254 B); db size: 8192 B; free heap: 12576 B
total: 180 entries (1322 B); db size: 8192 B; free heap: 12576 B
total: 190 entries (1412 B); db size: 8192 B; free heap: 12576 B
total: 200 entries (1486 B); db size: 8192 B; free heap: 12576 B
total: 210 entries (1558 B); db size: 16384 B; free heap: 4320 B
total: 220 entries (1644 B); db size: 16384 B; free heap: 4320 B
total: 230 entries (1716 B); db size: 16384 B; free heap: 4320 B
total: 240 entries (1788 B); db size: 16384 B; free heap: 4320 B
total: 250 entries (1864 B); db size: 16384 B; free heap: 4320 B
total: 260 entries (1950 B); db size: 16384 B; free heap: 4320 B
total: 270 entries (2014 B); db size: 16384 B; free heap: 4320 B
total: 280 entries (2098 B); db size: 16384 B; free heap: 4320 B
total: 290 entries (2170 B); db size: 16384 B; free heap: 4320 B
total: 300 entries (2248 B); db size: 16384 B; free heap: 4320 B
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "test_btree.py", line 22, in <module>
It's funny that at a certain point the free memory drops from 12576 to 4320, which is a little bit more than 8 k and just the increase in database size. So it looks like i tries to keep the database in memory, and the next need to increase the database size the code runs out of memory and stops. This memory restriction seems also be the case if the data base was created on a different machine.

Update: to be tested tomorrow: parameter cachesize=xxxx btree open. Maybe it's just a RTFM issue.

crizeo
Posts: 42
Joined: Sun Aug 06, 2017 12:55 pm
Location: Germany

Re: Using btree for key-value-database (OSError: 0)

Post by crizeo » Mon Aug 07, 2017 8:49 pm

Unfortunately, you can't find much about the btree module on the internet, but some hours ago I read something about this issue on the MicroPython Github repo., where they say that at some point you cannot store/read data. But 200 entries or sth shouldn't cause this error... Anyway, thank you so much for helping me!

As my data can be stored as a tree, I will try something different: Simply create folders named by the letters and then create value files containing the value. Then I will simply try to open the file and if it does not exist, I will get an error that I can catch/except:
  • A
  • B
  • ...
  • T
    • E
      • S
        • T
          • value.txt (I will name it '_' to save memory)
    • ...
  • S

Code: Select all

try:
    print("value: " + open('/'.join(list("TEST")) + "/value.txt").read())
except OSError:
    raise KeyError
The problem with this method is that it takes extremly long to write the 500.000 entries to the SD card (and requires more memory; using FAT32 with 512b cluster). Currently writing 8,8 entries per second (-> 16 hours). But I don't know what else to use and, in return, accessing seems to be quite fast...

User avatar
pythoncoder
Posts: 5956
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Using btree for key-value-database (OSError: 0)

Post by pythoncoder » Tue Aug 08, 2017 4:51 am

@Roberthh Keeping a database in RAM surely negates its purpose - you might as well pickle a dict. Are you sure it's doing this?
Peter Hinch
Index to my micropython libraries.

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Using btree for key-value-database (OSError: 0)

Post by Roberthh » Tue Aug 08, 2017 5:15 am

Setting cachesize=2048 did NOT make the trick. And yes, Peter, keeping the data in RAM seems odd and is contrary to the intetion of the module, as written in the doc.
About the memory loss: The same code on unix shows a constant available heap size. So I think (and hope) it's an implementation glitch.

log output on Unix:

Code: Select all

ttotal: 10 entries (64 B); db size: 8192 B; free heap: 2049216 B
total: 20 entries (126 B); db size: 8192 B; free heap: 2049216 B
total: 30 entries (200 B); db size: 8192 B; free heap: 2049216 B
total: 40 entries (268 B); db size: 8192 B; free heap: 2049216 B
total: 50 entries (352 B); db size: 8192 B; free heap: 2049216 B
total: 60 entries (410 B); db size: 8192 B; free heap: 2049216 B
total: 70 entries (464 B); db size: 8192 B; free heap: 2049216 B
total: 80 entries (522 B); db size: 8192 B; free heap: 2049216 B
total: 90 entries (582 B); db size: 8192 B; free heap: 2049216 B
total: 100 entries (660 B); db size: 8192 B; free heap: 2049216 B
total: 110 entries (746 B); db size: 8192 B; free heap: 2049216 B
total: 120 entries (824 B); db size: 8192 B; free heap: 2049216 B
total: 130 entries (916 B); db size: 8192 B; free heap: 2049216 B
total: 140 entries (996 B); db size: 8192 B; free heap: 2049216 B
total: 150 entries (1082 B); db size: 8192 B; free heap: 2049216 B
total: 160 entries (1158 B); db size: 8192 B; free heap: 2049216 B
total: 170 entries (1250 B); db size: 8192 B; free heap: 2049216 B
total: 180 entries (1330 B); db size: 8192 B; free heap: 2049216 B
total: 190 entries (1416 B); db size: 8192 B; free heap: 2049216 B
total: 200 entries (1504 B); db size: 8192 B; free heap: 2049216 B
total: 210 entries (1578 B); db size: 16384 B; free heap: 2049216 B
total: 220 entries (1662 B); db size: 16384 B; free heap: 2049216 B
total: 230 entries (1754 B); db size: 16384 B; free heap: 2049216 B
total: 240 entries (1832 B); db size: 16384 B; free heap: 2049216 B
total: 250 entries (1914 B); db size: 16384 B; free heap: 2049216 B
total: 260 entries (1988 B); db size: 16384 B; free heap: 2049216 B
total: 270 entries (2066 B); db size: 16384 B; free heap: 2049216 B
total: 280 entries (2148 B); db size: 16384 B; free heap: 2049216 B
total: 290 entries (2214 B); db size: 16384 B; free heap: 2049216 B
total: 300 entries (2306 B); db size: 16384 B; free heap: 2049216 B
total: 310 entries (2382 B); db size: 20480 B; free heap: 2049216 B
total: 320 entries (2460 B); db size: 20480 B; free heap: 2049216 B
total: 330 entries (2540 B); db size: 20480 B; free heap: 2049216 B
total: 340 entries (2630 B); db size: 20480 B; free heap: 2049216 B
total: 350 entries (2710 B); db size: 20480 B; free heap: 2049216 B
total: 360 entries (2782 B); db size: 20480 B; free heap: 2049216 B
total: 370 entries (2868 B); db size: 20480 B; free heap: 2049216 B
total: 380 entries (2932 B); db size: 20480 B; free heap: 2049216 B
total: 390 entries (3016 B); db size: 20480 B; free heap: 2049216 B
total: 400 entries (3098 B); db size: 20480 B; free heap: 2049216 B
total: 410 entries (3172 B); db size: 20480 B; free heap: 2049216 B
total: 420 entries (3254 B); db size: 24576 B; free heap: 2049216 B
total: 430 entries (3336 B); db size: 24576 B; free heap: 2049216 B
total: 440 entries (3404 B); db size: 24576 B; free heap: 2049216 B
total: 450 entries (3476 B); db size: 24576 B; free heap: 2049216 B
total: 460 entries (3556 B); db size: 24576 B; free heap: 2049216 B
total: 470 entries (3628 B); db size: 24576 B; free heap: 2049216 B
total: 480 entries (3704 B); db size: 24576 B; free heap: 2049216 B
total: 490 entries (3782 B); db size: 24576 B; free heap: 2049216 B
total: 500 entries (3858 B); db size: 24576 B; free heap: 2049216 B
>>> 

Post Reply