How to add/cache string to qstr pool in runtime in C code

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
Post Reply
bhcuong2008
Posts: 14
Joined: Sun May 30, 2021 7:38 am

How to add/cache string to qstr pool in runtime in C code

Post by bhcuong2008 » Sat Jun 05, 2021 11:09 am

Hi,

I have a list of strings by using ujson loads in C code:

char *s = "[\"a\", \"b\"]";
mp_obj_t mp_list_1 = mp_call_function_1(mp_loads_fn, mp_obj_new_str(s, strlen(s)));

Now I want to cache strings "a", "b" to QSTR pool. So that further processing that using values "a", "b" is faster. After processing, I could remove these in QSTR pool to save RAM.

Is there any C-API MPY that could check/add/delete qstr in QSTR pool?

Thank you.

stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: How to add/cache string to qstr pool in runtime in C code

Post by stijn » Sat Jun 05, 2021 11:34 am

mp_obj_new_str_via_qstr creates (or uses an existing) qstr. I'm not sure that is going to be faster though, better measure that first. But I also don't really understand what you're after: your list will have "a" and "b" created by mp_obj_new_str, so at that point do you mean you're going to replace those elements with qstrs? And all of that together should then be faster somehow? Or you're first going to create "a" and "b" in the pool so that when mod_ujson_load creates the list mp_obj_new_str will use the qtrs from the pool? That would require knowing all strings on beforehand.

bhcuong2008
Posts: 14
Joined: Sun May 30, 2021 7:38 am

Re: How to add/cache string to qstr pool in runtime in C code

Post by bhcuong2008 » Sat Jun 05, 2021 11:47 am

I just want it like in REPL. In REPL, if you define a list like this,

list_1 = ['a', 'b']

then MPY will create qstr automatically, for "a" and "b". Check it by import micropython module, and run micropython.qstr_info(1)

You could check its run time by executing operations with string "a", "b" in C/MPY code if "a", "b" not in QSTR pool. For example, you check "b" in dict or attribute of an object. It seems to be slower so much. Dont do it in REPL, due to REPL creates qstr "b" automatically.

Code: Select all

if 'b'  in dict_ex:
   do_something()
Specifically, you define a MPY function like this:

Code: Select all

def test_qstr(list_args, dict_args):
   for e in list_args:
      if e in dict_args:
         pass
Then, run this function in REPL, and in C code. In C code, you provide args to test_qstr:

Code: Select all

char *s_list_1 = "[\"a\", \"b\", \"c\", \"d\", \"e\"]";
char *s_dict_1 = "{\"a\": 1, \"b\": 2}";
mb_obj_t mp_list_1 = mp_call_function_1(mp_ujson_loads_fn, mp_obj_new_str(s_list_1, strlen(s_list_1)));
mb_obj_t mp_dict_1 = mp_call_function_1(mp_ujson_loads_fn, mp_obj_new_str(s_dict_1, strlen(s_dict_1)));
mp_obj_t mp_test_qstr_fn = mp_load_global(qstr_from_str("test_qstr"));

mp_call_function_2(mp_test_qstr_fn, mp_list_1, mp_dict_1)

In MPY REPL, define:

Code: Select all

list_1 = ['a', 'b', 'c', 'd', 'e']
dict_1 = {'a': 1, 'b': 2}
test_qstr(list_1, dict_1)
You will see execution time in MPY REPL will be faster than in C-API. This is due to qstr 'a', 'b', 'c', 'd', 'e' already created automatically by REPL.

bhcuong2008
Posts: 14
Joined: Sun May 30, 2021 7:38 am

Re: How to add/cache string to qstr pool in runtime in C code

Post by bhcuong2008 » Sat Jun 05, 2021 1:13 pm

I got it. Just call qstr_from_str("a"), then QSTR(a) will be added to QSTR pool. But I dont know how to release these qstrs :(

I also wonder how to add string value, such as list_args[0], to QSTR pool in python code.

stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: How to add/cache string to qstr pool in runtime in C code

Post by stijn » Sat Jun 05, 2021 3:04 pm

I got it. Just call qstr_from_str("a")
That is indeed what mp_obj_new_str_via_qstr (or mp_obj_str_intern) uses as well. Difference being it returns an mp_obj_t, and you'll need that anyway if you want to store the qstr in a list. I.e. if you already have a list with "a" in it from JSON, the strings in that list will still need to be replaced qith qstr. Just calling qstr("a") only does that when called before parsing the JSON. Also note that qstr length is limited, so you cannot just throw arbitrary strings at it.
I also wonder how to add string value, such as list_args[0], to QSTR pool in python code.
I don't think mp_obj_str_intern or similar is exposed, you'd have to add that in a custom module. And again, replace all your existing strings.
But I dont know how to release these qstrs
There is no function for that.
I just want it like in REPL.
Just a note: this has little to do with the REPL, it's because of the parse/compile phase. In other words: works for any Python code, also in files. There's no way to have that though, especially not if you're parsing arbitrary strings from JSON.
It seems to be slower so much
If I run test_qstr 100000 times it takes about 47mSec with qstr and 48mSec without. So, ok, it seems faster, but nothing extreme at all. At least not for this particular test, on the unix port, on my PC.

bhcuong2008
Posts: 14
Joined: Sun May 30, 2021 7:38 am

Re: How to add/cache string to qstr pool in runtime in C code

Post by bhcuong2008 » Sun Jun 06, 2021 7:57 am

Thank stijn.

I found this issue. In that post, dpgeorge told that if string not interned, it will use strcmp. It's so slower than that comparing number by using interned string.

https://github.com/micropython/micropython/issues/4053

You will see that qstr is important for speed và saving memory. But as in that issue, I need to release qstr that I made. Currently it's impossible to delete qstr.

I already made a custom function to add new qstr to the pool from MPY program. But I want I could manage it.

In my test, I could create qstr automatically from code: Q(state_l1), Q(state_l2), Q(state_l3), Q(state_l4), Q(baterry), Q(contact) to the pool. These qstrs helps faster in processing data about 30-40% in my case.

Code: Select all

>>> micropython.qstr_info(1)
qstr pool: n_pool=1, n_qstr=16, n_str_data_bytes=145, n_total_bytes=241
Q(O)
Q(I)
Q(S)
Q(B)
Q(L)
Q(N)
Q(F)
Q(D)
Q(uasyncio.core)
Q(uasyncio.event)
Q(state_l1)
Q(state_l4)
Q(state_l3)
Q(state_l2)
Q(battery)
Q(contact)
>>> 

Post Reply