Hi,
I have a list of strings by using ujson loads in C code:
char *s = "[\"a\", \"b\"]";
mp_obj_t mp_list_1 = mp_call_function_1(mp_loads_fn, mp_obj_new_str(s, strlen(s)));
Now I want to cache strings "a", "b" to QSTR pool. So that further processing that using values "a", "b" is faster. After processing, I could remove these in QSTR pool to save RAM.
Is there any C-API MPY that could check/add/delete qstr in QSTR pool?
Thank you.
How to add/cache string to qstr pool in runtime in C code
-
- Posts: 14
- Joined: Sun May 30, 2021 7:38 am
Re: How to add/cache string to qstr pool in runtime in C code
mp_obj_new_str_via_qstr creates (or uses an existing) qstr. I'm not sure that is going to be faster though, better measure that first. But I also don't really understand what you're after: your list will have "a" and "b" created by mp_obj_new_str, so at that point do you mean you're going to replace those elements with qstrs? And all of that together should then be faster somehow? Or you're first going to create "a" and "b" in the pool so that when mod_ujson_load creates the list mp_obj_new_str will use the qtrs from the pool? That would require knowing all strings on beforehand.
-
- Posts: 14
- Joined: Sun May 30, 2021 7:38 am
Re: How to add/cache string to qstr pool in runtime in C code
I just want it like in REPL. In REPL, if you define a list like this,
list_1 = ['a', 'b']
then MPY will create qstr automatically, for "a" and "b". Check it by import micropython module, and run micropython.qstr_info(1)
You could check its run time by executing operations with string "a", "b" in C/MPY code if "a", "b" not in QSTR pool. For example, you check "b" in dict or attribute of an object. It seems to be slower so much. Dont do it in REPL, due to REPL creates qstr "b" automatically.
Specifically, you define a MPY function like this:
Then, run this function in REPL, and in C code. In C code, you provide args to test_qstr:
In MPY REPL, define:
You will see execution time in MPY REPL will be faster than in C-API. This is due to qstr 'a', 'b', 'c', 'd', 'e' already created automatically by REPL.
list_1 = ['a', 'b']
then MPY will create qstr automatically, for "a" and "b". Check it by import micropython module, and run micropython.qstr_info(1)
You could check its run time by executing operations with string "a", "b" in C/MPY code if "a", "b" not in QSTR pool. For example, you check "b" in dict or attribute of an object. It seems to be slower so much. Dont do it in REPL, due to REPL creates qstr "b" automatically.
Code: Select all
if 'b' in dict_ex:
do_something()
Code: Select all
def test_qstr(list_args, dict_args):
for e in list_args:
if e in dict_args:
pass
Code: Select all
char *s_list_1 = "[\"a\", \"b\", \"c\", \"d\", \"e\"]";
char *s_dict_1 = "{\"a\": 1, \"b\": 2}";
mb_obj_t mp_list_1 = mp_call_function_1(mp_ujson_loads_fn, mp_obj_new_str(s_list_1, strlen(s_list_1)));
mb_obj_t mp_dict_1 = mp_call_function_1(mp_ujson_loads_fn, mp_obj_new_str(s_dict_1, strlen(s_dict_1)));
mp_obj_t mp_test_qstr_fn = mp_load_global(qstr_from_str("test_qstr"));
mp_call_function_2(mp_test_qstr_fn, mp_list_1, mp_dict_1)
In MPY REPL, define:
Code: Select all
list_1 = ['a', 'b', 'c', 'd', 'e']
dict_1 = {'a': 1, 'b': 2}
test_qstr(list_1, dict_1)
-
- Posts: 14
- Joined: Sun May 30, 2021 7:38 am
Re: How to add/cache string to qstr pool in runtime in C code
I got it. Just call qstr_from_str("a"), then QSTR(a) will be added to QSTR pool. But I dont know how to release these qstrs
I also wonder how to add string value, such as list_args[0], to QSTR pool in python code.
I also wonder how to add string value, such as list_args[0], to QSTR pool in python code.
Re: How to add/cache string to qstr pool in runtime in C code
That is indeed what mp_obj_new_str_via_qstr (or mp_obj_str_intern) uses as well. Difference being it returns an mp_obj_t, and you'll need that anyway if you want to store the qstr in a list. I.e. if you already have a list with "a" in it from JSON, the strings in that list will still need to be replaced qith qstr. Just calling qstr("a") only does that when called before parsing the JSON. Also note that qstr length is limited, so you cannot just throw arbitrary strings at it.I got it. Just call qstr_from_str("a")
I don't think mp_obj_str_intern or similar is exposed, you'd have to add that in a custom module. And again, replace all your existing strings.I also wonder how to add string value, such as list_args[0], to QSTR pool in python code.
There is no function for that.But I dont know how to release these qstrs
Just a note: this has little to do with the REPL, it's because of the parse/compile phase. In other words: works for any Python code, also in files. There's no way to have that though, especially not if you're parsing arbitrary strings from JSON.I just want it like in REPL.
If I run test_qstr 100000 times it takes about 47mSec with qstr and 48mSec without. So, ok, it seems faster, but nothing extreme at all. At least not for this particular test, on the unix port, on my PC.It seems to be slower so much
-
- Posts: 14
- Joined: Sun May 30, 2021 7:38 am
Re: How to add/cache string to qstr pool in runtime in C code
Thank stijn.
I found this issue. In that post, dpgeorge told that if string not interned, it will use strcmp. It's so slower than that comparing number by using interned string.
https://github.com/micropython/micropython/issues/4053
You will see that qstr is important for speed và saving memory. But as in that issue, I need to release qstr that I made. Currently it's impossible to delete qstr.
I already made a custom function to add new qstr to the pool from MPY program. But I want I could manage it.
In my test, I could create qstr automatically from code: Q(state_l1), Q(state_l2), Q(state_l3), Q(state_l4), Q(baterry), Q(contact) to the pool. These qstrs helps faster in processing data about 30-40% in my case.
I found this issue. In that post, dpgeorge told that if string not interned, it will use strcmp. It's so slower than that comparing number by using interned string.
https://github.com/micropython/micropython/issues/4053
You will see that qstr is important for speed và saving memory. But as in that issue, I need to release qstr that I made. Currently it's impossible to delete qstr.
I already made a custom function to add new qstr to the pool from MPY program. But I want I could manage it.
In my test, I could create qstr automatically from code: Q(state_l1), Q(state_l2), Q(state_l3), Q(state_l4), Q(baterry), Q(contact) to the pool. These qstrs helps faster in processing data about 30-40% in my case.
Code: Select all
>>> micropython.qstr_info(1)
qstr pool: n_pool=1, n_qstr=16, n_str_data_bytes=145, n_total_bytes=241
Q(O)
Q(I)
Q(S)
Q(B)
Q(L)
Q(N)
Q(F)
Q(D)
Q(uasyncio.core)
Q(uasyncio.event)
Q(state_l1)
Q(state_l4)
Q(state_l3)
Q(state_l2)
Q(battery)
Q(contact)
>>>