A review of serialisation libraries

Discussion about programs, libraries and tools that work with MicroPython. Mostly these are provided by a third party.
Target audience: All users and developers of MicroPython.
User avatar
pythoncoder
Posts: 3931
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

A review of serialisation libraries

Post by pythoncoder » Mon Feb 10, 2020 6:43 pm

I have written a doc in this repo describing the need for serialisation and summarising the relative advantages and drawbacks of four serialisation libraries available to MicroPython. These are:
  • ujson
  • pickle
  • ustruct
  • Protocol Buffers (a third party library)
There is a tutorial on Protocol Buffers. This library has a slightly challenging learning curve but offers unique advantages.

Any comments or corrections are welcome, here or on GitHub.
Peter Hinch

User avatar
dhylands
Posts: 3268
Joined: Mon Jan 06, 2014 6:08 pm
Location: Peachland, BC, Canada
Contact:

Re: A review of serialisation libraries

Post by dhylands » Tue Feb 11, 2020 5:35 pm

I've also recently been looking into flat buffers which offer some advantages over protocol buffers.
https://google.github.io/flatbuffers/

Admittedly, I've been using flatbuffers in rust, and not in python, but I wouldn't really expect the programming language to make too much difference.

stijn
Posts: 409
Joined: Thu Apr 24, 2014 9:13 am

Re: A review of serialisation libraries

Post by stijn » Tue Feb 11, 2020 7:07 pm

Another one for consideration would be MessagePack. Didn't compare it with other implementations performance-wise, but we've been using this for a while to interface with external code and haven't found problems so far. Reasons for use for our usecase: it's straightforward, has implementations in other code (been using it for interfacing with CPython and C++ and Lua). Why not json? Mainly because the standard doesn't have nan/inf: that's really problematic for numerical data. Why not protobuf? Quite involved to set up and maintain and essentially overkill.

User avatar
tve
Posts: 60
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: A review of serialisation libraries

Post by tve » Wed Feb 12, 2020 2:23 am

There's a huge performance difference between self-describing formats, such as json or messagepack and pre-defined formats, such as protobufs because optimized code can be generated for the latter while the former require some form of interpretation no matter what. However, when using a slow/dynamic language, such as python, it really doesn't make much of a difference...

User avatar
mattyt
Posts: 274
Joined: Mon Jan 23, 2017 6:39 am

Re: A review of serialisation libraries

Post by mattyt » Wed Feb 12, 2020 2:38 am

Looks good Peter, thanks!

But I tend to use FlatBuffers and MessagePack (and JSON/TOML for human-readable) depending on the use case. MessagePack when I want to allow dynamic messages to be constructed, FlatBuffers when implementing a protocol. At some point I'd like to see support for both of these formats in MicroPython...

Capt'n Proto and Protobuf are two others worth mentioning but have generally moved away from in favour of FlatBuffers.

User avatar
pythoncoder
Posts: 3931
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: A review of serialisation libraries

Post by pythoncoder » Wed Feb 12, 2020 8:31 am

tve wrote:
Wed Feb 12, 2020 2:23 am
There's a huge performance difference between self-describing formats, such as json or messagepack and pre-defined formats, such as protobufs because optimized code can be generated for the latter while the former require some form of interpretation no matter what. However, when using a slow/dynamic language, such as python, it really doesn't make much of a difference...
That depends on what you're doing with the data. For example if you're sending it over a radio link data volume can be crucial. Some channels such as LoraWan have bandwidth restrictions, others are just slow.
Peter Hinch

User avatar
pythoncoder
Posts: 3931
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: A review of serialisation libraries

Post by pythoncoder » Wed Feb 12, 2020 8:36 am

I wrote this because I have used all three official solutions but was intrigued by the properties of Protocol Buffers. In my testing the library works well. If there are good MicroPython implementations of any of these other schemes I'd be interested to see them.

Even better would be PR's to enhance my doc to describe their benefits and usage ;)
Peter Hinch

stijn
Posts: 409
Joined: Thu Apr 24, 2014 9:13 am

Re: A review of serialisation libraries

Post by stijn » Wed Feb 12, 2020 8:51 am

tve wrote:
Wed Feb 12, 2020 2:23 am
There's a huge performance difference between self-describing formats, such as json or messagepack and pre-defined formats, such as protobufs
I find that pretty hard to believe. At least in the case of MessagePack. And also because it's a super general non-qualified claim. I searched around a bit and also didn't find anything supporting it, sometimes even the opposite.

stijn
Posts: 409
Joined: Thu Apr 24, 2014 9:13 am

Re: A review of serialisation libraries

Post by stijn » Wed Feb 12, 2020 9:04 am

pythoncoder wrote:
Wed Feb 12, 2020 8:36 am
If there are good MicroPython implementations of any of these other schemes I'd be interested to see them.
You mean implementations of MessagePack etc?
You can pip install msgpack and add the module to MicroPython's search path. Modify exceptions.py so that it doesn't use multiple inheritance and then it works.
There's also u-msgpack-python but I do not remember the modifications required to make it work on MicroPython. It was twice as slow as msgpack-python for our use case tough.

User avatar
tve
Posts: 60
Joined: Wed Jan 01, 2020 10:12 pm
Location: Santa Barbara, CA
Contact:

Re: A review of serialisation libraries

Post by tve » Thu Feb 13, 2020 5:02 pm

stijn wrote:
Wed Feb 12, 2020 8:51 am
tve wrote:
Wed Feb 12, 2020 2:23 am
There's a huge performance difference between self-describing formats, such as json or messagepack and pre-defined formats, such as protobufs
I find that pretty hard to believe. At least in the case of MessagePack. And also because it's a super general non-qualified claim. I searched around a bit and also didn't find anything supporting it, sometimes even the opposite.
Fair enough. See https://github.com/alecthomas/go_serial ... benchmarks and for the results the tables are easiest to read in the raw doc: https://raw.githubusercontent.com/alect ... /README.md The time/iter column shows the time taken to serialize a struct with random data.

Some examples: std messagepack 1958 ns/iter while a special pre-compiled version of messagepack: 420 ns/iter. Protobuf: 319 ns/iter. Json: 4892 ns/iter. A lot of the performance difference comes from dynamic type dispatch, either to figure out what needs to be serialized or to figure out what needs to be allocated to deserialize. These benchmarks are obviously somewhat specific to Golang but the principle that interpreting types (whether using language introspection or some explicit type specifier as used in python's struct) is always going to have a penalty. Now, in python (or javascript or ruby) the difference may be insignificant due to the overall overhead of the language.

Post Reply