A review of serialisation libraries

Discussion about programs, libraries and tools that work with MicroPython. Mostly these are provided by a third party.
Target audience: All users and developers of MicroPython.
stijn
Posts: 735
Joined: Thu Apr 24, 2014 9:13 am

Re: A review of serialisation libraries

Post by stijn » Thu Feb 13, 2020 6:55 pm

It's also going to depend on the type of data. So this got me interested and I quickly compared it for Python for data we typically use, and for Protobuf vs MessagePack (the pure Python implementation and not the binary one, to make it fair) the results here are MessagePack being a tiny bit faster than Protobuf. But Protobuf also gives you type checking etc. And both are completely destroyed by json, being a native implementation, which does the trick in like 1 second. Or possibly I messed up what I'm doing, it's late already and I'm not sure the comparisions are fair. Code is like:

mydata.proto

Code: Select all

syntax = "proto2";

package tutorial;

message Data {
  required string id = 1;
  required string type = 2;
  repeated float values1 = 3;
  repeated int32 values2 = 4;
}
test.py

Code: Select all

import os
os.environ['MSGPACK_PUREPYTHON'] = '1'
import mydata_pb2
import msgpack
import cProfile
import json

# Replicate class we use with protobuf.
class Data:
  def __init__(self, **kwargs):
    self.id = kwargs.get('id', '')
    self.type = kwargs.get('type', '')
    self.values1 = kwargs.get('values1', [])
    self.values2 = kwargs.get('values2', [])

def PopulateData(d):
  d.id = 'foo'
  d.type = 'bar'
  d.values1.extend([float(x) for x in range(1000)])
  d.values2.extend([x for x in range(1000)])
  return d

rawData = PopulateData(Data())
protoData = PopulateData(mydata_pb2.Data())

iters = 1000

def RunProtoBuf():
  for _ in range(iters):
    mydata_pb2.Data().ParseFromString(protoData.SerializeToString())

def RunMessagePack():
  for _ in range(iters):
    Data(**msgpack.unpackb(msgpack.packb(rawData.__dict__), raw=False))

def RunJSon():
  for _ in range(iters):
    Data(**json.loads(json.dumps(rawData.__dict__)))

cProfile.run('RunProtoBuf()')
cProfile.run('RunMessagePack()')
cProfile.run('RunJSon()')
results in

Code: Select all

32723695 function calls (32723688 primitive calls) in 16.911 seconds
29686004 function calls (25670004 primitive calls) in 14.861 seconds
21004 function calls in 0.806 seconds
tldr; same old story: first check your requirements, then test a couple of implementations for your specific usecase, then choose.

Post Reply