Fast crc8

Discussion about programs, libraries and tools that work with MicroPython. Mostly these are provided by a third party.
Target audience: All users and developers of MicroPython.
User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Fast crc8

Post by Roberthh » Fri Feb 12, 2021 4:33 pm

Since the crc's can be designed in various ways, I tried one with right-shift's, which is a little bit faster, since the masking is not required

Code: Select all

POLYNOMIAL = 0x8408
PRESET = 0x0000

def _initial(c):
    crc = 0
    for j in range(8):
        if (crc ^ c) & 0x1:
            crc = (crc >> 1) ^ POLYNOMIAL
        else:
            crc = crc >> 1
        c = c >> 1
    return crc

import array
_tab = array.array("H", [_initial(i) for i in range(256)])

@micropython.native
def crc16(str):
    crc = PRESET
    for c in str:
        crc = (crc >> 8) ^ _tab[(crc ^ c) & 0xff]
    return crc

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: Fast crc8

Post by kevinkk525 » Fri Feb 12, 2021 5:38 pm

Thanks so much for your efforts! This version is indeed 700-900us faster on 500Bytes than the old one (once you make _tab a local variable otherwise the access to a global variable makes it slower).
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Fast crc8

Post by Roberthh » Fri Feb 12, 2021 8:34 pm

Two notes:
a) If you use in the last version (table with right shift) as POLYNIAL the value 0x8408, then you get the same values as with the first variant without tables. It puzzled me that the resutls were different. If you look at the bit pattern 0x8408 and 0o10201 = 0x1081 are quite similar.

b) One can also use a linear congruential generator (LCG) as hash function. It is for instance used in the hash method of micropython.
two variants:

b.1)

Code: Select all

@micropython.native
def hash(str):
    result = 0xceed

    for v in str:
        result = (result * 17 ^ v) & 0xffff
    return result
b.2) The same with an additional lookup table, which is told to be better. The lookup table is just a table of random values.

Code: Select all

@micropython.native
def hash(str):
    table = b'\\~A!\x90\xb2\xe03H\x81rQI\xd3#\'\x83b\x90\xe8"+s56\x1b2u'\
            b'\xe4)[t\x01\x0f\x8fR\x04&&[X\xfe\xbc\x80e64\r\x1b\xc1\xa0\x951\xec7'\
            b'\x1f\x00\xc9\xb7q@\xfa\'P\x01?&\x04Q\xa7\x17\xafS\xe6\xcf\xad\xa1i'\
            b'\xb6Drs\tO\xe5i\xf9\xc8_\x18\xdb\xad;\xdbq\xca\xcc\xe4\xbd\x0c\x84Gv'\
            b'\xfd\xeb\xfdq\xcd\x04\xa3B\x1e\xa0M\xb4\xfe\x82\xf6\x0c\xdf\xd73\xef'\
            b'\xdc\xbb\x855\x8b^D\xd7\x10\xf2I\x80\xea\x95\x99n\xe2\xeb\x11\xec\xf7'\
            b'\xed\x17\x15y\xd8F\x97\xf13\xe9\xc6\xa0\x91\x0eh\x00\xbd\xa1<\x1f\x9d'\
            b'\x0f\xbb\x07\x16\xe3\x07\xc1\xc0\xee\xc4\t\xc0\xd3\xc2\xeaqQ?\x84h\x95'\
            b'\x08\x98\xf6\xf3\xab8G\x86\\\xba\xa3\'M\xa0\x82\x99\xc0\xe3*:\x88\x8f'\
            b'\xb0=\xe7\xca\xfe\t\xd5\x87\xf6\xe5\xb7\xd6\xeaL@\xe3\x06\xfa\xc3\xb4'\
            b'\xf4E\xa3<W\x92m\xb8Y\x15\xae\x91G\xcf\x1e\xc2\x12\x9f$\x99%\x1a]\xef\x0bjE\x
    result = 0xceed

    for v in str:
        result = (result * 17 ^ table[v]) & 0xffff
    return result
In my test, the numbers on a rp2040 for 400 bytes are
4.6 ms for the non.table CRC
4.0 ms for the table based crc, table locally linked
1.7 ms for the hash without table
2.2 ms for the hash with table

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: Fast crc8

Post by kevinkk525 » Fri Feb 12, 2021 8:49 pm

This is an interesting approach.

For 400 Bytes on the esp32 I get:
3ms for the table based crc, table outside the function but locally cached
2.5ms for the hash without table
2.8ms for the hash with table


These are interesting results. The RP2040 is a lot slower on the crc calculations but quite a bit faster on the hash functions.

I think I'll go with the hash function then, since the client will be running on the RP2040.
How much better is using the lookup table for data <500 Bytes?
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Fast crc8

Post by Roberthh » Fri Feb 12, 2021 8:56 pm

The lookup table has to have a size of 256 bytes for all possible values in a byte of the message.
The 17 is just a prime number, not specifically chosen. It just has to be a prime smaller than 8192, such that the multiplication does not overflow a small int.
Edit: The table in the hash function could be made more carefully as a random permutation of all possible values. But for now I just made it as a first shot for testing.

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: Fast crc8

Post by kevinkk525 » Fri Feb 12, 2021 8:59 pm

Thanks for that explanation. Guess I might have expressed my question unclear though.
Roberthh wrote:
Fri Feb 12, 2021 8:34 pm
The same with an additional lookup table, which is told to be better
How much better is that table? I can't grasp how important that table is. Like is the hash without table worse than the crc16?
Especially for packets that will likely not be bigger than 500Bytes.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Fast crc8

Post by Roberthh » Fri Feb 12, 2021 9:07 pm

I'm not sure that the hash with table is better than the hash without table. There are recommendations to do so, but ...

Hashes may works equally to crc for that purpose. I am not a mathematician, so I cannot really prove. I just used both variants and both worked. CRC has been used long time because it can be easily implemented in hardware for bit streams. For any cryptographic applications, hash functions are used. Obviously not such a simple one.

So.. end of story for today.

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: Fast crc8

Post by kevinkk525 » Fri Feb 12, 2021 9:23 pm

Ok, so I guess for my use-case a simple hash without the table will be good enough then.

Thank you so much for all your help and answers! This has been immensely helpful.
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: Fast crc8

Post by Roberthh » Sat Feb 13, 2021 1:47 pm

No algorithm without testing. So I made a test series, filling a 512 byte block with random value and then changing the content of the data at one to four places with following variants:
- changing at random or consecutive places
- replace by a random number of by incrementing the existing value
Each modificaton was repeated 100000 times, and the events counted, where the changed data had the same checksum.
For the hash algorithms the prime number was changed from 17 to 73, which gave much better results. Other values are fine too, like 227, as long as it is a prime number < 256.
Table below

Code: Select all

            Increment Test             # of trials   CRC16   Simple hash   Tabled hash  
Single byte change at random position    1000000       0          0             0       
Two byte change at random positions      100000        0          3             2       
Two byte change at consecutive position  100000        0          2             9       
Three bytes changed at random positions  100000        2          0             0       
Three bytes changed at consecutive posi  100000        0          0             1       
Four bytes changed at random positions   100000        1          2             3       
Four bytes changed at consecutive posit  100000        0          3             9       
                                                                         
        Random replace                                                           
Single byte change at random position    1000000       0          0             0       
Two byte change at random positions      100000        2          3             3       
Two byte change at consecutive position  100000        0          5             2       
Three bytes changed at random positions  100000        1          1             1       
Three bytes changed at consecutive posi  100000        1          1             2       
Four bytes changed at random positions   100000        1          1             1       
Four bytes changed at consecutive posit  100000        2          1             2       
As you can see, the CRC16 performed best. Next is the simple hash with 73 as prime. There is a interesting peak at two byte changes for consecutive places. Changing more places did not change the figure much. The collisions count stayed at 1 - 2.

kevinkk525
Posts: 969
Joined: Sat Feb 03, 2018 7:02 pm

Re: Fast crc8

Post by kevinkk525 » Sat Feb 13, 2021 2:49 pm

Wow that is a very thorough test!

When you say "single byte change" do you mean a whole byte has changed or just a single bit within that byte?

Quite interesting that the tabled hash is worse than the simple hash. But crc16 beats them all..
I'd stick with crc16 if not for the pico being so slow with crc16 and impressively fast with the simple hash.

A bit funny that I'm thinking about the best algorithm for ensuring data integrity in a project whose "competitors" (adafruit esp32spi and espressif AT firmwae) don't even use any kind of checksum. :lol:
Kevin Köck
Micropython Smarthome Firmware (with Home-Assistant integration): https://github.com/kevinkk525/pysmartnode

Post Reply