bits to bytes

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
User avatar
jimmo
Posts: 2754
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia
Contact:

Re: bits to bytes

Post by jimmo » Thu Jul 07, 2022 2:38 am

KJM wrote:
Thu Jul 07, 2022 2:00 am
converts to b'\xf7\xf0' <class 'bytes'> and takes 921 ms to send
Well that's the correct data in the correct format. I suspect the delay is entirely the LoRa radio... LoRa bitrates can be very low, like tens of bits per second. Remember the radio is sending more than just your two bytes, there's other headers and stuff.

KJM
Posts: 158
Joined: Sun Nov 18, 2018 10:53 pm
Location: Sydney AU

Re: bits to bytes

Post by KJM » Thu Jul 07, 2022 7:02 am

I double checked this to make sure I got it correct

Code: Select all

def _Tx(r): s.send(M); m=lora.stats(); print('P(dBm)',m[6], ' sf',m.sftx, ' Tx(ms)',m[7]); print()

bis='1111011111111010'
d=int(bis, 2); h=hex(d); r=chr(int(h)); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
d=int(bis, 2); h=hex(d); r=(h)[2:7]; print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
r=struct.pack('<H', sum(int(b)*(1<<(i^7)) for i, b in enumerate(bits))); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)

bys  <class 'str'> 1 bytes
P(dBm) 20  sf 12  Tx(ms) 992

bys f7fa <class 'str'> 4 bytes
P(dBm) 20  sf 12  Tx(ms) 992

bys b'\xf7\xfa' <class 'bytes'> 2 bytes
P(dBm) 20  sf 12  Tx(ms) 992

Looks like no matter how I represent the contact closures it takes the same amount of time to send their status via raw lora. I'd lean towards the chr representation though because it sends less bytes.

Interestingly, to me anyway, if I add encryption the situation changes

Code: Select all

def _Tx(r):
  key=b'must be 16 chars'; iv=crypto.getrandbits(128); cipher=AES(key, AES.MODE_CFB, iv)
  code=cipher.encrypt(r); M=iv+code; print(repr(r),'-->',iv,'+',code,'=',M,len(M),'bytes')
  s.send(M); m=lora.stats(); print('P(dBm)',m[6], ' sf',m.sftx, ' Tx(ms)',m[7]); print()


bis='1111011111111010'
d=int(bis, 2); h=hex(d); r=chr(int(h)); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
d=int(bis, 2); h=hex(d); r=(h)[2:7]; print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
r=struct.pack('<H', sum(int(b)*(1<<(i^7)) for i, b in enumerate(bits))); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)

bys  <class 'str'> 1 bytes
'\uf7fa' --> b'i\xd3+\xaa\xa3\xf0\xae\xe4Ga\xda\xa3\xb7R.1' + b'\xf8\xe2c' = b'i\xd3+\xaa\xa3\xf0\xae\xe4Ga\xda\xa3\xb7R.1\xf8\xe2c' 19 bytes
P(dBm) 20  sf 12  Tx(ms) 1319

bys f7fa <class 'str'> 4 bytes
'f7fa' --> b'\x0bK\x13\x03#\xd4\x8e&\xd4\xd0@\xe9\xe7\x9f\xfc\x9a' + b'\xe1\xa5\x0f\xc6' = b'\x0bK\x13\x03#\xd4\x8e&\xd4\xd0@\xe9\xe7\x9f\xfc\x9a\xe1\xa5\x0f\xc6' 20 bytes
P(dBm) 20  sf 12  Tx(ms) 1319

bys b'\xf7\xfa' <class 'bytes'> 2 bytes
b'\xf7\xfa' --> b'\xb2\xa2\xd9\xddnf\xaa\xa6\x94\xfc\xe3I*\x0fY9' + b'\x03\xf6' = b'\xb2\xa2\xd9\xddnf\xaa\xa6\x94\xfc\xe3I*\x0fY9\x03\xf6' 18 bytes
P(dBm) 20  sf 12  Tx(ms) 1319
and struct sends less bytes.

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: bits to bytes

Post by Roberthh » Thu Jul 07, 2022 7:42 am

At SF12, the data rate is 250 bit/s. And, as Jimmo said, LORA adds quite a bit for protocol overhead to the messages. When you encrypt, your payload gets larger, because you have to send a full encrypted frame of 16 bytes. and then it does not matter whether you send 1 or 16 bytes. At SF12, the difference in transmission time between encrypted vs. clear accounts for 12 bytes data length.

KJM
Posts: 158
Joined: Sun Nov 18, 2018 10:53 pm
Location: Sydney AU

Re: bits to bytes

Post by KJM » Thu Jul 07, 2022 9:13 am

Something is troubling me Rob. It should take 2 bytes to represent 16 bits. But if I convert my 16 bit string of 1s & 0s to a character with d=int(bis, 2); bys=hex(d); r=chr(int(bys)) & send that over raw lora (without encryption) the receiver says it's received 1 byte. How is it possible to cram 65526 combinations into a single byte? I mean there could be 65536 unique characters but I can't understand how they can all be sent as variations of a single byte? To my mind a single byte can only handle 8 bits not 16?

User avatar
karfas
Posts: 193
Joined: Sat Jan 16, 2021 12:53 pm
Location: Vienna, Austria

Re: bits to bytes

Post by karfas » Thu Jul 07, 2022 9:25 am

KJM wrote:
Thu Jul 07, 2022 9:13 am
Something is troubling me Rob. It should take 2 bytes to represent 16 bits. But if I convert my 16 bit string of 1s & 0s to a character with d=int(bis, 2); bys=hex(d); r=chr(int(bys)) & send that over raw lora (without encryption) the receiver says it's received 1 byte. How is it
I would assume that chr(something) will return ONE character.

And I wonder that this doesn't raise exceptions. With CPython (have no micropython to play with here), I get:

Code: Select all

>>> bis='1111011111111010'
>>> d = int(bis, 2)
>>> hex(d)
'0xf7fa'
>>> int(hex(d))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '0xf7fa'
A few hours of debugging might save you from minutes of reading the documentation! :D
My repositories: https://github.com/karfas

User avatar
karfas
Posts: 193
Joined: Sat Jan 16, 2021 12:53 pm
Location: Vienna, Austria

Re: bits to bytes

Post by karfas » Thu Jul 07, 2022 9:43 am

karfas wrote:
Thu Jul 07, 2022 9:25 am
I would assume that chr(something) will return ONE character.
Found out that chr(something) might return an Unicode character (more than one byte).

You can also use bit operations to split the large integer to real bytes, like in:

Code: Select all

>>> bis='1111011111111010'
>>> d = int(bis, 2)
>>> d >> 8
247
>>> (hex(d >> 8), hex(d & 0xff))
('0xf7', '0xfa')
A few hours of debugging might save you from minutes of reading the documentation! :D
My repositories: https://github.com/karfas

User avatar
Roberthh
Posts: 3667
Joined: Sat May 09, 2015 4:13 pm
Location: Rhineland, Europe

Re: bits to bytes

Post by Roberthh » Thu Jul 07, 2022 9:47 am

r=chr(int(bys))
Creates a single byte. To create a two byte representation of your integer, use struct.pack(), like @jimmo suggested, or you can do a 'manual' packing like:

Code: Select all

msg = bytearray(2)
msg[0] = d >> 8
msg[1] = d & 0xff

User avatar
karfas
Posts: 193
Joined: Sat Jan 16, 2021 12:53 pm
Location: Vienna, Austria

Re: bits to bytes

Post by karfas » Thu Jul 07, 2022 10:24 am

Roberthh wrote:
Thu Jul 07, 2022 9:47 am
r=chr(int(bys))
Creates a single byte.
No.
This might be the case in micropython (and would be a bug, I think).

CPython(3) returns a (Unicode!) character. This might be a byte. See https://docs.python.org/3/library/functions.html#chr
chr(i)
Return the string representing a character whose Unicode code point is the integer i. For example, chr(97) returns the string 'a', while chr(8364) returns the string '€'. This is the inverse of ord().
For the value we are talking about (int('1111011111111010', 2) => 0xf7fa), this is for sure not (can't be) a byte.

I'm nitpicking here because the whole thread mixes bytes, characters and the integer representation of these things.
A few hours of debugging might save you from minutes of reading the documentation! :D
My repositories: https://github.com/karfas

martincho
Posts: 96
Joined: Mon May 16, 2022 9:59 pm

Re: bits to bytes

Post by martincho » Thu Jul 07, 2022 3:10 pm

KJM wrote:
Thu Jul 07, 2022 7:02 am

Code: Select all

def _Tx(r): s.send(M); m=lora.stats(); print('P(dBm)',m[6], ' sf',m.sftx, ' Tx(ms)',m[7]); print()
bis='1111011111111010'
d=int(bis, 2); h=hex(d); r=chr(int(h)); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
d=int(bis, 2); h=hex(d); r=(h)[2:7]; print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
r=struct.pack('<H', sum(int(b)*(1<<(i^7)) for i, b in enumerate(bits))); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
def _Tx(r):
  key=b'must be 16 chars'; iv=crypto.getrandbits(128); cipher=AES(key, AES.MODE_CFB, iv)
  code=cipher.encrypt(r); M=iv+code; print(repr(r),'-->',iv,'+',code,'=',M,len(M),'bytes')
  s.send(M); m=lora.stats(); print('P(dBm)',m[6], ' sf',m.sftx, ' Tx(ms)',m[7]); print()
bis='1111011111111010'
d=int(bis, 2); h=hex(d); r=chr(int(h)); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
d=int(bis, 2); h=hex(d); r=(h)[2:7]; print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
r=struct.pack('<H', sum(int(b)*(1<<(i^7)) for i, b in enumerate(bits))); print('bys', r, type(r), len(r), 'bytes'); _Tx(r)
Note: This isn't criticism, just a request to make it easier to be able to help anyone with coding questions.

Code like this his is very difficult to read and understand. I'd have to copy this code to an editor and break it down into properly indented individual lines and whitespace. Otherwise it looks like a wall of gibberish. Yes, I did mash-together three separate paragraphs in the quote to illustrate the point further. Sometimes taking things to extremes makes things clearer.

If anyone is doing this to post (thinking it makes it more compact) , I would strongly advise against it. Just post easy to read code. Consider that people who really want to help have to devote an amount of time and effort to read and understand your code. Take the time to make it easy and painless for them to understand and navigate. Clean and clear formatting. Comments. No cryptic variable, function or class names, etc.

The alternative, for the reader, is to take the time and make the effort to create a file, copy, paste and edit the code --and maybe even add their own comments, rename variables, etc.-- just so that they can begin to understand it and then try to help.

When I post I do my best to write something anyone can copy, paste and run without modifications and see the problem. Sometimes this means doing some work to reduce problems to the most fundamental minimal case (which can take a lot of time and testing) and removing any hardware/library dependencies. In other words, I really try my best to help those willing to look at my code understand what I am talking about and my code with the least possible effort and in the shortest possible time.

Frankly, I'd advise anyone not to ever do this in real code. If you need code minified, that can be done in post-processing and code can become really compact with super short variable names. I never, ever, use short variable names except for quick iteration constants or things like file pointers (i, j, n, fp, etc.). All my variable names --in any language-- are as long and descriptive as necessary to allow anyone to read the code now or in ten years and easily understand intent and meaning, for example:

Code: Select all

self._pwm_max_rate_of_change

# rather than...

self._pmrc
When minified, this might become "self.b2". I never have to deal with minified code; the short names don't have to mean anything. I've had to come back to my own code 10 to 15 years later and I am always glad I err on the side of more information.

If you can edit or re-post this code in a clear and easy-to-parse form and, perhaps, add some comments clarifying intent where appropriate, I am happy to have a look and see if I can help.

TheSilverBullet
Posts: 50
Joined: Thu Jul 07, 2022 7:40 am

Re: bits to bytes

Post by TheSilverBullet » Thu Jul 07, 2022 3:39 pm

Code: Select all

#!/usr/bin/env python3
from time import ticks_us, ticks_diff

def converter(arr):
    n = 0
    for b in arr:
        n <<= 1
        n |= 1 if b else 0
    return bytearray(((n >> 8), n & 0xff))

bits = [1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,0]

result = converter(bits)
print(type(result), result, hex(result[0]), hex(result[1]))

t0 = ticks_us()
for _ in range(1000):
    result = converter(bits)
t1 = ticks_us()
print(f'time: {ticks_diff(t1,t0)//1000} µs/call ')
 
# <class 'bytearray'> bytearray(b'\xf7\xfa') 0xf7 0xfa
# time: 183 µs/call 
Here's another thought…

Post Reply