Reverse engineer Micropython bytecode (.mpy) vs Python bytecode (.pyc)

General discussions and questions abound development of code with MicroPython that is not hardware specific.
Target audience: MicroPython Users.
Post Reply
User avatar
Dugite
Posts: 21
Joined: Thu Jan 18, 2018 1:29 pm

Reverse engineer Micropython bytecode (.mpy) vs Python bytecode (.pyc)

Post by Dugite » Fri Nov 15, 2019 2:19 pm

I would like to know how easy it is to reverse engineer Micropython bytecode (.mpy) vs Python bytecode (.pyc)?

I have a product which I will need to supply updates in DFU format for the STM32 and while you can set fuses to prevent reading the flash memory on the STM32, I am interested to know how easy it would be to reverse engineer the DFU file which was being distributed for updates?

Thank you, any information on this would be appreciated :)

OutoftheBOTS_
Posts: 847
Joined: Mon Nov 20, 2017 10:18 am

Re: Reverse engineer Micropython bytecode (.mpy) vs Python bytecode (.pyc)

Post by OutoftheBOTS_ » Fri Nov 15, 2019 9:32 pm

any code anywhere can be reversed engineered. I have seen many tools on the net for the purpose of inspecting machine instructions but you need to be a pretty cleaver cookie to use these tools to reverse engineer whats going on.

User avatar
jimmo
Posts: 2754
Joined: Tue Aug 08, 2017 1:57 am
Location: Sydney, Australia
Contact:

Re: Reverse engineer Micropython bytecode (.mpy) vs Python bytecode (.pyc)

Post by jimmo » Mon Nov 18, 2019 1:48 am

If the question is to convert a .mpy file to a disassembly -- i.e. from bytecode into text description of opcodes and arguments (e.g. load_fast "foo"), then it's not particularly complicated and doesn't use complicated encoding or variable-length patterns* (I've had to do this by hand for short sequences while debugging). I'm not aware of any pre-existing tools though, but you could probably write a tool to do this in Python. I'd love to see a radare plugin :)

The thing about Python though, is once you have the bytecode, converting it into something that looks like Python is not nearly as complicated as other languages. i.e. x64 assembly back to C is _very_ challenging. The main thing you lose is the names of function-local variables.

* the one exception is the way the function preamble is encoded where there's some compression, but again the scheme isn't overly complicated, and the code for it is obviously available in the VM.

Post Reply