improve time machine

nicho2 · Post by **nicho2** » Wed Jul 06, 2016 8:10 am

Hello,

I set/reset a pin in an interrupt to see the time of the interrupt.
the frequency of the CPU is 168MHz (STM32F407 Discovery)
I enter in the interrupt all 34.7µs (it's OK)
But the width of the impulsion is 4.8µs with all optimizations (it's too bigger).
In C, the impulsion is several nano seconds.
Is someone has an idea to improve again the speed ???

class Driver_SCS(object):
def __init__(self,InputPinId,OutputPinId,Timer=4):
self.timer = pyb.Timer(Timer, freq=28800) # create a timer object using timer 4 - trigger at 1Hz
self.timer.callback(self.SCS_tick) # set the callback to our tick function

@micropython.native
def SCS_tick(self,tim): # le callback passe le timer
"""
appellé toutes les 34.7µs
"""
import stm
BIT15 = const(1 << 15)
stm.mem16[stm.GPIOD + stm.GPIO_ODR] = BIT15
stm.mem16[stm.GPIOD + stm.GPIO_ODR] ^= BIT15
#self.Pin_TX.value(True)
#self.Pin_TX.value(False)
return

Post by **torwag** » Wed Jul 06, 2016 10:36 am

Does this happens with the code below?
Just to understand your case better

You get a 4.8us pulse calling the function below but you expect a much smaller pulse since you flip the I/O pin forth and back directly.
In C you get at least 5.95 ns right? That would mean one clock cylce. I expect even a few cylces more.
Did you test what happens if you add a time.sleep_us(2) in between the I/O flipping? Does that create any change? If it adds up, one of the commands (or both) might take simply too long for some reasons. If this does not add up, it might indicated some sort of timing problem.
What happen if you call your SCS_tick function simply in a loop several hundred times, without using a timer?

nicho2 · Post by **nicho2** » Wed Jul 06, 2016 11:29 am

it's with the code below!!

if i do a loop:

while (1):
scs.mon_driver_SCS.SCS_tick(0)

the pulse width is also 4.8µs and the time between 2 pulse is 45µs

if i insert a sleep:
@micropython.native
def SCS_tick(self,tim): # le callback passe le timer
import stm
import time
BIT15 = const(1 << 15)
stm.mem16[stm.GPIOD + stm.GPIO_ODR] = BIT15
time.sleep_us(2)
stm.mem16[stm.GPIOD + stm.GPIO_ODR] ^= BIT15
return

the pulse width is now 11µs (very instable) and some timer interrupt are lost.

if i try:
@micropython.native
def SCS_tick(self,tim): # le callback passe le timer
import stm
BIT15 = const(1 << 15)
stm.mem16[stm.GPIOD + stm.GPIO_ODR] = BIT15
pyb.udelay(2)
stm.mem16[stm.GPIOD + stm.GPIO_ODR] ^= BIT15
return
here the width pulse is 15µs and the time between is 34.7µs (OK)

if i write pyb.freq(), i'm really at 168MHz:
(168000000, 168000000, 42000000, 84000000)

For me it's very strange

Regards

Roberthh · Post by **Roberthh** » Wed Jul 06, 2016 1:03 pm

You could try to use viper code, which is translated directly into machine instruction, like:

Code: Select all

BIT15 = const(1 << 15)
import stm

@micropython.viper        
def SCS_tick(self, tim):
    gpiod = ptr16(stm.GPIOD + stm.GPIO_BSRRL)
    gpiod[0] = BIT15     # set BIT15 high
    gpiod[1] = BIT15     # set BIT15 low

That should result in a pulse width of about 20 ns. Calling the function takes about 4µs. I used this kind of code for fast I/O to a device. I did not test it as a callback function yet. I noticed that the memxx fucntions are rather slow.

dhylands · Post by **dhylands** » Wed Jul 06, 2016 4:26 pm

You have to keep in mind that there are other interrupts firing in the background which may cause jitter in your timings. You either need to disable interrupts or use the HW to get jitter free pulse widths.

nicho2 · Post by **nicho2** » Thu Jul 07, 2016 6:21 am

Roberthh wrote:You could try to use viper code, which is translated directly into machine instruction, like:
Code: Select all
BIT15 = const(1 << 15)
import stm

@micropython.viper        
def SCS_tick(self, tim):
    gpiod = ptr16(stm.GPIOD + stm.GPIO_BSRRL)
    gpiod[0] = BIT15     # set BIT15 high
    gpiod[1] = BIT15     # set BIT15 low
That should result in a pulse width of about 20 ns. Calling the function takes about 4µs. I used this kind of code for fast I/O to a device. I did not test it as a callback function yet. I noticed that the memxx fucntions are rather slow.

Thanks.
Now the pulse width is 48ns.
I continue my evaluation

Roberthh · Post by **Roberthh** » Thu Jul 07, 2016 9:09 am

Hello @nicho2. If 48ns is still too long, you may use inline assembler. Sample sketch:

Code: Select all

BIT15 = const(1 << 15)
import stm

@micropython.asm_thumb
def SCS_tick(r0, r1):  # r0: ptr to self, r1: is tim
# set up pointers to GPIO
# r5: bit mask for control lines
# r7: GPIOD BSSRL register ptr

    movwt(r7, stm.GPIOD)
    add (r7, stm.GPIO_BSRRL)
    mov(r5, BIT15)

    strh(r5, [r7, 0])  # BIT15 high
    strh(r5, [r7, 2])  # BIT15 low

You may have also a look at this part of the manual: docs.micropython.org/en/latest/pyboard/reference/speed_python.html

MicroPython Forum (Archive)

improve time machine

improve time machine

Re: improve time machine

Re: improve time machine

Re: improve time machine

Re: improve time machine

Re: improve time machine

Re: improve time machine