Page 1 of 5

Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Mon Feb 01, 2016 10:02 pm
by mad474
In THE german microcontroller forum a poor micropython guy is under heavy pressure lying behind by factor 500
https://www.mikrocontroller.net/topic/388161#4450525

Any suggestions to get better results in

Code: Select all

Micro Python v1.3.9-44-gb5a790d on 2015-02-08; PYBv1.0 with STM32F405RG
Type "help()" for more information.
>>> def togglePerformance():
...     stop = pyb.millis() + 1000
...     count = 0
...     while pyb.millis() < stop:
...         pyb.LED(1).toggle()
...         count += 1
...     print("Counted: ", count)
... 
>>> togglePerformance()
Counted:  39715
>>> togglePerformance()
Counted:  39720
>>>
are welcome. Or post an alternative toggle counter there directly. No login required.
Thanks!

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Mon Feb 01, 2016 11:23 pm
by Damien
You could try viper mode, or inline assembler (depending on the rules of the competition....).

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Mon Feb 01, 2016 11:58 pm
by dhylands
Here is a couple of improved versions:

Code: Select all

import pyb

def togglePerformance():
    stop = pyb.millis() + 1000
    count = 0
    millis = pyb.millis
    toggle = pyb.LED(1).toggle
    while millis() < stop:
        toggle()
        count += 1
    print("Counted: ", count)

@micropython.native
def togglePerformance2():
    stop = pyb.millis() + 1000
    count = 0
    millis = pyb.millis
    toggle = pyb.LED(1).toggle
    while millis() < stop:
        toggle()
        count += 1
    print("Counted: ", count)

togglePerformance()
togglePerformance2()
which yields the following:

Code: Select all

>>> import togglePerf
Counted:  130893
Counted:  214670
To be comparable to C, then inline assembler, could be used.

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 12:53 am
by dhylands
Here's my solution including inline assembler:

Code: Select all

import pyb

def togglePerformance():
    stop = pyb.millis() + 1000
    count = 0
    millis = pyb.millis
    toggle = pyb.LED(1).toggle
    while millis() < stop:
        toggle()
        count += 1
    print('Counted: {:10,} (bytecode)'.format(count))


@micropython.native
def togglePerformance2():
    stop = pyb.millis() + 1000
    count = 0
    millis = pyb.millis
    toggle = pyb.LED(1).toggle
    while millis() < stop:
        toggle()
        count += 1
    print('Counted: {:10,} (native)'.format(count))

@micropython.viper
def togglePerformance3Core(stop):
    count = 0
    millis = pyb.millis
    toggle = pyb.LED(1).toggle
    while millis() < stop:
        toggle()
        count += 1
    print('Counted: {:10,} (viper)'.format(count))

def togglePerformance3():
    stop = pyb.millis() + 1000
    count = togglePerformance3Core(stop)

@micropython.asm_thumb
def _counter():
    mov(r0, 0)
    movwt(r1, stm.GPIOA)    # LED(1) is A13
    add(r1, stm.GPIO_BSRRL)
    movw(r5, 1 << 13)       # r5 has mask for BSRR register
    movwt(r2, stm.TIM2)
    ldr(r3, [r2, stm.TIM_CNT])
    movw(r4, 2000)
    add(r4, r4, r3)         # r4 is our ending count
# loop
    label(loop)
    strh(r5, [r1, 0])       # Turn the LED on
    add(r0,1)               # increment counter
    strh(r5, [r1, 2])       # Turn the LED off
    add(r0,1)               # increment counter
    ldr(r3, [r2, stm.TIM_CNT])
    cmp(r3, r4)
    bls(loop)

def togglePerformance4():
    # Setup a timer to increment twice per millisecond
    # Timer 2 runs at 84 MHz, so we divide by 42000 and when the count
    # gets to 2000 then one second will have passed.
    t2 = pyb.Timer(2, prescaler=41999, period=0x0fffffff)
    count = _counter()
    print('Counted: {:10,} (asm_thumb)'.format(count))

togglePerformance()
togglePerformance2()
togglePerformance3()
togglePerformance4()
which produces this as output:

Code: Select all

MicroPython v1.6 on 2016-02-01; PYBv1.0 with STM32F405RG
Type "help()" for more information.
>>> 
>>> import togglePerf
Counted:    130,922 (bytecode)
Counted:    214,641 (native)
Counted:    239,252 (viper)
Counted: 11,975,836 (asm_thumb)
>>> 

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 9:58 am
by mad474
@Damien
No rules, everything allowed, as usual in this context :)

@dhylands
Very much appreciated, thank you Dave! I'll repulse this evening.

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 12:39 pm
by Damien
Nice work Dave!

Here is an improved viper version that uses native viper pointers to do the toggling:

Code: Select all

@micropython.viper
def togglePerformance3b():
    count = 0 
    stop = int(pyb.millis()) + 1000
    millis = pyb.millis
    odr = ptr16(stm.GPIOA + stm.GPIO_ODR)
    while int(millis()) < stop:
        odr[0] ^= 1 << 13 # PA13 = LED_RED
        count += 1
    print('Counted: {:10,} (viper2)'.format(count))
On PYBv1.0 I get:

Code: Select all

Counted:    128,519 (bytecode)
Counted:    218,550 (native)
Counted:    244,130 (viper)
Counted:    537,699 (viper2)
Counted: 11,975,820 (asm_thumb)

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 1:34 pm
by dhylands
So it turns out the previous assembler version wasn't incrementing the counter the same way that the C version was.

So I coded up an inline assembler version which dies things more or less the same way as the C version, but my counter is still only reaching 10 million, so I'm not sure what's wrong.

Code: Select all

import pyb
import stm
import uctypes

run_buf = bytearray(1)
run_buf[0] = 1

def timer_irq(tim):
    global run_buf
    run_buf[0] = 0

@micropython.asm_thumb
def counter(r0):    # r0 has address of run_buf
    movwt(r1, stm.GPIOA)    # LED(1) is A13
    add(r1, stm.GPIO_BSRRL)
    movw(r2, 1 << 13)       # r2 has mask for setting LED
    mov(r3, r2)
    mov(r4, 16)
    lsl(r3, r4)             # r3 has mask for clearing LED

    mov(r4, 0)              # r4 is counter
# loop
    label(loop)
    ldrb(r3, [r0, 0])
    cmp(r3, 1)
    bne(endloop)
    str(r2, [r1, 0])
    add(r4, 1)
    str(r3, [r1, 0])
    b(loop)
# endloop
    label(endloop)
    mov(r0, r4) # return counter in r0

def main():
    # Setup Timer2 to increment at 10 kHz
    # Disable SysTick
    stm.mem32[0xe000e010] = 0
    t2 = pyb.Timer(2, prescaler=84000000 // 10000 - 1, period=9999, callback=timer_irq)
    print('PSC =', stm.mem32[stm.TIM2 + stm.TIM_PSC])
    count = counter(run_buf)
    print('Counted: {:10,} (asm_thumb)'.format(count))

main()
I threw in the line to disable systick, which increased things a bit, but not much.

This is still a factor of 2 worse than the C example (I'll need to compile the C version and verify independently)

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 3:07 pm
by blmorris
I don't read German, but I still clicked through to see if I could understand any of it.

At least one comment was perfectly clear in the original German :D
Autor: Hans (Gast)
Datum: 2016-02-01 15:18
Hehe, Pintoggeln, der ultimative Benchmark für Mikrocontroller :-)
It is still a nice opportunity to show off MicroPython's assembler capabilities.

-Bryan

ETA: After seeing Dave's post below, it's clear that this little contest could actually drive a legitimate improvement to the assembler capabilities, and of course it is important to be able to control pins close to the hardware limit. I was just amused to see this little dose of humorous perspective in the thread.

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 3:17 pm
by dhylands
One thing I did notice was that when I compiled the C code, it generates a cbz instruction, which is a compare & branch (and there is also a cbnz instruction), and these seem to not be present in MicroPython. These seem like useful instructions to have. I'm going to see if I can encode this using a data word

Re: Pin Toggle Frequency Contest against C. Please Help! :)

Posted: Tue Feb 02, 2016 5:18 pm
by mad474
You guys rock! I'm going to push Dave's first inline assembler version and Damien's viper version and, well, the other versions as well. Thanks, learning a lot (yet understanding only a fraction). Sidenote: In order not to hard fault the board (4 LEDs cycle on and off, reset pin recoverable) Dave's assembler requires new firmware.