Multiaxis stepper motors using RMT
-
- Posts: 847
- Joined: Mon Nov 20, 2017 10:18 am
Re: Multiaxis stepper motors using RMT
If you reach what your after you will have created a motion controller well above anything else on the market.
Smooth stepper is a bit of a bench mark for high performance professional motion controllers used by professional CNC machinist. Smooth stepper during setup you have to set the max step freq and smooth stepper suggests starting at 1KHz and working from there see https://warp9td.com/index.php/gettingst ... r-and-mach
The many of latest generation professional motion controllers boast step freq up 100KHz which of course is usable in the real world and more of a marketing point of the majority of the motion controller that only do 50KHz.
Smooth stepper is a bit of a bench mark for high performance professional motion controllers used by professional CNC machinist. Smooth stepper during setup you have to set the max step freq and smooth stepper suggests starting at 1KHz and working from there see https://warp9td.com/index.php/gettingst ... r-and-mach
The many of latest generation professional motion controllers boast step freq up 100KHz which of course is usable in the real world and more of a marketing point of the majority of the motion controller that only do 50KHz.
- OlivierLenoir
- Posts: 126
- Joined: Fri Dec 13, 2019 7:10 pm
- Location: Picardie, FR
Re: Multiaxis stepper motors using RMT
Using @micropython.native
Test 6, expected frequency on ch1 15.686 kHz, measure: 13.64 kHz. Delay between two ch1_wp(ch1_frame) is shorter.
50µs per div.
Delay of 13µs exist between ch1 and ch2. Good improvement using @micropython.native.
5µs per div.
Code: Select all
# Test 6
from machine import Pin
from esp32 import RMT
ch1 = RMT(0, pin=Pin(2), clock_div=255)
ch1_wp = ch1.write_pulses
ch1_frame = (10, 10) * 3
ch2 = RMT(1, pin=Pin(4), clock_div=255)
ch2_wp = ch2.write_pulses
ch2_frame = (10, 10)
@micropython.native
def steps(ch1_wp, ch2_wp, ch1_frame, ch2_frame, loop):
lp = range(loop)
for _ in lp:
ch1_wp(ch1_frame)
ch2_wp(ch2_frame)
steps(ch1_wp, ch2_wp, ch1_frame, ch2_frame, 1000000)
50µs per div.
Delay of 13µs exist between ch1 and ch2. Good improvement using @micropython.native.
5µs per div.
Olivier Lenoir
https://gitlab.com/olivierlenoir
https://gitlab.com/olivierlenoir
- OlivierLenoir
- Posts: 126
- Joined: Fri Dec 13, 2019 7:10 pm
- Location: Picardie, FR
Re: Multiaxis stepper motors using RMT
Without RMT, I've made other test to benchmark how fast I can do steps and with how many Axis. Of course this does not take in consideration calculation required to do or not do steps. @micropython.native has been used and compare to the same code without @micropython.native.
Measure performed with an oscilloscope, MicroPython V1.12 on an ESP32-WROOM-32 @ 160 MHz.
Here bellow Test 14:
Here bellow Test 16:
Conclusion:
sleep_us(delay) in Test 14 and Test 16 can be used to calculate next step. An other approach would be to disable (to save calculation time) axis if they don't need to move.
Measure performed with an oscilloscope, MicroPython V1.12 on an ESP32-WROOM-32 @ 160 MHz.
Here bellow Test 14:
- 1 Axis: 89.54 kHz vs 27.18 kHZ without @micropython.native
- 2 Axis: 54.47 kHz vs 17.46 kHZ without @micropython.native
- 3 Axis: 39.14 kHz vs 12.86 kHZ without @micropython.native
- 4 Axis: 30.55 kHz vs 10.18 kHZ without @micropython.native
- 5 Axis: 25.05 kHz vs 8.524 kHZ without @micropython.native
- 6 Axis: 21.23 kHz vs 7.268 kHZ without @micropython.native
- 7 Axis: 18.41 kHz vs 6.336 kHZ without @micropython.native
- 8 Axis: 16.26 kHz vs 5.616 kHZ without @micropython.native
Code: Select all
# Test 14
# +Width = 3.85µs
from machine import Pin
from utime import sleep_us
ch1 = Pin(2, Pin.OUT)
ch2 = Pin(4, Pin.OUT)
ch3 = Pin(15, Pin.OUT)
ch4 = Pin(18, Pin.OUT)
ch5 = Pin(19, Pin.OUT)
ch6 = Pin(21, Pin.OUT)
ch7 = Pin(22, Pin.OUT)
ch8 = Pin(23, Pin.OUT)
@micropython.native
def steps(ch, loop, delay=0):
for _ in range(loop):
for c in ch:
c(1)
sleep_us(delay)
c(0)
steps((ch1,), 1000000) # 89.54 kHz vs 27.18 kHZ without @micropython.native
steps((ch1, ch2,), 100000) # 54.47 kHz vs 17.46 kHZ without @micropython.native
steps((ch1, ch2, ch3,), 1000000) # 39.14 kHz vs 12.86 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4,), 1000000) # 30.55 kHz vs 10.18 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5,), 1000000) # 25.05 kHz vs 8.524 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6,), 1000000) # 21.23 kHz vs 7.268 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6, ch7,), 1000000) # 18.41 kHz vs 6.336 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6, ch7, ch8,), 1000000) # 16.26 kHz vs 5.616 kHZ without @micropython.native
- 1 Axis: 62.70 kHz vs 25.04 kHZ without @micropython.native
- 2 Axis: 44.27 kHz vs 21.42 kHZ without @micropython.native
- 3 Axis: 34.21 kHz vs 18.72 kHZ without @micropython.native
- 4 Axis: 27.88 kHz vs 16.62 kHZ without @micropython.native
- 5 Axis: 23.53 kHz vs 14.94 kHZ without @micropython.native
- 6 Axis: 20.35 kHz vs 13.57 kHZ without @micropython.native
- 7 Axis: 17.93 kHz vs 12.44 kHZ without @micropython.native
- 8 Axis: 16.02 kHz vs 11.47 kHZ without @micropython.native
Code: Select all
# Test 16
from machine import Pin
from utime import sleep_us
ch1 = Pin(2, Pin.OUT)
ch2 = Pin(4, Pin.OUT)
ch3 = Pin(15, Pin.OUT)
ch4 = Pin(18, Pin.OUT)
ch5 = Pin(19, Pin.OUT)
ch6 = Pin(21, Pin.OUT)
ch7 = Pin(22, Pin.OUT)
ch8 = Pin(23, Pin.OUT)
@micropython.native
def steps(ch, loop, delay=0):
for _ in range(loop):
for c in ch:
c(1)
sleep_us(delay)
for c in ch:
c(0)
steps((ch1,), 1000000) # 62.70 kHz vs 25.04 kHZ without @micropython.native
steps((ch1, ch2,), 1000000) # 44.27 kHz vs 21.42 kHZ without @micropython.native
steps((ch1, ch2, ch3,), 1000000) # 34.21 kHz vs 18.72 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4,), 1000000) # 27.88 kHz vs 16.62 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5,), 1000000) # 23.53 kHz vs 14.94 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6,), 1000000) # 20.35 kHz vs 13.57 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6, ch7,), 1000000) # 17.93 kHz vs 12.44 kHZ without @micropython.native
steps((ch1, ch2, ch3, ch4, ch5, ch6, ch7, ch8,), 1000000) # 16.02 kHz vs 11.47 kHZ without @micropython.native
sleep_us(delay) in Test 14 and Test 16 can be used to calculate next step. An other approach would be to disable (to save calculation time) axis if they don't need to move.
Olivier Lenoir
https://gitlab.com/olivierlenoir
https://gitlab.com/olivierlenoir
- OlivierLenoir
- Posts: 126
- Joined: Fri Dec 13, 2019 7:10 pm
- Location: Picardie, FR
Re: Multiaxis stepper motors using RMT
MultiAxis linear interpolation.
Considering the following 3 axis ch1, ch2, ch3 linear integer movement and their derivative ch1', ch2', ch3'. If I get a 0 not step, if I get a 1 or -1 step.
I've not been able to find the derivative of function round(x), so I've used round(x) - round(x - 1) as the derivative. But this solution use float and it's slow.
Here is my solution with integer very very close to the round() version and 2 times faster.
Conclusion
do_int2() is the fastest solution so far (18.9s improved to 8.9s for 123,000 operations). If you have a better solution, thanks, to let me know.
Considering the following 3 axis ch1, ch2, ch3 linear integer movement and their derivative ch1', ch2', ch3'. If I get a 0 not step, if I get a 1 or -1 step.
Code: Select all
_/
_/
_/
_/
ch1: /
ch1': |_|_|_|_|
__
___/
ch2: __/
ch2': __|___|__
____
ch3: ____/
ch3': ____|____
[ch1, ch2, ch3]
[1, 0, 0]
[1, 1, 0]
[1, 0, 1]
[1, 1, 0]
[1, 0, 0]
Here is my solution with integer very very close to the round() version and 2 times faster.
Code: Select all
from utime import ticks_us, ticks_diff
def do_float(s, d, m):
return round(s * d / m) - round((s - 1) * d / m)
def do_int(s, d, m):
return (s * d + m // 2) // m - ((s - 1) * d + m // 2) // m
def do_int2(s, d, m):
sdm = s * d + m // 2
return sdm // m - (sdm - d) // m
dist = [-41000, 3000, -17000]
max_dist = max(map(abs, dist))
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
f = [do_float(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_float {}µs'.format(laps)) # 18,940,161µs
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
i = [do_int(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_int {}µs'.format(laps)) # 9,513,301µs
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
i = [do_int2(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_int2 {}µs'.format(laps)) # 8,879,925µs
do_int2() is the fastest solution so far (18.9s improved to 8.9s for 123,000 operations). If you have a better solution, thanks, to let me know.
Last edited by OlivierLenoir on Thu Jan 23, 2020 6:12 am, edited 1 time in total.
Olivier Lenoir
https://gitlab.com/olivierlenoir
https://gitlab.com/olivierlenoir
Re: Multiaxis stepper motors using RMT
Sure? That would be ~72ns for each call of do_int2(). That's hard to believe.8.9ms for 123,000 operations
- OlivierLenoir
- Posts: 126
- Joined: Fri Dec 13, 2019 7:10 pm
- Location: Picardie, FR
Re: Multiaxis stepper motors using RMT
Oups, oups, oups, you are right 8,879,925µs = 8.9s. I've updated my post. Thanks.
Olivier Lenoir
https://gitlab.com/olivierlenoir
https://gitlab.com/olivierlenoir
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Multiaxis stepper motors using RMT
You might like to try
I can't replicate your timings, even increasing the clock to 240MHz mine are slower: tested on an ESP32 with SPIRAM. But Viper did produce a significant improvement. 160MHz clock, your code:
240MHz your code
240MHz, Viper
Code: Select all
@micropython.viper
def do_int2(s: int, d:int, m:int)->int:
sdm = s * d + m // 2
return sdm // m - (sdm - d) // m
Code: Select all
do_float 27830976µs
do_int 15934537µs
do_int2 14929394µs
Code: Select all
do_float 25827778µs
do_int 13942139µs
do_int2 12975179µs
Code: Select all
do_float 25017065µs
do_int 14129904µs
do_int2 9369835µs
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.
Re: Multiaxis stepper motors using RMT
Using the script of @OlivierLenoir with the viper code provided by @pythoncoder for do_int2, I get the following times:
do_float 18039118µs
do_int 8656320µs
do_int2 3180131µs (Viper code)
Tested on a Wemos Lolin32 Lite WITHOUT SPIRAM, at 240 MHz. On a Wemos Lolin32 Pro with SPIRAM @240MHz, i get:
do_float 28680178µs
do_int 13879738µs
do_int2 8312655µs (Viper code)
For comparison: times on a generic ESP32 with SPIRAM and the Pycom firmware, @160MHz, no Viper code, I get:
do_float 12550005µs
do_int 8396078µs
do_int2 7738536µs (Python code)
That matches my observation that the ESP32 port here with SPIRAM is slow.
do_float 18039118µs
do_int 8656320µs
do_int2 3180131µs (Viper code)
Tested on a Wemos Lolin32 Lite WITHOUT SPIRAM, at 240 MHz. On a Wemos Lolin32 Pro with SPIRAM @240MHz, i get:
do_float 28680178µs
do_int 13879738µs
do_int2 8312655µs (Viper code)
For comparison: times on a generic ESP32 with SPIRAM and the Pycom firmware, @160MHz, no Viper code, I get:
do_float 12550005µs
do_int 8396078µs
do_int2 7738536µs (Python code)
That matches my observation that the ESP32 port here with SPIRAM is slow.
- pythoncoder
- Posts: 5956
- Joined: Fri Jul 18, 2014 8:01 am
- Location: UK
- Contact:
Re: Multiaxis stepper motors using RMT
Indeed, that seemed the only explanation for the discrepancy.
In particular gc is slow as it has to trawl through the whole RAM. This shouldn't affect these tests as it is performed prior to each run, but in general it will add to general latency and sluggishness, especially of soft IRQ's.
Peter Hinch
Index to my micropython libraries.
Index to my micropython libraries.
- OlivierLenoir
- Posts: 126
- Joined: Fri Dec 13, 2019 7:10 pm
- Location: Picardie, FR
Re: Multiaxis stepper motors using RMT
I did try @micropython.native when writing my previous post and I'm testing your @micropython.viper solution. The gc.collect() is just here to do the test in the same memory condition.pythoncoder wrote: ↑Thu Jan 23, 2020 9:54 amYou might like to tryI can't replicate your timings, even increasing the clock to 240MHz mine are slower: tested on an ESP32 with SPIRAM. But Viper did produce a significant improvement. 160MHz clock, your code:Code: Select all
@micropython.viper def do_int2(s: int, d:int, m:int)->int: sdm = s * d + m // 2 return sdm // m - (sdm - d) // m
Do you know that is the impact on memory when using python vs @micropython.native vs @micropython.viper?
Here are my results using ESP32-WROOM-32 @ 160MHz, MicroPython v1.12 on 2019-12-20; ESP32 module with ESP32
:
Code: Select all
from utime import ticks_us, ticks_diff
def do_int2(s, d, m):
sdm = s * d + m // 2
return sdm // m - (sdm - d) // m
@micropython.native
def do_int2_n(s, d, m):
sdm = s * d + m // 2
return sdm // m - (sdm - d) // m
@micropython.viper
def do_int2_v(s: int, d: int, m: int)->int:
sdm = s * d + m // 2
return sdm // m - (sdm - d) // m
dist = [-41000, 3000, -17000]
max_dist = max(map(abs, dist))
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
i = [do_int2(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_int2 {}µs'.format(laps))
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
i = [do_int2_n(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_int2_n {}µs'.format(laps))
gc.collect()
t0 = ticks_us()
for s in range(1, max_dist + 1):
i = [do_int2_v(s, d, max_dist) for d in dist]
laps = ticks_diff(ticks_us(), t0)
print('do_int2_v {}µs'.format(laps))
Code: Select all
do_int2 8987548µs
do_int2_n 7540658µs
do_int2_v 6868351µs
Olivier Lenoir
https://gitlab.com/olivierlenoir
https://gitlab.com/olivierlenoir