Best driver for ili9341 display

All ESP32 boards running MicroPython.
Target audience: MicroPython users with an ESP32 board.
User avatar
Minyiky
Posts: 26
Joined: Sat Oct 24, 2020 5:53 pm

Re: Best driver for ili9341 display

Post by Minyiky » Fri Dec 04, 2020 8:54 am

Update time, I have what I can only call mixed success. On the positive side I now have nano_GUI running on an ili9341 display (specifically the clock demo). On the negative side the write times are slow, around 3 seconds per screen refresh (it was 22 seconds before I got the micropython.viper decorator working). I am almost certain that will be due to how I am handkling the type conversion from GS4_HMSB to RGB565 but Im not sure what the problem is.

I have styled my driver to be similar to yours in structure, just with a change in self.style and buffer size to account for screen size and style (320 * 240 //2). My show function is also almost identicale, just wiuth 2 extra commands to set the limit of the write as required by the ili9341 display. The main difference I can see is in the _lcopy function.

Code: Select all

@micropython.viper
def _lcopy(dest, source, length:int):
    n = 0
    for x in range(length):
    	# First nibble is pixel 1
        c = source[x]
        color = color4to565(int(c) >> 4)
        dest[n] = int(color) >> 8  # Blue green
        n += 1
        dest[n] = int(color) & 0xff   # Red
        n += 1
	
	# Second nibble is pixel 2
        color = color4to565(int(c) & 15)
        dest[n] = int(color) >> 8  # Blue green
        n += 1
        dest[n] = int(color) & 0xff   # Red
        n += 1
I am also using a look up table for 4 bit to 565 conversion which I believe should be quick.

Code: Select all

@micropython.viper
def color4to565(c:int) -> int:
    clup = [    0,30720,  992,31712,
               15,30735, 1007,25388,
            52825,63488, 2016,65504,
               31,63519, 2047,65535]
    return int(clup[c])
If you have any thoughts on why this is so slow I would very much appreciate it.

User avatar
pythoncoder
Posts: 5068
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Best driver for ili9341 display

Post by pythoncoder » Fri Dec 04, 2020 2:54 pm

There are several reasons. Every time color4to565 is called, you are instantiating a list. This involves allocations. Secondly, list lookups aren't easily optimised by Viper. It's important to provide Viper with type hints. Also there are rather a lot of type conversions going on. I would work with bytes throughout. There is also the function call overhead.

I'm also puzzled that you iterate length times. In my code length denotes the number of physical pixels in a row. With 4-bit pixels shouldn't you be iterating through length//2 bytes?

My approach would be to instantiate the lookup table (LUT) once, probably in the constructor, either as a bytes object or as a bytearray if you want to be able to alter it at runtime. The code would then look something like this.

Code: Select all

@micropython.viper
def _lcopy(dest:ptr8, source:ptr8, lut:ptr8, length:int):
    n = 0
    for x in range(length // 2):
        c = source[x]  # Pixel pair
        d = c >> 4  # MS nibble
        c &= 0xf  # LS nibble
        dest[n] = lut[c]
        n += 1
        dest[n] = lut[c + 1]
        n += 1
        dest[n] = lut[d]
        n += 1
        dest[n] = lut[d + 1]
        n += 1
You could write a routine to populate the LUT: this would only need to be called once in the constructor so would have zero impact on performance.

Code: Select all

def make_lut():
    lut = bytearray(32)
    clup = [    0,30720,  992,31712,
               15,30735, 1007,25388,
            52825,63488, 2016,65504,
               31,63519, 2047,65535]
    i = 0
    for color in clup:
        lut[i] = color & 0xff
        i += 1
        lut[i] = color >> 8
        i += 1
    return lut
Please note that these are untested ideas. I haven't properly checked byte ordering and suchlike - it's just meant as a guide to how to maximise runtime performance. My brain isn't at its best today so there may be a howler or two :(

Rather than specify colors as numbers I would use RGB tuples as they are a bit more intuitive:

Code: Select all

def make_lut():
    lut = bytearray(32)
    clup = [(0, 0, 0), (255, 0, 0)... (255, 255, 255)]  # black, red, white (other elements omitted)

    def rgb(r, g, b):
        return ((r & 0xf8) << 5) | ((g & 0x1c) << 11) | (b & 0xf8) | ((g & 0xe0) >> 5)

    i = 0
    for color in clup:
        c = rgb(*color)
        lut[i] = c & 0xff
        i += 1
        lut[i] = c >> 8
        i += 1
    return lut
Peter Hinch

User avatar
pythoncoder
Posts: 5068
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Best driver for ili9341 display

Post by pythoncoder » Fri Dec 04, 2020 4:32 pm

I benchmarked this on an ESP32 at standard clock rate. 240 calls to _lcopy with 320 pixels per row took 74ms. Add the theoretical time of 122ms to transfer 240*320*16 bits at 100ns per bit and we get ~200ms refresh time. Here is the benchmark that times a single row:

Code: Select all

from time import ticks_us, ticks_diff
@micropython.viper
def _lcopy(dest:ptr8, source:ptr8, lut:ptr8, length:int):
    n = 0
    for x in range(length // 2):
        c = source[x]  # Pixel pair
        d = c >> 4  # MS nibble
        c &= 0xf  # LS nibble
        dest[n] = lut[c]
        n += 1
        dest[n] = lut[c + 1]
        n += 1
        dest[n] = lut[d]
        n += 1
        dest[n] = lut[d + 1]
        n += 1

def make_lut():
    lut = bytearray(32)
    clup = [    0,30720,  992,31712,
               15,30735, 1007,25388,
            52825,63488, 2016,65504,
               31,63519, 2047,65535]
    i = 0
    for color in clup:
        lut[i] = color & 0xff
        i += 1
        lut[i] = color >> 8
        i += 1
    return lut

def test():
    length = 320  # pixels
    source = bytearray(length // 2)
    dest = bytearray(length * 2)
    lut = make_lut()
    t = ticks_us()
    _lcopy(dest, source, lut, length)
    print(ticks_diff(ticks_us(), t))
In real code you need to increment the source pointer for each line: to do this fast without allocation either use a memoryview object or uctypes.addressof(), as per my other drivers.
Peter Hinch

User avatar
Minyiky
Posts: 26
Joined: Sat Oct 24, 2020 5:53 pm

Re: Best driver for ili9341 display

Post by Minyiky » Fri Dec 04, 2020 8:44 pm

Thank you for the help with this, I knew that my viper code was being very ineefficient and that was down to not being familiar with it, for example the type converrsions were to remove errors I was encountering.

I have restructured my code to make proper use of viper code, and only a single instanciation of the array (before I saw your update) Working at 10MHz I gat a transfer time of 217ms, almost exactly what you expected.

A quick note, the length in the function I provided was actually a half line due to how I was handling the show function.

Interestingly if I double the size of the line buffer the time taken to refresh the screen decreases, down to 191ms with a 480*2 line buffer. I am not sure where this comes from however as I am not doing any other call than an spi.write, I'm guessing the call itself must have a small overhead.

Out of interest why do you quote 10Mhz as the limit for SPI transfer, the espidf documentaion lists it as being higher (up to 80MHz).
Screenshot 2020-12-04 203853.png
Screenshot 2020-12-04 203853.png (17.34 KiB) Viewed 694 times
I will admit there is the very real possibility I am reading this wrong, or missing a limitation to do with micropython itself.

As an experiment writing at 80Mhz with a 240*12*2 line buffer (12 lines at once) I was able to reduce the refresh time down to 64ms timed using the @timed_function descriptor on the show() function as per the standard of your GUI. Purely using calls to _lcopy gives 43ms to wtite to the full buffer out in the needed format.

User avatar
pythoncoder
Posts: 5068
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Best driver for ili9341 display

Post by pythoncoder » Sat Dec 05, 2020 6:47 am

The ESP can indeed go to 80MHz. I was quoting the ili9341 datasheet. A convoy travels at the speed of the slowest ship ;)

I'm glad to hear you've got this working at a speed that matches my expectations. I have decided to include a 4-bit ili9341 driver in nano-gui. I'll need to order a unit. Touch support would be an entirely different project, and given the number of different touch controllers I have little enthusiasm for it. But I think there are uses for non-touch ili9341 displays as they are large and low cost.

As you have found, there is a tradeoff between buffer size and performance. My drivers buffer a single line, but this is fairly arbitrary. You could buffer any rectangular area. Say you buffer N lines. The size of the "line" buffer goes up by a factor of N (to N*320*2 bytes). But the number of calls to _lcopy and SPI write drops by a factor of N, as does the number of times you have to increment the source pointer. All this provides a small performance boost.

In the limit you'd buffer the entire framebuf, but that would be daft: you'd just use an RGB565 framebuf. This would avoid the need for _lcopy, provide 16 bit color and enable the highest possible performance - but at great cost in RAM.

There is a law of diminishing returns here. I buffer a single line because it's conceptually simple, quite quick and economical in RAM.
Peter Hinch

User avatar
Minyiky
Posts: 26
Joined: Sat Oct 24, 2020 5:53 pm

Re: Best driver for ili9341 display

Post by Minyiky » Sat Dec 05, 2020 11:54 am

The ESP can indeed go to 80MHz. I was quoting the ili9341 datasheet. A convoy travels at the speed of the slowest ship ;)
Ahh see I knew I would be missing something very obvious.

It looks like this is about the limit without some over clocking and DMA magic, and is probably perfectly fast enough for me as I will likely only be using simple graphics.

I wonder why my _lcopy speed seem to be a bit lower than yours, as far as I can see my code is very similar to yours (although I think I did check it while set to write 16 line blocks).

One small thing I noticed, I believe you would want to call lut[c*2] (and lut[c*2+1]) as you are mapping 16 values on to a byte array of size 32.

Going through all of this has certainly taught me a lot though, thank you for your patience with a complete beginner.

I have also seen an IPS display which I am tempted to try at some point using a different driver, st7789 which is actually what I based my c version of the driver on so shouldn't be too complicated to impliment. If I am reading the datasheet right it would also allow for a faster clock with a clock wite cycle of 16ns (so ~60MHz).
Screenshot 2020-12-05 at 11.53.09.png
Screenshot 2020-12-05 at 11.53.09.png (154 KiB) Viewed 659 times

User avatar
pythoncoder
Posts: 5068
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Best driver for ili9341 display

Post by pythoncoder » Sat Dec 05, 2020 1:26 pm

DMA magic won't save you from the physical limit of SPI.

Attitudes to overclocking vary. For one-off amateur projects it's arguably OK, but there is always the risk that one day it will stop working and need a power cycle. My personal view from a career in electronics design, is that datasheet limits are gospel. When an entire factory is churning out your design, you don't want a new batch of parts to come in and everything fails test.

On the other hand if the LVGL guys really are running at those speeds...

You are right about the need to multiply by 2. Well spotted - my brain was well below par yesterday. It's possible that bit twiddling might be slightly quicker than multiplication, but I haven't tested it:

Code: Select all

@micropython.viper
def _lcopy(dest:ptr8, source:ptr8, lut:ptr8, length:int):
    n = 0
    for x in range(length // 2):
        c = source[x]  # Pixel pair
        d = (c & 0xf0) >> 3  # MS nibble * 2
        c = (c & 0xf) << 1  # LS nibble * 2
        dest[n] = lut[c]
        n += 1
        dest[n] = lut[c + 1]
        n += 1
        dest[n] = lut[d]
        n += 1
        dest[n] = lut[d + 1]
        n += 1
You are also right about the st7789: you should be able to run at 60MHz. You'll need to keep wires short and direct at these speeds. Also note that (as with many of these chips) read speeds are slower; if you're porting to nano-gui it doesn't do readbacks.

If you have a driver working with nano-gui and ili9341 you might like to post it - I'd be interested to study it. I need to give some thought on how to integrate 4-bit color into the project as a whole, but I have a scheme in mind.
Peter Hinch

User avatar
Minyiky
Posts: 26
Joined: Sat Oct 24, 2020 5:53 pm

Re: Best driver for ili9341 display

Post by Minyiky » Sun Dec 06, 2020 2:08 pm

If you have a driver working with nano-gui and ili9341 you might like to post it - I'd be interested to study it. I need to give some thought on how to integrate 4-bit color into the project as a whole, but I have a scheme in mind.
I have forked your nano gui module and added some files that should allow for the gui to work.

The fork can be found here: https://github.com/minyiky/micropython-nano-gui

I have added the following files:
ili9341_setup.py # Replaces color_setup.py file in the demo
gui\core\colors_4bit.py # Certainly not a final solution but these are the 4bit integers which match the ones defined in the lookup table in the driver. Ideally I believe both the 4 bit and RGB565 values would be stored here
gui\demos\aclock_ili9341.py # A slightly modified version of the aclock.py demo I was using to test the gui
drivers\ili9XXX\ili9341.py # My implimentation of the display driver, keeping to a similar line as your drivers but there are almost certainly things you would want to change

Hope this is helpful to you

User avatar
pythoncoder
Posts: 5068
Joined: Fri Jul 18, 2014 8:01 am
Location: UK
Contact:

Re: Best driver for ili9341 display

Post by pythoncoder » Sun Dec 06, 2020 3:04 pm

Thank you very much. It will give me something to start with when I receive some hardware.

I have a couple of changes in mind. One is to make the handling of 4-bit color more generic, in case I decide to implement further 4-bit drivers. This really just moves code around between modules.

The second change is a minor RAM optimisation and is a matter of program style. Most drivers, including the Adafruit ones, declare bytearrays for constants such as self.PWCTRB. My approach is to write

Code: Select all

self.write_cmd(b'\xcf', 0x00, 0xC1, 0x30)  # PWCTRB
It saves a bit of RAM and I find it just as easy to switch between code and datasheet so long as the comment has the register name.

I'll post a message when this is complete. Thanks again for your help - it has convinced me that this is the way forward with the ili9341.
Peter Hinch

User avatar
Minyiky
Posts: 26
Joined: Sat Oct 24, 2020 5:53 pm

Re: Best driver for ili9341 display

Post by Minyiky » Sun Dec 06, 2020 3:47 pm

I'm glad I could be of some assistance.

Interestingly I also originally had the commands set the way you did, however when I was working with hundreds of individual calls the ram write sequence was being slowed by the continuous byte array conversion and so I changed all to be constants to keep it consistent for me to read. Now that the write is being done as a whole buffer that is no longer a problem, and even if it was I agree they could be reverted back.

Post Reply