TouchGFX on NeoChrom
This section discusses how to use TouchGFX on hardware having the NeoChrom graphics accelerator. This graphics accelerator improves the performance significantly for operations like texture mapping, image scaling and rotation. This means that more advanced UIs can be built while still keeping a high frame rate.
The NeoChrom graphics accelerator is currently only available in the STM32U5x9 microcontrollers as found on e.g. STM32U599 Discovery Kit.
The NeoChrom accelerator is also known by the name GPU2D in source code and in CubeMX.
NeoChrom Graphical Capabilities
The NeoChrom accelerator is capable of performing basic blitting (drawing images), blending, scaling, rotation, and texture mapping. All suce operations are automatically used by TouchGFX when running on a microcontroller with NeoChrom.
Compared to the DMA2D graphics accelerator NeoChrom is capable of accelerating more graphical operations and has a richer feature set:
Graphic feature | DMA2D | GPU2D |
---|---|---|
Supported formats (with TouchGFX) | ARGB8888, RGB888, RGB565, A8, A4, L8 | ARGB8888, RGB888, RGB565, A8, A4, A2, A1 |
Command list based | No | Yes |
Drawing | Rectangles | Rectangles, Pixels, Line, Triangle, Quadrilaterals with multi-sample anti-aliasing (MSAA) |
Blitting | Copy, alpha blending, pixel format conversion | Copy, alpha blending, pixel format conversion, color keying |
Texture Mapping | No | Yes |
With these capabilities available even more TouchGFX Widgets are accelerated with NeoChrom:
Widget | DMA2D | GPU2D |
---|---|---|
Box, BoxWithBorder | Yes | Yes |
Image, AnimatedImage, TiledImage, SnapshotWidget | Yes | Yes |
Button, ButtonWithIcon, ButtonWithLabel, ToggleButton | Yes | Yes |
RadioButton, RepeatButton | Yes | Yes |
PixelDataWidget | Yes | Yes |
TextArea, TextAreaWithWildcard, Keyboard | Partly | Yes |
ScalableImage | No | Yes |
TextureMapper, AnimatedTextureMapper | No | Yes |
Circle, Line, Graph, Gauge | No | No |
The operations that are not hardware accelerated are software rendered (implying a higher CPU-load, and lower performance). As the table above shows, NeoChrom can accelerate widgets like ScalableImage and TextureMapper. This means that we can use those widgets to a greater extent while keeping a high performance.
Rendering Time Improvements with NeoChrom.
The following examples illustrate the speed-up provided by NeoChrom over DMA2D and software rendering. We have created two projects with the Designer. The first project shows an Image on a Box background. The second project shows a TextureMapper Widget on a Box background. The widget is redrawn in every frame. In both cases the bitmap is of size 128x128, in ARGB8888 format, and stored in the internal flash. The framebuffer is in RGB565 format.
Both projects have been executed on a STM32F746 and a STM32U599 Discovery board.
We have measured the rendering times by connecting a GPIO to a logic analyzer:
The figure above shows the frame rate and rendering time when runninng on the STM32F746. Channel 00 shows the VSYNC signal. We see that the display runs with a frame interval of 16.9 ms (A1 to A2) corresponding to a frame rate of 59.2 Hz. Channel 01 shows the render time (high when rendering, B1 to B2). The time to render the Image is thus 1.3 ms. Image rendering is fast on the STM32F746.
The figure above is the texture mapper project running on the STM32F746. The rendering time of the TextureMapper is 4.5 ms. The TextureMapper widget is much slower than the Image.
Here is the STM32U599 Discovery kit running the Image project. The STM32U599 Discovery kit display has a display frame interval of 12.26 ms corresponding to a frame rate of 81.6 Hz. The render time of the Image is 0.7 ms. We see that the Image widget is faster than on the STM32F746 kit.
The render time of the TextureMapper is 1.7 ms. The TextureMapper is also faster on STM32U599 than on the STM32F746.
Rendering Time Summary
The table below shows the rendering times:
Element | STM32F746 | STM32U599 | Speed up |
---|---|---|---|
Frequency | 200 MHz | 160 MHz | 0.8 |
Image | 1.3 ms | 0.7 ms | ~2x |
TextureMapper | 4.5 ms | 1.7 ms | ~3x |
We see that even with a reduced clock frequency the STM32U599 greatly outperforms the STM32F746, especially with the TextureMapper.
These measurements are taken with the image in internal flash and the framebuffer in external SDRAM for STM32F746. The framebuffer is in internal SRAM for the STM32U599 (as this will be typical scenario). Moving the image to external flash hurts the STM32F746 as it uses QSPI flash (4-bit bus), whereas the STM32U599 uses a faster OSPI flash (8-bit bus).
The STM32F746 Discovery kit can run with a 480x272 RGB565 framebuffer in the internal RAM. This improves the performance (Image down to 1.03 ms), but it is not the standard configuration for STM32F746, as it uses a very large part of the internal SRAM for the frame buffer, leaving little RAM for other application components.
Running with a single frame buffers is also not suitable for all applications.
Richer User Interfaces
The improved rendering performance can be used to create user interfaces with more advanced animations. For example more scaled or rotated elements.
For the STM32F746, the frame refresh time was 16.8 ms. This means that the application must keep the render time below this to keep a frame rate of 60 fps. We can therefore have at most 3.75 texture mappers of that complexity (16.8 ms / 4.48 ms) on the screen, or a single larger texture mapper of size 247 x 247 (the same number of pixels and approximately the same rendering time).
If we assume the same screen refresh rate, but use the STM32U599 CPU, we can have 14.36 texture mappers (16.8 ms / 1.17 ms), or a single texture mapper of size 485 x 485.
The following figure shows two applications running on respectively the STM32F746 and the STM32U599. The idea is to make a carousel-like menu where elements are scaled up when going to the center and scaled down when going out (here we just use the same texture for all the elements).
The STM32F746 is capable of showing three icons, one big icon, scaled up by a factor of 1.9, and two smaller icons. The STM32U599 is capable of showing 7 icons. The largest icon is scaled up by factor 2.7.
The rendering time of the application with 3 icons on the STM32F746 is 14.38 ms. The rendering time of the application with 7 icons on the STM32U599 is 14.93 ms. Both UIs can thus run in 60 fps, with the STM32U599 showing much more content in a higher resolution.
Creating a project
CubeMX and the TouchGFX Generator supports the NeoChrom. In CubeMX the accelerator is known by its code name GPU2D. The GPU2D accelerator is only available to TouchGFX if GPU2D is enabled in the TouchGFX configuration in CubeMX.
If you use the STM32U599 TBS (template project) provided with the TouchGFX Designer this is already done. If you make your own custom project, make sure to enable the GPU2D Accelerator as shown below:
After enabling GPU2D press "Generate Code" in CubeMX. This regenerates the target configuration code. Now open the project in the TouchGFX Designer and generate code there also (F4).
The Designer generates assets (images, fonts, and texts) and simulator code, that matches the target configuration. You are now ready to compile the code with IAR.
If you are starting a project from the Designer there is no need to open CubeMX unless you need to change some hardware settings.
Supported IDEs
The STM32U599 TBS (version 3.0.0) is currently only supported with IAR Workbench.
A recent version of IAR is required (8.5x.x). Make sure that the processor variant (STM32U599NJ for the Discovery board) is supported by checking the General Options:
Framebuffer Formats
The STM32U599 discovery board supports three framebuffer formats: RGB565, RGB888, ARGB8888. These are configurable from CubeMX.
NeoChrom Limitations
The NeoChrom graphics accelerator in STM32U599 does not support the L8 image formats (L8_RGB565, L8_RGB888, L8_ARGB8888).
If you use these image formats in a TouchGFX application running on STM32U599 the images will be drawn using DMA2D. If you use these formats with ScalableImage or TextureMapper a software fallback will be used.
It is therefore recommended to not use L8 images with STM32U599.