Skip to main content

TouchGFX on NeoChrom

This section discusses how to use TouchGFX on hardware having the NeoChrom graphics accelerator. This graphics accelerator improves the performance significantly for operations like texture mapping, image scaling and rotation. This means that more advanced UIs can be built while still keeping a high frame rate.

The NeoChrom graphics accelerator is currently only available in the STM32U5x9 microcontrollers as found on e.g. STM32U599 Discovery Kit.

STM32U599 Discovery Board

The NeoChrom accelerator is also known by the name GPU2D in source code and in CubeMX.

NeoChrom Graphical Capabilities

The NeoChrom accelerator is capable of performing basic blitting (drawing images), blending, scaling, rotation, and texture mapping. All suce operations are automatically used by TouchGFX when running on a microcontroller with NeoChrom.

Compared to the DMA2D graphics accelerator NeoChrom is capable of accelerating more graphical operations and has a richer feature set:

Graphic featureDMA2DGPU2D
Supported formats (with TouchGFX)ARGB8888, RGB888, RGB565, A8, A4, L8ARGB8888, RGB888, RGB565, A8, A4, A2, A1
Command list basedNoYes
DrawingRectanglesRectangles, Pixels, Line, Triangle, Quadrilaterals with multi-sample anti-aliasing (MSAA)
BlittingCopy, alpha blending, pixel format conversionCopy, alpha blending, pixel format conversion, color keying
Texture MappingNoYes

With these capabilities available even more TouchGFX Widgets are accelerated with NeoChrom:

WidgetDMA2DGPU2D
Box, BoxWithBorderYesYes
Image, AnimatedImage, TiledImage, SnapshotWidgetYesYes
Button, ButtonWithIcon, ButtonWithLabel, ToggleButtonYesYes
RadioButton, RepeatButtonYesYes
PixelDataWidgetYesYes
TextArea, TextAreaWithWildcard, KeyboardPartlyYes
ScalableImageNoYes
TextureMapper, AnimatedTextureMapperNoYes
Circle, Line, Graph, GaugeNoNo

The operations that are not hardware accelerated are software rendered (implying a higher CPU-load, and lower performance). As the table above shows, NeoChrom can accelerate widgets like ScalableImage and TextureMapper. This means that we can use those widgets to a greater extent while keeping a high performance.

Rendering Time Improvements with NeoChrom.

The following examples illustrate the speed-up provided by NeoChrom over DMA2D and software rendering. We have created two projects with the Designer. The first project shows an Image on a Box background. The second project shows a TextureMapper Widget on a Box background. The widget is redrawn in every frame. In both cases the bitmap is of size 128x128, in ARGB8888 format, and stored in the internal flash. The framebuffer is in RGB565 format.

The Image project

The TextureMapper project

Both projects have been executed on a STM32F746 and a STM32U599 Discovery board.

We have measured the rendering times by connecting a GPIO to a logic analyzer:

STM32F746 running the Image project

The figure above shows the frame rate and rendering time when runninng on the STM32F746. Channel 00 shows the VSYNC signal. We see that the display runs with a frame interval of 16.9 ms (A1 to A2) corresponding to a frame rate of 59.2 Hz. Channel 01 shows the render time (high when rendering, B1 to B2). The time to render the Image is thus 1.3 ms. Image rendering is fast on the STM32F746.

STM32F746 running the TextureMapper project

The figure above is the texture mapper project running on the STM32F746. The rendering time of the TextureMapper is 4.5 ms. The TextureMapper widget is much slower than the Image.

STM32U599 running the Image project

Here is the STM32U599 Discovery kit running the Image project. The STM32U599 Discovery kit display has a display frame interval of 12.26 ms corresponding to a frame rate of 81.6 Hz. The render time of the Image is 0.7 ms. We see that the Image widget is faster than on the STM32F746 kit.

STM32U599 running the TextureMapper project

The render time of the TextureMapper is 1.7 ms. The TextureMapper is also faster on STM32U599 than on the STM32F746.

Rendering Time Summary

The table below shows the rendering times:

ElementSTM32F746STM32U599Speed up
Frequency200 MHz160 MHz0.8
Image1.3 ms0.7 ms~2x
TextureMapper4.5 ms1.7 ms~3x

We see that even with a reduced clock frequency the STM32U599 greatly outperforms the STM32F746, especially with the TextureMapper.

These measurements are taken with the image in internal flash and the framebuffer in external SDRAM for STM32F746. The framebuffer is in internal SRAM for the STM32U599 (as this will be typical scenario). Moving the image to external flash hurts the STM32F746 as it uses QSPI flash (4-bit bus), whereas the STM32U599 uses a faster OSPI flash (8-bit bus).

The STM32F746 Discovery kit can run with a 480x272 RGB565 framebuffer in the internal RAM. This improves the performance (Image down to 1.03 ms), but it is not the standard configuration for STM32F746, as it uses a very large part of the internal SRAM for the frame buffer, leaving little RAM for other application components.
Running with a single frame buffers is also not suitable for all applications.

Richer User Interfaces

The improved rendering performance can be used to create user interfaces with more advanced animations. For example more scaled or rotated elements.
For the STM32F746, the frame refresh time was 16.8 ms. This means that the application must keep the render time below this to keep a frame rate of 60 fps. We can therefore have at most 3.75 texture mappers of that complexity (16.8 ms / 4.48 ms) on the screen, or a single larger texture mapper of size 247 x 247 (the same number of pixels and approximately the same rendering time).
If we assume the same screen refresh rate, but use the STM32U599 CPU, we can have 14.36 texture mappers (16.8 ms / 1.17 ms), or a single texture mapper of size 485 x 485.

The following figure shows two applications running on respectively the STM32F746 and the STM32U599. The idea is to make a carousel-like menu where elements are scaled up when going to the center and scaled down when going out (here we just use the same texture for all the elements).

STM32F746 (left) and STM32U599 (right) running the texture mapper based carousel project

The STM32F746 is capable of showing three icons, one big icon, scaled up by a factor of 1.9, and two smaller icons. The STM32U599 is capable of showing 7 icons. The largest icon is scaled up by factor 2.7.

The rendering time of the application with 3 icons on the STM32F746 is 14.38 ms. The rendering time of the application with 7 icons on the STM32U599 is 14.93 ms. Both UIs can thus run in 60 fps, with the STM32U599 showing much more content in a higher resolution.

Creating a project

CubeMX and the TouchGFX Generator supports the NeoChrom. In CubeMX the accelerator is known by its code name GPU2D. The GPU2D accelerator is only available to TouchGFX if GPU2D is enabled in the TouchGFX configuration in CubeMX.
If you use the STM32U599 TBS (template project) provided with the TouchGFX Designer this is already done. If you make your own custom project, make sure to enable the GPU2D Accelerator as shown below:

Enabling GPU2D (NeoChrom) in CubeMX

After enabling GPU2D press "Generate Code" in CubeMX. This regenerates the target configuration code. Now open the project in the TouchGFX Designer and generate code there also (F4).

The Designer generates assets (images, fonts, and texts) and simulator code, that matches the target configuration. You are now ready to compile the code with IAR.

If you are starting a project from the Designer there is no need to open CubeMX unless you need to change some hardware settings.

Supported IDEs

The STM32U599 TBS (version 3.0.0) is currently only supported with IAR Workbench.

A recent version of IAR is required (8.5x.x). Make sure that the processor variant (STM32U599NJ for the Discovery board) is supported by checking the General Options:

IAR General Options

Framebuffer Formats

The STM32U599 discovery board supports three framebuffer formats: RGB565, RGB888, ARGB8888. These are configurable from CubeMX.

NeoChrom Limitations

The NeoChrom graphics accelerator in STM32U599 does not support the L8 image formats (L8_RGB565, L8_RGB888, L8_ARGB8888).
If you use these image formats in a TouchGFX application running on STM32U599 the images will be drawn using DMA2D. If you use these formats with ScalableImage or TextureMapper a software fallback will be used.

It is therefore recommended to not use L8 images with STM32U599.