TouchGFX on NeoChrom/NeoChromVG
This section discusses how to use TouchGFX on hardware having the NeoChrom graphics accelerator. This graphics accelerator improves the performance significantly for operations like texture mapping, image scaling and rotation. This means that more advanced UIs can be built while still keeping a high frame rate.
The NeoChrom graphics accelerator is currently only available in the STM32U5x9 microcontrollers as found on e.g. STM32U599 Discovery Kit.
The NeoChrom accelerator is also known by the name GPU2D in source code and in CubeMX.
NeoChrom and NeoChromVG
The NeoChrom accelerator has been updated with extra capabilities with the introduction of STM32U5G9. The improved accelerator is named NeoChromVG. The accelerator contains extended capabilities that allows hardware accelerated vector graphics.
Micro controller | Accelerator |
---|---|
STM32U599 | NeoChrom |
STM32U5A9 | NeoChrom |
STM32U5F9 | NeoChromVG |
STM32U5G9 | NeoChromVG |
NeoChrom Graphical Capabilities
The NeoChrom accelerators are capable of performing basic blitting (drawing images), blending, scaling, rotation, and texture mapping. All such operations are automatically used by TouchGFX when running on a microcontroller with NeoChrom.
Compared to the DMA2D graphics accelerator NeoChrom is capable of accelerating more graphical operations and has a richer feature set:
Graphic feature | DMA2D | GPU2D |
---|---|---|
Supported formats (with TouchGFX) | ARGB8888, RGB888, RGB565, A8, A4, L8 | ARGB8888, RGB888, RGB565, A8, A4, A2, A1 |
Command list based | No | Yes |
Drawing | Rectangles | Rectangles, Pixels, Line, Triangle, Quadrilaterals with multi-sample anti-aliasing (MSAA) |
Blitting | Copy, alpha blending, pixel format conversion | Copy, alpha blending, pixel format conversion, color keying |
Texture Mapping | No | Yes |
Vector Graphics | No | No* |
* Hardware accelerated Vector Graphics is available with NeoChromVG.
With these capabilities available even more TouchGFX Widgets are accelerated with NeoChrom:
Widget | DMA2D | GPU2D |
---|---|---|
Box, BoxWithBorder | Yes | Yes |
Image, AnimatedImage, TiledImage, SnapshotWidget | Yes | Yes |
Button, ButtonWithIcon, ButtonWithLabel, ToggleButton | Yes | Yes |
RadioButton, RepeatButton | Yes | Yes |
PixelDataWidget | Yes | Yes |
TextArea, TextAreaWithWildcard, Keyboard | Partly | Yes |
ScalableImage | No | Yes |
TextureMapper, AnimatedTextureMapper | No | Yes |
Circle, Line, Graph, Gauge | No | No |
SVG | No | No* |
* Hardware accelerated SVG is available with NeoChromVG.
The operations that are not hardware accelerated are software rendered (implying a higher CPU-load, and lower performance). As the table above shows, NeoChrom can accelerate widgets like ScalableImage and TextureMapper. This means that we can use those widgets to a greater extent while keeping a high performance.
Vector Graphics
The new NeoChromVG accelerator can accelerate vector graphics. This capability is used when rendering SVG images with TouchGFX. An extra buffer called the stencil buffer is required by the graphics accelerator. This buffer has the same dimension as the frame buffer, but only 1 byte pr pixel.
Example, if your frame buffer is 480 x 480 in 24bpp, the stencil buffer must be 480 * 480 = 230.400 bytes. It is important to allocate the stencil buffer in fast SRAM for best performance.
The stencil buffer is allocated by the TouchGFX Generator. See this guide.
Rendering Time Improvements with NeoChrom.
The following examples illustrate the speed-up provided by NeoChrom over DMA2D and software rendering. We have created two projects with the Designer. The first project shows an Image on a Box background. The second project shows a TextureMapper Widget on a Box background. The widget is redrawn in every frame. In both cases the bitmap is of size 128x128, in ARGB8888 format, and stored in the internal flash. The framebuffer is in RGB565 format.
Both projects have been executed on a STM32F746 and a STM32U599 Discovery board.
We have measured the rendering times by connecting a GPIO to a logic analyzer:
The figure above shows the frame rate and rendering time when runninng on the STM32F746. Channel 00 shows the VSYNC signal. We see that the display runs with a frame interval of 16.9 ms (A1 to A2) corresponding to a frame rate of 59.2 Hz. Channel 01 shows the render time (high when rendering, B1 to B2). The time to render the Image is thus 1.3 ms. Image rendering is fast on the STM32F746.
The figure above is the texture mapper project running on the STM32F746. The rendering time of the TextureMapper is 4.5 ms. The TextureMapper widget is much slower than the Image.
Here is the STM32U599 Discovery kit running the Image project. The STM32U599 Discovery kit display has a display frame interval of 12.26 ms corresponding to a frame rate of 81.6 Hz. The render time of the Image is 0.7 ms. We see that the Image widget is faster than on the STM32F746 kit.
The render time of the TextureMapper is 1.7 ms. The TextureMapper is also faster on STM32U599 than on the STM32F746.
Rendering Time Summary
The table below shows the rendering times:
Element | STM32F746 | STM32U599 | Speed up |
---|---|---|---|
Frequency | 200 MHz | 160 MHz | 0.8 |
Image | 1.3 ms | 0.7 ms | ~2x |
TextureMapper | 4.5 ms | 1.7 ms | ~3x |
We see that even with a reduced clock frequency the STM32U599 greatly outperforms the STM32F746, especially with the TextureMapper.
These measurements are taken with the image in internal flash and the framebuffer in external SDRAM for STM32F746. The framebuffer is in internal SRAM for the STM32U599 (as this will be typical scenario). Moving the image to external flash hurts the STM32F746 as it uses QSPI flash (4-bit bus), whereas the STM32U599 uses a faster OSPI flash (8-bit bus).
The STM32F746 Discovery kit can run with a 480x272 RGB565 framebuffer
in the internal RAM. This improves the performance (Image down to 1.03
ms), but it is not the standard configuration for STM32F746, as it
uses a very large part of the internal SRAM for the frame buffer,
leaving little RAM for other application components.
Running with a single frame buffers is also not suitable for all
applications.
Richer User Interfaces
The improved rendering performance can be used to create user interfaces with
more advanced animations. For example more scaled or rotated
elements.
For the STM32F746, the frame refresh time was 16.8 ms. This means that
the application must keep the render time below this to keep a frame
rate of 60 fps. We can therefore have at most 3.75 texture mappers of
that complexity (16.8 ms / 4.48 ms) on the screen, or a single larger
texture mapper of size 247 x 247 (the same number of pixels and
approximately the same rendering time).
If we assume the same screen refresh rate, but use the STM32U599 CPU,
we can have 14.36 texture mappers (16.8 ms / 1.17 ms), or a single
texture mapper of size 485 x 485.
The following figure shows two applications running on respectively the STM32F746 and the STM32U599. The idea is to make a carousel-like menu where elements are scaled up when going to the center and scaled down when going out (here we just use the same texture for all the elements).
The STM32F746 is capable of showing three icons, one big icon, scaled up by a factor of 1.9, and two smaller icons. The STM32U599 is capable of showing 7 icons. The largest icon is scaled up by factor 2.7.
The rendering time of the application with 3 icons on the STM32F746 is 14.38 ms. The rendering time of the application with 7 icons on the STM32U599 is 14.93 ms. Both UIs can thus run in 60 fps, with the STM32U599 showing much more content in a higher resolution.
Accelerated Vector Graphics
The new NeoChromVG accelerator is capable of accelerating vector graphics. This open the possibilities of a new class of applications, where vector based graphics plays a central role and not bitmaps.
One example is a map-application. Maps can be built from bitmaps, but that often requires a very large storage or that specific map sections are downloaded in advance.
The video below shows a demonstration application running on a STM32U5F9. The application zooms, rotates, and pans a map that is drawn from a vector definition (multiple polygons that are filled with different colors and stroked). The video is running full screen on a 800 x 480 display with 16bpp colors.
STM32U5F9 showing a moving map.
Coordinate limitations
Vector graphics with coordinates above 1024 are by default discarded on NeoChrom and NeoChrom. See here for a work-around. Note, this happens both when an SVG image is scaled up or if the SVG contains large coordinates in itself.
Creating a project
CubeMX and the TouchGFX Generator supports the NeoChrom. In CubeMX the
accelerator is known by its code name GPU2D. The GPU2D accelerator is
only available to TouchGFX if GPU2D is enabled in the TouchGFX
configuration in CubeMX.
If you use the STM32U599 TBS (template project) provided with the
TouchGFX Designer this is already done. If you make your own custom
project, make sure to enable the GPU2D Accelerator as shown below:
After enabling GPU2D press "Generate Code" in CubeMX. This regenerates the target configuration code. Now open the project in the TouchGFX Designer and generate code there also (F4).
The Designer generates assets (images, fonts, and texts) and simulator code, that matches the target configuration. You are now ready to compile the code with IAR.
If you are starting a project from the Designer there is no need to open CubeMX unless you need to change some hardware settings.
Supported IDEs
The STM32U599 TBS (version 3.0.0) is currently only supported with IAR Workbench.
A recent version of IAR is required (8.5x.x). Make sure that the processor variant (STM32U599NJ for the Discovery board) is supported by checking the General Options:
Framebuffer Formats
The STM32U599 discovery board supports three framebuffer formats: RGB565, RGB888, ARGB8888. These are configurable from CubeMX.
NeoChrom Limitations
The NeoChrom graphics accelerator in STM32U599 does not support the L8
image formats (L8_RGB565, L8_RGB888, L8_ARGB8888).
If you use these image formats in a TouchGFX application running on
STM32U599 the images will be drawn using DMA2D. If you use these
formats with ScalableImage or TextureMapper a software fallback will
be used.
It is therefore recommended to not use L8 images with STM32U599.