TouchGFX on NeoChrom/NeoChromVG
This section discusses how to use TouchGFX on hardware having the NeoChrom graphics accelerator. This graphics accelerator improves the performance significantly for operations like texture mapping, image scaling and rotation. This means that more advanced UIs can be built while still keeping a high frame rate.
The NeoChrom graphics accelerator is currently only available in the STM32U5x9 and STM32H7R/S microcontrollers. Development boards are available for both families.
The NeoChrom accelerator is known by the name GPU2D in source code and in CubeMX.
NeoChrom and NeoChromVG
The NeoChrom accelerator has been updated with extra capabilities with the introduction of STM32U5G9. The improved accelerator is named NeoChromVG. The accelerator contains extended capabilities that allows hardware accelerated vector graphics.
Micro controller | Accelerator |
---|---|
STM32U599 | NeoChrom |
STM32U5A9 | NeoChrom |
STM32U5F9 | NeoChromVG |
STM32U5G9 | NeoChromVG |
STM32H7Rx | NeoChrom |
STM32H7Sx | NeoChrom |
NeoChrom Graphical Capabilities
The NeoChrom accelerators are capable of performing basic blitting (drawing images), blending, scaling, rotation, and texture mapping. All such operations are automatically used by TouchGFX when running on a microcontroller with NeoChrom.
Compared to the DMA2D graphics accelerator NeoChrom is capable of accelerating more graphical operations and has a richer feature set:
Graphic feature | DMA2D | GPU2D |
---|---|---|
Supported formats (with TouchGFX) | ARGB8888, RGB888, RGB565, A8, A4, L8 | ARGB8888, RGB888, RGB565, A8, A4, A2, A1 |
Command list based | No | Yes |
Drawing | Rectangles | Rectangles, Pixels, Line, Triangle, Quadrilaterals with multi-sample anti-aliasing (MSAA) |
Blitting | Copy, alpha blending, pixel format conversion | Copy, alpha blending, pixel format conversion, color keying |
Texture Mapping | No | Yes |
Vector Graphics | No | No* |
* Hardware accelerated Vector Graphics is available with NeoChromVG.
With these capabilities available even more TouchGFX Widgets are accelerated with NeoChrom:
Widget | DMA2D | GPU2D |
---|---|---|
Box, BoxWithBorder | Yes | Yes |
Image, AnimatedImage, TiledImage, SnapshotWidget | Yes | Yes |
Button, ButtonWithIcon, ButtonWithLabel, ToggleButton | Yes | Yes |
RadioButton, RepeatButton | Yes | Yes |
PixelDataWidget | Yes | Yes |
TextArea, TextAreaWithWildcard, Keyboard | Partly | Yes |
ScalableImage | No | Yes |
TextureMapper, AnimatedTextureMapper | No | Yes |
Circle, Line, Graph, Gauge | No* | No* |
SVG | No | No** |
* The drawing/blending of pixels to the framebuffer is done by DMA2D, but the shape calculations are done in software. ** Hardware accelerated SVG is available with NeoChromVG.
The operations that are not hardware accelerated are software rendered (implying a higher CPU-load, and lower performance). As the table above shows, NeoChrom can accelerate widgets like ScalableImage and TextureMapper. This means that we can use those widgets to a greater extent while keeping a high performance.
Vector Graphics
The new NeoChromVG accelerator can accelerate vector graphics. This capability is used when rendering SVG images with TouchGFX. An extra buffer called the stencil buffer is required by the GPU2D graphics accelerator. This buffer has the same dimension as the frame buffer, but only 1 byte pr pixel.
Example, if your frame buffer is 480 x 480 in 24bpp, the stencil buffer must be 480 * 480 = 230,400 bytes. It is important to allocate the stencil buffer in fast SRAM for best performance.
The stencil buffer is allocated by the TouchGFX Generator. See this guide.
Rendering Time Improvements with NeoChrom.
The following examples illustrate the speed-up provided by NeoChrom over DMA2D and software rendering. We have created two projects with the Designer. The first project shows an Image on a Box background. The second project shows a TextureMapper Widget on a Box background. The widget is redrawn in every frame. In both cases the bitmap is of size 128x128, in ARGB8888 format, and stored in the internal flash. The framebuffer is in RGB565 format.
Both projects have been executed on a STM32F746 and a STM32U5A9 Discovery board.
We have measured the rendering times by connecting a GPIO to a logic analyzer:
The figure above shows the frame rate and rendering time when running on the STM32F746. Channel 00 shows the VSYNC signal. We see that the display runs with a frame interval of 16.9 ms (A1 to A2) corresponding to a frame rate of 59.2 Hz. Channel 01 shows the render time (high when rendering, B1 to B2). The time to render the Image is thus 1.3 ms. Image rendering is fast on the STM32F746.
The figure above is the texture mapper project running on the STM32F746. The rendering time of the TextureMapper is 4.5 ms. The TextureMapper widget is much slower than the Image.
Here is the STM32U5A9 Discovery kit running the Image project. The STM32U5A9 Discovery kit display has a display frame interval of 12.26 ms corresponding to a frame rate of 81.6 Hz. The render time of the Image is 0.7 ms. We see that the Image widget is faster than on the STM32F746 kit.
The render time of the TextureMapper is 1.7 ms. The TextureMapper is also faster on STM32U5A9 than on the STM32F746.
Rendering Time Summary
The table below shows the rendering times:
Element | STM32F746 | STM32U5A9 | Speed up |
---|---|---|---|
Frequency | 200 MHz | 160 MHz | 0.8 |
Image | 1.3 ms | 0.7 ms | ~2x |
TextureMapper | 4.5 ms | 1.7 ms | ~3x |
We see that even with a reduced clock frequency the STM32U5A9 greatly outperforms the STM32F746, especially with the TextureMapper.
These measurements are taken with the image in internal flash and the framebuffer in external SDRAM for STM32F746. The framebuffer is in internal SRAM for the STM32U5A9 (as this will be typical scenario). Moving the image to external flash hurts the STM32F746 as it uses QSPI flash (4-bit bus), whereas the STM32U5A9 uses a faster OSPI flash (8-bit bus).
The STM32F746 Discovery kit can run with a 480x272 RGB565 framebuffer in the internal RAM. This improves the performance (Image down to 1.03 ms), but it is not the standard configuration for STM32F746, as it uses a very large part of the internal SRAM for the frame buffer, leaving little RAM for other application components. Running with a single frame buffers is also not suitable for all applications.
Richer User Interfaces
The improved rendering performance can be used to create user interfaces with more advanced animations. For example more scaled or rotated elements. For the STM32F746, the display refresh time was 16.8 ms. This means that the application must keep the render time below this time to keep a frame rate of 60 fps. We can therefore have at most 3.75 texture mappers of that complexity (16.8 ms / 4.48 ms) on the screen, or a single larger texture mapper of size 247 x 247 (the same number of pixels and approximately the same rendering time). If we assume the same screen refresh rate, but use the STM32U5A9 CPU, we can have 14.36 texture mappers (16.8 ms / 1.17 ms), or a single texture mapper of size 485 x 485.
The following figure shows two applications running on respectively the STM32F746 and the STM32U5A9. The idea is to make a carousel-like menu where elements are scaled up when going to the center and scaled down when going out (here we just use the same texture for all the elements).
STM32F746 (left) with a 480x272 pixels display and STM32U5A9 (right) running the texture mapper based carousel project on a 480x480 display.
The STM32F746 is capable of showing three icons, one big icon, scaled up by a factor of 1.9, and two smaller icons. The STM32U5A9 is capable of showing 7 icons. The largest icon is scaled up by factor 2.7.
The rendering time of the application with 3 icons on the STM32F746 is 14.38 ms. The rendering time of the application with 7 icons on the STM32U5A9 is 14.93 ms. Both UIs can thus run in 60 fps, with the STM32U5A9 showing much more content in a higher resolution.
Accelerated Vector Graphics
The new NeoChromVG accelerator is capable of accelerating vector graphics. This open the possibilities of a new class of applications, where vector based graphics plays a central role and not bitmaps.
One example is a map-application. Maps can be built from bitmaps, but that often requires a very large storage or that specific map sections are downloaded in advance.
The video below shows a demonstration application running on a STM32U5G9. The application zooms, rotates, and pans a map that is drawn from a vector definition (multiple polygons that are filled with different colors and stroked). The video is running full screen on a 800 x 480 display with 16bpp colors.
STM32U5G9 showing a moving map.
SVG
The NeoChromVG accelerator drastically improves the performance of drawing SVG images. TouchGFX automatically leverages the available hardware. A simple example will show the improvements. Here is an SVG image that we will draw in size 152x152 pixels on a STM32F746 with software rendering, a STM32H7S with NeoChrom, and a STM32U5G9 with NeoChromVG:
The render time of the SVG image is shown in the table below:
Micro controller | Accelerator | Render time /ms |
---|---|---|
STM32F746 | NeoChrom | 4.12 |
STM32U5G9 | NeoChromVG | 0.97 |
STM32H7S8 | NeoChrom | 2.8 |
Read more about using SVG in TouchGFX here: SVG.
Vector fonts
The NeoChrom and NeoChromVG also accelerates drawing of vector fonts. You can read more about how to use vector fonts in this article: Vector fonts.
Creating a project
CubeMX and the TouchGFX Generator supports the NeoChrom. In CubeMX the accelerator is known by its code name GPU2D. The GPU2D accelerator is only available to TouchGFX if GPU2D is enabled in the TouchGFX configuration in CubeMX. If you use any of the TouchGFX TBS's (template projects) provided with the TouchGFX Designer this is already done. If you make your own custom project, make sure to enable the GPU2D Accelerator as shown below:
After enabling GPU2D press "Generate Code" in CubeMX. This regenerates the target configuration code. Now open the project in the TouchGFX Designer and generate code there also (F4).
The Designer generates assets (images, fonts, and texts) and simulator code, that matches the target configuration. You are now ready to compile the code.
If you are starting a project from the Designer there is no need to open CubeMX unless you need to change some hardware settings.
Framebuffer Formats
The STM32U5A9 discovery board supports three framebuffer formats: RGB565, RGB888, ARGB8888. These are configurable from CubeMX.
NeoChrom Limitations
The NeoChrom and NeoChromVG graphics accelerators does not support the L8 image formats (L8_RGB565, L8_RGB888, L8_ARGB8888) with TouchGFX. If you use these image formats in a TouchGFX application running on STM32U5A9 the images will be drawn using DMA2D. If you use these formats with ScalableImage or TextureMapper a software fall-back will be used.
It is therefore recommended to not use L8 images with micro controllers with NeoChrom/NeoChromVG accelerators.
The NeoChrom graphics accelerator creates suboptimal anti-aliasing on graphics elements drawn with the "Non-zero fill-rule" compared to NeoChromeVG. This may happen with SVG files that can specify the fill-rule as "nonzero". A work-around is to use "evenodd" fill-rule, but it is not valid for all drawings.