Skip to main content

Achieving Better Performance with CacheableContainer

In this section you will see how to achieve better performance in some animation scenarios by using RAM to save some reusable drawings.

When moving widgets in your application (like Image or TextArea), either through dragging or animation, TouchGFX needs to redraw these widgets in their new positions in every frame, while also in most cases redraw part of the background that was previously covered by these widgets.

When these widgets are computationally complex such as the TextureMapper widget, Shapes, and also large transparent Images it is hard for the MCU to render effeciently, as these are rendered without hardware acceleration. This results in a screen redraw that takes many milliseconds and impacts the performance of the application.

In this we will now see how to use the CacheableContainer to speed up animations that involve computationally complex elements by avoiding costly redrawing. While measurements in this article were performed using an STM32F429Discovery board, the CacheableContainer technique applies generally to other hardware platforms. Some available RAM is required for creation of a bitmap cache.

Further reading
Read also about Dynamic Bitmaps.

Performance Impact

Due to the performance implications of moving computationally expensive widgets with the MCU, an animation that evolves in many small steps will appear slow and sluggish due to a high render time for each frame. Programming the animation to complete faster (in time) will cause individual steps to be large, and the animation will not appear smooth to the user.

The following is an example running on an STM32F429-DISCO board (240x320), where a fullscreen Container is moved up vertically, while a similar Container is moved in from the bottom.

In the video below, the ToggleButton switches between CacheableContainer being enabled and disabled. The performance difference is clearly visible.

The two Containers that are moved each consist of a background Box, a TextArea, and a TextureMapper. The TextureMapper is configured to use the bilinear rendering algorithm and a global alpha of 174, making it very expensive to draw. The rendering time for the whole screen is around 100 ms on the STM32F429-DISCO board.

Test Application

In order to move the two elements relative to each other, they are put in a parent Container named masterContainer which is given twice the height of either child Container, giving it a size of 240 x 640 (2*320). By declaring the container as a move animator in TouchGFX Designer, it will be able to receive application ticks and animate over time during which performance can be measured.

CacheableContainer test application overview

The upper container named container1 is placed at position x=0, y=0. The lower container named container2 is placed at position x=0, y=320 directly below container1 in the parent masterContainer.

Since container1 and container2 are placed in the masterContainer, the two elements will move together when we move the masterContainer. For example, if we move the masterContainer to position x=0, y=-320, container1 will be invisible, but container2 will be fully visible. The animation between these two states can be created using an interaction in TouchGFX Designer.

The code below will move the masterContainer up if it is down, and down if it is already up. For simplicity, the code is inserted into the handleClickEvent eventhandler of the view, and is therefore executed whenever the user touches anywhere on the screen (below the ToggleButton):

Screen1View.cpp
void Screen1View::handleClickEvent(const ClickEvent& evt)
{
//Forward event to base View (for the ToggleButton to work)
View::handleClickEvent(evt);
//If touch is released and y > 50 (below the ToggleButton), move masterContainer
if (evt.getType() == ClickEvent::RELEASED && evt.getY() > 50)
{
const int endPosition = masterContainer.getY() >= 0 ? -320 : 0;
masterContainer.startMoveAnimation(masterContainer.getX(), endPosition,
20 /* ticks */,
EasingEquations::cubicEaseInOut,
EasingEquations::cubicEaseInOut);
}
}

Performance of Redrawing Complex Containers

As mentioned, the render time for one frame is around 100 ms when the MCU has to redraw the expensive TextureMapper at each small step of the animation. This gives us 10 frames per second (fps). The whole animation is 20 frames and will therefore take around two seconds.

On the STM32F429-DISCO evaluation kit, the rendering time is available as a digital signal on GPIO G14. The VSYNC signal is available on G13. The GPIO configuration is set up in the GPIO.cpp file.

The following image is a measurement of VSYNC and RENDER_TIME for the application when moving the masterContainer upwards:

Saleae Logic Software vsync and render time measurement

The rendering time is the first signal (active low). We can see that the rendering time for the first frame in the move animation is 99.29 ms.

The lower signal is the VSYNC, which transitions high to low on every frame when pixels are clocked out to the display. We can see on the measurement above that drawing a single frame covers the time for 7 frames on the display. On the 8th VSYNC signal the rendering of the next frame starts. During the rendering, the display is repeatedly showing the previously drawn frame (in the other framebuffer).

Improving Performance Through Caching

We can improve the performance of the above move animation by caching the rendering of the container to memory. After doing that we can simply move the pixels located in that memory (using DMA) to the framebuffer, rather than redrawing a complex widget using the MCU. Even if an application could achieve 60 frames per second using the MCU alone it would be busy (perhaps with 100% MCU load) making the same calculations repeatedly rather than doing something more important.

This "in-memory-image" of the Container can now be shown on the screen at different places, instead of re-rendering the Container.

The first thing to do is to enable caching through TouchGFX Designer by checking the Cacheable property on the two Containers container1 and container2:

CacheableContainer option on Container widget

The next step is to create two dynamic bitmaps in RAM that the Containers can be cached into.

Decide on an address in RAM where the bitmap cache should be located. In this particular example, we placed it in SDRAM (starts at address 0xd0000000 on an STM32F429) just after the framebuffers.

For the Windows simulator, the cache is allocated in a global variable:

Screen1View.hpp
#ifdef SIMULATOR
uint32_t sdramBuffer[8*1024*1024/4];
uint16_t* sdram = (uint16_t*)sdramBuffer;
#else
uint16_t* sdram = (uint16_t*)(0xd0000000 + 320*240*2*2);
#endif

Initialize the bitmap cache and create two dynamic bitmaps for caching:

Screen1View.cpp
//Create bitmap cache and two dynamic bitmap for caching, each bitmap is 150Kb
Bitmap::setCache(sdram, 320*1024, 2); //320Kb cache
dynamicBitmap1 = Bitmap::dynamicBitmapCreate(240, 320, Bitmap::RGB565);
dynamicBitmap2 = Bitmap::dynamicBitmapCreate(240, 320, Bitmap::RGB565);

Assign the dynamic bitmaps to the Containers and set them in caching mode:

Screen1View.cpp
//Assign the bitmaps to the CacheableContainers
container1.setCacheBitmap(dynamicBitmap1);
container2.setCacheBitmap(dynamicBitmap2);

//Enable caching
container1.enableCachedMode(true);
container2.enableCachedMode(true);

//Finally update the cached bitmaps
container1.updateCache();
container2.updateCache();

Calls to Container::updateCache() will render the two Containers into their respective bitmaps. Call this method whenever an updated state of the containers is needed. This must be handled in application code by the developer.

With caching enabled for container1 and container2, performance measurements now show a factor 20 improvement in render time from ~99ms to ~5ms meaning we can easily render in 60 frames per second completing the entire animation within 20 frames.

Saleae Logic Software vsync and render time measurement

Conclusion

Using CacheableContainer with DynamicBitmap when animating (frequent moves) can improve the render time dramatically when the subject is computationally complex and does not change between animation steps. In the event that the cache must update (e.g. a watch face when the time is updated) the contents of the cache can be recomputed at certain points during the animation controlled by the application.