Achieving Better Performance with Cacheable Container
In this section you will see how to achieve better performance in some animation scenarios by using RAM to save some reusable drawings.
When moving widgets in your application (like Image or TextArea), either through dragging or animation, TouchGFX needs to redraw these widgets in their new positions in every frame, while also in most cases redraw part of the background that was previously covered by these widgets.
When these widgets are computationally complex such as the Texture Mapper widget, Shapes, and also large transparent Images it is hard for the MCU to render effeciently, as these are rendered without hardware acceleration. This results in a screen redraw that takes many milliseconds and impacts the performance of the application.
In this we will now see how to use the Cacheable Container to speed up animations that involve computationally complex elements by avoiding costly redrawing. While measurements in this article were performed using an STM32F429Discovery board, the Cacheable Container technique applies generally to other hardware platforms. Some available RAM is required for creation of a bitmap cache.
Further reading
Performance Impact
Due to the performance implications of moving computationally expensive widgets with the MCU, an animation that evolves in many small steps will appear slow and sluggish due to a high render time for each frame. Programming the animation to complete faster (in time) will cause individual steps to be large, and the animation will not appear smooth to the user.
The following is an example running on an STM32F429-DISCO board (240x320), where a fullscreen Container is moved up vertically, while a similar Container is moved in from the bottom.
In the video below, the ToggleButton switches between Cacheable Container being enabled and disabled. The performance difference is clearly visible.
The two Containers that are moved each consist of a background Box, a TextArea, and a Texture Mapper. The Texture Mapper is configured to use the bilinear rendering algorithm and a global alpha of 174, making it very expensive to draw. The rendering time for the whole screen is around 100 ms on the STM32F429-DISCO board.
Test Application
In order to move the two elements relative to each other, they are put
in a parent Container named masterContainer
which is given twice the height of either child Container, giving it a
size of 240 x 640 (2*320)
. By declaring the container as a move
animator in TouchGFX Designer, it will be able to receive application
ticks and animate over time during which performance can be measured.
The upper container named container1
is placed at position x=0, y=0. The
lower container named container2
is placed at position x=0, y=320 directly
below container1 in the parent masterContainer
.
Since container1
and container2
are placed in the masterContainer
, the
two elements will move together when we move the masterContainer
. For example, if
we move the masterContainer
to position x=0, y=-320, container1
will be invisible, but container2
will be fully visible.
The animation between these two states can be created using an
interaction in TouchGFX Designer.
The code below will move the masterContainer
up if it is down, and
down if it is already up. For simplicity, the code is inserted into the
handleClickEvent
eventhandler of the view, and is therefore executed
whenever the user touches anywhere on the screen (below the ToggleButton):
Screen1View.cpp
void Screen1View::handleClickEvent(const ClickEvent& evt)
{
//Forward event to base View (for the ToggleButton to work)
View::handleClickEvent(evt);
//If touch is released and y > 50 (below the ToggleButton), move masterContainer
if (evt.getType() == ClickEvent::RELEASED && evt.getY() > 50)
{
const int endPosition = masterContainer.getY() >= 0 ? -320 : 0;
masterContainer.startMoveAnimation(masterContainer.getX(), endPosition,
20 /* ticks */,
EasingEquations::cubicEaseInOut,
EasingEquations::cubicEaseInOut);
}
}
Performance of Redrawing Complex Containers
As mentioned, the render time for one frame is around 100 ms when the MCU has to redraw the expensive Texture Mapper at each small step of the animation. This gives us 10 frames per second (fps). The whole animation is 20 frames and will therefore take around two seconds.
On the STM32F429-DISCO evaluation kit, the rendering time is available
as a digital signal on GPIO G14. The VSYNC signal is available on
G13. The GPIO configuration is set up in the GPIO.cpp
file.
The following image is a measurement of VSYNC and RENDER_TIME for the
application when moving the masterContainer
upwards:
The rendering time is the first signal (active low). We can see that the rendering time for the first frame in the move animation is 99.29 ms.
The lower signal is the VSYNC, which transitions high to low on every frame when pixels are clocked out to the display. We can see on the measurement above that drawing a single frame covers the time for 7 frames on the display. On the 8th VSYNC signal the rendering of the next frame starts. During the rendering, the display is repeatedly showing the previously drawn frame (in the other framebuffer).
Improving Performance Through Caching
We can improve the performance of the above move animation by caching the rendering of the container to memory. After doing that we can simply move the pixels located in that memory (using DMA) to the framebuffer, rather than redrawing a complex widget using the MCU. Even if an application could achieve 60 frames per second using the MCU alone it would be busy (perhaps with 100% MCU load) making the same calculations repeatedly rather than doing something more important.
This "in-memory-image" of the Container can now be shown on the screen at different places, instead of re-rendering the Container.
The first thing to do is to enable caching through TouchGFX Designer by checking the Cacheable property on the two Containers container1
and container2
:
The next step is to create two dynamic bitmaps in RAM that the Containers can be cached into.
Decide on an address in RAM where the bitmap cache should be located. In this particular example, we placed it in SDRAM (starts at address 0xd0000000 on an STM32F429) just after the framebuffers.
For the Windows simulator, the cache is allocated in a global variable:
Screen1View.hpp
#ifdef SIMULATOR
uint32_t sdramBuffer[8*1024*1024/4];
uint16_t* sdram = (uint16_t*)sdramBuffer;
#else
uint16_t* sdram = (uint16_t*)(0xd0000000 + 320*240*2*2);
#endif
Initialize the bitmap cache and create two dynamic bitmaps for caching:
Screen1View.cpp
//Create bitmap cache and two dynamic bitmap for caching, each bitmap is 150Kb
Bitmap::setCache(sdram, 320*1024, 2); //320Kb cache
dynamicBitmap1 = Bitmap::dynamicBitmapCreate(240, 320, Bitmap::RGB565);
dynamicBitmap2 = Bitmap::dynamicBitmapCreate(240, 320, Bitmap::RGB565);
Assign the dynamic bitmaps to the Containers and set them in caching mode:
Screen1View.cpp
//Assign the bitmaps to the Cacheable Containers
container1.setCacheBitmap(dynamicBitmap1);
container2.setCacheBitmap(dynamicBitmap2);
//Enable caching
container1.enableCachedMode(true);
container2.enableCachedMode(true);
//Finally update the cached bitmaps
container1.updateCache();
container2.updateCache();
Calls to Container::updateCache()
will render the two Containers
into their respective bitmaps. Call this method whenever an updated
state of the containers is needed. This must be handled in application
code by the developer.
With caching enabled for container1
and container2
, performance
measurements now show a factor 20 improvement in render time from
~99ms to ~5ms meaning we can easily render in 60 frames per second
completing the entire animation within 20 frames.
Conclusion
Using Cacheable Container with DynamicBitmap when animating (frequent moves) can improve the render time dramatically when the subject is computationally complex and does not change between animation steps. In the event that the cache must update (e.g. a watch face when the time is updated) the contents of the cache can be recomputed at certain points during the animation controlled by the application.