When developing embedded vision applications, performance is crucial since the resulting data is needed by critical system functions, often with real-time requirements. For instance, in a self-driving car this can mean the difference between life and death. Moreover, the performance of your vision processing can greatly impact control system performance, for instance in robotics. More frequent inputs, with less latency, means better control and reduced cycle times.
Today’s powerful vision processors and GPUs allow for great performance, but how do you ensure your solution really makes efficient use of the hardware? Perhaps some node requires much more processing time than expected and overloads a core, while the other cores are mostly idle? Perhaps the application is spending a lot of time waiting for DMA transfers to complete? Perhaps you have tried to improve performance by adding more compute resources, but the performance gain is less than expected? Fixing bottlenecks like these may yield vast improvements in performance, often with only small changes. This requires good insight into the runtime system.
Percepio Tracealyzer for OpenVX allows you to visualize the execution of OpenVX applications and identify bottlenecks where optimization can make a big difference. Tracealyzer for OpenVX is initially available for Synopsys EV6x embedded vision processors, leveraging the built-in trace support in Synopsys ARC MetaWare EV Development Toolkit.
Tracealyzer for OpenVX provides a variety of graphical views showing different perspectives of the recorded behavior, ranging from a detailed trace view to high-level overviews and statistics. The trace view, shown on the left side in the screenshot above, displays a timeline of the OpenVX graph execution on the available cores and accelerators, so you can to study the scheduling, pipelining and timing in great detail. The trace view can be adapted in many ways and supports both horizontal and vertical display.
In the center part of the screenshot you see two instances of Tracealyzer’s CPU Load Graph, showing how the processor cores are utilized by the application. This way, you can see not only the overall load, but also how the load varies over time, and how much processing time that is used by each graph node. It also serves an an overview of the trace where you can spot anomalies. You can for instance see two significant spikes in the ”sobel3x3” node (shown in red). To see what causes them, you can simply double-click in the CPU Load Graph to show the corresponding section in the trace view. All views in Tracealyzer are interconnected in similar ways, which makes it easy to drill down from high-level overviews into the detailed trace. The colors are used to make it easier to identify the OpenVX graph nodes. The same color coding is used across all Tracealyzer views.
In the screenshot above, we have changed the trace view to use horizontal mode and also added two instances of the Actor Instance Graph below it, where the Y-axis shows the execution times of the graph nodes. This way, you can see where nodes execute longer than normal and inspect the trace view to see the details. Notice the blue selection, made with a simple ”click-and-drag”. The selection is displayed in all views and naturally shows the same time frame in all of them. As you can see, the upper trace view is zoomed in on a shorter time interval while the lower graphs are zoomed out.
The Actor Statistics Report seen in the screenshot above shows the highest, lowest and average values for several timing properties, including execution time, for each graph node. All extreme values in this report (highest/lowest) are actually links into the trace, so by clicking the values you see the corresponding location in the trace view.
The trace view can also show user events, i.e. custom events logged by the application developer. User events allows you to visualize just about anything in your application, like diagnostic messages, variable values and states. In the screenshot above, we see an example where user events have been logged on two user event channels: ”MyVariable” showing values of an integer variable and ”MyState” showing state names. Tracealyzer can display such user events in several ways, e.g. as event labels in the trace view (1) and as entries in the Event Log (2). The User Event Signal Plot (3) allows for plotting numerical data from user events.
Moreover, if you have important state variables in your system, you can log the state changes as user events and define a State Machine in Tracealyzer to see the states in the trace view timeline (4). You can also see a summary of the state changes as a state machine graph (5). You can even get statistics on the time spent in each state, or the time between any two events, by defining a custom interval.
Tracealyzer is a powerful and integrated toolbox for trace visualization and analysis, with many more features and options than mentioned here. It supports several software platforms, not just OpenVX and vision processing but also general RTOS- and Linux-based systems.