Jetson Orin Nano: How to achieve real-time performance for video encoding
Many multimedia projects require efficient video encoding on the embedded system. In the past, RidgeRun has taken advantage of the dedicated hardware encoders available on the Jetson modules to meet our customers' encoding requirements while reducing the impact of encoding on the CPU load.
The Jetson Orin Nano does not include hardware units for video encoding (NVENC), unlike the other modules of the NVIDIA Orin family. Therefore, users must find alternatives to encode their video other than the hardware-accelerated NVENC module, such as CPU-based encoding. CPU-based encoding solutions leave less CPU power for additional tasks and in some cases may not achieve the same performance as NVENC-based encoding.
In this article, we present the performance results of two H.264 software-based encoding alternatives on Jetson Orin Nano. To carry out the tests we used the capabilities of the Jetson AGX Orin to emulate a Jetson Orin Nano 8GB. To learn more about the emulation, visit RidgeRun's wiki on the emulation of the Jetson Orin modules.
The CPU-based H.264 video encoding results presented in this article were obtained using two popular solutions: FFmpeg and GStreamer. FFmpeg is a multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter, and play multimedia files. GStreamer is a multimedia framework for constructing graphs of media-handling components, it supports a range from simple Ogg/Vorbis playback, and audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
For each software solution, we tested three different encoding presets that affect the quality of the video compression: veryslow, medium, and ultrafast. A preset is a set of options that determine the balance between encoding speed and compression ratio. A slower preset results in better compression, which translates to higher quality per file size. This means that if you have a target file size or constant bit rate, choosing a slower preset will result in better quality. Similarly, for constant quality encoding, selecting a slower preset will save on bitrate.
We set two main goals: finding the maximum frame rate achievable at a fixed resolution, and measuring the CPU load while encoding streams of 1920x1080 at 30 fps and 60 fps. For all cases, we tested three bitrate configurations: variable, fixed at 1 Mbps, and fixed at 10 Mbps.
The results were obtained using the following hardware and software setup:
NVIDIA Jetson Orin Nano (Emulated on a Jetson AGX Orin Developer Kit).
8 GB RAM.
FFmpeg version 4.2.7 with the x264 codec.
GStreamer version 1.16 with the x264enc element.
Maximum Frame Rate
The objective of the tests in this section is to determine the maximum frame rate and the CPU usage required to encode videos at 1080p resolution with different settings. More details with other resolutions can be found in our Software Encoders For Jetson Orin Nano wiki. The frame rate mentioned in these tests corresponds to the number of frames being encoded per second.
As seen in Fig. 1, while encoding with a fixed bitrate of 10 Mbps, we get up to 71 fps with the ultrafast encoding preset, 24 fps with the medium preset, and only 5 fps with the veryslow preset. This behavior will repeat across all tests since the veryslow preset tends to consume more resources but provides better quality, therefore producing slower encoding times. Faster presets will trade off quality for faster encoding.
With the variable bitrate configuration, a small drop in performance is observed. In this case, the encoder adjusts the bitrate as needed to maintain the desired quality, which can impact the overall performance.
In terms of CPU usage, as seen in Fig. 2, the veryslow and medium presets show a high usage for all configurations, both close to 80% of CPU load. The usage for the medium preset is between 63% and 68%. The ultrafast ranges between 47% and 51%.
Fig. 3, 4, and 5 show a frame of the test video (Copyright Blender Foundation | www.blender.org) for each preset tested on a 1080p video using FFmpeg. The ultrafast preset has a noticeable reduction in image quality in comparison with the veryslow preset. Therefore, the preset selection would depend on your specific use case. If a lower image quality does not have a significant impact, a faster preset would be the best option. If good quality is needed, a slower preset would be preferred, with the downside of higher CPU load and lower encoding frame rates. However, by controlling the bitrate you can still get acceptable quality while using a faster encoding preset.
Fig. 6 shows that variable bitrate gives us up to 95 fps when encoding with ultrafast preset, medium gets 43 fps, and veryslow only reaches 4.75 fps. A 10 Mbps bitrate gives up to 64.7 fps with ultrafast, 22.5 fps with medium, and 4.6 fps using the veryslow preset. As usual, the behavior may be controlled depending on the configuration we set on the GStreamer pipeline.
The CPU usage is similar to FFmpeg, as it still takes a considerable amount of the CPU resources as seen in Fig. 7 The CPU load with the veryslow preset ranges between 77% and 82%. The medium goes from 57% to 70%, and with the ultrafast, both are close to 52%.
The results, in terms of video quality, are very similar to FFmpeg. The video encoded with the fastest preset turns out with a significant drop in quality compared to the veryslow and medium preset, which is expected. Fig. 8, 9, and 10 show an example of the output video (Copyright Blender Foundation | www.blender.org). The images were taken from a 1080p resolution video.
For these tests, the goal is to take video streams of 30 and 60 fps and evaluate whether FFmpeg and GStreamer are able to fully encode the stream in real-time, as if it was taken from a live source (like a camera), and also evaluate resource usage with different configurations. We are going to focus on the 1080p resolution since it is the most demanding in terms of resources. For results on other common resolutions, visit our Software Encoders For Jetson Orin Nano wiki.
While encoding a stream of 1080p 60 fps, the only tested preset able to encode the whole 60 fps stream is ultrafast. The medium only goes between 19 fps and 42 fps, depending on the bitrate configuration. The veryslow remains between 3 fps and 8 fps.
The tests for the 1080p 30 fps stream also showed that only the ultrafast preset was able to encode the stream at 30 fps. The medium was able to encode at 30 fps only with a 1 Mbps bitrate setting.
In Fig. 11 we can see that, as usual, the slowest preset takes up a considerable amount of the CPU reaching almost 80%, while the fastest only needs around 42% to complete the encoding.
The tests for the 1080p 60 fps showed that, like FFmpeg, only the ultrafast preset was able to encode the stream at the same 60 fps ratio, except at a 10 Mbps bitrate. The veryslow preset encoding speed is between 5 fps and 10 fps, and the medium ranges between 19 fps and 39 fps.
With a 1080p 30 fps stream, once again, the ultrafast was able to encode it at the same ratio for all bitrate settings. The encoding speed from the veryslow preset ranges between 3 fps and 6 fps, and the medium between 16 fps and 30 fps.
Fig. 12 below shows that the veryslow preset presents the highest CPU usage ranging from 66% to 82%. The medium varies from 49% to 70%, and the ultrafast from 30% to 42%.
It would be possible to have approximately 4 1080p@30 FPS parallel streams being encoded without maxing out the CPU, as long as a lower bitrate is used and a faster preset is selected.
It would be possible to have approximately 2 1080p@60 FPS parallel streams being encoded at the same time if a low bitrate is selected with a faster preset.
It is possible to achieve good quality while using a faster preset by using a higher bitrate. For instance, a video encoded with a slow preset may be comparable with another video encoded with a faster preset and a higher bitrate.
The maximum frame rate achievable will depend on the bitrate configuration, the higher the bitrate the fewer frames that can be encoded per second, but the higher the quality we will get.
At 1080p, the encoders configured at either the veryslow or medium preset were not able to encode the 30 and 60 FPS streams. The ultrafast preset was able to encode both streams at their respective frame rate. This result applies to all bitrates tested.
A slower preset will result in better usage of the target bitrate (better quality with the same bitrate) but at the cost of a lower framerate and higher CPU usage.
When using a variable bitrate, the encoder adjusts the bitrate as needed to maintain the desired quality, which may result in a decrease in performance.