Published August 10, 2018. Updated August 11, 2018.
Ever wonder what exactly makes some streams look great, and others not-so-great? Read on...
This report contains an analysis of the picture quality of x264 and NVENC H.264 encoded video game footage, across different common resolutions, bitrates, and encoder settings. Analysis was performed using Netflix VMAF, a machine learning algorithm trained by Netflix to detect perceivable quality degradation in video footage, when compared to a high quality source video.
One minute of Heroes of the Storm gameplay was captured on a Windows 10 version 1803 machine running an Intel Core i7-8700K 6-core CPU and an nVidia GeForce GTX 980 Ti GPU. Source footage was recorded using OBS Studio at 1920x1080 @ 60fps, with the NVENC H.264 encoder on the Lossless rate control. This resulted in a 1.5GB source file with video content encoded at a variable bitrate, averaging roughly 200Mbps - far beyond what any Twitch broadcast would ever transmit.
It's also worth mentioning, the GPU was running BIOS version 84.00.32.00.01 and Driver version 220.127.116.1182 (398.82). No overclocking or modification of any kind was used.
For source footage and all re-encodes (both x264 and NVENC), the
High profile was used to enable the full capabilities of the encoder. Older, more limited profiles such as
Main aren't necessary any more, unless your audience is watching on devices from 2010 or before (iPhone 4, original iPad).
The source footage was then run through a series of re-encodes using ffmpeg at various resolutions, bitrates, and x264 presets. Encoding was performed on an AMD Ryzen 7 2700X 8-core CPU running Ubuntu 18.04.
If you're not familiar with x264 presets, this quote from the ffmpeg wiki sums it up nicely:
"A preset is a collection of options that will provide a certain encoding speed to compression ratio. A slower preset will provide better compression (compression is quality per filesize). This means that, for example, if you target a certain file size or constant bit rate, you will achieve better quality with a slower preset."
The resolutions and bitrates tested were selected based on the minimum and maximum Twitch recommended bitrates, plus one step beyond and one step below Twitch's recommendations. A re-encode for each x264 preset from
ultrafast was created at each of the bitrates listed below, resulting in a total of 76 different encodings to compare, and additional encodings for NVENC H.264.
Re-encodes were performed using the source footage as input, with constant bitrates and a keyframe interval of 2 seconds, as recommended by Twitch:
To analyze picture quality, we have to first extract the raw video data from the source, and from each of the 76 re-encodes:
Then, we use Netflix VMAF to compare the re-encoded video against the source footage. The analysis compares each frame of the re-encoded video against the matching frame from the source footage, and scores that frame with a value between 0 and 100:
perc5 pool indicates that we'd like a final, single score for the whole file to be displayed at the end of the analysis, and we want that final score to be calculated as the 5th percentile of all the frames analyzed - thus, 5% of the frames analyzed will have a score lower than the final "perc5" score.
Here's an example of VMAF output for
1080p60, at a bitrate of
7500K, using the
slow preset, and receiving a perc5 score of 88.3:
For comparison, here's VMAF output for
ultrafast, receiving a perc5 score of 51.1:
In-depth analysis of cinematic footage has shown that, roughly speaking, a VMAF score of 93+ is essentially flawless cinematic footage, and a score of 20 is completely unwatchable.
However, to my knowledge, no proper scientific study has been performed to the train VMAF specifically on video game footage. Video game footage, in addition to having more fine detail than cinematic footage (smaller text and graphics), is also typically viewed much closer to the screen than is cinematic footage.
For comparison, here is a sample source frame at full quality, and that same frame from several of the re-encodes at various bitrates and presets:
Disclaimer: Keep in mind that this frame was selected specifically because it was one of the more difficult frames for the encoder, due to a fast camera pan. The image quality in the screenshots below do not represent the picture quality for the entire encode, just the quality of one particularly difficult frame.
|1080p60 7500K slow - VMAF frame score: 81.9||1080p60 6000K fast - VMAF frame score: 76.9|
|1080p60 4500K veryfast - VMAF frame score: 57.6||1080p60 3000K ultrafast - VMAF frame score: 55.0|
After subjectively analyzing these frames, and other select frames at various qualities with various VMAF scores, my personal scale when observing VMAF scores for video game footage is as follows:
- 95+ is essentially flawless
- 85-95 is excellent
- 75-85 is good
- 65-75 is fair
- 55-65 is poor
- below 55 is unwatchable
With the scale above in mind, here are the VMAF quality results for the 28 different 1080p60 re-encodes:
Above we can see that picture quality is largely unchanged when using the
faster preset, and more difficult ones, but drops off quickly once you use
veryfast or easier presets. At 6000K, picture quality dips below the "excellent" threshold for all presets except
medium. This suggests that even when using Twitch's maximum recommended bitrates, it is quite difficult to get high picture quality.
In fact, many of the top Twitch streamers that stream at 1080p60 do so with an expensive multi-PC setup capable of using the
medium presets, and they do so at bitrates far above Twitch's recommended maximum, in the 7500-8000K range. These results demonstrate why such setups are necessary to maintain excellent picture quality at 1080p60.
Many smaller Twitch streamers that are broadcasting in 1080p60 resort to encoding using the
veryfast preset, as more difficult presets are simply too demanding for typical hardware, especially single-PC setups. Unfortunately, the
veryfast preset using Twitch's maximum recommended bitrate of 6000K results in a VMAF score of just 78.9, firmly in the "good" range.
Streamers that don't have the bandwidth to encode at 6000K suffer greatly, as the 4500K encode using the
veryfast preset scores "fair" at 72.7. Dropping below Twitch's recommended minimum bandwidth down to 3000K, the
veryfast preset scores "poor" at 61.8.
NVENC H.264 has become a popular alternative to x264, since NVENC utilizes hardware accellerated encoding that greatly reduces the impact on the CPU. However, NVENC has also been criticized as generating poor quality output as compared to x264 encodes at the same bitrate.
Different generations of nVidia GPUs, as well as different SDK versions & drivers support different features in NVENC. An overview of the differences is summarized in the Video Encode and Decode GPU Support Matrix, as well as in the NVENC Application Note and the the NVENC Programming Guide. Most of the differences are in regards to encoding speed, 4K H.264 support, and H.265/HEVC support (so not currently relevant to Twitch streaming), but there are also some variations in H.264 optimizations across the generations of GPUs and SDKs. Just remember that it's not necissarily apples-to-apples when comparing across GPU generations, models, and SDK versions. As a reminder, these results were obtained from a GTX 980 Ti with 398.82 drivers.
One fantastic thing about NVENC is its encoding speed. As long as you are using a Maxwell GPU (GTX 750) or newer, NVENC is capable of encoding 1080p at 180+fps, even with 2-pass encoding enabled. Older, Kepler cards may struggle with 2-pass enabled, but for the majority of streamers, enabling 2-pass should not be a problem. As such, all of the analysis below used CBR and 2-pass encoding.
Also important to note is NVENC uses a different set of presets than x264:
- High Quality
- High Performance
- Low Latency High Quality
- Low Latency High Performance
Default supports CABAC, and should provide a good quality baseline.
High Quality uses CABAC, and also adds support for B-Frames, which should increase quality beyond Default.
High Performance doesn't utilize CABAC or B-Frames. Instead, it uses an easier CAVLC algorithm which, given a fixed bitrate, should lower quality relative to CABAC presets.
Bluray is essentially
High Quality, but with B-Frames capped at 3 for compatibility with bluray standards. All of the presets we tested (both x264 and NVENC) also used 3 B-Frames (where B-Frames were supported). Thus,
Bluray was excluded from this test, as the results did not differ from
High Quality in any meaningful way.
Low Latency HQ / HP presets use CABAC, but they don't use B-Frames. They also reduce difficulty in other ways such as reduced deblocking and smaller motion estimation search ranges. You should note that these presets were designed for applications that are hyper-sensitive to render times, such as GPU-streaming services (using a remote GPU over the internet) or video conferencing.
Here are the VMAF quality results for the 20 different 1080p60 NVENC H.264 re-encodes:
|NVENC||Default||High Quality||Low Latency HQ||Low Latency HP||High Performance|
NVENC's results are interesting, because the preset selected has only a slight effect on picture quality. By a very small margin,
High Quality is the best quality preset, and by a fair margin,
High Performance is the worst.
Notably missing from the NVENC results are any configurations that can generate an excellent picture. This is likely where the critics of NVENC make their argument, since, with enough CPU power, x264 will certainly generate a better picture.
Since it's likely that anyone who uses NVENC will be able to use the
High Quality preset without issue (again, unless you're on an older Kepler GPU), it's perhaps more useful to compare just the
High Quality NVENC preset to the x264 presets across a range of bitrates.
Compared to x264, NVENC HQ is approximately the same quality as x264
veryfast at the same bitrate. The breakeven is likely somewhere around 4000K, with higher bitrates slightly favoring NVENC, and lower bitrates slightly favoring x264 veryfast.
Another interesting difference is in which specific frames NVENC has difficulty with. When pulling the same frame that was examined previously in x264 (frame 3370), the NVENC samples were higher quality, across the board. This indicates that NVENC and x264 will degrade in slightly different ways, and at slightly different times.
If we check the VMAF output for a better example of a difficult frame for NVENC, we can see that, although it did relatively well with frame 3370, it had a much harder time with frame 944:
|1080p60 7500K NVENC HQ - VMAF frame score: 77.8||1080p60 6000K NVENC HQ - VMAF frame score: 72.6|
|1080p60 4500K NVENC HQ - VMAF frame score: 62.6||1080p60 3000K NVENC HQ - VMAF frame score: 50.9|
For analyzing 720p60 picture quality, there was one additional step of re-scaling the 1280x720 output back up to 1920x1080 so a frame-by-frame analysis against the 1080p60 source footage could be performed:
After scaling output back up, here are the VMAF quality results for the 28 different 720p60 x264 re-encodes:
Moving down to 720p60 brings a slight drop in perceivable quality across the board. One interesting difference is that Twitch's recommended maximum bitrate of 5000K at 720p60 is not capable of generating an excellent image, even with the most difficult presets.
Since the quality is lower across the board at 720p60, streamers with low bandwidth should consider dropping framerate to 30fps in order to maintain high image quality - especially since more difficult presets don't seem to make as much of a difference at 720p60 as they do at 1080p60.
In addition to measuring picture quality, the re-encodes were also benchmarked, and the utime recorded so an assessment of difficulty of each configuration could be recorded. Differences in platforms and configurations make it difficult to use this data as a way to estimate your own system's capabilities, but we can at least use this data to compare how difficult one resolution + preset combination is compared to others.
Bitrate, while it does have some effect on CPU difficulty, is less impactful than resolution, framerate, or preset. To simplify the comparison, only Twitch's recommended minimum and maximum bitrates are displayed below.
The benchmark data allows us to make a few interesting observations. For example,
1080p60 4500K veryfast is about as difficult as
720p60 5000K faster. If we compare the quality scores for those same two configurations, the former had a total score of 72.7, and the latter had a total score of 81.4. So there are certainly cases where dropping resolution in order to use a more difficult preset, along with slightly raising bitrate would noticably increase quality.
If we look at the same difficult frame that we did previously, the quality difference is very apparent:
|1080p60 4500K veryfast - VMAF frame score: 57.6||720p60 5000K faster - VMAF frame score: 78.8|
This analysis demonstrates that streaming excellent quality video brings with it several demands of great processing power and bandwidth. Twitch's minimum suggested bitrates will result in "fair" quality video, unless you have the computing power to utilize a demanding preset such as
faster. Bitrate is clearly an important factor, and makes large differences in perceivable quality.
For streamers that just don't have a lot of bandwidth to work with, it may not make sense to stream at 1080p60, as the quality penalty is quite severe for 4500K and below. In cases such as this, it may be wise to cut framerate to 30fps, or to consider 720p60 or even 720p30 if bandwidth is extremely limited.
Chances are, however, that the bigger limiting factor for most streamers is going to be CPU power. While the
veryfast preset has become very common on Twitch, this analysis shows that moving even one preset slower to
faster results in a noticeable increase in picture quality. It some cases, it may even in fact be wise to move down from 1080p60 to 720p60 in order to utilize the
faster preset, or more difficult ones.
If you're considering what settings to use for your stream, my personal recommendation would be to start by measuring your upload bandwidth, assume that you'll be able to use 80% of that for streaming, and see which of Twitch's recommended maximum bitrates that you'd be able to hit. If you don't have 5Mbps or faster upload bandwidth, you probably won't be able to get high picture quality without dropping to resolutions below the ones we tested.
CPU power cannot be understated. If you don't have an extremely high end setup, you may not want to stream at 1080p60. In fact, unless you can use very high bitrates combined with reasonably difficult presets, you're likely to get better picture quality by lowering the resolution, lowering the framerate, or both.
NVENC becomes an interesting option for streamers that have a recent nVidia GPU and plenty of bandwidth, but not a lot of CPU power. If you stick the the
High Quality preset, and keep bitrate high, you'll get results roughly equal to
x264 veryfast, or slightly better.
We also learned that NVENC and x264 have trouble with different types of frames. Even though x264
veryfast and NVENC
High Quality are very similar on average, it's possible that certain games "just look better" on one or the other.
In general, if you are able to take advantage of more difficult x264 presets (
faster and up), it still seems wise to do so.
Disclaimer: This analysis is not comprehensive. It covers only a specific game (Heroes of the Storm) and analyzes just one minute of gameplay footage. Results may not be applicable to other games, or even to the same game if there is significant variation in game settings, especially graphics and camera settings.