Saturday, May 12, 2018

BC7 opaque encoding sweetspots

By running our non-RDO codec in an automated test thousands of times, I've identified 3-4 major encoding time vs. quality sweetspots. These regions are somewhat specific to our codec and how it's written and optimized, but I suspect rough regions like this will exist in all correctly written BC7 codecs (either CPU or GPU). Excluding ispc_texcomp, most codecs have key pbit handling flaws preventing them from exploiting some of these regions at all.

It's commonly believed that BC7 has this incredibly large search space (which it does) and is slow to encode, but only a few mode combinations and encoder search settings are actually valuable (i.e. highest quality for a given amount of encoding time). The valuable codec settings appear to fall into these four major regions (three if you ignore the last impractically slow region):

Note the max practical PSNR for this test corpus is ~51.39 dB RGB avg. PSNR. For reference, BC1 non-perceptual is around 10 dB lower than BC7 on average, so even real time BC7 is massively better than BC1's quality.

Region 1: Real time


The first sweetspot is the "extremely fast" or real time region. Amazingly, it's possible to produce results in this region with minimal or no obvious block artifacts.

In this region, you only use mode 6 (which will have noticeable block artifacts), or you combine modes 0 and 6 but you limit the # of mode 0 partitions examined by your partition estimator to only a small amount, like 1-4. Only a single partition (the best one returned by the estimator) is evaluated.

You can increase the # of partitions examined in mode 0 to improve quality a little more, which will reduce or eliminate block artifacts. You must handle pbits correctly (as I've detailed in previous posts) to exploit this mode combination, otherwise mode 0 will be useless due to massive banding.

Encoding time: .5-3.5 secs, average quality: 48.2-49.09 dB RGB PSNR

Region 2: Fast


The second region uses modes 1, 5, 6, and possibly 3. The max # of partitions examined by the estimator ranges from 1-34, and the encoder uses the strongest pbit handling it supports (trying all combinations of pbits using correct rounding). The encoder can also try varying the selectors a bit to improve the endpoints. In mode 5 the encoder tries all component rotations. In this region there will be no visible block artifacts if the # of partitions examined in mode 1 is high enough (not sure of the threshold yet, probably at least 16).

Encoding time: 5.5-16.4 secs, average quality: 49.8-50.8 dB RGB PSNR

Region 3: Basic


Most of the third major region uses modes 1, 3 and 6. Mode 2 can be added for slightly more quality. The # of partitions examined by the estimator ranges from 42-55, and the # of partitions evaluated ranges from 1-9. This region uses exhaustive pbit evaluation with correct rounding, and the evaluator tries several different ways of varying the selectors. There are no visible block artifacts in this region.

Encoding time: 21-54.0 secs, average quality: 50.98-51.24 dB RGB PSNR

Region 4: Slow to Exhaustive


Beyond this you can enable more modes, more partitions, etc. to very slightly increase quality. I have breakdowns on some interesting configurations here, but the massive increase in encoding time just isn't worth the tiny imperceptible quality gains.

Encoding time: 132.8-401.97 secs, average quality: 51.36-51.39 dB RGB PSNR


No comments:

Post a Comment