Thursday, August 17, 2017

Why crunch likes uncompressed texture data

We've recently gotten some interest in creating a RDO compressor specifically for already compressed textures, which is why I'm writing this.

crunch works best with (and is designed for) uncompressed RGBA texture data. You can feed crunch already compressed data (by compressing to DXT, unpacking the blocks, and throwing the unpacked pixels into the compressor), but it won't perform as well. Why you ask?

crunch uses top down clusterization on the block endpoints. It tries to create groups of blocks that share similar endpoints. Once it finds a group of blocks that seem similar enough, it then uses its DXT endpoint optimizers on these block clusters to create the near-optimal set of endpoints for that cluster. These clusters can be very big, which is why crunch/Basis can't use off the self DXT/ETC compressors which assume 4x4 blocks.

DXT/ETC are lossy formats, so there is no single "correct" encoding for each input (ignoring trivial inputs like solid-color blocks). There are many possible valid encodings that will look very similar. Because of this, creating a good DXT/ETC block encoder that also performs fast is harder than it looks, and adding additional constraints or requirements on top of this (such as rate distortion optimization on both the endpoints and the selectors) just adds to the fun.

Anyhow, imagine the data has already been compressed, and the encoder creates a cluster containing just a single block. Because the data has already been compressed, the encoder now has the job of determining exactly which endpoints were used originally to pack that block. crunch tries to do this for DXT1 blocks, but it doesn't always succeed. There are many DXT compressors out there, each using different algorithms. (crunch could be modified to also accept the precompressed DXT data itself, which would allow it to shortcut this problem.)

Things get worse if the endpoint clusterization step assigns 2+ blocks with different endpoints to the same cluster. The compressor now has to find a single set of endpoints to represent both blocks. Because the input pixels have already been compressed, we're now forcing the input pixels to lie along a quantized colorspace line (using 555/565 endpoints!) two times in a row. Quality takes a nosedive.

Basis improves this situation, although I still favor working with uncompressed texture data because that's what the majority of our customers work with.

Another option is to use bottom-up clusterization (which crunch doesn't use). This approach would work well on already compressed data. Quantizing just the selector data is the easiest thing to do first.

Sunday, June 25, 2017

Seattle

I'm by no means an expert on anything San Diego, having been there only around 1.5 months since leaving Seattle. I did spend 8 years in Seattle though, and here's what I think:

- Seattle is just way too dark of a city for me to live there year round. Here's Seattle vs. San Diego's sunshine (according to city-data.com).



The winter rain didn't bother me much at all. It was the lack of sun. (Hint to Seattle-area corporate recruiters: Fly in candidates from sunnier climates like Dallas to interview during July-August.)

- There's a constant background noise and auditory clutter to Seattle and the surrounding areas that's just getting louder and louder as buildings pop up and people (and their cars) move in.

Eventually this background noise got really annoying. Even downtown San Diego is surprisingly peaceful and quiet by comparison.

- Seattle's density is both a blessing and a curse. It's a very walkable city, so going without a car is possible if you live and work in the right places.

The eastside and westside buses can be incredibly, ridiculously over packed. Seattle needs to seriously get its public transportation act together.

- As a pedestrian, I've found Seattle's drivers to be so much nicer and peaceful on the road vs. San Diego's. CA drivers seem a lot more aggressive.

- San Diego is loaded with amazing beaches. Seattle - not so much.

A few misc. thoughts on Seattle and the eastside tech workers I encountered:

I lived and worked on the eastside (near downtown Bellevue) and westside (U District) for enough time to compare and contrast the two areas. The people in Seattle itself are generally quite friendly and easy going. Things seem to change quickly once you get to the eastside, which feels almost like a different state entirely.

I found eastside people to be much less friendly and living in their own little worlds. I wish I had spent more of my time living in Seattle itself instead of Bellevue. Culturally Bellevue feels cold and very corporate.

The wealthier areas on the eastside seemed the worse. Wealth and rudeness seem highly correlated. So far, I've yet to meet a Bellevue/Redmond tech 10-100 millionaire (or billionaire) that I found to be truly pleasant to be around or work with. I also learned over and over that there is only a weak correlation between someone's wealth and their ability to actually code. In many cases someone's tech wealth seemed to be related to luck of the draw, timing, personality, and even popularity. Some of the wealthiest programmers I met here were surprisingly weak software engineers.

I've seen this happen repeatedly over the years: Average software engineers get showered with mad cash and suddenly they turn inward, become raging narcissistic assholes, and firmly believe they and their code is godly. Money seems to bring out the worse personality traits in people.

Monday, June 19, 2017

Basis's RDO DXTc compression API

This is a work in progress, but here's the API to the new rate distortion optimizing DXTc codec I've been working on for Basis. There's only one function (excluding basis_get_version()): basis_rdo_dxt_encode(). You call it with some encoding parameters and an array of input images (or "slices"), and it gives you back a blob of DXTc blocks which you then feed to any LZ codec like zlib, zstd, LZHAM, Oodle, etc.

The output DXTc blocks are organized in simple raster order, with slice 0's blocks first, then slice 1's, etc. The slices could be mipmap levels, or cubemap faces, etc. For highest compression, it's very important to feed the output blocks to the LZ codec in the order that this function gives them back to you.

On my near-term TODO list is to allow the user to specify custom per-channel weightings, and to add more color distance functions. Right now it supports either uniform weights, or a custom model for sRGB colorspace photos/textures. Also, I may expose optional per-slice weightings (for mipmaps).

I'm shipping the first version (as a Windows DLL) tomorrow.

// File: basis_rdo_dxt_public.h
#pragma once

#include <stdlib.h>
#include <memory.h>

#ifdef BASIS_DLL_EXPORTS
#define BASIS_DLL_EXPORT __declspec(dllexport)  
#else  
#define BASIS_DLL_EXPORT
#endif  

#if defined(_MSC_VER)
#define BASIS_CDECL __cdecl
#else
#define BASIS_CDECL
#endif

namespace basis
{
    // The codec's current version number.
    const int BASIS_CODEC_VERSION = 0x0106; 
    
    // The codec can accept rdo_dxt_params's from previous versions for backwards compatibility purposes. This is the oldest version it accepts.
    const int BASIS_CODEC_MIN_COMPATIBLE_VERSION = 0x0106;

    typedef unsigned int basis_uint;
    typedef basis_uint rdo_dxt_bool;
    typedef float basis_float;

    enum rdo_dxt_format
    {
        cRDO_DXT1 = 0,
        cRDO_DXT5,
        cRDO_DXN,
        cRDO_DXT5A,

        cRDO_DXT_FORCE_DWORD = 0xFFFFFFFF
    };


    enum rdo_dxt_encoding_speed_t
    {
        cEncodingSpeedSlowest,
        cEncodingSpeedFaster,
        cEncodingSpeedFastest
    };

    const basis_uint RDO_DXT_STRUCT_VERSION = 0xABCD0001;

    const basis_uint RDO_DXT_QUALITY_MIN = 1;
    const basis_uint RDO_DXT_QUALITY_MAX = 255;
    const basis_uint RDO_DXT_MAX_CLUSTERS = 32768;
        
    struct rdo_dxt_params
    {
        basis_uint m_struct_size;
        basis_uint m_struct_version;

        rdo_dxt_format m_format;

        basis_uint m_quality;

        basis_uint m_alpha_component_indices[2];

        basis_uint m_lz_max_match_dist;

        // Output block size to use in RDO optimization stage, note this does NOT impact the blocks written to pOutput_blocks by basis_rdo_dxt_encode()
        basis_uint m_output_block_size;

        basis_uint m_num_color_endpoint_clusters;
        basis_uint m_num_color_selector_clusters;

        basis_uint m_num_alpha_endpoint_clusters;
        basis_uint m_num_alpha_selector_clusters;

        basis_float m_l;
        basis_float m_selector_rdo_quality_threshold;
        basis_float m_selector_rdo_quality_threshold_low;

        basis_float m_block_max_y_std_dev_rdo_quality_scaler;

        basis_uint m_endpoint_refinement_steps;
        basis_uint m_selector_refinement_steps;
        basis_uint m_final_block_refinement_steps;

        basis_float m_adaptive_tile_color_psnr_derating;
        basis_float m_adaptive_tile_alpha_psnr_derating;

        basis_uint m_selector_rdo_max_search_distance;

        basis_uint m_endpoint_search_height;
        basis_uint m_endpoint_search_width_first_line;
        basis_uint m_endpoint_search_width_other_lines;

        rdo_dxt_bool m_optimize_final_endpoint_clusters;
        rdo_dxt_bool m_optimize_final_selector_clusters;

        rdo_dxt_bool m_srgb_metrics;
        rdo_dxt_bool m_debugging;
        rdo_dxt_bool m_debug_output;
        rdo_dxt_bool m_hierarchical_mode;
        rdo_dxt_bool m_multithreaded;
        rdo_dxt_bool m_use_sse41_if_available;
    };

    inline void rdo_dxt_params_set_encoding_speed(rdo_dxt_params *p, rdo_dxt_encoding_speed_t encoding_speed)
    {
        if (encoding_speed == cEncodingSpeedFaster)
        {
            p->m_endpoint_refinement_steps = 1;
            p->m_selector_refinement_steps = 1;
            p->m_final_block_refinement_steps = 1;

            p->m_selector_rdo_max_search_distance = 3072;
        }
        else if (encoding_speed == cEncodingSpeedFastest)
        {
            p->m_endpoint_refinement_steps = 1;
            p->m_selector_refinement_steps = 1;
            p->m_final_block_refinement_steps = 0;

            p->m_selector_rdo_max_search_distance = 2048;
        }
        else
        {
            p->m_endpoint_refinement_steps = 2;
            p->m_selector_refinement_steps = 2;
            p->m_final_block_refinement_steps = 1;

            p->m_endpoint_search_width_first_line = 2;
            p->m_endpoint_search_height = 3;

            p->m_selector_rdo_max_search_distance = 4096;
        }
    }

    inline void rdo_dxt_params_set_to_defaults(rdo_dxt_params *p, rdo_dxt_encoding_speed_t default_speed = cEncodingSpeedFaster)
    {
        memset(p, 0, sizeof(rdo_dxt_params));

        p->m_struct_size = sizeof(rdo_dxt_params);
        p->m_struct_version = RDO_DXT_STRUCT_VERSION;

        p->m_format = cRDO_DXT1;

        p->m_quality = 128;

        p->m_alpha_component_indices[0] = 0;
        p->m_alpha_component_indices[1] = 1;

        p->m_l = .001f;

        p->m_selector_rdo_quality_threshold = 1.75f;
        p->m_selector_rdo_quality_threshold_low = 1.3f;

        p->m_block_max_y_std_dev_rdo_quality_scaler = 8.0f;

        p->m_lz_max_match_dist = 32768;
        p->m_output_block_size = 8;

        p->m_endpoint_refinement_steps = 1;
        p->m_selector_refinement_steps = 1;
        p->m_final_block_refinement_steps = 1;

        p->m_adaptive_tile_color_psnr_derating = 1.5f;
        p->m_adaptive_tile_alpha_psnr_derating = 1.5f;
        p->m_selector_rdo_max_search_distance = 0;

        p->m_optimize_final_endpoint_clusters = true;
        p->m_optimize_final_selector_clusters = true;

        p->m_selector_rdo_max_search_distance = 3072;

        p->m_endpoint_search_height = 1;
        p->m_endpoint_search_width_first_line = 1;
        p->m_endpoint_search_width_other_lines = 1;

        p->m_hierarchical_mode = true;

        p->m_multithreaded = true;
        p->m_use_sse41_if_available = true;

        rdo_dxt_params_set_encoding_speed(p, default_speed);
    }
        
    const basis_uint RDO_DXT_MAX_IMAGE_DIMENSION = 16384;

    struct rdo_dxt_slice_desc
    {
        // Pixel dimensions of this slice. A slice may be a mipmap level, a cubemap face, a video frame, or whatever.
        basis_uint m_image_width;
        basis_uint m_image_height;
        basis_uint m_image_pitch_in_pixels;

        // Pointer to 32-bit raster image. Format in memory: RGBA (R is first byte, A is last)
        const void *m_pImage_pixels;
    };

} // namespace basis

extern "C" BASIS_DLL_EXPORT basis::basis_uint BASIS_CDECL basis_get_version();
extern "C" BASIS_DLL_EXPORT basis::basis_uint BASIS_CDECL basis_get_minimum_compatible_version();

extern "C" BASIS_DLL_EXPORT bool BASIS_CDECL basis_rdo_dxt_encode(
    const basis::rdo_dxt_params *pEncoder_params,
    basis::basis_uint total_input_image_slices, const basis::rdo_dxt_slice_desc *pInput_image_slices,
    void *pOutput_blocks, basis::basis_uint output_blocks_size_in_bytes);

Sunday, April 30, 2017

Binomial stuff

One MS employee recently said to Stephanie (my partner) that (paraphrasing) "your company isn't stable and can't possibly last". My reply: We've been in business for over a year now, and our business is just a natural extension and continuation of our careers. I've been programming since 1985, and developing commercial data compression and other software since 1993. I've been doing this for a while and I'm not going to stop anytime soon.

Having my own small consulting company vs. just working full-time for a single corporation is just a natural next step to me. One thing I really liked about working at Valve was the ability to wheel my desk to virtually anywhere in the company and start adding value. I can now "wheel my desk" to anywhere in the world, and the freedom this gives us is amazing.

Binomial is a self-funded startup. We work on both development contracts and our current product (Basis). We haven't taken any investment money. Our "runway" is basically infinite.

Wednesday, April 19, 2017

Basis status

Just a small update. We've put like 99% of our effort into ETC1 and ETC1+DXT1 over the previous 5-6 months. Our ETC1 encoder supports RDO and an intermediate format, and has shipped on OSX/Linux/Windows. I've been modifying the ETC1 encoder to also support DXT1 (for our universal format) over the previous few weeks.

Our ETC1 encoder was written almost from scratch. The next major step is to roll back all the improvements and things I've learned while implementing our ETC1 encoder back into our DXT-specific encoder. crunch's support for DXT has a bunch of deficiencies which hurt ratio. (Roy Eltham and Fabian Giesen have recently pointed this issue out to me. I've actually been aware of inefficiencies in crunch's codebook generator for a few months, since working on the new codebook generator for ETC1.) I'm definitely fixing this problem (and others!) in Basis.

Saturday, March 18, 2017

Probiotic yogurt making

Just got back from a wonderful business trip to Portland Maine, visiting ForeFlight. Making more probiotic yogurt tonight because I ate up almost my entire stock on the trip. (It didn't help that we got stuck in a blizzard while there, but that turned out to be really fun.) The food in Portland is amazing!

The pot of boiling water is for sterilizing the growth medium, in this case 2% organic grassfed milk+raw sugar. After the milk is boiled (repastuerized) and cooled I inoculate it using a 10 strain probiotic blend from Safeway. I tried a bunch of probiotics before finding this particular brand, which seems magical for me. Without this extremely strong yogurt I have no idea how long it would have taken my gut to heal after the antibiotics I had to take in 2015.

Yogurt making like this is tricky. Early on, around 30% of my attempts failed in spectacular ways. These days my success rate is almost 100%. Sterilization of basically everything (including tools, spoons, etc.) over and over throughout the process is critical to success.



Nerd/Brogrammer spectrum

Okay, I've been watching Silicon Valley. This show is so realistic and exaggerated that I find it painful to watch at times (especially the first couple episodes), but it's also funny as hell. It really helps to put things into perspective about why I'm now a consultant and not "exclusive" to a single corporation anymore.


Some observations:

- Notice that the interpersonal relationships between basically all the characters are pretty toxic. Everyone seems to be exploiting everyone else in some way, and money/wealth is a large motivator to most characters.

- Look at the above frame. Where's the programmer in the middle in this series? The person who works out, is empathetic, loves to program, but isn't a total and complete asshole.