Graphics conventions for dummies

2016-12-11 Permalink

Unless you have a compelling reason to do otherwise, use the following conventions:

Assume 8-bit data is sRGB

It’s the de facto standard color space. In lack of an embedded color profile it is safe to assume that the data is sRGB. In any case sRGB is a better bet than linear RGB.

Do math in linear RGB[1]

This includes interpolation, blending, convolutions, etc. sRGB is not physically correct nor perceptually uniform, so one should consider it only as a storage format, whether in memory or on disk.

Use premultiplied alpha

Always.

In conjunction with a linear color space, it is the only correct way to alpha blend and filter.
It can encode additive and ‘over’ regions within the same image (i.e. flames and smoke).
It properly weights the color information by its contribution to the final image, thus facilitating compression.[2]

The selling point of straight alpha is that it is intuitively understood as covering the area with a diluted paint of the given RGB color. However, premultiplied alpha gets also intuitive once you teach yourself to think of the alpha value as the percentage of the background light absorbed and the RGB part as the extra light emitted on top.

When encoding with sRGB, first premultiply by alpha, then encode the resulting color components with sRGB transfer function. The alpha value is to be kept linear. This is what the built-in GPU hardware expects.

In OpenGL configure the pipeline by:

glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);

This accounts for the destination alpha too.

Never use HSV/HSB/HSL/HSI

They are ad hoc parametrizations of the RGB color space with no relation to color science. Their only use is in color pickers designed for artists who were raised since childhood using those color pickers. Use CIELUV (or relatives) instead.

Need 16-bit on GPU?

Prefer half-floats to integers. The reason for storing 16-bit, usually, is not the precision gained but the higher dynamic range, and this is where floating-points excel. Using integers with a logarithmic number system is also a widely used option for storage formats. Unfortunately it is inconvenient for processing.

Rasters aren’t matrices

Matrices are representations of linear transformations or bilinear forms. Mathematicians index their elements by (row, column).

Rasters, on the other hand, are discreet approximations of vector fields C(x,y), and as such shall be indexed accordingly by (column, row).

Some think that it’s appropriate to store rasters in a matrix datatypes, so they confusingly switch the order of the indices. However, it’s important to draw the above distinction between the two. E.g. you would never want to do a matrix multiplication between two rasters, but having a multiplication operator that performs an elementwise multiplication on rasters is extremely useful.

Offender: OpenCV.

Pixels are squares with centers on half-integer coordinates

It interplays well with scaling[3], and is nicely symmetric.

On screen Y axis points upwards

This is already the prevailing convention in mathematics. Working with GUI and text? It’s OK to use Y downwards there.

In 3D world XY is the horizontal plane

Because heightmaps are functions z(x,y), with higher z values going ‘upwards’. Consequently an unrotated camera hangs downwards, looking towards the negative Z axis (nadir).

Use quaternions for orientation

They are more robust than rotation matrices, and take less space.[4] But most importantly they follow the ‘every value is valid’ doctrine.

Footnotes

With ITU-R BT.709 primaries. I.e. in the pre-gamma sRGB.
PNG should have used premultiplied alpha, alas it does not.
When building a mipmap/pyramid the origin doesn’t shift.
A good application is in interpolating the tangent space on the GPU: using a single vertex attribute is enough to do correct bump mapping. One can store quaternions in the normal map, which would also allow using anisotropic materials with direction varying per fragment.