Data Conversion by Multi-Bit Neighborhood Mask Dithered Interpolation


(input)-o----->[quantizer]------>[adder]-->[table]
        |                           ^ 
        |--->[neighborhood sample]--|
Fig.1 Neighborhood Mask Dither

  Single-bit mask dither requires 2**k samples to converge for k truncated bits.
Multi-bit mask dither reduces sample neighborhood size
required for dithered interpolation convergence to weighted averages.
A k-bit mask dither yields a weighted average for each sample with k truncated bits.
In general, m-bit mask dither convergence with k truncated bits requires 2**(k-m) samples.
More table entries must be accessed, on average, for sampling with multi-bit mask dither. 

For example, consider the case of a 17x17x17 table T[][][] and sets of three 8-bit inputs,
each with 4 truncated bits and using 4-bit mask dither.
Up to 5 different table entries are accessed for each input tuple.
Truncated index values are R,G,B and corresponding dither bits are r,g,b:

  R = red >> 4   ; G = green >> 4   ; B = blue >> 4   
  r = red & 0x0F ; g = green & 0x0F ; b = blue & 0x0F
...then 5 table entries and their weights would be:
  ( T[R + (r & 0x08)>>3][G + (g & 0x08)>>3][B + (b & 0x08)>>3]<<3
  + T[R + (r & 0x04)>>2][G + (g & 0x04)>>2][B + (b & 0x04)>>2]<<2
  + T[R + (r & 0x02)>>1][G + (g & 0x02)>>1][B + (b & 0x02)>>1]<<1
  + T[R + (r & 0x01)][G + (g & 0x01)][B + (b & 0x01)] + T[R][G][B] ) >> 4 
This implementation generates a weighted average for each sample.

Alternatively, here is code for a 2-bit mask dither of 4 truncated bits
that converges to weighted average over neighborhoods of 8 samples:

 shift[] = {3,2,3,1,3,2,3,0};

for(i=0; i < end; i++) {		/* for all samples */
   r = RGB[0][i]; g = RGB[1][i]; b = RGB[2][i]; /* sample data */
   R = r >> 4; G = g >> 4; B == B >> 4;	/* truncate to index sparse table T[][][] */
   m = shift[i & 0x07];			/* select the shift for this sample */
   if( 0 == m ) {			/* special case lsb */
     /* 4 bits have 16 possible values;
    /*  for convergence with 8 samples, the lsb and 0 each have weight 0.5 */
     weighted_average[i] = (T[R+(r & 1)][G+(g & 1)][B+(b & 1)] + T[R][G][B])>>1;
   }
   else {
#define  mask(c) ((c>>m) & 1);	/* test a bit */
	/* increment indices with masked bits */
     weighted_average[i] = T[R+mask(r)][G+mask(g)][B+mask(b)]<<(m-1);
   }
}

For n-dimensional interpolation, up to 2**n table entries are accessed.
Considering 3-dimensional interpolations, trilinear accesses all 8 nearest entries,
while tetrahedral accesses 4 of the 8 but requires calculating which 4 entries to select.
Mask dither accesses up to one more table entry than the number of truncated bits.
Four-bit dither of 4 truncated bits accesses up to 5 table entries to interpolate each sample.
This is up to one more table access than tetrahedral, but with less computational complexity.
Geometric interpolation over more than 3 dimensions requires more than 4 table accesses,
but mask dither accesses <= 1 + truncated bit count, independent of table dimensions.
For some truncated bit count, dither using more bits trades average table accesses per sample
for convergence to average over smaller neighborhoods than dithering with fewer bits.
This is a classic instance of oversampling frequency vs signal precision. 

maintained by blekenbleu