Guys,<br><br>There is a way to do this much cheaper, but I would have to investigate a bit first. The basic idea is that you first create 'bitgaps' between the color components by exploiting the fact that a 565 color occupies only 16 bits. If you shift red a lot to the left and green also a little, gaps of zeroes are created between the components. Good thing about gaps is that you can multiply R, G and B with a single multiply; after that you shift the whole bunch back and there you have your scaled color (wich is basically what you want with alpha blending).<br><br>This technique can be improved by precalculating the bitgaps. This is of course only useful when you use a palette.<br><br>Wich brings me to another solution that I use often: If you use a palette anyway, why not precalculate 16 or 32 scaled versions of the entire palette? If you want to do 25%/75% blending of two colors, this is simply a matter of looking up the 25% version of color 1, and the 75% version of color 2, adding them, and voila.<br><br>I used this technique for very fast bilinear filtering; my texture mapper with 5bit bilerp ran faster without MMX code than the code that intel did WITH their MMX.
<br><br>About LUT's: If you have an array like this: int a[256][5], and you want item [10][1], this does not become *(a + 1 + 10 * 256), but rather *(a + 1 + (10 <<
). It's just a matter of picking your array sizes smart. If you are uncertain about these compiler optimizations, simply always build 1D arrays and do the shift yourself.<br><br>Final note, about LUT's and the cache: I found that on the PC an integer multiply is almost always better than a lookup in a huge table (where 'huge' is 32K or above). On the PocketPC, I have no idea how this relates.
<br><br>Greets,<br>- Jacco.