Were Compaq on something when they designed the totally illogical video memory layout for the 38xx?
Having conducted some initial tests, I can confirm that my graphics application currently runs at 20fps with correct conversion from an off-screen buffer to the video memory, and 30fps with a direct memcpy (and therefore displaying a rearranged representation of my buffer on the screen).
Has anyone had any experience with writing specialised graphics functions (polygon blitters and so on) to "natively" write to the 38xx screen layout, therefore allowing a direct memcpy from the offscreen buffer to display memory?
I'm slightly concerned that the overhead caused by the overhead of the extra functionality in my graphics functions will negate the benefit of being able to perform a memcpy between the offscreen buffer and the display memory. Especially given that my application makes use of alpha transparencies and therefore a certain amount of overdraw.