[quote]My experiments have shown that writing 32-bits at a time is much faster then writing 16-bits. quote]
It would be. The StrongARM can't play with 16-bit memory very well, so it ends up loading the 32-bit word, switching the upper/lower half-word with the new one, and storing the 32 bits again... then loading it, switching the next pixel and storing it again... which isn't exactly efficient.