Flushing caches...
On an ARM system, look to the "coprocessor" (MCR) commands.
On the StrongARM, you want coprocessor 7 for the cache control stuff (no idea about the XScale - should be the same but knowing Intel..).
mcr 15, 0, r0, c7, c5, 0 ; // flush the I cache
mcr 15, 0, r0, c7, c10, 4; // flush the write buffer
ISTR the data flush being horribly slow, so to flush the data just read a (contiguous in virtual) cache-sized block of data from RAM. Something like:
;r0 = address of your buffer
;r1 = number of 32-byte lines to flush
.flush
ldr r2,[r0],#32 // 32 is a cache line size
subs r1,r1,#1
bne flush
You can also flush single entries - it's all in the StrongARM developer manual at page 59.
Oh, and Microsoft ported the Q2 engine to their CLR:
http://msdn.microsoft.com/visualc/quake/default.aspx