Page 2 of 3

PostPosted: Aug 25, 2005 @ 4:11pm
by warmi

PostPosted: Aug 25, 2005 @ 4:57pm
by joshbu [MSFT]

PostPosted: Aug 26, 2005 @ 6:14am
by mm40

PostPosted: Aug 26, 2005 @ 6:41am
by refractor

PostPosted: Aug 26, 2005 @ 7:46am
by mm40
awesome, how is the best way to use it?

also has anyone seen huge improvements by using the latest EVC compiler and SDK? I use the 2002 SDK for backward compatibilty with all ARM devices, but I was wondering if you used the 2005 SDK and compiled specifically for XScale (if this is possible) does this speed up your code?

PostPosted: Aug 26, 2005 @ 12:57pm
by drgoldie
if you believe that your code is purely memory bound i suggest you have a look at this thread:

http://www.pocketmatrix.com/forums/view ... hp?t=19742

it turned out that by prefetching (using pure c code) it was possible to speed up the algorithm by several 100%.

bye,
Daniel

PostPosted: Aug 26, 2005 @ 6:36pm
by Kzinti

PostPosted: Aug 26, 2005 @ 7:19pm
by fast_rx

PostPosted: Aug 26, 2005 @ 10:48pm
by joshbu [MSFT]
Prefetch the memory areas you are going to read.

Thus, in a blending or multisampling routine, yes, there would be some value to prefetching the destination. Unless the destination is the primary surface.

On XScale devices, we encourage driver writers to mark the primary uncached with combining. This gives us the maximum write speed possible without cluttering the cache. It also means that prefetching will have no effect.

General graphics tip: reading back from the primary is only going to get more expensive on PPCs/SPs. Don't do it.

We've also seen some good speed-ups making other buffers that are overwhealming written to (back buffers, or composition buffers) uncached with combining.

PostPosted: Aug 27, 2005 @ 1:38am
by mm40

PostPosted: Aug 27, 2005 @ 2:42am
by Andy

PostPosted: Aug 27, 2005 @ 5:55am
by mm40
I think I'm not seeing any improvements because of some other barrier, I just compiled blit.exe without any drawing except for the frame counter and the words 'pocketfrog' in the corner and only get 47fps! :x obviously trying to inch towards 47fps from 41fps by optimzing one function isn't going to cut it. the barrier must be somewhere in pocketfrog, gapi, or the QVGA to VGA scaler in the x50v, time to move benching to my other axim I think...

PostPosted: Aug 29, 2005 @ 6:18am
by mm40

PostPosted: Aug 29, 2005 @ 8:46pm
by drgoldie

PostPosted: Aug 29, 2005 @ 10:09pm
by mm40