This site is no longer active and is available for archival purposes only. Registration and login is disabled.

assembly bitmap scaler


assembly bitmap scaler

Postby mm40 » Aug 12, 2005 @ 6:29pm

User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby refractor » Aug 12, 2005 @ 7:10pm

User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Postby mm40 » Aug 12, 2005 @ 8:02pm

User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby mm40 » Aug 18, 2005 @ 5:49am

Sorry for the delay, I have created a testbed and C function scaler, the testbed uses PocketFrog, you can download it here

http://pocketfrog.droneship.com/download.html

use EVC 3.0
1) download and unzip pocketfrog
2) open the pocketfrog workspace
3) replace the example blit.cpp file with the one attached
3) compile the PocketFrog lib
4) compile Blit and run it

You could convert the whole blitStretch function and its sub function, but I think that might be a waste of time at it is doing some simple clipping and probably won't yeild must of an increase in preformance.

The workhorse function is blitStretchClippedFixed and the one which should be tuned in ASM if possible.

I have some benchmark results in the file. Let me know if you need me to explain anything else or need some code changes or comments. I think the blitStretchClippedFixed function is about as optimized as it can be in C.
Attachments
blit.cpp
(9.69 KiB) Downloaded 600 times
User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby Andy » Aug 18, 2005 @ 6:42am

Andy
<font color=red size=3>Troll++</font>
 
Posts: 1288
Joined: Nov 1, 2003 @ 7:36am


Postby Dan East » Aug 18, 2005 @ 1:35pm

User avatar
Dan East
Site Admin
 
Posts: 5264
Joined: Jan 25, 2001 @ 5:19pm
Location: Virginia, USA


Postby mm40 » Aug 18, 2005 @ 3:48pm

User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby refractor » Aug 18, 2005 @ 8:28pm

It's looking fairly bus-bound so I'm not expecting a huge gain in performance from an ARM version, but I'll do it tomorrow anyway. :mrgreen:

You should definitely sort out the blitStretch function though -- ideally you should do all of that in fixed point (the float divisions will cost a lot).
User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Postby refractor » Aug 19, 2005 @ 1:20pm

Right.

Attached is a blit.exe with blitStretchClippedFixed coded in ARM (the original function, not Andy's). I'll post the source for the ARM function in a few hours after I've tidied it up a bit.

I haven't had time to properly speed-test it yet (I just wrote it and made sure it ran). As I said before, it looks pretty bus-bound and I wouldn't be surprised if there was little to no gain (other than giving me something interesting to do during my lunch hour).

I could probably kill another cycle or two out of the setup, but the loops are looking pretty tight (tighter than the compiler's version at least). I could also modify the ARM based on Andy's suggestion and see if that helped at all.
Attachments
Blit.zip
Compiled with evc++3 default settings in release mode. blitStretchClippedFixed has been replaced with one written in ARM code.
(43.33 KiB) Downloaded 561 times
User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Postby mm40 » Aug 19, 2005 @ 4:24pm

I get the same as results as Andy's function, but this is on a very fast device (x50v) on a slow CPU device I think this may make a big difference, I will test that specifically in the future.
User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby refractor » Aug 19, 2005 @ 6:16pm

Attachments
blitStretchClippedFixed.asm.txt
blitStretchClippedFixed in ARM assembler
(7.03 KiB) Downloaded 635 times
User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Postby mm40 » Aug 21, 2005 @ 2:52am

User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby joshbu [MSFT] » Aug 23, 2005 @ 10:30pm

joshbu AT microsoft dot-you-know-where
Windows CE Software Design Engineer

“This posting is provided “AS IS” with no warranties, and confers no rights.”
joshbu [MSFT]
pm Member
 
Posts: 60
Joined: Apr 10, 2004 @ 12:28am
Location: Redmond, WA


Postby mm40 » Aug 25, 2005 @ 12:55am

Hi joshbu, thanks for your suggestions.

I tried a 32 bit implementation and it provided no speed improvement at all, plus you had the mess of making sure everything was aligned and cleaning up if it wasn't, didn't seem to be worth it.

To ensure scan lines are cache aligned wouldn't that be specific for each type of processor or device? Or do they all have the same cache sizes? I'm not sure this would matter since my data is all in a tightly packed array, so it should already be only missing the cache the minimal amount.

Adding special cases for certain scaling would speed it up a lot, but is so rare that it would hit those specific sizes its not worth doing (of course this depends a lot on your app).
User avatar
mm40
pm Member
 
Posts: 135
Joined: Feb 21, 2003 @ 9:11pm


Postby refractor » Aug 25, 2005 @ 6:42am

User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Next

Return to Windows Mobile


Sort


Forum Description

A discussion forum for mobile device developers on the Windows Mobile platform. Any platform specific topics are welcome.

Moderators:

Dan East, sponge, Digby, David Horn, Kevin Gelso, RICoder

Forum permissions

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

cron