This site is no longer active and is available for archival purposes only. Registration and login is disabled.

Some help to optimize my new ATARI ST emu for PocketPC...


Postby AndrewGower » Oct 21, 2002 @ 1:26am

These are the best I can manage for endian swap routines. I can't help but feel their should be an even better solution, but I can't work out what it is if there is.

-----
word endian swap
-input in r0
-big-endian answer returned in r1
-corrupts r2

mov r1, r0, lsr #8 ;swap top half into r1
mov r2, r0, lsl #24 ;swap bottom half into r2
add r1, r1, r2, lsr #16 ;combine parts together
-----

----
long endian swap
-input in r0
-big-endian answer returned in r1
-corrupts r2,r0
(if you would rather it corrupted r3 instead of r0 it could do, if you wished to retain the input)

mov r1, r0, lsr #24 ;put byte 4 in position
add r1, r1, r0, lsl #24 ;put byte 1 in position
mov r0, r0, ror #16 ;move bytes 2 and 3 to edge
mov r2, r0, lsr #24 ;put byte 2 in position
add r2, r2, r0, lsl #24 ;put byte 3 in position
add r1, r1, r2, ror #16 ;combine final answer
-----

To use these in your read and writes just use the above snippets of code to swap the endian-ness of the data in the register before writing it normally, or after reading it normally.

e.g

for read16, do a normal 16 bit read, then run 'word endian swap' on the register

for write16 run 'word endian swap' on the register, and then do a normal 16 bit write

for read32, do a normal 32 bit read, then run 'long endian swap' on the register

for write32 run 'long endian swap' on the register, and then do a normal 32 bit write



I haven't tested this :-) But it's getting late here in the uk, so I'll leave it at that for now
AndrewGower
pm Member
 
Posts: 16
Joined: Oct 19, 2002 @ 10:29am


Postby Dave H » Oct 21, 2002 @ 1:38am

>These are the best I can manage for endian swap routines. I can't help but feel their should be an even better solution, but I can't work out what it is if there is.

Well there is a better solution - the best optimisation, as always, is to not need to byteswap in the first place :)
Dave H.
Lead Programmer (Repton PPC/7650)
[url=http://www.handango.com/PlatformProductDetail.jsp?productId=43741]
Buy Repton Online here!
[/url]
User avatar
Dave H
pm Member
 
Posts: 164
Joined: Oct 3, 2002 @ 5:01pm


Postby Guest » Oct 21, 2002 @ 10:23am

the problems with not byte swapping, and just storing everything back to front are:

a) byte read/writes on the 68000 are not word aligned. so if we just store it backwards and someone does:
move.w d0,(a0)
move.b 1(a0),d1
it's going to wrong. This could be compensated for relatively easy in the byte access routine, but probably not in in less than the 3 instructions it took to fix up the word access routine.

b) memory access is only word aligned, not long aligned, so if someone does:
move.l d0,(a0)
move.l 2(a0),d1
it's going to go wrong. Againt it could be compensated for but I'm not convinced be any quicker than the endian swap code I already place.


Perhaps you could post some code showing how to compensate for these issues quickly, because if it's shorter than the endian swap code it sounds really great.

Thanks
Andrew
Guest
 


Postby schtruck » Oct 21, 2002 @ 12:08pm

schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


Postby schtruck » Oct 21, 2002 @ 12:14pm

schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


Postby Guest » Oct 21, 2002 @ 12:17pm

Guest
 


Postby refractor » Oct 21, 2002 @ 1:49pm

User avatar
refractor
pm Insider
 
Posts: 2304
Joined: Feb 5, 2002 @ 1:12pm
Location: Luxembourg


Postby Dave H » Oct 21, 2002 @ 2:24pm

Dave H.
Lead Programmer (Repton PPC/7650)
[url=http://www.handango.com/PlatformProductDetail.jsp?productId=43741]
Buy Repton Online here!
[/url]
User avatar
Dave H
pm Member
 
Posts: 164
Joined: Oct 3, 2002 @ 5:01pm


Postby Guest » Oct 21, 2002 @ 2:43pm

Guest
 


Postby schtruck » Oct 21, 2002 @ 3:30pm

schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


Postby AndrewGower » Oct 21, 2002 @ 9:19pm

Hi,

here are the macroers for the technique, of storing words (but not long words) backwards.

#define ReadB(addr) *(uint8*)(addr^1)

#define WriteB(addr,value) *(uint8*)(addr^1)=value

#define ReadW(addr) *(uint16*)(addr)

#define WriteW(addr,value) *(uint16*)(addr)=value

#define readL(addr) (*(uint32*)(addr)<<16)|(*(uint32*)(addr)>>16)

#define WriteL(addr,value) *(uint32*)(addr)=(value<<16)|(value>>16);

Unfortunately as far as I am aware C++ doesn't have an operand for rotate (Correct me if I'm wrong!), so I just hope the compiler is smart enough to spot that (value<<16)|(value>>16) is in fact just a simple ror #16

Note that if you store the words backwards like this it's going to mess up any other code that accesses the memory, for instance the screen draw code I gave earlier will now end up drawing the columns back to front, this could be corrected easily enough

I guess the best thing to do would be try the code, see if it works, and how much faster is is (or isn't) and then if it gives a good speed up I'll rewrite the screen redraw code (again) to handle getting everything backwards

If you could post what the compiler produces from the above defines it would help to see how a good job it has managed to do

Thanks
Andrew
AndrewGower
pm Member
 
Posts: 16
Joined: Oct 19, 2002 @ 10:29am


Postby schtruck » Oct 21, 2002 @ 10:26pm

schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


Postby schtruck » Oct 21, 2002 @ 10:35pm

in the case of yes , here is what the compiler generate:

; 205 : WriteB(address + membase, value);

eor r3, r1, #1
ldr r1, [pc, #8] ; pc+8+8 = 00000014
ldr r1, [r1]
strb r3, [r1, +r0]

; 211 : WriteW(address + membase, value);

ldr r2, [pc, #8] ; pc+8+8 = 00000010
ldr r2, [r2]
strh r1, [r2, +r0]

; 217 : WriteL(address + membase, value);

mov r3, r1, lsr #16
orr r3, r3, r1, lsl #16
ldr r1, [pc, #8] ; pc+8+8 = 00000018
ldr r1, [r1]
str r3, [r1, +r0]

; 222 : return ReadB(address + membase);

ldr r1, [pc, #0xC] ; pc+8+12 = 00000014
ldr r1, [r1]
ldrb r3, [r1, +r0]
eor r0, r3, #1

; 227 : return ReadW(address + membase);

ldr r1, [pc, #8] ; pc+8+8 = 00000010
ldr r1, [r1]
ldrh r0, [r1, +r0]


; 232 : return ReadL(address + membase);

ldr r1, [pc, #0x10] ; pc+8+16 = 00000018
ldr r1, [r1]
ldr r0, [r1, +r0]
mov r3, r0, lsr #16
orr r0, r3, r0, lsl #16
schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


Postby AndrewGower » Oct 21, 2002 @ 10:41pm

readB and writeB were strange, but they were in fact right :-) you have to xor the address not the value! It's to compensate for the fact that the bytes aren't stored in the correct place

the compiler hasn't done an amazing job with readL and writeL but at least it's still better than what it was producing before, so it should still be a good speed up
Last edited by AndrewGower on Oct 21, 2002 @ 10:44pm, edited 1 time in total.
AndrewGower
pm Member
 
Posts: 16
Joined: Oct 19, 2002 @ 10:29am


Postby schtruck » Oct 21, 2002 @ 10:42pm

just to confirm what i think, if we use this method, we'll must apply Read and Write on ROM (TOS) and on Floppy disk Sector read, no?

but how? just Readb and writeB or need to apply ReadB/WriteB a first time and then Apply ReadL and WriteL?
schtruck
pm Member
 
Posts: 77
Joined: May 24, 2002 @ 5:29pm


PreviousNext

Return to Windows Mobile


Sort


Forum Description

A discussion forum for mobile device developers on the Windows Mobile platform. Any platform specific topics are welcome.

Moderators:

Dan East, sponge, Digby, David Horn, Kevin Gelso, RICoder

Forum permissions

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

cron