PocketMatrix

Posted: **Dec 14, 2004 @ 7:55pm**

hi stephC,

thanks for that hint.
strange thing is that using the _Preload function the code actually becomes slower.
i changed my previous code to not add, but just read from memory and the time went down to 2.04ms. on the other hand the PLD version takes 3.71ms.

the problem seems to be that in the LDR case the compiler creates the following:

ldr r0, [r2]
ldr lr, [r2, #0x20]
ldr r11, [r2, #0x40]
ldr r10, [r2, #0x60]

while in the PLD case it updates the address register all the time which prevents it from being used for PLD next clock circle.

any ideas on this? i'd still be interested to see how fast an optimal PLD solution would be. maybe somebody could create a 'precompiled' assembler function that can preload x bytes using PLD (i'm not an assembler programmer). i'd love to check how fast that would be...

besides that i'm happy to see that this code now got more than 3x faster than the original implementation that looked already fully optimized...

Daniel

Posted: **Dec 15, 2004 @ 6:35pm**

Posted: **Dec 15, 2004 @ 6:57pm**

Posted: **Dec 15, 2004 @ 7:05pm**

Posted: **Dec 15, 2004 @ 7:08pm**

Posted: **Dec 16, 2004 @ 12:35am**

Posted: **Dec 16, 2004 @ 10:40am**

Posted: **Dec 17, 2004 @ 12:31am**

Posted: **Dec 18, 2004 @ 2:38pm**

Posted: **Sep 8, 2005 @ 4:13pm**

Posted: **Sep 16, 2005 @ 10:38pm**

Posted: **Sep 16, 2005 @ 11:12pm**

Posted: **Sep 16, 2005 @ 11:38pm**

Posted: **Sep 19, 2005 @ 6:46pm**

PocketMatrix

Scaling an Image to Half-Res