Page 1 of 2

Writing Efficient C for ARM

PostPosted: Dec 26, 2001 @ 9:11pm
by Digby
I recently came across on ARM's website. While it strictly deals with optimization techniques using the ARM C compiler, some things do carry over to the Microsoft ARM C compiler provided with eVC. I haven't got around to testing some of the topics covered in the article and looking at the resulting assembly language output from the MS compiler. Perhaps one of you lads with tons of time on your hands can do this and share your findings? Anyway, it makes good bathroom reading if nothing else.<br><br>

Re: Writing Efficient C for ARM

PostPosted: Dec 27, 2001 @ 4:40am
by Dan East

Re: Writing Efficient C for ARM

PostPosted: Dec 27, 2001 @ 5:16pm
by Digby
Man, you're quite the typist there Dan. Just a reminder that those things are only known to be valid for the ARM C compiler - not the ARM compiler provided with the Microsoft Embedded Visual Tools SDK (clarm.exe).<br><br>There's also a whitepaper up on the ARM site regarding fixed-point math.  This is probably old hat for someone like yourself, but others might be interested.  Here's the link:<br><br><br>

Re: Writing Efficient C for ARM

PostPosted: Jan 3, 2002 @ 1:26pm
by Mole
Please, Please don't rely on eVC to compile efficient code, because it won't!!!. apart from the natural speed of the ARM it has some very good features that the compiler just don't use :(, (see assem thread that i started), all instructions can be conditional (not just branches) depending on the outcome of the previous instruction e.g.<br><br>a=a-b<br>if (a<0)<br>{c=0;}<br>else<br>{c=100;}<br><br>can be represented in assembler...<br>r0=a;<br>r1=b;<br>r2=c;<br><br>subs r2,r0,r1<br>movmi r2,#0;<br>movpl r2,#100;<br><br>3 instructions!!!!!!!!<br>evc will push/pull all the vars off/on the stack, then use a CMP instruction for the compare!!!<br><br>Dan,<br>send me a complex and bottleneck function from ur q2 source code and i will show you the differnec hand compilation can make (i bet 3-5 times quicker)<br>but forget pushing ur C code about to get better performance, because eVC just don't compile efficent ARM code<br>Last modification: Mole - 01/03/02 at 10:26:49

Re: Writing Efficient C for ARM

PostPosted: Jan 3, 2002 @ 2:42pm
by Phantom
EVC indeed messes up in a major fashion, but sadly, things that I typically used to turn into asm on the x86 are compiled just fine. I have this huge matrix / vector multiplication that compiles to 'optimal' code. I agree that the case you mentioned can be done much faster, but that's also code that I would rarely turn into hand-optimized asm. :)<br><br>One other thing that I noticed: When I compile with maximum optimizations, the resulting .asm file doesn't compile. It only works without optimizations. That could also explain the bad compiles you mentioned.

Re: Writing Efficient C for ARM

PostPosted: Jan 4, 2002 @ 12:14am
by Digby

Re: Writing Efficient C for ARM

PostPosted: Jan 4, 2002 @ 6:26am
by BadBazza

Re: Writing Efficient C for ARM

PostPosted: Jan 4, 2002 @ 10:02am
by Dan East
See the thread <br><br>Dan East

Re: Writing Efficient C for ARM

PostPosted: Jan 4, 2002 @ 10:06am
by BadBazza
Thanks again Dan,<br><br>I did look at that thread but thought I would need a basic understanding before progressing to these documents.<br><br>Cheers<br>Bad<br><br>

PostPosted: Dec 2, 2004 @ 3:48pm
by Dan East

PostPosted: Dec 2, 2004 @ 5:47pm
by Structure
Sticky !!

PostPosted: Dec 3, 2004 @ 11:06am
by Crayfish

PostPosted: Dec 8, 2004 @ 4:18am
by bitbank
Allow me to add a few choice words to this discussion...

I've found that no matter how good the compiler is (and the eVC++ one is not great), the C language does not allow you to specify things which will use all of the capabilities of the target CPU. Depending on the application, good old hand-written assembly language can usually get you 2-3X the performance of C code. It certainly is a good idea to create clean C code for the compiler and I've got two good suggestions for speeding things up:

1) The ARM CPU does not do well with global (static) variables because it can only use register-indirect addressing. Keep statics in structures with a variable pointing to the structure.

2) Looping can be done a lot quicker if a loop variable is avoided. e.g.

Slow way:
s = <source pointer>
d = <destination pointer>
for (i=0; i<count; i++)
{
*d++ = *s++;
}

Fast way:
s = <source pointer>
d = <destination pointer>
pEnd = &d[count]

while (d < pEnd)
{
*d++ = *s++;
}

Enjoy,
Larry B.

PostPosted: Jul 11, 2005 @ 8:34pm
by fdave

PostPosted: Oct 6, 2005 @ 1:03am
by frasse