PocketMatrix

Posted: **Jun 7, 2007 @ 4:15pm**

Hi, Daniel.

I can't understand though, why turning by small blocks will give performance gain.

2-dimentional array is stored as linear chain of WORDs. Turning by 40x40 for example means that I need to copy 40 times for 40 bytes everytime shifting source pointer by (1 line - 40 elements )

In my cycle I do copying one element after another, without jumping.

As I think you've seen performance gain in action, so maybe I'm wrong somewhere in my conclusions.

If is not too hard for you can you explain me more or maybe you have some links to places where it is explained?

Nomad.

Posted: **Jun 7, 2007 @ 4:31pm**

The reason why working with small block is faster is simple: it is cache friendly!

Cache misses are one of the major performance problems on mobile phones, because main memory is very slow. To my feeling the performance drop is a lot heavier than on the PC.

So how does using small blocks help?

While you walk in 1 pixel steps through memory in the source image you do 1 line steps (image-width * 2 bytes) in the target image. The first column of target pixels will always result in cache misses. In case of a small block, the 2nd target column will result in hits, because those cache lines that were loaded for the first target column also hold the pixels for the 2nd target column. With a large block (full image) though, you already wrote into so many areas in memory that those cache lines that hold the first pixels of the first column are already caching other parts of memory (those pixels at the bottom of the image). So for the large image you will almost always have cache misses for every target column.

While this explanation might not be fully technically correct (please somebody correct me if I'm wrong) it should give you an idea what is going on. The general rule is: data locality is king.

In an extreme form you could store the complete image in small blocks, which is what modern graphics cards are doing.

Here's a good book on this topic:

Although the book is for PC programming, the general ideas hold also for mobile devices.

BTW: If you optimize your code well, you can even do filtered (sub-pixel accurate) rotations at any angle in real time. We recently did a project () were we needed to rotate and zoom a 2D map in real time. Without filtering it looks ugly but with some nice tricks filtering can be done almost as fast as without...

bye,
Daniel

Posted: **Jun 8, 2007 @ 2:05pm**

Posted: **Jun 8, 2007 @ 3:17pm**

PocketMatrix

Rotating image in memory by 90 degress

Rotating image in memory by 90 degress