Page 1 of 2

Non-scanline rasterization of triangles

PostPosted: Jan 30, 2006 @ 1:18am
by hm
I am currently revisiting the triangle rasterization code that I have in Vincent. The current implementation is more or less based on the ideas out of Chris Hecker's article . The problem with this approach is that it is quite bulky and expensive during triangle setup. In addition, more advanced features like anti-aliasing are *very* difficult to incorporate.

That said I want to experiment with half-space based rasterization, along the lines of Nicolas Capens article .
I believe that many if not most hardware implementations are actually based on this approach nowadays.
Has anyone here actually tried to use this type of rasterizer in a software implementation for textured polygons on a PPC, and what were the experiences?

Also, has anyone experimented with texture swizzling in such a rasterizer etc. to optimize the cache behavior?

Thanks,
HM

Re: Non-scanline rasterization of triangles

PostPosted: Jan 30, 2006 @ 4:13am
by torus

PostPosted: Jan 30, 2006 @ 5:04am
by Dan East

PostPosted: Jan 30, 2006 @ 10:21am
by hm

Re: Non-scanline rasterization of triangles

PostPosted: Jan 30, 2006 @ 10:22am
by hm

Re: Non-scanline rasterization of triangles

PostPosted: Jan 30, 2006 @ 11:55am
by torus

PostPosted: Jan 30, 2006 @ 4:28pm
by Dan East
I plugged the half-space rasterizer into my engine (the-engine-formally-known-as-Varium), and the results are dismal. This is rendering 340 polys, 64x64 texture (when utilized), full screen 240x320 (all pixels rendered to).

[edit]These half space benchmarks are flawed. See my posts below[/edit]
15 FPS: Exact half space C++ implementation as provided (flatshading, q=8 )
10 FPS: Half space with texture mapping q=8
07 FPS: Half space with texture mapping q=16
09 FPS: Half space with texture mapping q=4
51 FPS: My own standard span based rasterization no perspective correction
40 FPS: My own rasterization with gouraud but no perspective correction
32 FPS: My own rasterizer with perspective correction and gouraud shading

The half space test I performed did not include occlusion testing (span based or zbuffer), skybox (which would be completely occluded), or gouraud shading. The tests with my own rasterizers used my C++ implementation (which could use more optimization), and included occlusion testing, as well as a hidden skybox. To make things even worse, that is using my n-gon rasterizer, which is 10-20% slower than the tri rasterizer I used to use. I benchmarked rendering only tris, so my own routines would have performed even better if I were rendering polys with more than 3 sides (which is typical of content generated from MAP files).

The ASM version of my rasterizers achieves another 50% or so over the C++ implementation.

I just don't see how I could pick up 500% performance over his C++ code just by implementing it in ASM. The inner loops are already very tight, so I don't think the performance would even double.

It would be interesting to know exactly why his implementation performs so much worse, considering his code appears to execute substantially fewer instructions than mine. I also have 2-3 function calls per span too. All I can figure is there is some major stalling going on due to the cache.

Dan East

PostPosted: Jan 30, 2006 @ 4:37pm
by Dan East
Actually, I just realized what is causing part of the disparity. I'm not testing for polys entirely outside the display, which I do with my own routines.
I'll post another set of benchmarks shortly.

Dan East

PostPosted: Jan 30, 2006 @ 4:51pm
by Dan East
Okay, after adding some quick tests to see if the poly is visible at all, as well as to drop out of the loop when the poly is outside the bottom or right side of the display, the flatshading FPS jumped to 47, and with texture mapping it is 33.

An additional optimization would be to jump into the output buffer for polys that are off the top or left sides of the buffer.

So it is 35% slower than my non corrected routine, and that is still not using any occlusion testing with the half space routine.

Dan East

PostPosted: Jan 30, 2006 @ 5:06pm
by Dan East
As I suspected, the difference in performance between the two routines converges as the poly count goes up. With my routines the poly setup and span routine overhead start adding up.

1200 tris
17 FPS: half space
16 FPS: my routine

That's without gouraud, but again all my spans are being checked for occlusion, where the half space is just pure rendering. Also, a standard tri rasterizer should still outperform it compared to my ngon routine at that poly count.

Dan East

PostPosted: Jan 30, 2006 @ 5:13pm
by hm

PostPosted: Mar 19, 2006 @ 8:33pm
by hm

PostPosted: Mar 22, 2006 @ 1:56am
by Dan East

PostPosted: Mar 22, 2006 @ 7:06am
by hm

PostPosted: May 12, 2006 @ 5:14am
by StephC