VoxelWar Discussion thread

Warp · Unread post by **Warp** » Mon Oct 12, 2015 7:28 am

LeCom wrote:
Oh good then I'm not the only one seeing faults in the development of IB (not saying that it's crap though). GM and his team just have wrong priorities. They seem to emphasize on miscellaneous or unimportant things, while IB itself has nothing to offer as a game. It's more like a game engine, but very complex and hard to mod.

I think that's right, well put. Complex and Hard to mod is definetly a issue when someone doesn't have much experience with things.

jadedctl · Unread post by **jadedctl** » Sun Nov 08, 2015 8:30 pm

Anyone happen to have a copy of VoxelWar's source?
It looks like the forums and site are down....

longbyte1 · Unread post by **longbyte1** » Tue Nov 10, 2015 4:05 am

Uhh, did he make his own raycasting engine from scratch?

LeCom · Unread post by **LeCom** » Tue Nov 10, 2015 2:51 pm

longbyte1 wrote:
Uhh, did he make his own raycasting engine from scratch?

If you mean me, I wrote both from scratch, yes, how else.
PS: The SVO 4DOF raycaster is kind of finished and is actually shitty on CPUs, something like 10 FPS for 320x240 on a 2.4 GHz CPU and DDR3 RAM (single thread only and infinite visibility tho') and still a view rendering glitches. But it shows that it's possible and that it's not far away from our 60 FPS Full HD target(and consider how fast GPUs are in comparison).

Marisa Kirisame · Unread post by **Marisa Kirisame** » Tue Nov 10, 2015 10:38 pm

LeCom wrote:
longbyte1 wrote:
Uhh, did he make his own raycasting engine from scratch?
If you mean me, I wrote both from scratch, yes, how else.
PS: The SVO 4DOF raycaster is kind of finished and is actually shitty on CPUs, something like 10 FPS for 320x240 on a 2.4 GHz CPU and DDR3 RAM (single thread only and infinite visibility tho') and still a view rendering glitches. But it shows that it's possible and that it's not far away from our 60 FPS Full HD target(and consider how fast GPUs are in comparison).

Port it to GLSL for great justice performance

Also, use beamcasting starting with e.g. 16x16 regions, then recursing down to suit

Note I haven't done the latter yet but if you combine the two you'd probably use a mipmap pyramid for your texture and do several FBO stages

LeCom · Unread post by **LeCom** » Wed Nov 11, 2015 4:35 pm

Tbh the only way I ever programmed a GPU was via SDL2's rendering API because I was always a voxel and raycasting/tracing fanboy. But fine, I think my hardware supports PND3D's GLSL mode, so why not try it.
At the moment I AM using a mipmap pyramid. Couldn't figure out anything else that would be good in a game with changing, and possibly complex terrain.
I had confirmed with some tests that the main reason for the slowness was the fact that the CPU has to cycle through every single pixel. That's I'm figuring out how to implement an algorithm that casts like only one single ray for the upper mip levels, and then divides into several new rays when going down in the level.

I see, people are still asking for source. Dunno if I even can get the external drive containing it, but may I ask for what purpose at least? Note that it's pretty much what used to be called spaghetti code back then, aka weird and mostly make-shift stuff aka it's better to rewrite it than continuing to work with it (also the problem with Voxlap).

Marisa Kirisame · Unread post by **Marisa Kirisame** » Wed Nov 11, 2015 10:22 pm

If you want to improve performance, throw more cores and SIMD lanes at it ;) OpenMP makes the "more cores" thing easy. Just make sure you do something like this:

Code: Select all

int y;
#pragma omp parallel for
for(y = 0; y < height; y++)
{
    int x;
    Uint32 *dest = ((Uint32 *)(screen->pixels + screen->pitch*y));

    for(x = 0; x < width; x++, dest++)
    {

and not this:

Code: Select all

int x, y;
Uint32 *dest = (Uint32 *)screen->pixels;
#pragma omp parallel for
for(y = 0; y < height; y++, dest += (screen->pitch-width)/4)
{
    for(x = 0; x < width; x++, dest++)
    {

as the threads will share the same x + dest and will dissolve into a scattered mess.

The SIMD approach works better if you do something like this:

Code: Select all

for(x = 0; x < width; x += 4)
{

__m128 posx, posy, posz;
__m128 velx, vely, velz;
__m128 time;

...

posx = _mm_add_ps(posx, _mm_mul_ps(time, velx));
posy = _mm_add_ps(posy, _mm_mul_ps(time, vely));
posz = _mm_add_ps(posz, _mm_mul_ps(time, velz));

...

}

Rather than:

Code: Select all


for(x = 0; x < width; x++)
{

__m128 pos;
__m128 vel;
float time;

...

pos = _mm_add_ps(pos, _mm_mul_ps(_mm_set1_ps(time), vel));

...

}

Oh, and it's best to code that sort of stuff in C, as even I am not better at coding assembly than a C compiler.

LeCom wrote:
That's I'm figuring out how to implement an algorithm that casts like only one single ray for the upper mip levels, and then divides into several new rays when going down in the level.

Yep, that's beamtracing.

LeCom · Unread post by **LeCom** » Thu Nov 12, 2015 6:16 am

I'm not relying that much on multithreading. Especially since the speed up per core is only around 70% of a core's power. Then most hardware usually only has 4-8, and changing to beamtracing would make it hard to parallelise (btw any idea why google returns basically nothing usable for beamtracing?). Dunno about SIMD yet, my usage of MMX in the voxlap hack was pretty much of a fail. Btw I don't really get your recommendation, you want me to use 128 bit registers for stuff I can do with floats, instead of doing the actual SIMD thing?
Also, yes, C rules if it comes to these things. I was considering D for some time because of its high-level stuff (compiled+high-level=speed?), but I doubt so now.

Marisa Kirisame · Unread post by **Marisa Kirisame** » Fri Nov 13, 2015 2:02 am

LeCom wrote:
I'm not relying that much on multithreading. Especially since the speed up per core is only around 70% of a core's power. Then most hardware usually only has 4-8,

For raytracing I get very close to double speed for two actual cores. For hyperthreading I still see an improvement, although not as much. It still helps.

LeCom wrote:
and changing to beamtracing would make it hard to parallelise (btw any idea why google returns basically nothing usable for beamtracing?).

Just group the work into, say, 32x32 regions. It will be serial within the regions, but the regions will be traceable in parallel.

LeCom wrote:
Dunno about SIMD yet, my usage of MMX in the voxlap hack was pretty much of a fail. Btw I don't really get your recommendation, you want me to use 128 bit registers for stuff I can do with floats, instead of doing the actual SIMD thing?

No, I'm saying use the 128-bit registers to operate on 4 rays at once, rather than using them as a 4-float vector.

Also if you arrange the 4 rays into 2x2 blocks it'll be easier to beamtrace and you may also slightly reduce the level of divergence. With that said, _mm_movemask_ps() is also useful (maskmove = copy while masking some values out, movemask = copy the top bit of each value and shove it into an int).

If you're curious and would like to have a nosey around with SSE stuff, the guide you want is the Intel Intrinsics Guide, which used to be a Java program but is now a notably nicer web app: https://software.intel.com/sites/landin ... sicsGuide/

LeCom wrote:
Also, yes, C rules if it comes to these things. I was considering D for some time because of its high-level stuff (compiled+high-level=speed?), but I doubt so now.

I once wrote a raytracer in C++ which used virtual functions. The fact that I used virtual functions had negligible performance impact, because compilers these days are actually pretty good.

I don't know how good D's compiler is though. If you have no issues with C, then just stick with C.

----

Icarus North wrote:
And how tf does this topic have nearly 70 pages, It's like the biggest topic in this forum

Biggest active thread. Biggest thread used to be the original Iceball thread but jdrew's shitty clan managed to beat that record by treating the thread like a chat room.

LeCom · Unread post by **LeCom** » Fri Nov 13, 2015 9:25 am

I could implement a stack for the rays and distribute them among cores and SSE# registers, or other stuff, I know. But at first I want a working and good implementation of the tracer. If one's algorithm isn't good enough, not even SIMD or hyperthreading help.

As for D, there's a GCC port (GDC) and an LLVM implementation. I don't know if LLVM's optimization code is language-independent and therefore as fast as the C one, but there's still GDC. C works pretty fine though, I just had some thoughts that certain high-level stuff in D could be faster than doing it by hand in C (pretty much like the ASM vs. C comparison). However, I don't think so anymore and stick to C anyway.

Edit: page 69

longbyte1 · Unread post by **longbyte1** » Sat Nov 14, 2015 6:43 am

I actually once considered using D (heck, it lets you do inline asm!) but wasn't really sure of the performance given the limited assortment of available compilers. I'm glad you made the same consideration too.

LeCom · Unread post by **LeCom** » Sat Nov 14, 2015 8:45 am

longbyte1 wrote:
I actually once considered using D (heck, it lets you do inline asm!) but wasn't really sure of the performance given the limited assortment of available compilers. I'm glad you made the same consideration too.

I don't really get the link between compiler selection width and language performance. Out of the 3 main D compilers available, one is the reference implementation with focus on reliability and correctness, and the other are ports of the two most important and fastest compilers out there (GCC and LLVM). Plus, benchmarks say that D is almost as fast as C/C++, the main slowdown reason being the shitty garbage collector (that you can disable ofc).
Moreover, inline asm is nothing compared to the freshly added OOP-based SIMD implementation.

longbyte1 · Unread post by **longbyte1** » Sat Nov 14, 2015 3:55 pm

LeCom wrote:
longbyte1 wrote:
I actually once considered using D (heck, it lets you do inline asm!) but wasn't really sure of the performance given the limited assortment of available compilers. I'm glad you made the same consideration too.
I don't really get the link between compiler selection width and language performance. Out of the 3 main D compilers available, one is the reference implementation with focus on reliability and correctness, and the other are ports of the two most important and fastest compilers out there (GCC and LLVM). Plus, benchmarks say that D is almost as fast as C/C++, the main slowdown reason being the shitty garbage collector (that you can disable ofc).
Moreover, inline asm is nothing compared to the freshly added OOP-based SIMD implementation.

What are you trying to tell me?

LeCom · Unread post by **LeCom** » Sun Nov 15, 2015 12:02 pm

longbyte1 wrote:
What are you trying to tell me?

Same here

Icarus North wrote:
That's basically all it is anyways since people aren't actively discussing and playing it anymore.

Can't you just, like, not give a fuck?

bloodfox · Unread post by **bloodfox** » Sun Nov 15, 2015 9:22 pm

LeCom wrote:
longbyte1 wrote:
What are you trying to tell me?
Same here
Icarus North wrote:
That's basically all it is anyways since people aren't actively discussing and playing it anymore.
Can't you just, like, not give a fuck?

impossibruuuuuuuuuuuuuuuuuuuuu

Build and Shoot

VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Re: VoxelWar Discussion thread

Who is online