Porting Voxlap

Miscellaneous projects by the Build and Shoot community.
222 posts Page 9 of 15 First unread post
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


As cautiously hoped for, fixing the gcc inline assembly in updatereflects has fixed the kv6s. Check the temp branch on my github shortly. no_asm will not benefit because the C is still broke, and that branch contains no assembly as its name states.

Also, running the game from share to load the test map works again. (it didn't for time being). That means the game crashing when you shoot something appears to be the last major bug.
VladVP
Post Demon
Post Demon
Posts: 1425
Joined: Fri Dec 14, 2012 10:48 pm


Phew, it's finally holidays next week! I'll have plenty of time to finish that function map. To be honest, I'm hardly through the file related functions xD
Handles
League Participant
League Participant
Posts: 1087
Joined: Tue Jan 08, 2013 9:46 pm


holidays next week? where are you? In australia we just finished 8 weeks holiday!
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


Oh cool! I forgot you were working on that, Vlad. With luck, Conservative will compile by then too.
VladVP
Post Demon
Post Demon
Posts: 1425
Joined: Fri Dec 14, 2012 10:48 pm


I just spent two hours tracing all the function calls for loadnul() and all of it's subfunctions... I'm not done yet.... Here's my progress: http://www.lucidchart.com/invitations/a ... 100a005798
I had to create an entire new documnet just to contain the function map of this single function...

Why Ken Silverman? WHYYY did you have to make Voxlap this complicated!?!?!
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


Cool work! Yeah voxlap is over the top. It's all in the name of optomization, though with compilers these days I am not sure how much of that still applies. Also have you heard of http://www.gnu.org/software/cflow/manual/cflow.html ? I may not understand what you are doing correctly, but this might automate it to a certain extent.
Handles
League Participant
League Participant
Posts: 1087
Joined: Tue Jan 08, 2013 9:46 pm


if VladVP is doing this as i would think the software could help.
VladVP
Post Demon
Post Demon
Posts: 1425
Joined: Fri Dec 14, 2012 10:48 pm


Heeyyy, I didn't think of making an application that automates this for me... thanks Sonar, I'll look into it Green_Wink1
Cajun Style
Deuced Up
Posts: 145
Joined: Fri Dec 07, 2012 11:04 am


GJ Sonar and Vlad. I have been slacking off >_>;
Interesting to see white models were introduced with removal of assembly. It isn't farfetched that all original C alternatives were untested... I had assumed Sonar rewrote the C uptil now >,< Oh, fail is me. Hopefully you can translate the assembly in question... oh wait, it works >_<
BTW I did a rebase, because it seemed I had some funky manual merges in my tree. Also I discovered git merge-base <3
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


VladVP: I doubt i can handle function pointers in the general case, but voxlap does have too many of those, so it should do a decent job.

Cajun: Don't worry, we all were for a while there. Yeah I would of assumed ken made the C first to test the assembly against it, but evidentally that is not the case. Alternatively there was a time when the C did work but some change elsewhere broke that. I wonder of he made any tests at all, as those would be extremely usefull for finishing this.

I can see that Vlad has updated to my latest version. I guess you haven't pushed to github yet, Cajun. Also, if either if you could see what I've done to fix the assembly in the last couple commits on conservative and repeat the process on the remaining blocks, that would be MUCH appreciated. It's a very mechanical process, I just have a busy week.
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


Liking the new forum!
Cajun Style
Deuced Up
Posts: 145
Joined: Fri Dec 07, 2012 11:04 am


I have just pushed.
Here's a little callgraph I wrote while plowing through the functions. It is exhaustive for the top/parent functions listed.
Spoiler:
genmipkv6: calls
umulshr32

loadkv6: calls
kzclose
kzopen
kzread

kzread: calls
getbits
hufgencode
peekbits
putbuf4zip
qhufgencode
suckbits

expandbitstack: calls
clearbuf
expandbit256

estnorm: calls
expandbitstack
isvoxelsolid
That program looks very promising, and might deprecate all manual efforts though.
Just to be sure, this is the kind of changes you meant, Sonar?
git diff 680ea132 a2307944
Spoiler:
(how lovely; it tries to copy the terminal colours as well >:( )
Code: Select all
[1mdiff --git a/source/voxlap5.cpp b/source/voxlap5.cpp[m
[1mindex d518273..1f11765 100644[m
[1m--- a/source/voxlap5.cpp[m
[1m+++ b/source/voxlap5.cpp[m
[36m@@ -9055,14 +9055,14 @@[m [mvoid drawtile (long tf, long tp, long tx, long ty, long tcx, long tcy,[m
 	long p, i, j, a;[m
 [m
 	#if defined(__GNUC__) && !defined(NOASM) //only for gcc inline asm[m
[31m-	register float reg0 asm("mm0");[m
[31m-	register float reg1 asm("mm1");[m
[31m-	register float reg2 asm("mm2");[m
[31m-	//register float reg3 asm("mm3");[m
[31m-	register float reg4 asm("mm4");[m
[31m-	register float reg5 asm("mm5");[m
[31m-	register float reg6 asm("mm6");[m
[31m-	register float reg7 asm("mm7");[m
[32m+[m	[32mregister lpoint2d reg0 asm("mm0");[m
[32m+[m	[32mregister lpoint2d reg1 asm("mm1");[m
[32m+[m	[32mregister lpoint2d reg2 asm("mm2");[m
[32m+[m	[32m//register lpoint2d reg3 asm("mm3");[m
[32m+[m	[32mregister lpoint2d reg4 asm("mm4");[m
[32m+[m	[32mregister lpoint2d reg5 asm("mm5");[m
[32m+[m	[32mregister lpoint2d reg6 asm("mm6");[m
[32m+[m	[32mregister lpoint2d reg7 asm("mm7");[m
 	#endif[m
 [m
 	if (!tf) return;[m
[36m@@ -10723,6 +10723,17 @@[m [mstatic void updatereflects (vx5sprite *spr)[m
 	float f, g, h, fx, fy, fz;[m
 	long i, j;[m
 [m
[32m+[m	[32m#if defined(__GNUC__) && !defined(NOASM) //only for gcc inline asm[m
[32m+[m	[32mregister lpoint2d reg0 asm("mm0");[m
[32m+[m	[32mregister lpoint2d reg1 asm("mm1");[m
[32m+[m	[32mregister lpoint2d reg2 asm("mm2");[m
[32m+[m	[32mregister lpoint2d reg3 asm("mm3");[m
[32m+[m	[32m//register lpoint2d reg4 asm("mm4");[m
[32m+[m	[32mregister lpoint2d reg5 asm("mm5");[m
[32m+[m	[32mregister lpoint2d reg6 asm("mm6");[m
[32m+[m	[32m//register lpoint2d reg7 asm("mm7");[m
[32m+[m	[32m#endif[m
[32m+[m
 #if 0[m
 	//KV6 lighting calculations for: fog, white, black, intens(normal dot product), black currently not supported![m
 [m
[36m@@ -10830,25 +10841,27 @@[m [mstatic void updatereflects (vx5sprite *spr)[m
 			#ifdef __GNUC__ //gcc inline asm[m
 			__asm__ __volatile__[m
 			([m
[31m-				".intel_syntax noprefix\n"[m
[31m-				"movq	mm6, lightlist[0]\n"[m
[31m-				"mov	ecx, 255*8\n"[m
 			".Lnolighta:\n"[m
[31m-				"movq	mm0, iunivec[ecx]\n"[m
[31m-				"movq	mm1, iunivec[ecx-8]\n"[m
[31m-				"pmaddwd	mm0, mm6\n" //mm0: [tp.a*iunivec.a + tp.z*iunivec.z][tp.y*iunivec.y + tp.x*iunivec.x][m
[31m-				"pmaddwd	mm1, mm6\n"[m
[31m-				"pshufw	mm2, mm0, 0x4e\n"  //Before: mm0: [ 0 ][ a ][   ][   ][ 0 ][ b ][   ][   ][m
[31m-				"pshufw	mm3, mm1, 0x4e\n"[m
[31m-				"paddd	mm0, mm2\n"[m
[31m-				"paddd	mm1, mm3\n"[m
[31m-				"pshufw	mm0, mm0, 0x55\n"[m
[31m-				"pshufw	mm1, mm1, 0x55\n"  //After:  mm0: [   ][   ][   ][a+b][   ][a+b][   ][a+b][m
[31m-				"movq	kv6colmul[ecx], mm0\n"[m
[31m-				"movq	kv6colmul[ecx-8], mm1\n"[m
[31m-				"sub	ecx, 2*8\n"[m
[31m-				"jnc	short .Lnolighta\n"[m
[31m-				".att_syntax prefix\n"[m
[32m+[m				[32m"movq	%c[uv](%[c]), %[y0]\n"[m
[32m+[m				[32m"movq	%c[uv]-8(%[c]), %[y1]\n"[m
[32m+[m				[32m"pmaddwd	%[y6], %[y0]\n"      //mm0: [tp.a*iunivec.a + tp.z*iunivec.z][tp.y*iunivec.y + tp.x*iunivec.x][m
[32m+[m				[32m"pmaddwd	%[y6], %[y1]\n"[m
[32m+[m				[32m"pshufw	$0x4e, %[y0], %[y2]\n"   //Before: mm0: [ 0 ][ a ][   ][   ][ 0 ][ b ][   ][   ][m
[32m+[m				[32m"pshufw	$0x4e, %[y1], %[y3]\n"[m
[32m+[m				[32m"paddd	%[y2], %[y0]\n"[m
[32m+[m				[32m"paddd	%[y3], %[y1]\n"[m
[32m+[m				[32m"pshufw $0x55, %[y0], %[y0]\n"[m
[32m+[m				[32m"pshufw	$0x55, %[y1], %[y1]\n"   //After:  mm0: [   ][   ][   ][a+b][   ][a+b][   ][a+b][m
[32m+[m				[32m"movq	%[y0], %c[kvcm](%[c])\n"[m
[32m+[m				[32m"movq	%[y1], %c[kvcm]-8(%[c])\n"[m
[32m+[m				[32m"sub	$2*8, %[c]\n"[m
[32m+[m				[32m"jnc    .Lnolighta\n"[m
[32m+[m				[32m: [y0] "=y" (reg0), [y1] "=y" (reg1),[m
[32m+[m				[32m  [y2] "=y" (reg2), [y3] "=y" (reg3),[m
[32m+[m				[32m  [y6] "=y" (reg6)[m
[32m+[m				[32m: [c]  "r" (255*8), "4" (*(int64_t *)lightlist),[m
[32m+[m				[32m  [uv] "p" (iunivec), [kvcm] "p" (kv6colmul)[m
[32m+[m				[32m:[m
 			);[m
 			#endif[m
 			#ifdef _MSC_VER //msvc inline asm[m
[36m@@ -10884,28 +10897,30 @@[m [mstatic void updatereflects (vx5sprite *spr)[m
 			#ifdef __GNUC__ //gcc inline asm[m
 			__asm__ __volatile__[m
 			([m
[31m-				".intel_syntax noprefix\n"[m
[31m-				"punpcklbw	mm5, vx5.kv6col\n"[m
[31m-				"movq	mm6, lightlist[0]\n"[m
[31m-				"mov	ecx, 255*8\n"[m
[32m+[m				[32m"punpcklbw	%[vxpart], %[y5]\n"[m
 			".Lnolightb:\n"[m
[31m-				"movq	mm0, iunivec[ecx]\n"[m
[31m-				"movq	mm1, iunivec[ecx-8]\n"[m
[31m-				"pmaddwd	mm0, mm6\n" //mm0: [tp.a*iunivec.a + tp.z*iunivec.z][tp.y*iunivec.y + tp.x*iunivec.x][m
[31m-				"pmaddwd	mm1, mm6\n"[m
[31m-				"pshufw	mm2, mm0, 0x4e\n" //Before: mm0: [ 0 ][ a ][   ][   ][ 0 ][ b ][   ][   ][m
[31m-				"pshufw	mm3, mm1, 0x4e\n"[m
[31m-				"paddd	mm0, mm2\n"[m
[31m-				"paddd	mm1, mm3\n"[m
[31m-				"pshufw	mm0, mm0, 0x55\n"[m
[31m-				"pshufw	mm1, mm1, 0x55\n" //After:  mm0: [   ][   ][   ][a+b][   ][a+b][   ][a+b][m
[31m-				"pmulhuw	mm0, mm5\n"[m
[31m-				"pmulhuw	mm1, mm5\n"[m
[31m-				"movq	kv6colmul[ecx], mm0\n"[m
[31m-				"movq	kv6colmul[ecx-8], mm1\n"[m
[31m-				"sub	ecx, 2*8\n"[m
[31m-				"jnc short .Lnolightb\n"[m
[31m-				".att_syntax prefix\n"[m
[32m+[m				[32m"movq	%c[uv](%[c]), %[y0]\n"[m
[32m+[m				[32m"movq	%c[uv]-8(%[c]), %[y1]\n"[m
[32m+[m				[32m"pmaddwd	%[y6], %[y0]\n"      //mm0: [tp.a*iunivec.a + tp.z*iunivec.z][tp.y*iunivec.y + tp.x*iunivec.x][m
[32m+[m				[32m"pmaddwd	%[y6], %[y1]\n"[m
[32m+[m				[32m"pshufw	$0x4e, %[y0], %[y2]\n"   //Before: mm0: [ 0 ][ a ][   ][   ][ 0 ][ b ][   ][   ][m
[32m+[m				[32m"pshufw	$0x4e, %[y1], %[y3]\n"[m
[32m+[m				[32m"paddd	%[y2], %[y0]\n"[m
[32m+[m				[32m"paddd	%[y3], %[y1]\n"[m
[32m+[m				[32m"pshufw $0x55, %[y0], %[y0]\n"[m
[32m+[m				[32m"pshufw	$0x55, %[y1], %[y1]\n"   //After:  mm0: [   ][   ][   ][a+b][   ][a+b][   ][a+b][m
[32m+[m				[32m"pmulhuw	%[y5], %[y0]\n"[m
[32m+[m				[32m"pmulhuw	%[y5], %[y1]\n"[m
[32m+[m				[32m"movq	%[y0], %c[kvcm](%[c])\n"[m
[32m+[m				[32m"movq	%[y1], %c[kvcm]-8(%[c])\n"[m
[32m+[m				[32m"sub	$2*8, %[c]\n"[m
[32m+[m				[32m"jnc .Lnolightb\n"[m
[32m+[m				[32m: [y0] "+y" (reg0), [y1] "+y" (reg1),[m
[32m+[m				[32m  [y2] "+y" (reg2), [y3] "+y" (reg3),[m
[32m+[m				[32m  [y5] "=y" (reg5), [y6] "=y" (reg6)[m
[32m+[m				[32m: "5" (*(int64_t *)lightlist),[m
[32m+[m				[32m  [c]  "r" (255*8), [vxpart] "m" (vx5.kv6col),[m
[32m+[m				[32m  [uv] "p" (iunivec), [kvcm] "p" (kv6colmul)[m
 			);[m
 			#endif[m
 			#ifdef _MSC_VER //msvc inline asm[m

That looks like an Intel to AT&T conversion...which I thought wasn't necessary???
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


Fetched you changes. I still need to look through all of them but thanks. Cajun_no_asm is merged and not rebased but I guess that is fine for now. You haven't done anything with conservative or normal no_asm so you might as well reset those to where mine are however.
Cajun Style wrote:
That looks like an Intel to AT&T conversion...which I thought wasn't necessary???
Yeah that's it all right. Here's the thing: the lines that have that use constraints (like %[asdf_name] ) need to be converted to GCC syntax. I could have left fixed registers like "mmx1", but instead I converted them all to constraints, so just about every line needed to be made to gcc syntax.

Functions like hrendzsse, hrendnozsse, etc are much bigger so it's probably too much effort to do that, at least initially. If you look hrendzsse or expandbit256, you can see I only needed to change the lines which had something to do with an array or variable accessed by it's symbol (constant address after linking). gcc doesn't mangle those symbols for you, so I had to make [p] constraints (constant resolved at link time) for each symbol/address used like that, and convert those lines, but only those lines, to gcc inline assembly.

So if it's shorter, try your hand and converting the whole thing too gcc syntax with no fixed registers (most efficient), but just initially to get conservative to build, just converting the lines with symbols will suffice. Again, expandbit256 is probably the best example of that.
Cajun Style
Deuced Up
Posts: 145
Joined: Fri Dec 07, 2012 11:04 am


I'm sorry but it seems those things that need converting are the hardest to understand. I couldn't find a tutorial or reference as to what "constraints" mean exactly in this context. (Or what the syntax is exactly.) I hope Vlad understands. Otherwise a script, teaching us, or doing it yourself are the options.
Sonarpulse
Coder
Coder
Posts: 443
Joined: Thu Dec 13, 2012 7:18 pm


If you look through the thread I describe it in pretty good detail a couple times, but I'll admit searching for that is annoying. I mean I would probably get to it within the next mouth, but if you and vlad take a stab at and it can get done sooner. If you make an attempt I'll certainly comment on it, that's probably the fastest way to learn.
222 posts Page 9 of 15 First unread post
Return to “Noteworthy Picks”

Who is online

Users browsing this forum: No registered users and 16 guests