Label adresses in inline assembly

Remco Hoogenboezem

Hello, I would like to emit the lable addresses from a bit of inline assembly code into a branch table but I could not figure out how to do it. I would say it should look something like this. At least the compiler emits something like this for a switch statement. I am using visual studio 2005. __asm { mov eax,BRANCHTABLE mov ecx,"Number between 0 and N" jmp [eax+ecx*4] BRANCHTABLE &LABLE0 &LABLE1 &LABLE2 ... &LABLEN LABLE0: ... LABLE1: ... LABLE2: ... ... LABLEN } Thanks ;)

Stuart Dootson

In MASM, possibly - but in inline assembly, you can't use operators such as & and you can't declare initialised data (which is what you're doing). I really don't know how you'd go about that. Maybe replace hte branchtable by a switch statement with interspersed inline assembly?

Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

Lost User

Hi Remco, I created this little inline assembly sample for you. It loads the addresses of code labels into a jump table and allows you to jump to the offset based on a user choice. I commented each line.

#include "stdafx.h"

int _tmain(int argc, _TCHAR* argv[])
{
	#pragma pack (1)
		unsigned long table[3] = {0};
	#pragma pack(pop)

	char szError[] = "Invalid choice.\n";
	char szPrompt[] = "Enter a number between 1 and 3:\n";
	char szNotify[] = "You entered jump number: %d";
	char szFmt[] = "%d";
	int iChoice =0;
	int iSizeArray = (sizeof(table) / sizeof(table[0]));

	__asm
	{
		lea esi, table			;Load address of table into esi
		mov edx, DWORD PTR jump1		;Move address of jump1 into edx
		mov [esi], edx			;Move edx into table[0]
		add esi, 4				;Increment esi by size of unsigned long
		mov edx, DWORD PTR jump2		;Move address of jump2 into edx
		mov [esi], edx			;Move edx into table[1]
		add esi, 4				;Increment esi by size of unsigned long
		mov edx, DWORD PTR jump3		;Move address of jump3 into edx
		mov [esi], edx			;Move edx into table[2]
		lea eax, szPrompt			;Load effective address of prompt
		push eax				;push eax onto stack
#ifdef _DLL						;Are we dynamically linked to C runtime?
		call DWORD PTR printf		;Call dynamic linked printf
#else
		call printf				;Call static linked printf
#endif
		add esp, 4				;adjust stack pointer because we pushed eax
		lea eax, iChoice			;Load effective address of iChoice
		push eax				;Push it onto the stack
		lea ebx, szFmt			;Load effective address of fmt
		push ebx				;Push it on the stack
#ifdef _DLL						;Are we dynamically linked to C runtime?
		call DWORD PTR scanf		;Call dynamic linked scanf
#else
		call scanf				;Call static linked scanf
#endif
		add esp, 8				;adjust stack pointer because we pushed eax and ebx
		mov eax, iChoice			;Move iChoice value into eax
		mov ebx, iSizeArray		;Move the size of our array into ebx 
		cmp eax, ebx				;compare
		ja error				;Jump to the error lable if iChoice is larger than table array
		dec eax					;Decrement iChoice by 1 because the table is zero based array
		lea esi, table			;Load address of table into esi
		mov eax,[esi + 4*eax]		;Move value of table array onto eax by calculating offset
		jmp eax				;Jump to address stored in eax
jump1:
		mov eax, 1				;Move number 1 into eax
		jmp notify				;Absolute jump to notify label
jump2:
		mov eax, 2				;Move number 1 into eax
		jmp notify				;Absolute jump to notify label
jump3:
		mov eax, 3				;Move number 1 i

Stuart Dootson

Oooh - nice. I'd wondered if there was a way to create the jump table as a C variable, but hadn't thought of initialising it in assembly code.

Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

Remco Hoogenboezem

Hello Stuart and David, Thank you for your replies. Both solutions given are practical I think. I intended to use the branch table inside a time ciritcal part of my ray-tracer (ray vs axis aligned bounding box intersection test). I am building a stream ray-tracer using the sse ALU processing four rays in parallel. But the number of rays to process is not always a multiple of four to process the last 1,2 or 3 rays I thought to use a branch table. But I think I will make a funciton one for each case and use a switch statement ouside of the filter functions making the code easier to read and the c++ compiler does emit the code I would like. Thank you for your help!!

Luc Pattyn

A lot of jumps, and certainly those through a jump table, introduce a hickup in the instruction flow, as they are not predictable at all; so I'd rather avoid them. assuming lots of rays I would take a different approach: if not a multiple of four, calculate one of the rays multiple times, e.g. duplicate the last ray one to three times so the number always is a multiple of 4. That probably will be simpler and may be faster. :)

Luc Pattyn

:badger: :jig: :badger:

Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.

:jig: :badger: :jig:

Remco Hoogenboezem

Yes conditional branches can be expensive causing pipeline stalls. But in my implementation using streams of rays there is only one hard to predict branch and that branch is only taken once per processed stream so it is not really that important. But a c++/assembly mix looks alittle bit messy. Thats why I removed the branch table from my inline assembly function to improve code readability at the expense of code size. Expanding the number of rays per stream to a multiple of four is for me not an option because rays are partitioned in place. Meaning that the filtered stream or output stream should be of the same length compared to the input stream. Thanks :)

Luc Pattyn

I forgot to mention one technique I often use in cases like this, where the expensive jump is taken only once (upon entry); I'll describe it in pseudo-code, it basically is a loop unroll by 4:

switch(count%4) {
case 0:
// do step
goto case1;
case 1:
// do step
goto case2;
case 2:
// do step
goto case3;
case 3:
// do step
count-=4;
if (count>0) goto case0;
}

You can do this in any language, with a switch or with labels and jumps (and if the language allows fall-through, you may skip most of the goto's). In assembly, you would still need labels. :)

Luc Pattyn

:badger: :jig: :badger:

Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.

:jig: :badger: :jig: