But, fundamentally, this is how lambda functions are implemented anyway in those functional programming languages. The lambda functions you write in those high-level languages are ultimately translated into similar code and (usually) executed dynamically on the stack. A lot of these high-level languages compile down to C too (or C--).
The idea of executing machine code directly on the stack should be familiar to those who know how buffer overflow exploits work. So first, we create a reasonably sized buffer on the stack (in main()) to hold the code:
char lambda[5000];Then, we declare some functions above main():
/**
* Definition of function #1:
*/
int add( int x, int y )
{
return x + y;
}
void dummyfunc1() {}
// We need to end with a dummy function so we can easily compute the size of
// the function by calculating the differences in addresses.
/**
* Definition of function #2:
*/
int sumsquares( int x, int y )
{
// Just for fun, we make this function bigger:
int part1 = x * x;
int part2 = y * y;
return part1 + part2;
}
void dummyfunc2() {}Right. Now comes the power of the C programming language. The reason why I padded dummy functions after those declarations is just enhance clarity in this demonstration. They strictly aren't necessary and I could have simply computed the offsets from the next function. But function order must be strictly obeyed because of the way the code subsequently gets laid out in memory.
A function is simply an array of machine instructions. So we just need to compute the size of those two functions:
unsigned long function_add_size = (long)dummyfunc1 - (long)add;unsigned long function_sumsquares_size = (long)dummyfunc2 - (long)sumsquares;Now all have left to do is load the function into the stack buffer, by copying memory directly:
memcpy( &lambda, (char*)add, function_add_size );Now we can simply execute the code anytime we wish by casting the stack buffer to the appropriate function pointer type and calling it:
a = ((int (*)(int, int)) lambda)(4, 5);We do the same with the second function, then we print the result:
memcpy( &lambda, (char*)sumsquares, function_sumsquares_size );
b = ((int (*)(int, int)) lambda)(4, 5);Lo and behold (this is what printf prints):
a = 9
b = 41But we can go further than that. Let's dynamically load in the machine code at runtime. First, we write a function that does subtraction in C, then we disassemble the function:
(gdb) disassemble subtract
Dump of assembler code for function subtract:
0x0804845f <subtract+0>: push %ebp
0x08048460 <subtract+1>: mov %esp,%ebp
0x08048462 <subtract+3>: mov 0xc(%ebp),%edx
0x08048465 <subtract+6>: mov 0x8(%ebp),%eax
0x08048468 <subtract+9>: sub %edx,%eax
0x0804846a <subtract+11>: pop %ebp
0x0804846b <subtract+12>: ret
End of assembler dump.GDB tells us that the function is 13 bytes in size. So let's directly read off the machine instructions:
(gdb) x/13xb subtract
0x804845f <subtract>: 0x55 0x89 0xe5 0x8b 0x55 0x0c 0x8b 0x45
0x8048467 <subtract+8>: 0x08 0x29 0xd0 0x5d 0xc3Some fun in Ruby turns that into the following string (representing the machine code):
irb(main):026:0> puts "0x55 0x89 0xe5 0x8b 0x55 0x0c 0x8b 0x45 0x08 0x29 0xd0 0x5d 0xc3".split.map {|code| '\\'+code[1..3]}.join
\x55\x89\xe5\x8b\x55\x0c\x8b\x45\x08\x29\xd0\x5d\xc3Which we directly load into the buffer...
strcpy( lambda, "\x55\x89\xe5\x8b\x55\x0c\x8b\x45\x08\x29\xd0\x5d\xc3" );And execute...
c = ((int (*)(int, int)) lambda)(10, 2);Which returns (the result of 10 - 2):
c = 8Another reason why C is my favorite programming language. Source code is located here for the curious.
0 comments:
Post a Comment