Writing compiler, got obfuscator?
-
Hello, My post isn't about a programming bug, but rather funny reflector bug that I've found while developing my compiler. Let's consider such code :
float getRange(float2 x, float2 y) { return sqrt((x.x - y.x)*(x.x-y.x) + (x.y-y.y)*(x.y-y.y)); }
It's pretty trivial function, we all know what is does, when my compiler tried to compile it to IL (it's CG-like language compiler, so float2 is a native type for it, but to be compatible with external .NET modules, it uses normal struct to represent it) I've got code like this (only few lines of IL):L_0000: ldarg.1 L_0001: stloc.0 L_0002: ldloca.s num L_0004: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0009: ldarg.2 L_000a: stloc.0 L_000b: ldloca.s num L_000d: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0012: sub
arg.1 and arg.2 are vectors passed as arguments. So what it does is loading argument 1 (x vector), storing it in local var. then loading address of this vector, getting field (x) on stack, then doing the same for 2nd vector, then subing values on stack, so it's actually IL code forx.x - y.x
above. And now, what Reflector.NET has printed out:private float getRange(Float2 num1, Float2 num2) { Float2 num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; return CgLib.sqrt((float) (((num.x - num.x) * (num.x - num.x)) + ((num.y - num.y) * (num.y - num.y)))); }
As you can see, he interpreted lines about loading argument and storing it in temp value, but what he forgot is that addresses of fields on stack refer to different vectors, eventhough temporary vector "num" was used to represent it. Function works correctly btw :). And by the way - do you know other methods of cheating reflector? -
Hello, My post isn't about a programming bug, but rather funny reflector bug that I've found while developing my compiler. Let's consider such code :
float getRange(float2 x, float2 y) { return sqrt((x.x - y.x)*(x.x-y.x) + (x.y-y.y)*(x.y-y.y)); }
It's pretty trivial function, we all know what is does, when my compiler tried to compile it to IL (it's CG-like language compiler, so float2 is a native type for it, but to be compatible with external .NET modules, it uses normal struct to represent it) I've got code like this (only few lines of IL):L_0000: ldarg.1 L_0001: stloc.0 L_0002: ldloca.s num L_0004: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0009: ldarg.2 L_000a: stloc.0 L_000b: ldloca.s num L_000d: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0012: sub
arg.1 and arg.2 are vectors passed as arguments. So what it does is loading argument 1 (x vector), storing it in local var. then loading address of this vector, getting field (x) on stack, then doing the same for 2nd vector, then subing values on stack, so it's actually IL code forx.x - y.x
above. And now, what Reflector.NET has printed out:private float getRange(Float2 num1, Float2 num2) { Float2 num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; return CgLib.sqrt((float) (((num.x - num.x) * (num.x - num.x)) + ((num.y - num.y) * (num.y - num.y)))); }
As you can see, he interpreted lines about loading argument and storing it in temp value, but what he forgot is that addresses of fields on stack refer to different vectors, eventhough temporary vector "num" was used to represent it. Function works correctly btw :). And by the way - do you know other methods of cheating reflector?Ravadre wrote:
And by the way - do you know other methods of cheating reflector?
Well, one obfuscation technique I know which I think is quite clever is that the .NET CLR allows overloading on return type as well as parameter types. I don't think any of the CLS languages allow this so an obfuscator can give the same name to two methods with the same parameters but differing return types. This cannot be 'decompiled' without renaming one of the methods. I doubt it will cheat Reflector though, it will just report something impossible.
Regards, Rob Philpott.
-
Ravadre wrote:
And by the way - do you know other methods of cheating reflector?
Well, one obfuscation technique I know which I think is quite clever is that the .NET CLR allows overloading on return type as well as parameter types. I don't think any of the CLS languages allow this so an obfuscator can give the same name to two methods with the same parameters but differing return types. This cannot be 'decompiled' without renaming one of the methods. I doubt it will cheat Reflector though, it will just report something impossible.
Regards, Rob Philpott.
Rob Philpott wrote:
This cannot be 'decompiled' without renaming one of the methods.
I would start off by giving all the methods a unique name and then decompiling. Of course, nobody would simply obfuscate by just renaming everything…
My GUID: ca2262a7-0026-4830-a0b3-fe5d66c4eb1d :) Now I can Google this value and find all my Code Project posts!
-
Hello, My post isn't about a programming bug, but rather funny reflector bug that I've found while developing my compiler. Let's consider such code :
float getRange(float2 x, float2 y) { return sqrt((x.x - y.x)*(x.x-y.x) + (x.y-y.y)*(x.y-y.y)); }
It's pretty trivial function, we all know what is does, when my compiler tried to compile it to IL (it's CG-like language compiler, so float2 is a native type for it, but to be compatible with external .NET modules, it uses normal struct to represent it) I've got code like this (only few lines of IL):L_0000: ldarg.1 L_0001: stloc.0 L_0002: ldloca.s num L_0004: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0009: ldarg.2 L_000a: stloc.0 L_000b: ldloca.s num L_000d: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0012: sub
arg.1 and arg.2 are vectors passed as arguments. So what it does is loading argument 1 (x vector), storing it in local var. then loading address of this vector, getting field (x) on stack, then doing the same for 2nd vector, then subing values on stack, so it's actually IL code forx.x - y.x
above. And now, what Reflector.NET has printed out:private float getRange(Float2 num1, Float2 num2) { Float2 num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; return CgLib.sqrt((float) (((num.x - num.x) * (num.x - num.x)) + ((num.y - num.y) * (num.y - num.y)))); }
As you can see, he interpreted lines about loading argument and storing it in temp value, but what he forgot is that addresses of fields on stack refer to different vectors, eventhough temporary vector "num" was used to represent it. Function works correctly btw :). And by the way - do you know other methods of cheating reflector?There are lots of ways to make Reflector act all wonky if you're the one emitting the IL. I used Boo[^] full-time for about a year and Reflector's C# formatter would often translate the IL into an error message like "This is not a valid method." or would just crash Reflector altogether. The IL was valid, though.
"we must lose precision to make significant statements about complex systems." -deKorvin on uncertainty
-
Hello, My post isn't about a programming bug, but rather funny reflector bug that I've found while developing my compiler. Let's consider such code :
float getRange(float2 x, float2 y) { return sqrt((x.x - y.x)*(x.x-y.x) + (x.y-y.y)*(x.y-y.y)); }
It's pretty trivial function, we all know what is does, when my compiler tried to compile it to IL (it's CG-like language compiler, so float2 is a native type for it, but to be compatible with external .NET modules, it uses normal struct to represent it) I've got code like this (only few lines of IL):L_0000: ldarg.1 L_0001: stloc.0 L_0002: ldloca.s num L_0004: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0009: ldarg.2 L_000a: stloc.0 L_000b: ldloca.s num L_000d: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0012: sub
arg.1 and arg.2 are vectors passed as arguments. So what it does is loading argument 1 (x vector), storing it in local var. then loading address of this vector, getting field (x) on stack, then doing the same for 2nd vector, then subing values on stack, so it's actually IL code forx.x - y.x
above. And now, what Reflector.NET has printed out:private float getRange(Float2 num1, Float2 num2) { Float2 num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; return CgLib.sqrt((float) (((num.x - num.x) * (num.x - num.x)) + ((num.y - num.y) * (num.y - num.y)))); }
As you can see, he interpreted lines about loading argument and storing it in temp value, but what he forgot is that addresses of fields on stack refer to different vectors, eventhough temporary vector "num" was used to represent it. Function works correctly btw :). And by the way - do you know other methods of cheating reflector?Hi, If you truely value your code, obsfusication is definately not the answer. No matter what tool you use, if someone is patient enough they will be able to reverse engineer it. It should only be used as a deterant and not for code protection.
Regards Julian Mummery
Please Visit my FREE Bug / Fault Logging Website at FaultLogger.com**
**
-
Hi, If you truely value your code, obsfusication is definately not the answer. No matter what tool you use, if someone is patient enough they will be able to reverse engineer it. It should only be used as a deterant and not for code protection.
Regards Julian Mummery
Please Visit my FREE Bug / Fault Logging Website at FaultLogger.com**
**
Firstly, I would like to point out, that what I am working on is compiler, that fact that the code has been obfuscated is a side effect :). Secondly, I don't know if I can agree with you completely. Let's say we have 50k+ lines of code project, but also, we have 500 lines of very innovative code that alone is worth a lot. By obfuscating our code, we increase chance that it won't be stolen. And even if someone will want to steal it, he will of course succeed, but he will suffer higher costs (more time spent on reverse engineering means more money needed). It's the same situation when you write some crack-protection code. You can't write uncrackable app, but you can "buy" yourself some time before it will be cracked.
-
Hello, My post isn't about a programming bug, but rather funny reflector bug that I've found while developing my compiler. Let's consider such code :
float getRange(float2 x, float2 y) { return sqrt((x.x - y.x)*(x.x-y.x) + (x.y-y.y)*(x.y-y.y)); }
It's pretty trivial function, we all know what is does, when my compiler tried to compile it to IL (it's CG-like language compiler, so float2 is a native type for it, but to be compatible with external .NET modules, it uses normal struct to represent it) I've got code like this (only few lines of IL):L_0000: ldarg.1 L_0001: stloc.0 L_0002: ldloca.s num L_0004: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0009: ldarg.2 L_000a: stloc.0 L_000b: ldloca.s num L_000d: ldfld float32 [R.Languages]R.Languages.Compilers.Shady.StdLib.Float2::x L_0012: sub
arg.1 and arg.2 are vectors passed as arguments. So what it does is loading argument 1 (x vector), storing it in local var. then loading address of this vector, getting field (x) on stack, then doing the same for 2nd vector, then subing values on stack, so it's actually IL code forx.x - y.x
above. And now, what Reflector.NET has printed out:private float getRange(Float2 num1, Float2 num2) { Float2 num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; num = num1; num = num2; return CgLib.sqrt((float) (((num.x - num.x) * (num.x - num.x)) + ((num.y - num.y) * (num.y - num.y)))); }
As you can see, he interpreted lines about loading argument and storing it in temp value, but what he forgot is that addresses of fields on stack refer to different vectors, eventhough temporary vector "num" was used to represent it. Function works correctly btw :). And by the way - do you know other methods of cheating reflector?