Protect Your Source Code: Obfuscation 101
Pages: 1, 2
Obfuscation Example
The University of Arizona has developed a very nice Java tool called Sandmark that is used to test and study software watermarking. As part of that study, it implements many well-known obfuscation algorithms and provides a GUI for a tool called Soot, an optimization tool that can also decompile bytecode. These two tools together allow you to take Java code, obfuscate it, and then decompile it to see the effect of the obfuscations. Let's get started with an example to put all of this together:
Download Sandmark. Get the executable sandmark.jar file and all the supporting jar files: BCEL.jar, bloat-1.0.jar, dynamicjava.jar, and junit.jar. Place these files in /Library/Java/Home.
Download Soot. You'll want the three precompiled jar files: sootclasses-2.2.1.jar, jasminclasses-2.2.1.jar, and polyglotclasses-1.3.jar available from the main page. Place them in /Library/Java/Home.
Download the jDecompile script. Add this script to your
$PATHby typingexport PATH=$PATH:/path/to/file/jDecompile. Also, change its permissions to executable withchmod u+x jDecompile. If you decide to use this tool a lot, you'll want to permanently add it to your path by modifying your .bashrc file. Make sure you're running Sandmark from the same Terminal you used to place this script in your path or else Sandmark won't find the script and you'll get errors.
To start up Sandmark, navigate to the sandmark.jar file with Terminal and execute it by typing java -jar sandmark.jar. The toolbar up top expands with a button on the far right, and each tab has its own specialized help menu, which actually is pretty helpful. For instance, the jDecompile script you downloaded is an adaptation of a script from the "Decompile" tab that I tailored for OS X.
Sandmark has an easy-to-use GUI interface and a great help system.
To do a quick obfuscation of source code, you can choose a particular obfuscation algorithm from the "Obfuscate" tab, or by letting Sandmark apply a variety of obfuscation algorithms on the "Quick Protect" tab.
For an overview of the algorithms available in Sandmark, check here. The only caveat is that Sandmark expects a jar file. (If you'd like an overview of creating and working with jar files before jumping into an example with Sandmark, check here.)
Save the following Java code to a file called IfElseDemo.java:
public class IfElseDemo {
public static void main(String[] args) {
int testscore = 76;
char grade;
if (testscore >= 90) {
grade = 'A';
} else if (testscore >= 80) {
grade = 'B';
} else if (testscore >= 70) {
grade = 'C';
} else if (testscore >= 60) {
grade = 'D';
} else {
grade = 'F';
}
System.out.println("Grade = " + grade);
}
}
Let's apply an obfuscation algorithm to this simple example and then decompile the obfuscated bytecode to see the difference.
Compile IfElseDemo in Terminal.
Save a file called IfElseDemo.java containing the example code above.
Type
javac IfElseDemo.javato compile.Type
java IfElseDemoto verify that the code runs.
- Wrap IfElseDemo.class into an executable jar file.
- With VIM or a text editor of your choice, create a file called "mainClass" in this same directory.
In "mainClass", place this line: "Main-Class: IfElseDemo" (no quotes).
Back on the command line, type
jar cmf mainClass IfElseDemo.jar IfElseDemo.class.Verify that the jar file is created and type
java -jar IfElseDemo.jarto verify that it executes properly.
Obfuscate the IfElseDemo.jar in Sandmark.
Choose the "Obfuscate" tab, and select the "Merge Local Integers" algorithm. Since our example code is primarily dependent upon integers for its logic, this looks like a good choice.
Name the output file IfElseDemo_obfuscated.jar.
Click on "Obfuscate".
Verify that IfElseDemo_obfuscated.jar exists and execute it with
java -jar IfElseDemo_obfuscated.jar.
Decompile IfElseDemo_obfuscated.jar with Sandmark to see the difference.
Choose the "Decompile" tab by extending the tabs with the arrow button on the far right.
Choose the IfElseDemo_obfuscated.jar as your input file.
Type "IfElseDemo" (no quotes) into the "Class" text box.
Leave the "Classpath" text box blank.
Click on "Decompile".
If all goes well, a preview of the obfuscated source code opens up that is quite a bit harder to understand. If you have trouble with the decompiling portion, make sure your path is set correctly for the Terminal window in which you're running Sandmark.
Although this example doesn't unlock any of the secrets of the universe, it does illustrate how effective obfuscation can be for even a simple example. Now imagine applying various obfuscation techniques to thousands of lines of more complex code.
If you take a look at the algorithms Sandmark offers, you'll notice that there's scores of confusing possibilities. Refactoring inheritance hierarchies, introducing confusing arithmetic operations, and introducing buggy variations of existing code blocks that never get executed are just a few of the possibilities. Keep in mind that you might want to just obfuscate the sensitive portions of your code, because the obfuscation can impose size and performance penalties. The penalties may or may not make a difference; it's a trade-off you have to measure and consider.
Final Thoughts
In a world where everyone follows license agreements and no one wants to reverse engineer government secrets, obfuscation techniques wouldn't be of much use. Since we don't live in the shire, however, security measures have their place and are just one of the many things that keeps the world spinning. Hopefully, you now have a better feel for the compilation process and understand how obfuscation is a powerful tool you can use to protect your code from exploitation and hacking.
Matthew Russell is a computer scientist from middle Tennessee; and serves Digital Reasoning Systems as the Director of Advanced Technology. Hacking and writing are two activities essential to his renaissance man regimen.
Return to MacDevCenter.com.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 6 of 6.
-
Obfuscators can compact/ improve code
2005-04-14 12:54:38 JohnGrant [Reply | View]
Pro obfuscators need not hurt performance.
DashO http://preemptive.com/products/dasho/Benefits.html
can compact / improve performance.
John
-
Article purpose
2005-04-13 09:23:25 Florijan [Reply | View]
I am a beginner programmer, and I work with Java a lot at the moment. I read the article, and by thoughts were: oh well, not such a big wisdom described, however I can imagine it to be useful. I also thought that obfuscation is NOT such a nice method of "hiding" your source code. Performance and size being the main reasons.
But I still thought it was a decent article because it introduces some tools and techniques I (and many others I think) have considered. I did not see any nasty presumptuous sentences like: "If you obfuscate (and obfuscation is beautiful and costs you nothing), you are free from all the possible hacking threats there are or ever will be in this universe!" And those comments I read on the article seemed to be attacking exactly that one line which WASN"T THERE. So, read carefully people.
I think it is an OK article, even thought I don't think it will help many people.
-
Those who do not learn from history.
2005-04-12 00:37:11 Michael Schwern | [Reply | View]
the vast majority of software pirates won't spend 500 hours reverse engineering and patching a simple $10 shareware application
Every anti-piracy method is based on this assumption. Every one fails because it is not true. It was not true back when it was a bunch of computer nerds copying 5 1/4" floppies using BitNibbler downloaded off a dial-up BBS and its not even remotely true now with your grandma downloading music over BitTorrent.
The vast majority do not need to break the obfuscation. Just one. The Internet takes care of the rest. And there's always somebody who does this sort of thing for fun and is shockingly good at it.
An article investigating automated obfuscation and discussing how its done might have been interesting. But to conclude that obfuscation is security? Sophomoric.
-
God save us all...
2005-04-11 16:35:55 matthewmusgrove [Reply | View]
...from developers like you.
Obfuscated source code can still be deobfuscated. Obfuscation only slows would be attackers. You should be fired for claiming that obfuscation is a means of securing programs.
-
Extremely slow applications
2005-04-09 13:50:15 elanthis [Reply | View]
If this is something that Java professionals really do, it's no wonder Java has such a record for being dog slow. ;-)
You are inserting gobs and gobs and gobs of useless, pointless code that the compiler doesn't (is in fact incapable of) optimizing out. What might have been done in a dozen opcodes now takes hundreds or thousands. Doesn't matter if you're using C, Java, or any other language, such obfuscation techniques make your final application far worse performance wise.






people to believe that obfuscation is pointless. Especially schwern,
who tries stupidly to raise a red flag by stating that obfuscation :
| ... increases code complexity which increases code maintenance costs.
| If you obfuscate the code by hand, woe be unto the next person who has
| to maintain that code.
He obviously doesn't understand (or doesn't want potential customers of
obfuscating software to understand) that the original project source code
(on which the maintenance is performed) is never obfuscated, just the
source code that can be reverse engineered from the executables.
Sure there may still be some hackers who see obfuscation as a challenge
but the the longer those criminal idiots spend hacking one piece of
software the less time they have to to hack the next. And every day
a hack is unavailable is another day that unscrupulous downloaders
who want the product now but can't find a hack may consider buying a
licence.
So in short - yes .. obfusccation is a waste of time and resources...
the hackers' time and resources! Which is a good thing.