Analyzing "Obad.a" a.k.a. "The most sophisticated Android Trojan"

    Recently we came accross this blogpost from Kaspersky that introduces its readers to a new Android Backdoor Trojan as "The most sophisticated Android Trojan" with the name "Obad.a", so we got curious to see whether or not Joe Sandbox Mobile 1.0 would be able to handle the APK (MD5: E1064BFD836E4C895B569B2DE4700284). Besides obfuscated class/method names, heavy string encryption and heavy java reflection usage, the APK actually uses two relatively new exploits to make static and dynamic analysis more difficult for sandbox systems and human analysts.

    A brief test has shown that no public sandbox system can handle the APK correctly. Errors range from "No threat detected", over "no malicious activity detected" to "No executable file". The reason probably being that none of the systems actually managed to run the APK in the first place. To be fair, our analysis system wasn't able to handle the APK without an engine update either.

    The first exploit: Incomplete/corrupt AndroidManifest.xml

    The AndroidManifest.xml provided by the APK is actually incomplete, but it uses some weaknesses in the Android OS manifest checking code that installs and runs an APK even if certain key attribute names are missing, although it should be rejected. Of course, this is a major hurdle for sandboxing systems that rely on extracting key information from the manifest (e.g. the activity name, etc.). Here is a small excerpt of the original, incomplete AndroidManifest.xml:


    As we can see, some important attribute namespaces and names are missing, such as "android:name" for the uses-permission tag or the "minSdkVersion" field in the manifest tag. So, in order to be able to run the APK in our system we updated our engine to fixup the AndroidManifest.xml with a best-effort approach. This is an excerpt of the same manifest after fixup:


    Using the fixup our engine was able to properly extract static information and install the APK at the correct target device API level.

    The second exploit: Inserting data into the instruction stream

    The next problem we had was decoding and recompiling the classes.dex file (a necessary step as part of API call instrumentation), as the APK was modified in such a way that a packed data switch was inserted into the instruction stream. This in turn caused the DalvikVM verifier to reject our APK upon loading a specific sample class. A typical error of this type is:

    VFY: encountered data table in instruction stream
    VFY: rejecting opcode 0x00 at 0x002a
    VFY: rejected Lcom/android/system/admin/oCIlCll;.oCIlCll ([B)[B
    Verifier rejected class Lcom/android/system/admin/oCIlCll;


    This is a pretty neat exploit, as it uses a weakness in existing dedexer tools and thereby hinders repacking APKs for automatic analysis. In order to solve the problem, we had to update our preprocessor to move the packed data switch back to its proper location before recompiling the classes.dex file.

    Dynamically analysing the APK

    Before running the APK, we created two signatures that match on these two special APK characteristics (corrupt manifest and packed data switch exploit). After quickly implementing the signatures, we ran the APK with Joe Sandbox Mobile 1.0 and here are some of the results:

    Corrupt Manifest Signature

    Exploit Signature

    Obfuscated Method Name Detection Signature (New!)

    In no time flat we were able to find some interesting code locations with dynamic data and reproduce results found in the Kaspersky blogpost linked at the top. Check out some of our results:


    Here we see the "decryptString" function call (instrumented by function signature generically) returning "su -c 'id'" and passing the string to Runtime.exec in an attempt to create a superuser shell. This really outlines the power of "Hybrid Code Analysis" (HCA) engine that we use in Joe Sandbox Mobile 1.0, which combines dynamic data with static code analysis (in this case, the disassembly of the function).


    Checking internet connection:


    "calling home" to androfox.com:



    Building a JSON command with device specific data:


    Internal DB structure used to save user data (the last 5 datasets are displayed by default).


    Querying for device admin privileges:



    All of the above information was extracted and discovered within minutes.

    Conclusion

    All in all, it is amazing to see how quickly we were able to find key behavior and code locations of the sample and got a very deep look into the samples functionality using the disassembly listings. The generic instrumentation engine decrypted a ton of strings and the reflective invoke translation made the real behavior quite visible. Of course, it is quite amazing to see how advanced and sophisticated malware is becoming on Android (like on Windows) and in our opinion this sample really shows that it requires specialists and advanced technology to tackle threats at this level. That all sandboxing systems on the market failed shows how advanced this sample really is using two relatively new exploits directly targeting automatic analysis systems. On the other hand though, it was great to see how powerful our engine is and that we were able to react very quickly to this new threat and update our engine. Using our new generic signatures we will be able to detect any variant of this malware in the future and will be looking out for the next generation.


    Outlining some new Joe Sandbox Mobile 1.0 features with "Opfake.C", a SMS based trojan for Android

    Joe Sandbox Mobile 1.0 is on its final meters before the release and we have been working on some new cool features in the past weeks that we would like to briefly outline using the "Opfake.C" SMS trojan (MD5: 001a42a555b4bd39bf6ecd8b11441870) that uses heavy obfuscation techniques (encrypted strings, reflective invokes) to hide its real behavior.

    Detecting Custom String Decryption Routines

    The instrumentation engine of Joe Sandbox Mobile is very versatile. Using special keywords, it is possible to instrument not only Android SDK API calls, but also any locally defined function of the current target. This is useful to instrument e.g. functions that might reveal interesting data for further analysis. A typical example of such a function is often a static string decryption method that follows a function signature, such as "public static String decryptRoutine(String encryptedString)". Using the following configuration option it is possible to instrument functions that match this type of function signature (enabled by default):

    LOGAPI_SPECIAL0=__STATIC____ANYLOCALCLASS__;->__ANYFUNC__(Ljava/lang/String;)Ljava/lang/String;


    As "Opfake.C" uses encrypted strings heavily this feature comes in quite handy, as outlines in the following screenshot:


    The red marked rows are executed API calls that were instrumented. As we can see, the string "_+J+:s.j:xjjFXQvo3ss.]j-3s" is decrypted to "java.net.HttpURLConnection" calling the function "mkfkejkpu" in the "mkfkejkpu" class in package "mkfkejkpu". An information that would have been hidden, if the Joe Sandbox Mobile engine would not offer such rich configuration options, such as the template-style instrumentation of local functions. Of course, if we discover an interesting function call during analysis that is not instrumented, it is possible to update the configuration and rerun the sample for more live data extraction.

    Resolving Reflective Invokes

    Another nice feature is that we associate objects returned by "getMethod" or "getDeclaredMethod" with reflective method "invoke" calls to resolve the real method call and parameter data. Often, obfuscating techniques involve heavy reflective invoke usage. The following two screenshots outline this quite nicely:



    As can be seen in the first screenshot, the reflection invoke at line 94 is resolved properly to "java.net.URL.openConnection" and the param0 reveals the URL address "http://gogos1.net/index.php". These "invoke" translations are injected into the API call stream and passed to signatures, just as-if they were called directly by the APK. That way the full signature set will be applied to the APK, independent of whether or not reflection is being used. The second screenshot outlines the same mechanism plus the parameter name resolving. The "getIntExtra" call is resolved properly and the parameters "name" and "defaultValue" are extracted by our Android SDK parameter name resolving engine making the obfuscation efforts mute.

    Signature Source Linked to Disassembly

    Another new feature is that the signature source is linked to the disassembly code and is clickable. That way it is very easy to find an entrypoint into the disassembly using the matched signatures at the top of the report. Checkout the following screenshot:



    Here we see that the reflective invoke "openConnection" caused the "Opens an internet connection" signature to trigger. Clicking the "show sources" link allows us to view all disassembly locations that caused the signature to match. Clicking the "source"-link in turn allows us to easily navigate into the proper disassembly code location for further analysis. A very powerful intertwining of dynamic and static data, a technology that we call "Hybrid Code Analysis" (HCA).

    Basic Data-Flow Analysis

    Something we have been working on is adding data-flow analysis algorithms to the static code analysis engine of Joe Sandbox Mobile 1.0. In the first version a basic version of it will be included that follows string data propagation within a function scope. That way signatures depending on string parameter values will match even if an API call was never executed, as string data associated with the register list is passed to the signature interface (e.g. URI.parse being called with a  "content://telephony" string). The following screenshot is a simple example:



    Installation / Miscellaneous Section

    Sometimes samples register receivers or start services during runtime that are not listed in the manifest xml. Joe Sandbox Mobile 1.0 keeps track of these events and lists the specific data as part of the "Installation" section at the top of the behavior data. Again, it is possible to navigate to the relevant disassembly code by clicking the link. The following screenshot outlines the new behavior section:


    Also, we added a new command that simulates "GPS" satellite location change and call the associated listeneres registered by "requestLocationUpdate" to trigger even more payload. In this case, the latitude and longitude values can be set through the cookbook script used in the analysis.

    Conclusion

    In this blogpost we learned about some new Joe Sandbox Mobile 1.0 features, such as deobfuscation techniques (decrypt string method detection, reflective invoke resolving), data flow analysis, parameter name resolving, new cookbook commands and an improved integration of signatures with the disassembly code. We are looking forward to Joe Sandbox Mobile 1.0!