Happy New Year!


    The Joe Security team wishes you success, satisfaction and many pleasant moments in 2013!

    Analyzing "Android-Trojan/FakeInst": Plug & Play premium SMS fraud

    Introduction

    As may be known by now, Joe Security offers free services to analyze APKs and other files. We often check submitted files to see if we come across anything special. Today, we found an interesting sample (MD5 0123078fac53446ab5d9527b6da1ab14) that is a typical SMS fraud APK that sends premium SMS to a specific number. The software is labelled by AV as "Android-Trojan/FakeInst" (and similar), albeit it actually does implement a working installing mechanism and has some kind of "License Agreement" term that outlines the SMS costs. Nevertheless, what is interesting about it is the way it works, as it seems to implement a "Plug & Play" mechanism that makes it very easy for anyone really to take affect of the dubious functionality and "turn over" the APK to implement their own premium SMS "registration service". ;-) What we will learn in this blogpost is how easy it is to understand the sample in only a couple of minutes using the comprehensive Joe Sandbox report. Let us take a look at what information we can extract using the report alone and then get down to the configuration mechanisms.

    Getting an overview

    The first thing we always do when we look at a sample is take a look at the signatures and static information to get a quick idea of what the APK might be up to. Here is the matched signatures and permissions overview at the top of the report:


    As we can see in the signature overview, the APK does a couple of suspicious things. Most APKs that combine these behavior signatures are usually malicious:

    • Connects to the internet (posts data, downloads data)
    • Executes code after phone reboot
    • Sends SMS
    • Obfuscated method names/uses reflection
    Often malware comes with a lot more "bad stuff", in this case it is bad enough and looking at the signatures alone gives us a good idea of what this might be doing (e.g. potentially leaking sensitive data, sending premium SMS, trying to hide behavior). Taking a look at the static information overview:


    As highlighted, we have some typical patterns: the APK is "play store compatible", that means it is meant to be spread as much as possible, it is signed by a valid certificate from Russia (suits the cyrillic) and it requires some unusual permissions (like the "MOCK_LOCATION" permission).

    Next, we take a look at the network traffic, if there is anything generated. Here we go:


    Taking a look at the dependency graph and the first HTTP packets, we quickly understand that it tried to download something from "trashbox.ru", a legitimate russian file host/news site actually. On a side-note: we took a look at the downloaded APK and it is the "play store" app certificate-signed by Google actually. Obviously noone wants to pay money for something that you can have for free and that is pre-installed on Android to begin with (maybe that is where the "Fake Installer" name comes from). ;-)

    Getting into the nitty-gritty

    Everything we got to learn about the APK so far we might have been able to extract from competing sandbox systems, but let's take a deeper look now. Before we dive into the disassembly, let us see what our "automatic button interaction" engine clicked and check out some more screenshots.


    As we see at the top, Joe Sandbox Mobile did quite a few clicks on the buttons and managed to forward the "installation process" of the APK that way. Also, we can see that the APK created a "lasttime" file in a ".pay" directory on our SD card (which is downloadable as part of the full report in Joe Sandbox Mobile). That the "button clicking" did actually work we can see on the following screenshots:


    After clicking the "далее" (Continue) button (see first screenshot in the Introduction section).


    The final screen of the analysis. Google translates it as "Thank you for using our services. Now you will receive a SMS-message with a password to the private site. By clicking on the link you will be able to download the file.". The two buttons скачать and выход read "Download" and "Output". As we will see, the paid app was already downloaded and it is the playstore APK. Possibly an SMS really is received later on in the process, but unfortunately our analysis system does not really send SMS (or receive SMS) over a mobile carrier, so we could not follow this pathway. Nevertheless, a costly SMS was sent already.

    Now let us take a look at some interesting functions that outline the configuration.


    The above function reads the ".dat" files (you can find them in the APK under the "assets" folder), which basically resemble a variety of "configuration files" that determine the way the installer behaves. The most important asset files are:

    • link.dat: Contains the URL in cleartext of which the APK is downloaded, in this case hxxp://trashbox.ru/files2/74704_96f1f8/googleplaymarket_3.8.17.apk
    • data.res: Contains the "parameter data" for the "pay process". It is UTF-8 Base64 encoded.
    • command.dat, title.dat, etc.: Contain "Label Texts" that are printed on the buttons.
     

    Here we see the "data.res" file contents being "decoded" into the parameter data "SMSNum-1: 2011SMSText-1: PM04333000276". Essentially, it is a "framework" that can be used to generate premium SMS apps for different countries in slightly different fashions. Of course, the download APK could also be malware and does not need to be something legitimate as in this case.

    Now let us take a look at how the configuration is loaded and executed:


    As we can see at the top, a "JSON"-String is actually used to execute functions with parameters as defined in data.res. In this case the "pay process" is obfuscated as the following JSON command:

    {\"c0\":\"android.telephony.SmsManager\",\"m0\":\"getDefault\",\"m1\":\"sendTextMessage\",\"p0\":\"java.lang.String\",\"p1\":\"android.app.PendingIntent\"}

    Obviously, an entire program code could be encapsulated in this format. The cX variables indicate class lookup instructions, the mX variables indicate associated method lookup instructions and the pX variables indicate the parameter types for the lookups. The lookup happens purely using java reflection API. Luckily for us, we resolve reflective invokes automatically with Joe Sandbox Mobile, so that we can follow this tricky process quite easily.


    In the screenshot above we see the call to "getDefault" of "android.telephony.SmsManager", which returns a SmsManager object that is later used to send the text message (as indicated by the JSON command). A few lines further down we see what we were looking for:


    As we can see, an alleged premium SMS "PM04333000276" is sent to the phone number "2011". Finally, a "lasttime" file with the device timestamp is created under /mnt/sdcard/.pay/lasttime, which we again see quite nicely in the enriched disassembly listing.

    Conclusion

    What we basically learned in this blogpost is the Hybrid Code Analysis (HCA) technology implemented in Joe Sandbox Mobile that combines dynamic and static analysis is a fundamental key to understanding any kind of software and its mechanisms. Today, it is by far not enough to analyze malware purely statically (see the previous blogposts dealing with heavy obfuscation), nor is it enough to analyze malware purely dynamically. We can only understand targeted threats and malicious behavior if we get down into the "nitty gritty" and take a deep look at the inner workings. Of course, we do not want to spend a huge amount of time to analyze anything "by hand", which is why we need comprehensive reports and a fully automated system that takes care of the tedious work. As shown in this blogpost, we were able to fully understand the sample in just 15 minutes. Also, we learned that it is possible to dynamically execute code using the java reflection API. Understanding these threats requires a fine-grained instrumentation as implemented by Joe Sandbox Mobile.


    Fully-Automated String Decryption and Data Leakage Detection using Hybrid Code Analysis (HCA)

    Introduction

    In June earlier this year we demonstrated our generic instrumentation engine with Opfake.C (MD5: 001a42a555b4bd39bf6ecd8b11441870) and showed how it was easily possible to hook calls to local methods matching certain method signatures (see this blogpost). In this concrete case, we log all invokes to static methods that take a String as input parameter and return a String, e.g.
     
    public static String method(String s)

    We define these type of methods as "DecryptString" methods signatures. Often, mildly sophisticated malware stores their Strings in an encrypted form in order to hinder pattern-based matches from static analysis AV engines. Thus, malware authors do not put out the effort to implement complex decryption algorithms and use simple techniques, such as substitution based ciphers.

    Usually, the encrypted strings are spread throughout the entire package and need to be decrypted quickly on-the-fly. The decrypted payload is usually a class/method name used to lookup class objects, method objects, reflective invokes or often to hide C&C URLs. Also, samples using encrypted strings usually try to encrypt all possible strings, so that we can assume there is going to be a lot of "DecryptString" method calls overall during runtime. So we had an idea: what if we record all I/O Strings of all invokes matching the "DecryptString" method signature and build a character-based "conversion map" and use that to decrypt information to try to decrypt other, non-executed invokes to the same method? And if that succeeds, can we build behavior signatures off of that data? Afterall, combining dynamic analysis results with static analysis to obtain behavior data is what Hybrid Code Analysis (HCA) is all about. Let's get to work.


    Building Input/Output Character Maps

    The first step was to improve our engine to build input/output character maps for all runtime invokes to methods matching the "DecryptString" method signature as noted above. Of course, the character maps we build need to take into account overloaded method names so that we can reliably account the logged data on a per function scope. Also, we only considered input/output data if the input/output String has the same length and characters differ. In the case of Opfake.C, there is really only one method that gives good results and which is used heavily to decrypt Strings. Here is the calculated Input/Output Character Map for "public static String mkfkejkpu.mkfkejkpu.mkfkejkpu(String s)" based on 400+ observed runtime calls:
     
    Input Output Input Output Input Output
    n 0 + a Z m
    9 1 8 A 0 N
    C 2 2 B s n
    R 3 @ b 3 o
    ; 4 ] c l O
    , 5 o C F p
    E 6 < d Y P
    i 7 A D p q
    M 8 . e Q R
    7 - B E e r
    K ( k f t s
    * ) h F z S
    U * H g ^ T
    4 , w G j t
    : . r h 1 u
    ? / x H X U
    b : - i D V
    g ? u I J v
    ` [ _ j L w
    V _ m J S W
    G } N K c X
    W + P k a x
    O < 6 l > y
    ) = v L y Y
    f > 5 M ( Z




    [ z

    Wow! :-) With the exception of a few characters (like the number "9"), we have almost a complete table of the main ASCII human readable characters. Also, the conversion map does not seem to be a simple substition cipher as "ROT-13" or the likes. Before we take a look if we can generate some good results using the character map on other non-executed invokes to the same method, let us take a look at how a typical non-executed code sequence looks like:


    As we can see above, without reverse engineering the "Decryption"-method mkfkejkpu and implementing some custom decryption algorithm, it will not be possible to understand what is going on there. Using some data flow analysis for the parameter (which is easy) and our previously calculated character map, it is possible for Joe Sandbox Mobile to fully automatically decrypt Strings for these calls, even though the code is never executed. This is what the results look like for the same code sequence:


    Aha! The code seems to be part of a routine that is building a C&C URL http://m-l1g.net/q.php that is probably used to post some data. Scrolling down a bit, we find this code sequence in the same method:


    which confirms our assumption that an HTTP based request will be executed (the reflective invoke happens shortly after). The "synthetically" (or heuristically) calculated return values are marked as "Synthetic Return" instead of "Return", as usual.

    Creating Behavior Signatures based on Synthetic Strings

    The decryption mechanism applies fully-automated at every non-executed invoke to the same method, we were able to understand the entire payload of Opfake.C. Using the data, we built a proof of concept signature that detects SMS sending code, even if the code isn't executed and the lookup Strings are residing in the package fully encrypted. Here is the code sequence:


    .. and here is the Signature:


    The signature matches if the Strings "android.telephony.SmsManager", "sendTextMessage" and a reflective invoke happen within the same code context. Of course, the signature offers a "Source" link to quickly jump to the relevant code location. Besides the signature above, we came up with two more signatures to help getting an overview of decrypted strings in the package quickly, especially if decrypted Strings appear in the same code context as a reflective invoke (a good indicator for hidden payload):



    See the "Uses an encrypted string to lookup and invoke a method via reflection" Signature for "payload hiding" code locations and the "Probably tries to hide strings using a DecryptString routine" signature for a full list.

    Detecting Sensitive Information Leakage

    Besides the really cool "auto-decryption" feature that we added to Joe Sandbox Mobile, we also added a second signature that detects if sensitive phone information is possibly being leaked. As outlined in the Chuli.A blogpost from August, we have been creating signatures that are more context-aware and work on dynamic session data, such as critical phone identifying information being leaked. In that post we showed how sensitive phone information was being posted in a base64 encoded format as part of HTTP post parameters. Posting data to a C&C server PHP file is not new and a lot of malware uses encrypted payload and not only simple encodings. In the case of Opfake.C, the malware authors decided to encrypt sensitive phone information using the AES cipher algorithm. Here is the relevant code location:


    In the figure above, we see an AES cipher instance being initialized.


    Shortly after the initialization code, we see a call to Cipher.doFinal with a String that contains sensitive phone information, such as the IMEI/IMSI and other sensitive phone information. In the new version of Joe Sandbox Mobile, whenever a Cipher encrypts a payload that contains sensitive phone information, the following signature triggers:


    The "Leaked:" part of the comment indicates which sensitive phone information has been identified and a quick entrypoint to the relevance code location is provided, as well. Of course, implementing this signature would not have been possible without context-awareness (the session information) and full parameter data of the runtime invoke.

    Conclusion

    In this blogpost we demonstrated the power of Hybrid Code Analysis (HCA) that combines dynamic and static analysis in Joe Sandbox Mobile. Using HCA, it was possible to understand how Strings are decrypted in Opfake.C and re-apply the learned character mapping to other encrypted Strings on non-executed invokes (essentially "simulating" a decryption). That way, it was possible to understand the full payload of Opfake.C and create intelligent behavior signatures. Furthermore, we outlined that context-awareness and parameter-level instrumentation, as implemented in Joe Sandbox Mobile, can open doors to more complex signatures that detect data leakage.


    Analyzing "Chuli.A" even if the C&C server is down

    Introduction

    In the past weeks we have been working on Joe Sandbox Mobile implementing some engine improvements to enhance code coverage, adding more powerful behavior signatures and extracting even more dynamic data from APK sample analysis. In the previous blogposts we focused more around reflective invoke resolvement and string decryption techniques.

    In this post, we will introduce some of the new features and show how to trigger payload even if the C&C server of a trojan is down. Therefore, we will take a deeper look at "Chuli.A" (MD5 c4c4077e9449147d754afd972e247efc), an interesting Android Trojan that was found in some targeted attacks against Tibetian and Uyghur activists (see Android Trojan Found in Targeted Attack). The goal will be to see if we can extract the same (in the best case, even more) information as the three Kaspersky Lab Experts using only the Joe Sandbox Mobile report. This is a useful task we undergo regularly to quality test our sandboxing system, as it quickly reveals weaknesses and shows room for improvement, if the desired goal cannot be achieved.



    Taking a deeper look at Chuli.A

    Running the sample on our free apk-analyzer.net service does not show much activity (see first run here). We see some "relatively harmless" signatures, very little dynamic analysis data and no internet traffic, as can be seen in the following screenshots:



    In Kaspersky's blogpost the authors claim that "It is important to note that the data won't be uploaded to C&C server automatically. The Trojan waits for incoming SMS messages (...)". This is not quite true, as we will later see. So, why is the sample not showing its real behavior? Let's take a look at the receivers defined in the AndroidManifest.xml (see "Static File Info" - "Receivers" in the report):


    We noticed that none of the intents are simulated by our default cookbook and most of the intent actions are protected intents that are only sent by the operating system. Meaning that it is not possible to receive these intents by declaring components in the manifest, e.g. android.intent.TIME_TICK:

    "Broadcast Action: The current time has changed. Sent every minute. You can not receive this through components declared in manifests, only by exlicitly registering for it with Context.registerReceiver(). This is a protected intent that can only be sent by the system." (see the Android Reference Manual)

    Taking a look at the "onReceive" method of ScreenReceiver in the disassembly shows that it checks to see if the "com.google.services.PhoneService" is running and then starts it accordingly. Following this, a chain of receivers and services are registered and executed.


    Before we continued analysis here, we decided to quickly implement a new cookbook command "_JBSimulateTimeTick()" which sends the android.intent.TIME_TICK action to all components that have an intent-filter specified. Why? Because it is easier to understand malware if we combine static analysis with dynamic data. After rerunning the sample, the first glance at the detected signature and internet traffic looks very promising:



    Now that we obviously triggered a lot of behavior, let us take a look at the PhoneService. To summarize the analysis, when the PhoneService.onCreate() is executed this is what happens:

    • The service calls sendInfo using the "create" command (see first HTTP POST above): hxxp://64.78.161.133/android.php?create=phone<timestamp>
    • It checks if the sendInfo was successful
    • *IF* sendInfo was successful (status code 200), it continues execution and does this:
      • Another receiver "sendReceiver" is registered for the action "com.google.system.receiver":
      • Following PhoneService.serviceInit() is called, which registers an "AlarmService":

    Let's continue with AlarmService. AlarmService.onCreate() does the following:

    • Sets up a receiver for android.provider.Telephony.SMS_RECEIVED (the "alarmReceiver" mentioned in the blogpost by Kaspersky)
    • Gathers sensitive information, such as phone name, location, contacts and sms data
    • Creates a "com.google.system.receiver" action, stores the gathered data in a Bundle and sends a broadcast

    The last broadcast causes the previously registered "sendReceiver" to be triggered, which in turn sends the sensitive phone information. A good entrypoint into these code locations is the "APK behavior" -> "Installation" section of the report:


    As can be seen in the screenshot above, the started services and registered receivers that happen during runtime are outlined quite nicely. Clicking the links directly navigates into the associated disassembly code.
     

    Dynamic Data Refined: HTTP POST parameters

    As outlined previously, sensitive phone information is being sent to the C&C server. This data is sent using the following url sheme: hxxp://64.78.161.133/data/phone<timestamp>/process.php?datatype=<base64encodeddata>
    The actual data transmission is visible in the report's dynamic data column as one of the calls to DefaultHttpClient.execute() in SendInfo.run() (duplicate code exists in SendInfo.reSendInfo()):


    Besides presenting simple toString() output per parameter object/primitive, the Hybrid Code Analysis engine of Joe Sandbox Mobile also displays special meta-data for certain parameters. The "HttpPost" object is one of these. In this case, we also print the "getURI", "getEntity", "getEntity.getContentType" and a special base64 decode result for "getEntity", which also URL decodes the getEntity string (see EntityUtils.toString()). This way the POST variables can become visible to the analyst in some cases where a simple encoding, such as base64 is chosen by malware authors.

    Code Coverage Improved: Spoofing the C&C Server Status Code

    The reason why the action broadcast might have slipped by is that the "sendReceiver" registration and "AlarmService" registration depends fully on whether or not the HTTP POST request to hxxp://64.78.161.133/android.php succeeds (status code 200). It is a lot easier to detect these kind of checks when you see "live data" in the disassembly. Here is where the status code is checked (part of SendInfo.sendInfo()):


    As we can see, the status code prevents execution from properly advancing (the C&C server is down). That is why we added a new feature to our engine that allows us to spoof the status code to 200 (OK) no matter what the real status code was. This feature is turned off by default and can be enabled using the _JBSetEngineOption('spoofHttpStatusCode', 'true') command as part of a custom cookbook:


    Rerunning the sample with the new cookbook command provides the desired result:



    Using this spoofing technique, we were able to enhance code coverage and progress execution as far as possible. Without this trick (or patching the code manually) the APK would not execute its malicious payload.

    New Behavior Signature: Detecting Phone Information Leakage

    As the most important static and dynamic data is propagated and made available to the behavior signature interface of Joe Sandbox Mobile, we decided to write a new signature that checks whether or not phone information, such as the device id, sim serial number, phone number, etc. is "leaked" via HTTP POST parameters to web servers. In this case, not only the "raw" POST bytes are analyzed, but also the decoded base 64 version. Here is how the signature looks in the Chuli.A sample when triggered


    How easy is it to create signatures like that? Well, the open signature interface allows users to quickly implement and activate new signatures. In this case, the signature itself was not all that difficult to implement, here is an excerpt:


    As can be seen, the signature interface is quite straightforward and offers the ability to detect malicious behavior in a generic way and mark samples as malicious (only 1/46 antivirus solutions detect Chuli.A as malicious).

    Conclusion

    In this blogpost we were able to see that sophisticated sandboxing systems can be a powerful tool to get a deep understanding of malware without the need of being a reverse engineer expert. The comprehensive report offers a lot of entrypoints into the code off the shelf. In this context sophisticated sandbox system means triggering malicious behavior not only by simulating user interaction, but at times implementing data manipulation algorithms, such as optionally spoofing the HTTP status code allowing to analyze malware beyond the day of release (when C&C servers are down). Using refined (e.g. decoded) dynamic data, it is possible to achieve more accurate analysis results for the analysis system and the analysts even so. In this case, we were able to quickly detect the base64 decoded parameters and create a "phone information leakage" behavior signature that will be used to classify similar malware in the future.

    Analyzing "Obad.a" a.k.a. "The most sophisticated Android Trojan"

    Recently we came accross this blogpost from Kaspersky that introduces its readers to a new Android Backdoor Trojan as "The most sophisticated Android Trojan" with the name "Obad.a", so we got curious to see whether or not Joe Sandbox Mobile 1.0 would be able to handle the APK (MD5: E1064BFD836E4C895B569B2DE4700284). Besides obfuscated class/method names, heavy string encryption and heavy java reflection usage, the APK actually uses two relatively new exploits to make static and dynamic analysis more difficult for sandbox systems and human analysts.

    A brief test has shown that no public sandbox system can handle the APK correctly. Errors range from "No threat detected", over "no malicious activity detected" to "No executable file". The reason probably being that none of the systems actually managed to run the APK in the first place. To be fair, our analysis system wasn't able to handle the APK without an engine update either.

    The first exploit: Incomplete/corrupt AndroidManifest.xml

    The AndroidManifest.xml provided by the APK is actually incomplete, but it uses some weaknesses in the Android OS manifest checking code that installs and runs an APK even if certain key attribute names are missing, although it should be rejected. Of course, this is a major hurdle for sandboxing systems that rely on extracting key information from the manifest (e.g. the activity name, etc.). Here is a small excerpt of the original, incomplete AndroidManifest.xml:


    As we can see, some important attribute namespaces and names are missing, such as "android:name" for the uses-permission tag or the "minSdkVersion" field in the manifest tag. So, in order to be able to run the APK in our system we updated our engine to fixup the AndroidManifest.xml with a best-effort approach. This is an excerpt of the same manifest after fixup:


    Using the fixup our engine was able to properly extract static information and install the APK at the correct target device API level.

    The second exploit: Inserting data into the instruction stream

    The next problem we had was decoding and recompiling the classes.dex file (a necessary step as part of API call instrumentation), as the APK was modified in such a way that a packed data switch was inserted into the instruction stream. This in turn caused the DalvikVM verifier to reject our APK upon loading a specific sample class. A typical error of this type is:

    VFY: encountered data table in instruction stream
    VFY: rejecting opcode 0x00 at 0x002a
    VFY: rejected Lcom/android/system/admin/oCIlCll;.oCIlCll ([B)[B
    Verifier rejected class Lcom/android/system/admin/oCIlCll;


    This is a pretty neat exploit, as it uses a weakness in existing dedexer tools and thereby hinders repacking APKs for automatic analysis. In order to solve the problem, we had to update our preprocessor to move the packed data switch back to its proper location before recompiling the classes.dex file.

    Dynamically analysing the APK

    Before running the APK, we created two signatures that match on these two special APK characteristics (corrupt manifest and packed data switch exploit). After quickly implementing the signatures, we ran the APK with Joe Sandbox Mobile 1.0 and here are some of the results:

    Corrupt Manifest Signature

    Exploit Signature

    Obfuscated Method Name Detection Signature (New!)

    In no time flat we were able to find some interesting code locations with dynamic data and reproduce results found in the Kaspersky blogpost linked at the top. Check out some of our results:


    Here we see the "decryptString" function call (instrumented by function signature generically) returning "su -c 'id'" and passing the string to Runtime.exec in an attempt to create a superuser shell. This really outlines the power of "Hybrid Code Analysis" (HCA) engine that we use in Joe Sandbox Mobile 1.0, which combines dynamic data with static code analysis (in this case, the disassembly of the function).


    Checking internet connection:


    "calling home" to androfox.com:



    Building a JSON command with device specific data:


    Internal DB structure used to save user data (the last 5 datasets are displayed by default).


    Querying for device admin privileges:



    All of the above information was extracted and discovered within minutes.

    Conclusion

    All in all, it is amazing to see how quickly we were able to find key behavior and code locations of the sample and got a very deep look into the samples functionality using the disassembly listings. The generic instrumentation engine decrypted a ton of strings and the reflective invoke translation made the real behavior quite visible. Of course, it is quite amazing to see how advanced and sophisticated malware is becoming on Android (like on Windows) and in our opinion this sample really shows that it requires specialists and advanced technology to tackle threats at this level. That all sandboxing systems on the market failed shows how advanced this sample really is using two relatively new exploits directly targeting automatic analysis systems. On the other hand though, it was great to see how powerful our engine is and that we were able to react very quickly to this new threat and update our engine. Using our new generic signatures we will be able to detect any variant of this malware in the future and will be looking out for the next generation.