This article covers the building blocks of Android malware analysis, getting you ready to go, with everything needed when it comes to reverse engineering malware on Android!
Types of Android Malware
For Android devices running Google Play Services, and in turn using the Google Play Store, one of the biggest application security defences is the Google Play Protect utility. Google Play Protect identifies malware in two forms; on device, and off device (also known as Cloud). On device protection works by daily scanning all applications installed on Android devices, while cloud protection works by vetting and reviewing applications that are uploaded to the Google Play Store.
As an authority on Android Malware, we’ll be using the definitions provided by Google Play Protect for Android Malware (also referred to as Potentially Harmful Applications). In addition to device malware, some antivirus providers also class personally unwanted software (POS) or Mobile Unwanted Software (MUwS) as harmful to a device. These won’t be included here as, while they pose a danger to the device ecosystem, they do not strictly fall into the category of malware. These can include:
- Ad fraud
- Unauthorized Use or Imitation of System Functionality
- Disruptive ads
- Social Engineering
- Data collection and restricted permissions abuse
|Trojan||Used in combination with other malware categories, a Trojan will usually appear as something it is not. For example a legitimate game, application, or useful piece of software. While it may appear to be benign it will then perform undesirable actions against the user.|
|Spyware||Spyware is any form of application or software that transmits personal or personally identifiable information to a third party without adequate control, notice, or consent - this can include contact information, photos or other files, messages from email, call logs or sms, web history and bookmarks and information from the device '/data/' directory. This can also include actions such as recording video, audio, calls, and acquiring application data.|
|Stalkerware||A subset of this type of malware and is often seen used as a commercial/ spyware-as-a-service, where the data is often sent to a third party that is not the PHD provider. Legitimate versions of such software can be used by parents to track their children, however, a persistent notification should be displayed at all times.|
|Spam||Applications that send unsolicited messages to the user's contacts, the user themselves, or others without adequate consent from the user.
Rooting - An application or code that roots the target device without consent from the user and in turn executes some form of further malicious code onto the device.
|Ransomware||Defined as an application or code that gains partial or full control over a device and in turn offers to relinquish that control for a performed action such as payment. This can include encrypting device data or enabling device admin controls to lock the user out of the device.|
|Elevated privilege abuse||Similar to rooting, this is where an application or code compromises system integrity by breaking the app sandbox, gaining elevated privileges, or changing or disabling access to core security-related functions. This can include everything from disabling SELinux, abusing features so that the application cannot be uninstalled, abusing permission or authentication models.|
|Phishing||Similar to a Trojan, this is where an application or code masquerades as a legitimate piece of software and requests user authentication credentials or other personal and private information. This data is then sent to a malicious third party.|
|Non-Android threat||This malware category applies to Android applications and code that do not provide a direct threat to the device ecosystem or user, and instead leverage the device to target other platforms - such as connected devices or devices on the same network.|
|Hostile downloaders||This category covers applications and code that are utilised to download other malware. Google Play Protect has a collection of rules it uses for identifying if a given piece of downloader software if classed as hostile, these being:
There is reason to believe it was created to spread PHAs and it has downloaded PHAs or contains code that could download and install apps; or
At least 5% of apps downloaded by it are PHAs with a minimum threshold of 500 observed app downloads (25 observed PHA downloads).
They don't drive downloads without user interaction
All PHA downloads are initiated by consenting users.
|Denial of service (DoS)||An application or code that performs a denial of service (DoS) / distributed DoS attack against a third party system or resource.|
|Billing fraud||Billing fraud summarises several types of an application or code that leads to the user being charged for a service in an intentionally deceptive way. This is commonly broken down into: SMS, Call, and Toll fraud.|
|Backdoor||A backdoor will often allow for a malicious actor to gain unwanted, unauthorised, and potentially harmful remote control of the target device.|
Android Applications 101
Android application’s are commonly written in either Java or Kotlin. When a software engineer wants to create an APK (the Android pacKage), that contains the code and materials that are run on an Android device, they will need to compile that Java or Kotlin source code to a Dalvik executable/ bytecode. This Dalvik executable is a binary that is run on the Android device. This works where each process on the device uses its own virtual machine (VM) which segregates applications. Prior to Android API level 21 (Android 5), this would have been a Dalvik Virtual Machine, and in later versions will instead use the Android Runtime (ART). Both operate in similar fashions, where they simulate a device’s CPUs, registers, and other features while running an application’s compiled Dalvik bytecode.
While it is the Dalvik bytecode that needs to be run on a device, this is not human readable and so if we are to reverse engineer an application we’ll need to decompile it back into a human readable form. This is where Jadx comes in. Using Jadx we can decompile the Dalvik bytecode back into Java. This is often called pseudo Java, as it is not a one for one representation of what the original source code would have been, and instead is the decompiler’s best guess.
Android application architecture
An Android application (APK) is an archive-like file format, that contains several other files and folders. This including:
- assets— A directory for application assets. This is for arbitrary storage; anything provided by the application creator can be stored here.
- res— A directory with all resources that are not compiled into
arsc(icons, images, etc.).
- lib— A directory for native libraries used by the application. Contains multiple directories for each supported CPU architecture that the application has been compiled for.
- META-INF— A directory for APK metadata – including signatures.
- xml— The application manifest in a binary XML formatted file that contains application metadata — for example, its name, version, permissions, etc.
- dex— The classes.dex file contains the compiled application code in the Dex file format. There can be additional .dex files (named classes2.dex, etc.) when the application uses multidex.
- arsc— This file contains precompiled resources—such as strings, colours, or styles.
Android Manifest File
Android APK files also include a file detailing the application configuration – AndroidManifest.xml. The Android manifest includes information such as:
- Package name and application ID
- Application components
- Intent filters
- Icons and labels information
- Device compatibility information
Building and Reverse Engineering an ordinary application
First things first, especially if you’re new to Android as a whole, my recommendation would be to build a simple Android application. As an example open up Android Studio and create one of the following simple applications. This will give you a better understanding of Java, how Android applications are written, and some experience using a device/ emulator when running an application.
- A simple note-taking application
- A quote of the day application
- A simple password manager
- An Android file explorer
After compiling the created application to an APK, your next step will be to decompile that application to pseudo-Java, using Jadx. This will provide you with an understanding of how code goes from source code, to Dalvik assembly, then back to decompiled pseudo Java.
Reverse Engineering Android Malware
Kicking things off, it’s important to bear in mind that any malware analysis should be performed on a segregated system, such as a VM, with minimal connection to the outside world.
All of the malware samples used today can be found on the Android Malware GiHub repository by Ashishb.
Malicious Skype Application
rouge_skype folder you can find a file named
skype.apk. If loaded onto a device this application would present itself as a regular, if old, Skype application. That being the case, we can use some tools to identify a more nefarious purpose.
Before this, however, we can upload the APK onto Virus Total (a tool for analysing and sharing suspicious files, domains, IPs and URLs). With an immediate look, we can see that we’re not the first people to be looking at this APK, with the first submission of the file to Virus Total (VT) being back in 2015. Straight off the bat, VT gives us a clear indication that this APK is potentially (and probably)malicious.
The next step is to disassemble this APK and begin diving into it’s contents. Using APKTool, the contents of the APK archive can be identified (for information on these files see the Android Application Architecture section above).
assets folder of the application we can see several obfuscated file names. Using the
file command it can be identified that one of these files,
a , is actually a jar file. As an APK is simply an instance of a Jar (Java Archive) file, with some additional Android files dotted in there, this can be reverse engineered with ease.
Using APKTool a second time, this time on the Jar file, it can be seen that the jar file is successfully de-bundled into it’s core components, and the internals of a second APK can be seen. This is a good indication that the original Skype application was patched in some way and is dynamically loading the jar file inside of the
assets directory at runtime.
Further analysis could be performed on both this APK and the jar file, including reviewing the manifest file, reviewing interesting functions, and identifying the link between the two files.
The next malware sample we’ll be reviewing can be found in the
dendroid folder in the GitHub repository. Dendroid malware was quite sophisticated in it’s time, being first identified in 2014 by Symantec. Falling into several of the above categories, Dendroid was capable of deleting call logs, opening web pages, dialling numbers, recording calls, SMS interception, uploading files, opening applications, and performing denial of service attacks.
Instead of manually reviewing the file structure and disassembled SMALI of this application, we’ll be using APKLab (A Mobile Threat Intelligence Platform by Avast), as an example of automated reverse engineering tooling (if you do not have access to APKLab, you can use other automated tooling such as Quark).
Using APKLab, this APK can be uploaded and an analysis performed on it. Some key information that APKLab provides, includes:
- Suspicious permissions
- URL strings
- Generated files
- Emulator check for known build properties
- A network dump
By now, you have an entry level understanding into the internals of Android applications, Android malware types, and several areas and variables to look out for during Android malware analysis. See below for additional resources on the topic!
Reverse Engineering Android Malware Course
I’m in the process of developing a Udemy course on reverse engineering Android malware. Enter your email address below to receive an update and early-bird discount code once the course is live, along with subscribing to my mailing list.
Introduction To Reverse Engineering Course
I’ve released an Android reverse engineering course inspired by my interest in game design. This course walks through the fundamentals of reverse engineering and uses Android games as a fun and practical starting point.
10% off Android Malware Reverse Engineering Cheat Sheets
Free and premium resources, available on everything from Android and iOS security fundamentals, reverse engineering basics, and study guides for my Udemy courses. Use code ‘MALWARE-ARTICLE’ for 10% off on the Android Malware Reverse Engineering Cheat Sheet.