Introduction

This article I am going to show you how you can use various tools to reverse engineer Android applications. The goal of this article will be to reverse engineer an Android application that uses Akamai’s Bot Manager Premier (BMP), aka the mobile protection SDK. You can recognise this from the x-acf-sensor-data header that is sent with requests.

The Android OS and Dalvik Bytecode

Android is built on a modified version of the Linux kernel and is open-source. One of the key components of the Android OS is the Dalvik virtual machine, which is responsible for running applications written in Java or Kotlin. When an Android application is compiled, the Java code is converted into Dalvik bytecode, which is a set of low-level instructions (think of it like assembly for Windows), that can be executed by the Dalvik virtual machine. The Dalvik bytecode is optimised for use on mobile devices with limited processing power and memory, which ultimately allows apps to run efficiently on a wide range of devices.

Since Android is based on Linux, it can also make use of compiled shared library binaries (.so format). The Dynamic Linker is responsible for loading and accessing resources compiled in a shared library. Since version 3.3.0, Akamai BMP uses shared libraries to offload some routines to native libraries with the intent to make it harder to reverse engineer. It will also give a slight speed improvement since the code is running natively.

Getting Started

We’re going to need the following tools for this exercise:

The first thing we want to do is ensure that our target actually has Akamai BMP installed. The easiest way to do this is to open the APK in jadx, and search for the x-acf-sensor-data string:

image-20230309180017125

Great, this is a valid target. The next thing we need to do is decompile the APK. The structure of an APK is organized into several directories and files. The root directory contains files such as the manifest file, which provides information about the app’s components, permissions, and metadata. The assets directory contains static files such as images and sound files. The lib directory contains compiled libraries that are specific to a particular CPU architecture. The res directory contains resources such as strings, layouts, and drawables that are used by the app’s user interface. The classes.dex file contains the app’s compiled code in the form of Dalvik bytecode.

Decompiling the APK

To decompile the APK we will use apktool : java -jar apktool_2.5.0.jar d "ALDI UK_7.9.0.124_Apkpure.apk"

After we’ve ran apktool we will have a folder with all the decompiled parts of the APK

image-20230321234428989

If we take a look in the lib folder we can find the compiled .so binaries for Akamai; lib\x86_64\libakamaibmp.so

Native Library Analysis

We are next going to analyze the .so native library file. .so is the equivalent of a Dynamic Link Library (.dll) on Windows. To analyze this file we will use Ghidra. This is also often referred to as a shared library. Open up Ghidra and drag in the file so we can start the analysis process.

image-20230322000553178

If we go back to JADX and have a look at this code, we can see the references to some functions inside the shared object file.

package com.cyberfend.cyfsecurity;

import android.util.Pair;
import java.util.ArrayList;

/* loaded from: classes.dex */
public final class SensorDataBuilder {

    /* renamed from: a */
    private static final SensorDataBuilder f387a = new SensorDataBuilder();

    public final native synchronized String buildN(ArrayList<Pair<String, String>> arrayList);

    public final native synchronized void initializeKeyN();

    SensorDataBuilder() {
    }

    static {
        System.loadLibrary("akamaibmp");
    }

    /* renamed from: a */
    public static SensorDataBuilder m5136a() {
        return f387a;
    }

    /* renamed from: com.cyberfend.cyfsecurity.SensorDataBuilder$1 */
    /* loaded from: classes.dex */
    final class AnonymousClass1 implements Runnable {
        /* JADX INFO: Access modifiers changed from: package-private */
        public AnonymousClass1() {
        }

        @Override // java.lang.Runnable
        public final void run() {
            SensorDataBuilder.this.initializeKeyN();
        }
    }
}

We can see that native functions are called: buildN and initializeKeyN. This must mean that the shared library exports these functions to make them available in Java. If we go back to Ghidra, we can search for these functions in the symbol tree:

image-20230322001140993

Double clicking on the pink text under exports will decompile the function into pseudo-c

void Java_com_cyberfend_cyfsecurity_SensorDataBuilder_initializeKeyN(void)

{
  int iVar1;
  uchar **ppuVar2;
  uchar *puVar3;
  long in_FS_OFFSET;
  uchar *puStack88;
  uchar *apuStack80 [2];
  void *pvStack64;
  Crypto CStack56;
  uchar auStack55 [15];
  uchar *puStack40;
  long lStack32;
  
  ppuVar2 = (uchar **)SensorDataBuilder::getInstance();
  lStack32 = *(long *)(in_FS_OFFSET + 0x28);
  if (*(char *)(ppuVar2 + 5) == '\0') {
    puVar3 = (uchar *)operator.new[](0x11);
    *ppuVar2 = puVar3;
    Crypto::randomBytes(0x10,ppuVar2);
    std::__ndk1::basic_string<char,std::__ndk1::char_traits<char>,std::__ndk1::allocator<char>>::
    basic_string<decltype(nullptr)>
              ((basic_string<char,std::__ndk1::char_traits<char>,std::__ndk1::allocator<char>> *)
               apuStack80,
               "-j0ZOfGt%xoJ$.p%U<#~.Bnx#M\nk?-%PwI&Yg+>#|;0W1F{?0@WVJE+#8d 6]Jy2V2_<uqM:HbEfN8j/fy, L^(Prg}yLPi^Xp&ot43flfpXu`h AmT).TJ;*fdo^f;G@J84LcY!U-QKo[:]Be5)h>v6HN*rjS,^|*<K+(6|| yxRxH:S#4>FSYVwK=z<_SH&*L+qWor+.fNpo_Q@o_8@t{KAqQxc#Z(%X,r^[q)~*;+b8Plb<Mrc\n8(&U++!| Z8HPGT5oa/BqAbX6"
              );
    Crypto::rotate_string(&CStack56,(basic_string *)apuStack80,0x3f,-1);
    if (((ulong)apuStack80[0] & 1) != 0) {
      operator.delete(pvStack64);
    }
....

The first thing that immediately jumps out is the reference to another function: rotate_string which takes 3 parameters. &CStack56 is likely a pointer to to an instance of the Crypto class, 0x3f is the rotation value (in this case 63) and the third parameter -1 could indicate the rotation direction.

If we double click the rotate_string function, we again get pseudo-c code for the function. We could try and reverse engineer this code, but it’s quite complicated with a lot of bit-shifting type code and memory allocation. Instead we can find another attack vector that will make our lives a lot easier, which I’ll explain below.

Crypto::rotate_string(Crypto *this,basic_string *param_1,uint param_2,int param_3)

{
  char cVar1;
  undefined *puVar2;
  ulong uVar3;
  ulong uVar4;
  int iVar5;
  ulong uVar6;
  basic_string *pbVar7;
  basic_string bVar8;
  long lVar9;
  ulong uVar10;
  long in_FS_OFFSET;
  bool bVar11;
  undefined local_178 [4];
  undefined4 uStack372;
  undefined8 uStack368;
  undefined *local_168;
  undefined local_158 [8];
  ulong uStack336;
  undefined *local_148;
  undefined local_138 [16];
  undefined local_128 [16];
  undefined local_118 [16];
...
  local_48 = local_138;
  do {
    cVar1 = (char)lVar9;
    if ((0x3a < (byte)(cVar1 - 0x22U)) ||
       ((0x400000000000021U >> ((ulong)(byte)(cVar1 - 0x22U) & 0x3f) & 1) == 0)) {
      if (cVar1 == '\x7f') break;
      bVar11 = (_local_158 & (undefined  [16])0x1) != (undefined  [16])0x0;
      if (bVar11) {
        uVar3 = (local_158 & 0xfffffffffffffffe) - 1;
      }
      else {
        uStack336 = (ulong)((byte)local_158[0] >> 1);
        uVar3 = 0x16;
      }
      local_138[lVar9] = (char)uStack336;
      if (uStack336 == uVar3) {
                    /* try { // try from 001a868f to 001a86ac has its CatchHandler @ 001a8955 */
...

So it looks as if this function is designed to decode/decrypt a string, and there is some interesting text that we need to decode.

Again looking at the pseudo-c code of initializeKeyN we can see that the decoded string is later passed as a pointer parameter in a function called RSAEncrypt (look at apuStack80).

    basic_string<decltype(nullptr)>
              ((basic_string<char,std::__ndk1::char_traits<char>,std::__ndk1::allocator<char>> *)
               apuStack80,
               "-j0ZOfGt%xoJ$.p%U<#~.Bnx#M\nk?-%PwI&Yg+>#|;0W1F{?0@WVJE+#8d 6]Jy2V2_<uqM:HbEfN8j/fy, L^(Prg}yLPi^Xp&ot43flfpXu`h AmT).TJ;*fdo^f;G@J84LcY!U-QKo[:]Be5)h>v6HN*rjS,^|*<K+(6|| yxRxH:S#4>FSYVwK=z<_SH&*L+qWor+.fNpo_Q@o_8@t{KAqQxc#Z(%X,r^[q)~*;+b8Plb<Mrc\n8(&U++!| Z8HPGT5oa/BqAbX6"
              );
    Crypto::rotate_string(&CStack56,(basic_string *)apuStack80,0x3f,-1);
    if (((ulong)apuStack80[0] & 1) != 0) {
      operator.delete(pvStack64);
    }
    apuStack80[0] = (uchar *)operator.new[](0x81);
    puVar3 = auStack55;
    if (((byte)CStack56 & 1) != 0) {
      puVar3 = puStack40;
    }
    iVar1 = Crypto::RSAEncrypt(puVar3,0x10,*ppuVar2,apuStack80);

If we double click the RSAEncrypt function, we can see that it is a function that accepts 4 parameters: a pointer, int, pointer and pointer to a pointer. This will be our alternative attack vector.

int Crypto::RSAEncrypt(uchar *param_1,int param_2,uchar *param_3,uchar **param_4)

The parameter we’re interested in is the param_1, as this is going to be the decrypted string value that is decoded by the rotate_string function.

Hooking Native Functions using Frida

Okay so we know that we want to somehow read the value of the param_1 pointer. We can us Frida to do just that. I am going to assume that you have already setup a rooted Android device and installed Frida on your device. We’re now going to write a Frida hook in JS that will hook into the native function of RSAEncryprt, intercept the parameters and dump it to the console.

We first create our basic hook.js file:

console.log("Started");

function waitForLibLoading(libraryName) {
    var isLibLoaded = false;

    Interceptor.attach(Module.findExportByName(null, "android_dlopen_ext"), {
        onEnter: function (args) {
            var libraryPath = Memory.readCString(args[0]);
            if (libraryPath.includes(libraryName)) {
                console.log("[+] Loading library " + libraryPath + "...");
                isLibLoaded = true;
            }
        },
        onLeave: function (args) {
            if (isLibLoaded) {                
                isLibLoaded = false;
            }
        }
    });
}

waitForLibLoading("libakamaibmp.so");

This code will hook the android_dlopen_ext API call, which is the Dynamic Linker on Android. The Dynamic Linker on Android is responsible for resolving dependencies between shared libraries at runtime. When an application loads a shared library with android_dlopen_ext, the Dynamic Linker checks the library’s dependencies and recursively loads any additional libraries required by the application. The linker then resolves symbol references between the libraries, allowing the application to use functions and data defined in the shared libraries.

So we’ve now hooked the Dynamic Linker and have found where libakamaibmp has been loaded. The next step is we need to find the memory address of the library and also the memory address of the RSAEncrypt function, so we can read the memory locations.

If we search for RSAEncrypt in the Symbol Tree, we can see it’s virtual memory address in Ghidra. In our case the address is 0x001a4258. This address isn’t quite correct just yet.

image-20230322112310046

We now need to find the Base Image Address. When a program is loaded into memory, the operating system assigns it a base image address where it’s code and data will be located, and is fixed (for the most part) for the duration of programs execution. The Base Image Address is used as an offset for all memory references made by the program. For example, if a program tries to access a memory location at offset 0x1000 and it’s Base Image Address is 0x80000000, the actual physical memory location address being access will be 0x80001000.

To get the Base Image Address of this particular shared library, you can grab it from Ghidra by clicking Window -> Memory Map. In this case, the Base Image Address is 0x001000000

image-20230322110858847

We can now write the following Frida code to find the correct memory location of RSAEncrypt

function process(libraryName){
    const rsaEncryptAddress = 0x001a4258
    const imageBase = 0x00100000
    const memBase = Module.findBaseAddress(libraryName);
    console.log("-> Base address is " + memBase);

    //Find the actual address by subtracting the image base
    const actualRsaEncryptAddress = memBase.add(rsaEncryptAddress - imageBase);
    console.log("[+] RSAEncrypt Physical Address " + actualRsaEncryptAddress);
}

We now have the physical address of RSAEncrypt as it’s loaded in memory. Next we need to write a hook to intercept the function and it’s parameters. Remember I mentioned earlier that the first parameter of RSAEncrypt is the plain-text RSA key, so we can hook that parameter and read it as a UTF-8 string.

Interceptor.attach(actualRsaEncryptAddress, {
    onEnter: function(args) {
        console.log("Hooked RSAEncrypt");
        const rsaPublicKey = Memory.readUtf8String(args[0]); //public key is the first arg, and is type uchar *
        console.log(rsaPublicKey);
    },
    onLeave: function(retval) {
        console.log("Leaving RSAEncrypt");
    }
});
    

If we now put all this together and run the code in Frida, we can see the results!

frida -l hook.js -U -f de.apptiv.business.android.aldi_uk

image-20230322112515124

And here’s our final code:

console.log("Started");

function process(libraryName){
    const rsaEncryptAddress = 0x001a4258
    const imageBase = 0x00100000
    const memBase = Module.findBaseAddress(libraryName);
    console.log("-> Base address is " + memBase);

    //Find the actual address by subtracting the image base
    const actualRsaEncryptAddress = memBase.add(rsaEncryptAddress - imageBase);
    console.log("[+] Actual RSA Encrypt Address " + actualRsaEncryptAddress);
    
    Interceptor.attach(actualRsaEncryptAddress, {
        onEnter: function(args) {
            console.log("Hooked RSAEncrypt");
            const rsaPublicKey = Memory.readUtf8String(args[0]); //public key is the first arg, and is type uchar *
            console.log(rsaPublicKey);
        },
        onLeave: function(retval) {
            console.log("Leaving RSAEncrypt");
        }
    });
}

function waitForLibLoading(libraryName) {
    var isLibLoaded = false;

    Interceptor.attach(Module.findExportByName(null, "android_dlopen_ext"), {
        onEnter: function (args) {
            var libraryPath = Memory.readCString(args[0]);
            if (libraryPath.includes(libraryName)) {
                console.log("[+] Loading library " + libraryPath + "...");
                isLibLoaded = true;
            }
        },
        onLeave: function (args) {
            if (isLibLoaded) {            
                process(libraryName);			
                isLibLoaded = false;
            }
        }
    });
}

waitForLibLoading("libakamaibmp.so");

Going One Step Further

Akamai BMP uses AES to encrypt sensors, which is also handled in the shared library. Further digging around in Ghidra we find another function of interest: AESEncrypt

image-20230322113338426

We can take the information we’ve learnt from this article to apply the same theory to this function:

  1. Find the virtual address of the function
  2. Translate it to a physical address
  3. Build a hook
  4. Profit!
//AESEncrypt Hook (encrypts the sensor)
const aesEncryptAddr = 0x001a3ff4; //Address of the AESEncrypt method, as found in Ghidra
const aesEncryptBase = Module.findBaseAddress(libraryName);
console.log("[+] AESEncrypt Base address is " + aesEncryptBase);
const actualAesEncryptAddress = membase.add(aesEncryptAddr - ghidraImageBase);
console.log("[+] Actual AESEncrypt Address " + actualAesEncryptAddress);

Interceptor.attach(actualAesEncryptAddress, {
    onEnter: function(args) {
        console.log("Hooked AESEncrypt");
        const plainSensor = Memory.readUtf8String(args[0]); //sensor is the first arg, and is type uchar *
        console.log(plainSensor);
    },
    onLeave: function(retval) {
        console.log("Leaving AESEncrypt");
    }
});

And we can see the results in our console!

image-20230322113605443

Until next time!