Architecture

Architecture

Introduction #

Tweasel provides a toolbox of several tools aimed at analyzing network traffic to detect mobile privacy violations. Here, we document how these tools work in a general overview to allow you to understand and recreate the results the tools produce.

Generally, the setup the tools were developed for and which we assume here is the following: A computer, which we call the “host”, running the tweasel tools, and a mobile device, running either a distribution of Android or iOS, which are connected to each other via USB and are also on the same network. The device might instead be an emulator running on the host device.

Programming language and ecosystem #

Tweasel is mostly coded using TypeScript, which is a programming language that is transpiled to JavaScript. For transpiling and packaging, the project uses Parcel. The scrips are run on the Node.js runtime. Dependencies are loaded from the npm package registry.

Parts of the project, in particular appstraction and cyanoacrylate, also depend on Python scripts. These load and create their own Python runtime environment using our TypeScript wrapper library for the Python venv module, autopy. It loads and installs the dependencies from the Python Package Index (PyPI).

appstraction and cyanoacrylate also require the Android development tools to be installed and in the correct version for working with Android devices and emulators. This is automated using the andromatic helper library. It downloads and installs all the required tools and libraries and manages their usage.

How to do the setup manually

If you want to recreate the results without using our libraries, you’ll need to install the dependencies yourself. You will also need to set up your host for interaction with physical devices, if you want to. For that, you can follow the documentation of appstraction.

Python dependencies #

  1. Install Python (or make sure it is already installed).
  2. Create a virtual environment using python -m venv .venv (read more in Python’s documentation of venvs).
  3. Install mitmproxy, pymobiledevice3 and frida-tools using pip: .venv/bin/pip install mitmproxy pymobiledevice3 frida-tools (You might need a specific version to have the same environment as we are using. You can check the scripts/common/python.js file in the sources of our tools to find the right versions).

Android development tools #

If you want to use Android, you also need to install the Android development tools, which you can for example install via Android Studio. Note that these need to be included in your PATH, e.g. by including something like this in your .zshrc/.bashrc:

1# Android SDK
2export ANDROID_HOME="$HOME/Android/Sdk"
3export ANDROID_API_VERSION="<Your API version, e.g. 33.0.0>"
4export PATH="$PATH:$ANDROID_HOME/platform-tools:$ANDROID_HOME/build-tools/$ANDROID_API_VERSION:$ANDROID_HOME/cmdline-tools/latest/bin/:$ANDROID_HOME/emulator"

Instrumentation of devices and emulators #

To ensure a consistent test environment, to place honey data on the device, and to automatically control apps and other operating system functions, we employ a number of ways to instrument devices and emulators. Depending on what you want to reproduce, you might or might not need to do these things. Here, we give an overview of the techniques we use to control device functions. For the specific details on how to do them independently, we want to refer you to appstraction’s source code.

Android #

On Android, we use two techniques to control the operating system. For one, we make use of the Android Debug Bridge (adb), which is the official way to interact with Android devices (and emulators). With adb, we read out system information, install apps, change their settings, push files to the device’s file system or simulate user interactions. In particular, we make use of adb’s ability to run commands in a shell on the device using adb shell.

Because some things we do require elevated privileges, we try to elevate the adb shell using the su binary: adb shell su root /bin/sh -c '<command>'1. To ensure a su binary and root privileges are available, we expect the device to either be rooted with an external rooting tool such as Magisk, or have “Rooted Debugging” enabled, which is an option for USB debugging on some Android distributions. You can find more details on setting up a device for rooted access in appstraction’s README.

Additionally, we use the instrumentation toolkit Frida, which allows hooking native functions and interacting with the runtime context of a process. This allows us to create scripts that use internal, unexposed APIs to manipulate the device state. We use this for example to access an app’s settings storage and to set the content of the clipboard to place honey data in it. Frida needs root access on the device and the frida-server to be installed, which is done automatically in appstraction.

Manual Frida setup on Android

You can follow these steps if you want to manually set up Frida on your device:

  1. Check the Frida tools version on your host (see previous section for how to install Frida on your host): frida --version
  2. Download frida-server in the matching version (make sure at least the major versions match).
    You need to choose the frida-server binary for the correct architecture of your device (you can determine that using adb shell getprop ro.product.cpu.abi).
  3. Push the binary to your device using adb push <path on host>/frida-server /data/local/tmp/frida-server.
  4. Make sure the binary is executable: adb shell su root /bin/sh -c 'chmod 755 /data/local/tmp/frida-server'.

Now, if you want to use Frida on the device, you can start it using adb shell su root /bin/sh -c '/data/local/tmp/frida-server --daemonize'.

iOS #

To control iOS devices, we use pymobiledevice3, a Python library that interacts with iOS’s lockdownd service and implements internal Apple APIs which are used for communication with the host via USB. Similarly to adb, this allows to install apps and retrieve system information, but it does not allow to open a shell on the device.

To do the latter, the device needs to be jailbroken and an SSH server needs to be installed ( more on which device configuration we require). We then connect to the device as root using SSH over the network (the host and the device need to be able to reach each other) and run commands using this shell. We also automatically install all other required dependencies automatically. Using the shell access, we can manipulate app permissions, start apps, and sync files from the host to the device.

We also use Frida on iOS, which can be installed using Cydia/Sileo. We use Frida to access several internal APIs, e.g. to set the clipboard content, access the app settings storage, read out device information, and start apps.

Collecting traffic #

Device traffic is collected employing a machine-in-the-middle (MITM) proxy. For that, the device’s network traffic is rerouted through the host machine, saved, and then forwarded to the original receiver. This type of proxy is (apart from cryptographic checks) opaque to the software running on the device. However, MITM proxies cannot access TLS-encrypted content. To get around that, the proxy provides its own encryption certificate. If the device is made to trust this certificate, the encrypted traffic can be decrypted and saved by the host. Apps can prevent that by requiring a specific certificate, a technique called certificate pinning. To circumvent certificate pinning, additional measures are required.

cyanoacrylate uses mitmproxy as the MITM proxy on the host. It provides a certificate authority we need to install on the devices and saves the decrypted traffic, which cyanoacrylate then exports in the HAR file format, which contains only the HTTP(S) traffic and ignores other types of network transmissions.

Android #

To reroute the traffic through the host, cyanoacrylate sets up a VPN tunnel on the device using WireGuard. The host then uses mitmproxy --mode wireguard to set up a VPN server that can receive the device’s traffic. On the device, we install the WireGuard app and then set up the host as the VPN server. WireGuard allows to either route all the traffic via the VPN or just the network communication generated by a number of allowlisted apps. This allows to attribute requests to the specific app, but it might miss transmissions which are made through a different process.

In order to read TLS-encrypted traffic, we install the certificate authority generated by mitmproxy on the device using appstraction. To disable certificate pinning, we use the HTTP Toolkit certificate unpinning script, which we load in Frida to hook and disable certificate pinning functions in the app’s code at runtime. Our investigation found that the script is adequate to bypass most certificate pinning methods.

How to set up traffic collection on Android

To manually set up traffic collection on Android, do the following steps:

  1. Download and install the WireGuard app on the device.
  2. Start the proxy on the host: mitmweb --mode wireguard. A browser should open.
  3. Make sure host and device are on the same network and can reach each other.
  4. On the device, open the WireGuard app and tap the plus icon to add a tunnel. Choose to “Scan from QR code” and scan the code on the host.
  5. Activate the tunnel and navigate to http://mitm.it on the device. Follow the instructions to install the CA.
  6. If necessary, edit the tunnel settings to only tunnel specific apps.
  7. Follow the HTTP Toolkit guide to disable certificate pinning.

iOS #

On iOS, VPNs are not easily available to tunnel the device’s traffic. Instead, we use the HTTP proxy settings in the network preferences to route the (HTTP(S)) traffic to the host: The host runs mitmproxy in its HTTP proxy mode (the default) and is then set as the proxy server on the device. This sends all of the device’s HTTP(S) traffic to the host and there is no option to filter out specific apps.

To read the TLS-encrypted traffic, the mitmproxy CA certificate is installed on the device via the the system preferences. Disabling certificate pinning can be done on jailbroken devices using the SSL Kill Switch 2 tweak. This is just installed on the device once and then breaks the certificate pinning in (almost) all apps.

How to set up traffic collection on iOS

To manually set up traffic collection on iOS, do the following steps:

  1. (If your device is jailbroken: Use Cydia or Sileo to install SSL Kill Switch 2. You need to add https://julioverne.github.io/ to the repositories)
  2. Start the proxy on the host: mitmweb. A browser should open.
  3. Make sure host and device are on the same network and can reach each other.
  4. Set up the proxy on the device:
    1. Open the “Settings” app.
    2. Go to “Wi-Fi” and choose the network your device and host are currently connected to.
    3. Scroll down and tap “Configure proxy” and then choose “Manual”.
    4. For “Server” input your devices local IP address, for “Port” input the mitmproxy port (default 8080).
  5. Navigate to http://mitm.it on the device. Follow the instructions to install the CA.

Detecting trackers #

The recorded traffic in the HAR files is passed to TrackHAR to detect transmissions of data to tracking endpoints. To do so, TrackHAR uses an adapter-based matching approach in addition to, optionally, the more common indicator matching, i.e. TrackHAR does not only try to recognize specific, pre-known data in the traffic. Instead, it primarily uses a database of common tracking endpoints and the schema of requests we have seen contacting them. This way, TrackHAR can decode the requests and use the schema to find what data has been transmitted.2

To create the adapters, we analyze data from test runs of requests we did with thousands of apps and also do other kinds of research to determine what data is transmitted. We list all ways of our reasoning for determining data types and provide documentation on our research for every adapter in TrackHAR.


  1. This command proved to be consistent across different su implementations, since Android uses a non-POSIX-compliant su↩︎

  2. You can also read a longer explanation of why we chose this approach↩︎