Malicious Code Forensics: A Practitioner’s Guide to Reverse Engineering Malware from Compromised IoT Devices

By 2025, there will be over 41 billion IoT devices connected to our networks. That isn’t just a number: it’s a massive, expanding attack surface. For every smart camera, router, and connected toaster we install, we create a new potential foothold for attackers. Unlike the familiar battleground of x86 systems, the threats targeting these devices are a different beast entirely. They are built for resource-constrained environments and non-x86 architectures like ARM and MIPS. For security practitioners, this new reality presents a sharp, technical challenge. The old playbooks don’t always apply, and the difficulty in extracting and analyzing firmware can feel like a significant barrier. If you’re struggling to build a structured process for dissecting these unique threats, you’re not alone. This guide provides a practical, hands-on workflow for malicious code forensics on embedded systems.

The First Hurdle: Getting the Code Off the Device

Before you can perform any malicious code forensics, you need the code. With IoT devices, this is often the most challenging step. You can’t just download an executable. You need to extract the firmware directly from the device’s hardware to find the malicious binaries hidden within. This requires a hands-on approach that blends hardware and software skills.

Your first task is to gain access to the device’s console, which is usually possible through a serial connection. Look for a set of four pins on the device’s printed circuit board (PCB) labeled VCC, GND, TX, and RX. Using a simple USB-to-TTL serial cable, you can connect these to your computer and access the device’s bootloader and shell. This is your primary entry point for reconnaissance.

When a shell isn’t enough, you need to go deeper with direct memory access. Interfaces like JTAG (Joint Test Action Group) provide low-level debugging access to the CPU, allowing you to halt the processor and dump the entire contents of memory. This is invaluable for live forensics on a running device. However, the most reliable method is often to read the firmware directly from the flash memory chip. Using a tool like a Bus Pirate or a dedicated SPI flash programmer, you can physically clip onto the chip and download its contents, creating a complete binary image of the device’s software.

Once you have this firmware image, the real software work begins. The single most important tool in your arsenal is binwalk. It scans the binary image for known file signatures and data structures, allowing you to carve out the different components: bootloaders, the Linux kernel, and most importantly, the compressed filesystem. This filesystem, often SquashFS or CramFS, is where you’ll find the operating system’s executables, configuration files, and the malware itself.

Deconstructing the Threat: Static and Dynamic Analysis for Non-x86 Malware

Analyzing malware built for ARM or MIPS architectures requires a mental shift from traditional x86 reverse engineering. These RISC (Reduced Instruction Set Computing) architectures use simpler, fixed-length instructions and a different memory model. This changes how you approach both static and dynamic analysis.

For static analysis, your primary tools remain disassemblers and decompilers like Ghidra, IDA Pro, or Radare2. The key is to configure them for the correct architecture (e.g., ARM 32-bit, Little Endian). One of the biggest challenges you’ll face is that a significant portion of IoT malware is written in C and statically linked. This means that instead of calling out to shared system libraries, all the necessary library code is compiled directly into the malware executable. Your disassembler won’t automatically recognize standard functions like printf or strcpy. You’ll spend a good portion of your time identifying these common functions to clean up the code and focus on the malware’s unique, malicious logic. Your goal here is to map out the program’s structure, identify its core capabilities like C2 communication protocols, and find any embedded encryption keys or command strings.

Dynamic analysis, or running the code to observe its behavior, is where things get really different. You can’t just execute an ARM binary on your Intel-based analysis machine. It will fail immediately. This is why a properly configured emulation environment is not just a nice-to-have: it’s an absolute necessity for effective malicious code forensics on IoT devices.

Building a Safe Lab: Emulating IoT Environments with QEMU

To safely execute and analyze IoT malware, you need a sandboxed lab that mimics the device’s native environment. QEMU (Quick EMUlator) is the perfect tool for this job. It can perform full-system emulation to boot an entire IoT operating system or, more efficiently, use user-space emulation to run a single binary from a different architecture on your host system.

Here is a practical, step-by-step process to build your analysis environment:

  1. Install QEMU User-Mode Emulators: On a Linux analysis machine, you can install the static user-mode binaries. For example, on Debian/Ubuntu, you’d run sudo apt-get install qemu-user-static. This provides the interpreters needed to run foreign binaries.

  2. Extract the Root Filesystem: Using binwalk -eM firmware.bin, extract the device’s filesystem from the firmware image you dumped earlier. This will create a directory (e.g., _firmware.bin.extracted/squashfs-root/) containing the full file structure of the IoT device.

  3. Prepare the Emulation Environment: Copy the appropriate QEMU static binary into the extracted filesystem’s /usr/bin directory. For an ARM binary, you would copy qemu-arm-static.

  4. Enter the Emulated System: Use the chroot command to change the root directory into the extracted filesystem. This effectively places you inside the IoT device’s environment. The command would look something like this: sudo chroot squashfs-root /usr/bin/qemu-arm-static /bin/sh. This command tells the system to use the QEMU static binary as the interpreter for the shell you are launching.

  5. Execute and Observe: You are now in a shell running inside the emulated ARM environment. You can navigate the filesystem and execute the malicious binary just as it would run on the real device. Now, use your standard Linux forensics toolkit to watch it. Use strace to trace all system calls the binary makes, lsof to see what files it opens, and run tcpdump on your host machine to capture any network traffic it generates. This is how you’ll discover its C2 servers, observe its propagation methods, and understand its true purpose.

This emulation technique is critical for modern malicious code forensics. It allows you to safely detonate IoT malware in a controlled lab, turning a static, unknown binary into a live process you can analyze in real-time.

Reverse engineering IoT malware is a discipline that sits at the intersection of hardware hacking, firmware analysis, and software reverse engineering. The process of extracting the code from a chip, identifying malicious binaries, and analyzing them in an emulated environment is a foundational skill for any practitioner defending against these evolving threats. As IoT technology becomes further embedded in our homes and critical infrastructure, the malware targeting it will only grow in sophistication. Mastering these techniques isn’t just about analyzing today’s botnets, it’s about preparing for the threats of tomorrow.

Dive deep into the bits and bytes of IoT threats. Get our technical playbook on reverse engineering malware from embedded devices.

YOU MIGHT ALSO LIKE