Disassembling router firmware

Disassembling router firmware is a very interesting concept as it involves opening up and understanding something thought of as a black box - the home router. Some time ago I heard the home router firmware is in many cases actually a downsized but otherwise fully functional Linux system which can be reversed, explored, and sometimes even modified.


Getting our hands on the firmware


The first step of software reversing is obtaining the binary image of our firmware. In the most cases this is done in one of the two ways:



  1. Opening the router up and hacking its serial port. This approach is beyond the scope of this post, but suffice it to say, this approach gives the benefit of getting all the potential changes done to your firmware either by your operator or by a piece of malware. You can read more about this approach here.

  2. Downloading the firmware from either the router vendor or your Internet provider. This one is easier as it involves a simple download from the Internet.


I will be investigating firmware available from here.


Initial research


The first steps in our reversing session will give us some important information needed to achieve our goal. I will also give some explanation why this information is useful and how we will be able to make use of it.


File


The Linux file command can be used to retrieve the type of many different files. In our case we expect it to return just "data", as our binary file is a conglomerate of several different parts - the bootloader, the file system, the kernel... We will use another tool to discover those details later in the process.


# file BR-6208AC_1.30.bin 
BR-6208AC_1.30.bin: data


Strings


This command shows all the human-readable strings in any binary file. Just skimming the output will tell us if the image is encrypted, packed, or obfuscated. If the result contains many readable and actually understandable terms, it is a strong indication that there was no attempt to hide or compress the information. This technique can also expose other important data, like headers, file systems, and so on.


# strings BR-6208AC_1.30.bin
...
decompressing kernel:
done decompressing kernel.
start address: 0x%08x
...
IoxNI
mn,t`
U    YIK
...


Strings utility revealed one more interesting thing to me, the string cr73. It is interesting as it was the first string displayed on the list. Once I saw it, I ran the hexdump tool to get more details.


Hexdump


Although it is very hard to read hex for an untrained eye, hexdump can reveal some interesting information about the position of the interesting strings found using the strings utility. I was interested to see where the string cr73 was found, as it was first on the strings list and it looked like it might be a header of sorts. Hexdump tool revealed this to be true.



# hexdump -C BR-6208AC_1.30.bin | more
00000000 |cr73.P........o.|
00000010 |.......!@.`.....|
00000020 |...........a&.|.|
00000030 |...b&1.(..@!....|
00000040 |!............ @!|
00000050 |!......!...x.. !|


I will be investigating this header in the future and report my findings in another blog post.


Entropy


Figure 1: Binary image entropyFrom Wikipedia, entropy is a measure of the information content of a message source but is also a measure of unpredictability. This is very useful because it can pinpoint more precisely the binary image parts which have high entropy (they might be encrypted or obfuscated) and parts that seem to follow some rules (lower entropy) - either the code or the strings.


Binvis.io provides us with the ability to visualize the whole binary image layout in graphical form. The screenshot shows the image entropy. Looking at it, we can observe that the data is mostly unordered, but there are two chunks that have different colors. Looking at the bigger one, you would see all zeros on the right-side hexadecimal view, but looking on the smaller one, it looks like ordinary binary data. My assumption is that this is the bootloader code, which needs to be confirmed later in the process.


Binwalk


This tool doesn't come with Linux by default. It can be obtained from this site. Binwalk discovers different embedded file types in the binary file that we are investigating. It can also be used to extract those files, although sometimes I had some troubles using it directly. The ways that were working for me will be described in the following chapter.


# binwalk BR-6208AC_1.30.bin 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
10264         0x2818          LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, uncompressed size: 3905452 bytes
1179648       0x120000        Squashfs filesystem, little endian, version 4.0, compression:lzma, size: 2777727 bytes, 643 inodes, blocksize: 131072 bytes, created: 1902-03-06 16:02:08


Extracting the firmware


There are usually at least three firmware parts that are important for it to work properly.



  1. Bootloader - bootloader prepares the most basic functionalities of the embedded system to work. In some cases, there are hidden gems added to the bootloader, like an administrator menu which can be used while booting through a serial port. The bootloader is the starting point of the whole booting process.

  2. Kernel - The embedded system's kernel is responsible for running the required processes, just like in other systems.

  3. Filesystem - Filesystem contains all the file and directory structure similar to PC Linux systems.


Firmware Mod Kit


Firmware mod kit is an excellent tool to disassemble, modify, and repack a firmware image. In the background, it uses binwalk and dd Linux tools to do the job. The idea behind it is that it stores some intermediary data while unpacking so it can use it for repacking after the desired changes have been made. For this tutorial, I will be using only unpacking options.


FMK results


The firmware mod kit successfully extracted the file system from the image, but that was pretty much it. There are three image parts available as well: header, footer, and rootfs. I'm pretty sure rootfs is the actual filesystem in its binary form. I still have to discover what header and footer image parts contain.


FMK gotchas


I encountered two gotchas while using fmk. The first one is the Python version. FMK is written to use Python 3, while the Kali Linux I'm using has the default Python 2. The best description of the problem is in this Stack Overflow post. To solve it, I used virtualenv tool, the solution is in this Stack Overflow post.


The second gotcha was the firmware mod kit I downloaded uses the old binwalk script instead of the newer binwalk binary executable. The solution to this one was to just go into the fmk script, find the call to ${BINWALK} and change it to plain binwalk.


Conclusion


After the initial investigation and filesystem extraction, I have a few paths where I can continue my research and learning.



  1. I can start investigating and reverse-engineering the bootloader in an attempt to unpack other parts of the binary image which are potentially hidden from me at this time. For this task, I can start learning about reverse engineering and radare 2, which I wanted to learn for a long time.

  2. I can start investigating extracted binaries and scripts in the search of backdoors and vulnerabilities. For this task, I can start learning about emulating the binaries with qemu tool.

firmware, binwalk, firmware mod kit, reversing