This post is the second one in the Windows Reversing Sessions series in which I'm learning the reversing and describing the learned approaches and procedures.
As with any reversing project, the first order of business is to get the feeling about the app. I ran the app a couple of times and tried some random passwords, which all failed. This gave me some insights about the program. It runs from the console but displays a windowed input dialog.
I also checked the string constants in the program and found a couple of potential passwords, but all were invalid in themselves. These will prove later on while reversing the code.
Finding the method of interest
As always when reverse engineering a program, the first order of business is to find the location of the code of interest. As the majority of other debuggers, IDA immediately takes us to the program
In our case, we are dealing with a program with visual elements. This usually means that there are several levels of control:
- The main function initializes all the major visual elements. Each such element usually has a window or dialog function
- The dialog function usually contains a switch/case statement, which handles all the necessary events passed to the dialog.
- We need to find the correct case statement that leads us to the desired code. As there are usually many events triggering in each second, it is sometimes hard to know where to put the breakpoint and, if it comes to that, which condition to use.
- WinMain - this is a simple method, which sets up a dialog callback
- DialogFunc - this is the callback method containing the switch/case statement with the event handling code
- sub_401080 - in the event handler callback there was only one custom method, so this was an easy thing to figure out
So this is it, we will be investigating this method.
At the first glance, the method is lacking the base pointer setup header, which indicates the arguments and local variables are going to be referenced through ESP. While this is not the usual approach when programming in assembly (the ESP pointer is dynamic and it is hard to maintain the proper offsets while developing the program), this is no problem for a compiler.
After reserving some space for local variables and storing the EDI pointer to be restored on the function end, the first part of the code initializes the string variables to which the password will be copied. There is one command that does not belong in this section but is placed here probably because of the compiler optimization.
mov ecx, 18h xor eax, eaxlea edi, [esp+68h+SecondChar] ; ESP: 0x0019F7A8 ; Offset: 0x68 ; var_63: byte offset -63 ; EDI becomes 0x0019F7AD mov[esp+68h+PasswordText], 0 push 64h rep EDI is now 0x0019F80D stosd; stosw; EDI: 0x00F80F stosb; EDI: 0x00F810
When I first saw this code segment my first order of business was to find out what
stos commands do and what happens when
rep is used. It turns out that these instructions are used to initialize the different kinds of variables. The last letter in
stos family of instructions decides how many bytes it will be affected:
stosb- one byte stosw- word (two bytes) stosd- double word (four bytes)
Each of those methods uses the value in EAX register as the value that is being copied, and the value in EDI register as the address to where the value is copied to. When the copy operation finishes, the EDI register is updated to point to the new location:
stosbincreases EDI by one stoswincreases EDI by two stosdincreases EDI by four
stos is prefixed by
rep keyword, the ECX register is used to define the number of times the
rep operation is executed, so it functions as a simple loop statement. If we compare this to
memset(void * ptr, int value, size_t num);
ptr variable would be stored in EDI, the
value variable would be stored in the EAX in hex, and the
num variable would be stored in ECX in hex.
You can read more details about these instructions here.
Getting the user input from the global object
The next thing I noticed is the use of the
GetDlgItemText method. We can find the details here. Whenever there is a call to a function, the first order of business is to find all the changes on the stack which set the function parameters. In most cases this is done by pushing the function parameters to the stack in the right-to-left order (the last parameter is pushed first on the stack), but I have seen different approaches in the past (I once saw the compiler decide to first decrease the stack pointer to add space and then move the parameters to the proper locations).
Now it becomes obvious the first push happens a bit early - in the previous block of code. This one sets the maximum count, the last parameter to the call we are inspecting. The second push is the out parameter and it receives the value retrieved. The third parameter is the dialog item id.
Once executed, our local string buffer will contain user-inputted text.
When observing the local variables in IDA, the local variable layout seems fine at the first glance, but as soon as the user input is retrieved it becomes obvious that they are the parts of the same string. This can be considered weird programming as these are not the real pointers, but variables following one another which get initialized all at once with a simple method call. I didn't spot this during the static analysis, but when I started to actually debug the input.
There are a couple of steps during which the whole password is compared:
- The second character is compared first, with number 0x61.
_strncmpis used to compare the third and fourth characters to a constant string.
- A loop is used to compare the rest of the input with a constant string. It is important to note here that the loop is partially unrolled. There are two almost the same parts of the code repeating one after another, but the second one points to the position after the first one. In the end, the pointers are increased by two. This all means that the string is compared two characters at a time, which effectively means twice less needed loops. as deciding when to stop a loop can be costly timewise, this serves to speed things up.
- The first character is compared to number 0x45.
I will leave for the reader to find out what the numbers mean and which constant strings are used as the practice. Better yet, fire up your favorite debugger and follow along!
There are several things I learned or became better at by doing this
- I learned more about the pointer arithmetic and LEA instruction
- I learned the constructs used for variable Initialization
- I learned when the variables follow each other, they can be initialized all at once with a string copy method. Note that this might not work with the newer compilers as I know they sometimes reorder the variable location for security purposes.
- I generally improved in my assembly instructions understanding and the ability to follow the instruction path