x86 Disassembly/Variable Examples

Example: Identify C++ Code

edit

Can you tell what the original C++ source code looks like, in general, for the following accessor method?

 push ebp
 mov ebp, esp
 mov eax, [ecx + 8] ;THISCALL function, passes "this" pointer in ecx
 mov esp, ebp
 pop ebp
 ret

We don't know the name of the class, so we will use a generic name MyClass (or whatever you would like to call it). We will lay out a simple class definition, that contains a data value at offset +8. Offset +8 is the only data value accessed, so we don't know what the first 8 bytes of data looks like, but we will just assume (for our purposes) that our class looks like this:

 class MyClass
 {
   int value1;
   int value2;
   int value3; //offset +8
   ...
 }

We will then create our function, which I will call "GetValue3()". We know that the data value being accessed is located at [ecx+8], (which we have defined above to be "value3"). Also, we know that the data is being read into a 4-byte register (eax), and is not truncated. We can assume, therefore, that value3 is a 4-byte data value. We can use the this pointer as the pointer value stored in ecx, and we can take the element that is at offset +8 from that pointer (value3):

 MyClass::GetValue3()
 {
   return this->value3;
 }

The this pointer is not necessary here, but i use it anyway to illustrate the fact that the variable was accessed as an offset from the this pointer.

Note: Remember, we don't know what the first 8 bytes actually look like in our class, we only have a single accessor method, that only accesses a single data value at offset +8. The class could also have looked like this:

 class MyClass /*Alternate Definition*/
 {
    byte byte1;
    byte byte2;
    short short1;
    long value2;
    long value3;
  ...
 }

Or, any other combinations of 8 bytes.

Example: Identify C++ Code

edit

Can you tell what the original C++ source code looks like, in general, for the following setter method?

 push ebp
 mov ebp, esp
 cmp [ebp + 8], 0
 je error
 mov eax, [ebp + 8]
 mov [ecx + 0], eax
 mov eax, 1
 jmp end
 :error
 mov eax, 0
 :end
 mov esp, ebp
 pop ebp
 ret

This code looks a little complicated, but don't panic! We will walk through it slowly. The first two lines of code set up the stack frame:

 push ebp
 mov ebp, esp

The next two lines of code compare the value of [ebp + 8] (which we know to be the first parameter) to zero. If [ebp+8] is zero, the function jumps to the label "error". We see that the label "error" sets eax to 0, and returns. We haven't seen it before, but this looks conspicuously like an if statement. "If the parameter is zero, return zero".

If, on the other hand, the parameter is not zero, we move the value into eax, and then move the value into [ecx + 0], which we know as the first data field in MyClass. We also see, from this code, that this first data field must be 4 bytes long (because we are using eax). After we move eax into [ecx + 0], we set eax to 1 and jump to the end of the function.

If we use the same MyClass definition as in question 1, above, we can get the following code for our function, "SetValue1(int val)":

 int MyClass::SetValue1(int val)
 {
   if(val == 0) return 0;
   this->value1 = val;
   return 1;
 }

Notice that since we are returning a 0 on failure, and a 1 on success, the function looks like it has a bool return value. However, the return value is 4 bytes wide (eax is used), but the size of a bool is implementation-specific, so we can't be sure. The bool is usually defined to have a size of 1 byte, but it is often stored the same way as an int.