In this post our goal should be to learn how to recognize different type of variables, structs and operations in a disassembly code.
For to do the disassemble I’ll work with Radare, that is a great tool to reversing engineering and for practicality to use command line structured. But if you prefer you can use other tools like Cutter, x64dbg, objdump or other.
The idea here, is comparing a C code and show how we can assimilate the assembly code with it.
Global x Local Variable
In this section, we’ll cover how local and global variables are converted. Pay attention in:
- ‘x’ as a global variable,
- ‘y’ as a constant global variable and
- ‘total’ as a local variable.
I`ll always start each section with a piece of code that we’ll use as base to analyse de assembly code.
#include <stdio.h>
int x = 1;
const int y = 2;
void main(){
int total = 0;
total = x + y;
printf("Total = %d\n", total);
}
Decompiling with Radare.
[0x00001060]> s main
[0x00001149]> pd 17
;-- main:
0x00001149 f30f1efa endbr64
0x0000114d 55 push rbp
0x0000114e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001151 4883ec10 sub rsp, 0x10
0x00001155 c745fc000000. mov dword [rbp - 4], 0
0x0000115c 8b05ae2e0000 mov eax, dword [obj.x] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
; [0x4010:4]=1
0x00001162 ba02000000 mov edx, 2
0x00001167 01d0 add eax, edx
0x00001169 8945fc mov dword [rbp - 4], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000116c 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000116f 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001171 488d3d900e00. lea rdi, str.Total___d_n ; 0x2008 ; "Total = %d\n"
0x00001178 b800000000 mov eax, 0
0x0000117d e8cefeffff call sym.imp.printf
0x00001182 90 nop
0x00001183 c9 leave
0x00001184 c3 ret
Before starting comparing, it’s important to remember how functions are initialized in a stack, in our case the function main. So, pay attention in this stretch of code:
push rbp
mov rbp, rsp
sub rsp, 0x10
That’s how the stack frame is created for this function, we start with the base pointer at the same place of the stack pointer, so the top and bottom of the stack are the same. And after that are allocated 0x10 (16) bytes / positions in the stack that will be used by the function.
Convertion table
C | Disassembly Code |
---|---|
int x = 1; | mov eax, dword [obj.x] |
cont int y = 2; | mov edx, 2 |
int total = 0; | mov dword [rbp - 4], 0 |
total = x + y; | add eax, edx mov dword [rbp - 4], eax |
printf(“Total = %d\n”, total); | mov esi, eax lea rdi, str.Total___d_n mov eax, 0 call sym.imp.printf |
With this table, we can verify that the global variables are referenced by memory address, and the local variable are referenced by the stack addresses.
Variable type | Disassembly view |
---|---|
Global | mov eax, dword [obj.x] |
Global constant | mov edx, 2 |
Local | mov dword [rbp - 4], 0 |
Notice how ‘total’ variable is initialized, the number 4 is used because each address is 4 bytes in size because we are working with an int (sizeof(int) == 4).
The ESI and EDI registers are used in buffer data manipulation operations. ESI is the source index register, and EDI is the destination index register. If ECX appears, it is used as the counting variable.
The stack is used to store local variables, functions parameters, and the return address. Two registers are commonly used, ESP and EBP. ESP is the stack pointer and typically contains a memory address that points to the top of stack. It’s value will change as items are pushed on and popped off the stack. The EBP is the base pointer that stays consistent within a given function, so that the program can use it as a placeholder to keep track of the location of a local variable and parameter. You will find it always as a base for local variables as [rbp - 4].
It’s possible to read data from the stack without using the push or pop instructions. For example, the instruction mov eax, ss:[esp] will directly access the top of the stack, because the ss is not exactly a register but a stack segment (ss).
With you find some of these register here, you can get some hits of the use if you remember the meaning of them.
- CS = Code Segment (used for IP)
- DS = Data Segment (used for MOV)
- ES = Destination Segment (used for MOVS, etc.)
- SS = Stack Segment (used for SP)
Arithmetic Operations
Our C code will cover some basic operation of mathematics that are really common to find.
void main(){
const int a = 2;
int b = 3;
int total = 0;
b++;
total = a + b;
b--;
total = b - a;
total = a/b;
b = a%b;
total = a*b;
}
Decompiling with Radare.
[0x00001129]> pd 26
;-- main:
0x00001129 f30f1efa endbr64
0x0000112d 55 push rbp
0x0000112e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001131 c745f4020000. mov dword [rbp - 0xc], 2
0x00001138 c745f8030000. mov dword [rbp - 8], 3
0x0000113f c745fc000000. mov dword [rbp - 4], 0
0x00001146 8345f801 add dword [rbp - 8], 1
0x0000114a 8b55f4 mov edx, dword [rbp - 0xc] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000114d 8b45f8 mov eax, dword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001150 01d0 add eax, edx
0x00001152 8945fc mov dword [rbp - 4], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001155 8b45f8 mov eax, dword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001158 2b45f4 sub eax, dword [rbp - 0xc]
0x0000115b 8945fc mov dword [rbp - 4], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000115e 8b45f4 mov eax, dword [rbp - 0xc] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001161 99 cdq
0x00001162 f77df8 idiv dword [rbp - 8]
0x00001165 8945f8 mov dword [rbp - 8], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001168 836df801 sub dword [rbp - 8], 1
0x0000116c 8b45f4 mov eax, dword [rbp - 0xc] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000116f 99 cdq
0x00001170 f77df8 idiv dword [rbp - 8]
0x00001173 8955f8 mov dword [rbp - 8], edx ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001176 8b45f4 mov eax, dword [rbp - 0xc] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001179 0faf45f8 imul eax, dword [rbp - 8]
0x0000117d 8945fc mov dword [rbp - 4], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001180 90 nop
0x00001181 5d pop rbp
0x00001182 c3 ret
As we expected, we start with 3 local variables in stack that are declared as:
mov dword [rbp - 0xc], 2
mov dword [rbp - 8], 3
mov dword [rbp - 4], 0
Remember the multiples of 4 are used because all of our variable have 4 bytes in size, that is, they are integers.
Operations in C | Disassembly Code |
---|---|
b++; | add dword [rbp - 8], 1 |
total = a + b; | mov edx, dword [rbp - 0xc] mov eax, dword [rbp - 8] add eax, edx mov dword [rbp - 4], eax |
total = b - a; | sub eax, dword [rbp - 0xc] mov dword [rbp - 4], eax |
b=a/b; | mov eax, dword [rbp - 0xc] cdq idiv dword [rbp - 8] mov dword [rbp - 8], eax |
b–; | sub dword [rbp - 8], 1 |
b = a%b; | mov eax, dword [rbp - 0xc] cdq idiv dword [rbp - 8] mov dword [rbp - 8], edx |
total = a*b; | mov eax, dword [rbp - 0xc] imul eax, dword [rbp - 8] mov dword [rbp - 4], eax |
The mul value instruction always multiples EAX by value. The result is stored as a 64-bit value across the register EAX. But consider a environment of 32 bits and you would like to save a variable with 64 bits, so you’ll nedd two registers, EDX:EAX. EDX stores the most significant 64 bits and EAX the stores the least significant 64 bits.
When performing the div or idiv instruction, you are dividing EDX:EAX by the value and storing the result in EAX and the remainder in EDX.
Functions
Functions are one kind of scope that have the parameters and variables allocated in a stack to do local operations. This block of data is called stack frame, that count with respectively this item pushed into the stack: arguments, return address, EBP (Base Pointer), local variables and ESP (Stack Pointer), this last one always point to the top item in the stack.
Stack Struct |
Our base code written in C.
int sum(int value1, int value2){
return value1+value2;
}
void main(){
int value = sum(2,3);
printf("Value = %d", value);
}
Decompiling with Radare.
[0x00001161]> pd 16
;-- main:
0x00001161 f30f1efa endbr64
0x00001165 55 push rbp
0x00001166 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001169 4883ec10 sub rsp, 0x10
0x0000116d be03000000 mov esi, 3
0x00001172 bf02000000 mov edi, 2
0x00001177 e8cdffffff call sym.sum
0x0000117c 8945fc mov dword [rbp - 4], eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000117f 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001182 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001184 488d3d790e00. lea rdi, str.Value___d ; 0x2004 ; "Value = %d"
0x0000118b b800000000 mov eax, 0
0x00001190 e8bbfeffff call sym.imp.printf
0x00001195 90 nop
0x00001196 c9 leave
0x00001197 c3 ret
Operations in C | Disassembly Code |
---|---|
int value = sum(2,3); | mov esi, 3 mov edi, 2 call sym.sum mov dword [rbp - 4], eax |
printf(“Value = %d”, value); | mov eax, dword [rbp - 4] mov esi, eax lea rdi, str.Value___d mov eax, 0 call sym.imp.printf |
The parameters are pushed onto the stack from right to left, the stack is cleans up when the function is complete, and the return value is stored in EAX.
Adding data to Stack |
If Statements
To identify if statements, basically you’ll need to understand this kind of code below.
cmp dst, src
jmp address
The cmp instruction is identical to the sub instruction, however the variables are not affected. The cmp instruction only set the flags: Zero Flag (ZF) and Carry Flag (CF). The result of this, is the base of operations in the if conditionals.
cmp operations | ZF | CF |
---|---|---|
dst = src | 1 | 0 |
dst < src | 0 | 1 |
dst > src | 0 | 0 |
Our base code written in C.
void main(){
const int a = 2;
int b = 3;
if (a == b) {
puts("a and b are equal\n");
if(a > b) {
puts("a > b");
} else {
puts("b > a");
}
} else {
puts("a and b are different");
}
}
Decompiling with Radare.
[0x00001149]> pd 25
;-- main:
0x00001149 f30f1efa endbr64
0x0000114d 55 push rbp
0x0000114e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001151 4883ec10 sub rsp, 0x10
0x00001155 c745f8020000. mov dword [rbp - 8], 2
0x0000115c c745fc030000. mov dword [rbp - 4], 3
0x00001163 8b45f8 mov eax, dword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001166 3b45fc cmp eax, dword [rbp - 4]
┌─< 0x00001169 750e jne 0x1179
│ 0x0000116b 488d3d920e00. lea rdi, str.a_and_b_are_equal_n ; 0x2004 ; "a and b are equal\n"
│ 0x00001172 e8d9feffff call sym.imp.puts
┌──< 0x00001177 eb2e jmp 0x11a7
│└─> 0x00001179 488d3d970e00. lea rdi, str.a_and_b_are_different ; 0x2017 ; "a and b are different"
│ 0x00001180 e8cbfeffff call sym.imp.puts
│ 0x00001185 8b45f8 mov eax, dword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
│ 0x00001188 3b45fc cmp eax, dword [rbp - 4]
│┌─< 0x0000118b 7e0e jle 0x119b
││ 0x0000118d 488d3d990e00. lea rdi, str.a___b ; 0x202d ; "a > b"
││ 0x00001194 e8b7feffff call sym.imp.puts
┌───< 0x00001199 eb0c jmp 0x11a7
││└─> 0x0000119b 488d3d910e00. lea rdi, str.b___a ; 0x2033 ; "b > a"
││ 0x000011a2 e8a9feffff call sym.imp.puts
└└──> 0x000011a7 90 nop
0x000011a8 c9 leave
0x000011a9 c3 ret
The local variables
mov dword [rbp - 8], 2
mov dword [rbp - 4], 3
Operations in C | Disassembly Code |
---|---|
if (a == b) | mov eax, dword [rbp - 8] cmp eax, dword [rbp - 4] jne 0x1179 |
puts(“a and b are equal\n”); | lea rdi, str.a_and_b_are_equal_n call sym.imp.puts |
puts(“a and b are different”); | lea rdi, str.a_and_b_are_different call sym.imp.puts |
if(a > b) | mov eax, dword [rbp - 8] cmp eax, dword [rbp - 4] jle 0x119b |
puts(“a > b”); | lea rdi, str.a___b call sym.imp.puts |
puts(“b > a”); | lea rdi, str.b___a call sym.imp.puts |
Common Instruction | Description |
---|---|
jne | Jump if ZF is not set, that is ZF=0 |
jle | Jump after a cmp if the destination value is less than the source value |
lea | “Load effective address”, is used to put a memory address into the destination |
jz or je | ZF = 1 |
jnz or jne | ZF=0 |
Loops
In this section you’ll discover how practically not exist difference between kinds of loops, all have group of conditions and jump address.
For
Our base code written in C.
void main(){
for (int i = 0; i < 10; i++)
{
printf("%d\n", i);
}
}
Decompiling with Radare.
[0x00001149]> pd 18
;-- main:
0x00001149 f30f1efa endbr64
0x0000114d 55 push rbp
0x0000114e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001151 4883ec10 sub rsp, 0x10
0x00001155 c745fc000000. mov dword [rbp - 4], 0
┌─< 0x0000115c eb1a jmp 0x1178
┌──> 0x0000115e 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
╎│ 0x00001161 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
╎│ 0x00001163 488d3d9a0e00. lea rdi, [0x00002004] ; "%d\n"
╎│ 0x0000116a b800000000 mov eax, 0
╎│ 0x0000116f e8dcfeffff call sym.imp.printf
╎│ 0x00001174 8345fc01 add dword [rbp - 4], 1
╎└─> 0x00001178 837dfc09 cmp dword [rbp - 4], 9
└──< 0x0000117c 7ee0 jle 0x115e
0x0000117e 90 nop
0x0000117f 90 nop
0x00001180 c9 leave
0x00001181 c3 ret
Operations in C | Disassembly Code |
---|---|
for (int i = 0; i < 10; i++) | mov dword [rbp - 4], 0 jmp 0x1178 cmp dword [rbp - 4], 9 jle 0x115e mov eax, dword [rbp - 4] add dword [rbp - 4], 1 |
printf(“%d\n”, i); | mov esi, eax lea rdi, [0x00002004] mov eax, 0 call sym.imp.printf |
While
Our base code written in C.
void main(){
int i = 10;
while(i > 0) {
i--;
}
}
Decompiling with Radare.
[0x00001129]> pd 12
;-- main:
0x00001129 f30f1efa endbr64
0x0000112d 55 push rbp
0x0000112e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001131 c745fc0a0000. mov dword [rbp - 4], 0xa
┌─< 0x00001138 eb04 jmp 0x113e
┌──> 0x0000113a 836dfc01 sub dword [rbp - 4], 1
╎└─> 0x0000113e 837dfc00 cmp dword [rbp - 4], 0
└──< 0x00001142 7ff6 jg 0x113a
0x00001144 90 nop
0x00001145 90 nop
0x00001146 5d pop rbp
0x00001147 c3 ret
Operations in C | Disassembly Code |
---|---|
int i = 10; | mov dword [rbp - 4], 0xa |
while(i > 0) | jmp 0x113e cmp dword [rbp - 4], 0 jg 0x113a |
i–; | sub dword [rbp - 4], 1 |
Switch Case
From disassembly, it’s almost impossible, to know whether the original code was a switch statement or a sequence of if and else.
void main(){
int value = 0;
switch (value)
{
case 1:
case 2:
printf("Option 1: %d",value);
break;
case 3:
printf("Option 2: %d",value);
break;
default:
printf("Default: %d",value);
break;
}
}
Decompiling with Radare.
[0x00001149]> pd 34
;-- main:
0x00001149 f30f1efa endbr64
0x0000114d 55 push rbp
0x0000114e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001151 4883ec10 sub rsp, 0x10
0x00001155 c745fc000000. mov dword [rbp - 4], 0
0x0000115c 837dfc02 cmp dword [rbp - 4], 2
┌─< 0x00001160 7f08 jg 0x116a
│ 0x00001162 837dfc00 cmp dword [rbp - 4], 0
┌──< 0x00001166 7f0a jg 0x1172
┌───< 0x00001168 eb38 jmp 0x11a2
││└─> 0x0000116a 837dfc03 cmp dword [rbp - 4], 3
││┌─< 0x0000116e 741a je 0x118a
┌────< 0x00001170 eb30 jmp 0x11a2
││└──> 0x00001172 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
││ │ 0x00001175 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
││ │ 0x00001177 488d3d860e00. lea rdi, str.Option_1:__d ; 0x2004 ; "Option 1: %d"
││ │ 0x0000117e b800000000 mov eax, 0
││ │ 0x00001183 e8c8feffff call sym.imp.printf
││┌──< 0x00001188 eb2f jmp 0x11b9
│││└─> 0x0000118a 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
│││ 0x0000118d 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
│││ 0x0000118f 488d3d7b0e00. lea rdi, str.Option_2:__d ; 0x2011 ; "Option 2: %d"
│││ 0x00001196 b800000000 mov eax, 0
│││ 0x0000119b e8b0feffff call sym.imp.printf
│││┌─< 0x000011a0 eb17 jmp 0x11b9
└└───> 0x000011a2 8b45fc mov eax, dword [rbp - 4] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
││ 0x000011a5 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
││ 0x000011a7 488d3d700e00. lea rdi, str.Default:__d ; 0x201e ; "Default: %d"
││ 0x000011ae b800000000 mov eax, 0
││ 0x000011b3 e898feffff call sym.imp.printf
││ 0x000011b8 90 nop
└└─> 0x000011b9 90 nop
0x000011ba c9 leave
0x000011bb c3 ret
Operations in C | Disassembly Code |
---|---|
int value = 0; | mov dword [rbp - 4], 0 |
case 2: | cmp dword [rbp - 4], 2 jg 0x116a |
case 3: | cmp dword [rbp - 4], 3 |
printf(“Default: %d”,value); | mov eax, dword [rbp - 4] mov esi, eax lea rdi, str.Default:__d mov eax, 0 call sym.imp.printf |
Arrays
In assembly, arrays are accessed using a base address as starting point. The size of each item isn’t explicit, but we can discovery seeing how the array is being indexed.
void main(){
int ip_address[] = {127,0,0,1};
for (int i = 0; i < 4; i++)
{
printf("%d\n", ip_address[i]);
}
}
Decompiling with Radare.
[0x00001169]> pd 30
;-- main:
0x00001169 f30f1efa endbr64
0x0000116d 55 push rbp
0x0000116e 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001171 4883ec30 sub rsp, 0x30
0x00001175 64488b042528. mov rax, qword fs:[0x28] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x0000117e 488945f8 mov qword [rbp - 8], rax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x00001182 31c0 xor eax, eax
0x00001184 c745e07f0000. mov dword [rbp - 0x20], 0x7f ; '\x7f'
0x0000118b c745e4000000. mov dword [rbp - 0x1c], 0
0x00001192 c745e8000000. mov dword [rbp - 0x18], 0
0x00001199 c745ec010000. mov dword [rbp - 0x14], 1
0x000011a0 c745dc000000. mov dword [rbp - 0x24], 0
┌─< 0x000011a7 eb20 jmp 0x11c9
┌──> 0x000011a9 8b45dc mov eax, dword [rbp - 0x24] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
╎│ 0x000011ac 4898 cdqe
╎│ 0x000011ae 8b4485e0 mov eax, dword [rbp + rax*4 - 0x20] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
╎│ 0x000011b2 89c6 mov esi, eax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
╎│ 0x000011b4 488d3d490e00. lea rdi, [0x00002004] ; "%d\n"
╎│ 0x000011bb b800000000 mov eax, 0
╎│ 0x000011c0 e8abfeffff call sym.imp.printf
╎│ 0x000011c5 8345dc01 add dword [rbp - 0x24], 1
╎└─> 0x000011c9 837ddc03 cmp dword [rbp - 0x24], 3
└──< 0x000011cd 7eda jle 0x11a9
0x000011cf 90 nop
0x000011d0 488b45f8 mov rax, qword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011d4 644833042528. xor rax, qword fs:[0x28]
┌─< 0x000011dd 7405 je 0x11e4
│ 0x000011df e87cfeffff call sym.imp.__stack_chk_fail
└─> 0x000011e4 c9 leave
0x000011e5 c3 ret
Operations in C | Disassembly Code |
---|---|
int ip_address[] = {127,0,0,1}; | mov rax, qword fs:[0x28] mov qword [rbp - 8], rax xor eax, eax mov dword [rbp - 0x20], 0x7f mov dword [rbp - 0x1c], 0 mov dword [rbp - 0x18], 0 mov dword [rbp - 0x14], 1 |
for (int i = 0; i < 4; i++) | mov dword [rbp - 0x24], 0 jmp 0x11c9 cmp dword [rbp - 0x24], 3 jle 0x11a9 add dword [rbp - 0x24], 1 |
printf(“%d\n”, ip_address[i]); | mov eax, dword [rbp - 0x24] cdqe mov eax, dword [rbp + rax*4 - 0x20] mov esi, eax lea rdi, [0x00002004] mov eax, 0 call sym.imp.printf |
Struct
Structures are accessed with a base address used as a starting pointer.
struct book {
int unit;
char* title;
double price;
};
void to_string(struct book* b){
printf("Title: %s\nPrice: %f\nUnit: %d\n", b->title, b->price, b->unit);
}
void main(){
struct book math;
math.price = 50.5;
math.title = "Math for dummies";
math.unit = 5;
to_string(&math);
}
Decompiling with Radare.
[0x000011ab]> pd 22
;-- main:
0x000011ab f30f1efa endbr64
0x000011af 55 push rbp
0x000011b0 4889e5 mov rbp, rsp ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011b3 4883ec20 sub rsp, 0x20
0x000011b7 64488b042528. mov rax, qword fs:[0x28] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011c0 488945f8 mov qword [rbp - 8], rax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011c4 31c0 xor eax, eax
0x000011c6 f20f10056a0e. movsd xmm0, qword [0x00002038] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
; [0x2038:8]=0x4049400000000000
0x000011ce f20f1145f0 movsd qword [rbp - 0x10], xmm0 ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011d3 488d054c0e00. lea rax, str.Math_for_dummies ; 0x2026 ; "Math for dummies"
0x000011da 488945e8 mov qword [rbp - 0x18], rax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011de c745e0050000. mov dword [rbp - 0x20], 5
0x000011e5 488d45e0 lea rax, [rbp - 0x20]
0x000011e9 4889c7 mov rdi, rax ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011ec e878ffffff call sym.to_string
0x000011f1 90 nop
0x000011f2 488b45f8 mov rax, qword [rbp - 8] ; '\xff\xff\xff\xff\xff\xff\xff\xff'
0x000011f6 644833042528. xor rax, qword fs:[0x28]
┌─< 0x000011ff 7405 je 0x1206
│ 0x00001201 e85afeffff call sym.imp.__stack_chk_fail
└─> 0x00001206 c9 leave
0x00001207 c3 ret
It’s important to know that in this struct book, the size allocate for char* is the size of a address in your system 32 or 64 bits. So, if we wolud calculate the size needed for it is: int (4 bytes) + char* (8 bytes) + double (8 bytes) = 20 bytes.
Operations in C | Disassembly Code |
---|---|
math.price = 50.5; | movsd xmm0, qword [0x00002038] movsd qword [var_10h], xmm0 |
math.title = “Math for dummies”; | lea rax, str.Math_for_dummies mov qword [var_18h], rax |
math.unit = 5; | mov dword [var_20h], 5 |
to_string(&math); | lea rax, [var_20h] mov rdi, rax call sym.to_string |
Conclusion
In this post we learn how the variable, functions, and operations could be “translated” from C language to assembly. This knowledge is very useful to start to understand binary code and work with reverse engineering.