While there are other programming languages that are susceptible to buffer overflows, C and C++ are popular for this class of attacks. In this article, we’ll explore some of the reasons for buffer overflows and how someone can abuse them to take control of the vulnerable program.
What is buffer overflow?
Buffer overflow is a class of vulnerability that occurs due to the use of functions that do not perform bounds checking. In simple words, it occurs when more data is put into a fixed-length buffer than the buffer can handle. It’s better explained using an example. So let’s take the following program as an example. void vuln_func(char *input); int main(int argc, char *argv[]) { if(argc>1) vuln_func(argv[1]); } void vuln_func(char *input) { char buffer[256]; strcpy(buffer, input); } This is a simple C program which is vulnerable to buffer overflow. If you look closely, we have a function named vuln_func, which is taking a command-line argument. This argument is being passed into a variable called input, which in turn is being copied into another variable called buffer, which is a character array with a length of 256. However, we are performing this copy using the strcpy function. This function doesn’t perform any bounds checking implicitly; thus, we will be able to write more than 256 characters into the variable buffer and buffer overflow occurs. If this overflowing buffer is written onto the stack and if we can somehow overwrite the saved return address of this function, we will be able to control the flow of the entire program. That’s the reason why this is called a stack-based buffer overflow.
Types of buffer overflow
We have just discussed an example of stack-based buffer overflow. However, a buffer overflow is not limited to the stack. The following are some of the common buffer overflow types.
Stack-based buffer overflow
When a user-supplied buffer is stored on the stack, it is referred to as a stack-based buffer overflow. As mentioned earlier, a stack-based buffer overflow vulnerability can be exploited by overwriting the return address of a function on the stack.
Heap-based buffer overflow
When a user-supplied buffer is stored on the heap data area, it is referred to as a heap-based buffer overflow. Heap overflows are relatively harder to exploit when compared to stack overflows. The successful exploitation of heap-based buffer overflow vulnerabilities relies on various factors, as there is no return address to overwrite as with the stack-based buffer overflow technique. The user-supplied buffer often overwrites data on the heap to manipulate the program data in an unexpected manner.
Understanding debuggers
Understanding how to use debuggers is a crucial part of exploiting buffer overflows. When writing buffer overflow exploits, we often need to understand the stack layout, memory maps, instruction mnemonics, CPU registers and so on. A debugger can help with dissecting these details for us during the debugging process. In the Windows environment, OllyDBG and Immunity Debugger are freely available debuggers. GNU Debugger (GDB) is the most commonly used debugger in the Linux environment.
Exploit mitigation techniques
To be able to exploit a buffer overflow vulnerability on a modern operating system, we often need to deal with various exploit mitigation techniques such as stack canaries, data execution prevention, address space layout randomization and more. To keep it simple, let’s proceed with disabling all these protections. For the purposes of understanding buffer overflow basics, let’s look at a stack-based buffer overflow.
Crashing and analyzing core dumps
In this section, let’s explore how one can crash the vulnerable program to be able to write an exploit later. The following makefile can be used to compile this program with all the exploit mitigation techniques disabled in the binary. We are simply using gcc and passing the program vulnerable.c as input. We are producing the binary vulnerable as output. clean: rm vulnerable Let’s disable ASLR by writing the value 0 into the file /proc/sys/kernel/randomize_va_space. This looks like the following: Now we are fully ready to exploit this vulnerable program. Let’s compile it and produce the executable binary. To do this, run the command make and it should create a new binary for us. We should have a new binary in the current directory. Let’s run the file command against the binary and observe the details. As we can see, it’s an ELF and 64-bit binary. Let’s run the binary with an argument. $ Nothing happens. This is intentional: it doesn’t do anything apart from taking input and then copying it into another variable using the strcpy function.
Crashing the program
Now let’s see how we can crash this application. We’re going to create a simple perl program. So we can use it as a template for the rest of the exploit.
Let’s create a file called exploit1.pl and simply create a variable. Let’s give it three hundred “A”s. We want to produce 300 characters using this perl program so we can use these three hundred “A”s in our attempt to crash the application.
exploit1.pl
Let us also ensure that the file has executable permissions.
$junk = “A” x 300;
print $junk;
Now, let’s write the output of this file into a file called payload1.
Let’s simply run the vulnerable program and pass the contents of payload1 as input to the program.
As you can see, there is a segmentation fault and the application crashes. Now let’s type ls and check if there are any core dumps available in the current directory.
$
If you notice, in the current directory there is nothing like a crash dump. There are no new files created due to the segmentation fault. Let’s enable core dumps so we can understand what caused the segmentation fault.
$
$
$ ls
exploit1.pl Makefile payload1 vulnerable vulnerable.c
$
This should enable core dumps. Now, let’s crash the application again using the same command that we used earlier. Type ls once again and you should see a new file called core.
This file is a core dump, which gives us the situation of this program and the time of the crash. We can use this core file to analyze the crash. Let’s see how we can analyze the core file using gdb.
$
$
$ ls
core exploit1.pl Makefile payload1 vulnerable* vulnerable.c
$
75 commands loaded for GDB 9.1 using Python engine 3.8
[*] 5 commands could not be loaded, run gef missing
to know why.
[New LWP 34966]
[!] ‘./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’ not found/readable
[!] Failed to get file debug information, most of gef features will not work
Core was generated by `./vulnerable AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005555555551ad in ?? ()
gef➤
If you look at this gdb output, it shows that the long input has overwritten RIP somewhere. (RIP is the register that decides which instruction is to be executed.)
If you notice the next instruction to be executed, it is at the address 0x00005555555551ad, which is probably not a valid address. That’s the reason why the application crashed. As I mentioned earlier, we can use this core dump to analyze the crash. We can also type info registers to understand what values each register is holding and at the time of crash.
As I mentioned, RIP is actually overwritten with 0x00005555555551ad and we should notice some characters from our junk, which are 8 As in the RBP register. This is how core dumps can be used.
rbx 0x5555555551b0 0x5555555551b0
rcx 0x80008 0x80008
rdx 0x414141 0x414141
rsi 0x7fffffffe3e0 0x7fffffffe3e0
rdi 0x7fffffffde89 0x7fffffffde89
rbp 0x4141414141414141 0x4141414141414141
rsp 0x7fffffffde68 0x7fffffffde68
r8 0x0 0x0
r9 0x7ffff7fe0d50 0x7ffff7fe0d50
r10 0x0 0x0
r11 0x0 0x0
r12 0x555555555060 0x555555555060
r13 0x7fffffffdf70 0x7fffffffdf70
r14 0x0 0x0
r15 0x0 0x0
rip 0x5555555551ad 0x5555555551ad
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 0x33
ss 0x2b 0x2b
ds 0x0 0x0
es 0x0 0x0
fs 0x0 0x0
gs 0x0 0x0
gef➤
Let’s run the program itself in gdb by typing gdb ./vulnerable and disassemble main using disass main.
This is the disassembly of our main function. If you notice, within the main program, we have a function called vuln_func. Let us disassemble that using disass vuln_func.
0x0000000000001149 <+0>: endbr64
0x000000000000114d <+4>: push rbp
0x000000000000114e <+5>: mov rbp,rsp
0x0000000000001151 <+8>: sub rsp,0x10
0x0000000000001155 <+12>: mov DWORD PTR [rbp-0x4],edi
0x0000000000001158 <+15>: mov QWORD PTR [rbp-0x10],rsi
0x000000000000115c <+19>: cmp DWORD PTR [rbp-0x4],0x1
0x0000000000001160 <+23>: jle 0x1175 <main+44>
0x0000000000001162 <+25>: mov rax,QWORD PTR [rbp-0x10]
0x0000000000001166 <+29>: add rax,0x8
0x000000000000116a <+33>: mov rax,QWORD PTR [rax]
0x000000000000116d <+36>: mov rdi,rax
0x0000000000001170 <+39>: call 0x117c <vuln_func>
0x0000000000001175 <+44>: mov eax,0x0
0x000000000000117a <+49>: leave
0x000000000000117b <+50>: ret
End of assembler dump.
gef➤
If you notice the disassembly of vuln_func, there is a call to strcpy@plt within this function.
0x000000000000117c <+0>: endbr64
0x0000000000001180 <+4>: push rbp
0x0000000000001181 <+5>: mov rbp,rsp
0x0000000000001184 <+8>: sub rsp,0x110
0x000000000000118b <+15>: mov QWORD PTR [rbp-0x108],rdi
0x0000000000001192 <+22>: mov rdx,QWORD PTR [rbp-0x108]
0x0000000000001199 <+29>: lea rax,[rbp-0x100]
0x00000000000011a0 <+36>: mov rsi,rdx
0x00000000000011a3 <+39>: mov rdi,rax
0x00000000000011a6 <+42>: call 0x1050 strcpy@plt
0x00000000000011ab <+47>: nop
0x00000000000011ac <+48>: leave
0x00000000000011ad <+49>: ret
End of assembler dump.
gef➤
Now run the program by passing the contents of payload1 as input.
In the current environment, a GDB extension called GEF is installed. It shows many interesting details, like a debugger with GUI.
Program received signal SIGSEGV, Segmentation fault.
0x00005555555551ad in vuln_func ()
[ Legend: Modified register | Code | Heap | Stack | String ]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$rax : 0x00007fffffffdd00 → “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[…]”
$rbx : 0x00005555555551b0 → <__libc_csu_init+0> endbr64
$rcx : 0x20000
$rdx : 0x11
$rsp : 0x00007fffffffde08 → “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA”
$rbp : 0x4141414141414141 (“AAAAAAAA”?)
$rsi : 0x00007fffffffe3a0 → “AAAAAAAAAAAAAAAAA”
$rdi : 0x00007fffffffde1b → “AAAAAAAAAAAAAAAAA”
$rip : 0x00005555555551ad → <vuln_func+49> ret
$r8 : 0x0
$r9 : 0x00007ffff7fe0d50 → endbr64
$r10 : 0x0
$r11 : 0x0
$r12 : 0x0000555555555060 → <_start+0> endbr64
$r13 : 0x00007fffffffdf10 → 0x0000000000000002
$r14 : 0x0
$r15 : 0x0
$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffde08│+0x0000: “AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA” ← $rsp
0x00007fffffffde10│+0x0008: “AAAAAAAAAAAAAAAAAAAAAAAAAAAA”
0x00007fffffffde18│+0x0010: “AAAAAAAAAAAAAAAAAAAA”
0x00007fffffffde20│+0x0018: “AAAAAAAAAAAA”
0x00007fffffffde28│+0x0020: 0x00007f0041414141 (“AAAA”?)
0x00007fffffffde30│+0x0028: 0x00007ffff7ffc620 → 0x0005042c00000000
0x00007fffffffde38│+0x0030: 0x00007fffffffdf18 → 0x00007fffffffe25a → “/home/dev/x86_64/simple_bof/vulnerable”
0x00007fffffffde40│+0x0038: 0x0000000200000000
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x5555555551a6 <vuln_func+42> call 0x555555555050 strcpy@plt
0x5555555551ab <vuln_func+47> nop
0x5555555551ac <vuln_func+48> leave
→ 0x5555555551ad <vuln_func+49> ret
[!] Cannot disassemble from $PC
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: “vulnerable”, stopped 0x5555555551ad in vuln_func (), reason: SIGSEGV
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x5555555551ad → vuln_func()
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤
Now if you look at the output, this is the same as we have already seen with the coredump. 8 As are overwriting RBP. But we have passed 300 As and we don’t know which 8 are among those three hundred As overwriting RBP register.
When exploiting buffer overflows, being able to crash the application is the first step in the process. Using this knowledge, an attacker will begin to understand the exact offsets required to overwrite RIP register to be able to control the flow of the program.
Conclusion
In this article, we discussed what buffer overflow vulnerabilities are, their types and how they can be exploited. We also analyzed a vulnerable application to understand how crashing an application generates core dumps, which will in turn be helpful in developing a working exploit. In the next article, we will discuss how we can use this knowledge to exploit a buffer overflow vulnerability.
Sources
Buffer Overflow, OWASP Stack-Based Buffer Overflow Attacks: Explained and Examples, Rapid7 What Is a Buffer Overflow, Acunetix