Using Radare2 to patch a binary Dec 28 2019

When reversing a binary, sometimes it's useful to modify how the binary behaves. We can accomplish this by changing the binary itself. If we had the source code, it'd be easy, but for us, it'll require looking at the decompiled code and deciding which bytes to modify to get our desired behaviour. We can, for example, change the control flow by changing the jump condition. Or we could modify a string that is used on a comparison, etcetera. The modification of a binary is known as patching. In this post, we are going to learn how to use radare2 to patch a binary.

Before we start, we'll have a refresher on control flow in assembly language.

Control flow on Assembly
A simple example
Using radare2 to analyse binary
Patching
Final thoughts
Related topics/notes of interest

Control flow on Assembly

To keep track of the execution flow, we use the program counter register (EIP for 32bit, RIP 64bits). The RIP stores the address of the next instruction to be executed. In a simple program, we might just have a sequence of instructions one after the other. So the RIP will be incremented each time by one. On non-trivial examples, we would like to change the flow of the program depending on different conditions.

The simplest form of control flow is an unconditional jump. We use the instruction jmp location. Where location could be an address or a label:

1
jmp label

We can also have conditional jumps. As the name implies, it'll jump to a specific location if the condition is met. For example, using je jump on equal:

1
2
3
4
5
mov eax, $7
mov ebx, $7
cmp eax, ebx
je 0x0000007c
....

If eax and ebx are equal, it'll jump to address 0x0000007c. Else it'll continue to the next instruction. cmp subtracts both registers and if the Z (Zero) flag is set, it means that they are equal. The instruction je checks for the Z flag in the flags register and make the decision there.

That should be enough for our example. But if you would like to read more, or need a refresher on assembly control flow, visit this link.

We can now have a look at a simple example that uses control flow, and we'll later use the generated binary as an example for patching.

A simple example

Let's create a simple program that reads a number and verifies if it's the four-digit code that will "unlock" and print the information. We'll name the file bankdetails.c and add the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdio.h>

int main(int argc, char* argv[]) {
 int code = 4477;
 int pcode;
 printf("Hello, please enter your 4 pin code: ");
 scanf("%d",&pcode);
 if(code == pcode) {
     printf("Ok, this is your bank account.\n");
 } else {
     printf("Wrong pin, bye!\n");
 }
 return 0;
}

We'll compile it, and use that binary as our test subject for binary patching.

1
2
$ gcc bankdetails.c
# this generates the binary: a.out

We can see the content of a.out using the xxd(1) command. It'll display a hex dump of the file.

1
$ xxd a.out | less

We pipe the output to less so we can read the content more comfortably. I'll show the first part of the file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0300 3e00 0100 0000 6006 0000 0000 0000 ..>.....`.......
00000020: 4000 0000 0000 0000 c019 0000 0000 0000 @...............
00000030: 0000 0000 4000 3800 0900 4000 1d00 1c00 ....@.8...@.....
00000040: 0600 0000 0400 0000 4000 0000 0000 0000 ........@.......
00000050: 4000 0000 0000 0000 4000 0000 0000 0000 @.......@.......
00000060: f801 0000 0000 0000 f801 0000 0000 0000 ................
00000070: 0800 0000 0000 0000 0300 0000 0400 0000 ................
00000080: 3802 0000 0000 0000 3802 0000 0000 0000 8.......8.......
00000090: 3802 0000 0000 0000 1c00 0000 0000 0000 8...............
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000 ................
000000b0: 0100 0000 0500 0000 0000 0000 0000 0000 ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000d0: 300a 0000 0000 0000 300a 0000 0000 0000 0.......0.......
000000e0: 0000 2000 0000 0000 0100 0000 0600 0000 .. .............
000000f0: a00d 0000 0000 0000 a00d 2000 0000 0000 .......... .....
00000100: a00d 2000 0000 0000 7002 0000 0000 0000 .. .....p.......
00000110: 7802 0000 0000 0000 0000 2000 0000 0000 x......... .....
00000120: 0200 0000 0600 0000 b00d 0000 0000 0000 ................
00000130: b00d 2000 0000 0000 b00d 2000 0000 0000 .. ....... .....
00000140: f001 0000 0000 0000 f001 0000 0000 0000 ................
00000150: 0800 0000 0000 0000 0400 0000 0400 0000 ................
00000160: 5402 0000 0000 0000 5402 0000 0000 0000 T.......T.......
00000170: 5402 0000 0000 0000 4400 0000 0000 0000 T.......D.......
00000180: 4400 0000 0000 0000 0400 0000 0000 0000 D...............
00000190: 50e5 7464 0400 0000 e808 0000 0000 0000 P.td............
000001a0: e808 0000 0000 0000 e808 0000 0000 0000 ................
000001b0: 3c00 0000 0000 0000 3c00 0000 0000 0000 <.......<.......
000001c0: 0400 0000 0000 0000 51e5 7464 0600 0000 ........Q.td....
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 1000 0000 0000 0000 ................
00000200: 52e5 7464 0400 0000 a00d 0000 0000 0000 R.td............
00000210: a00d 2000 0000 0000 a00d 2000 0000 0000 .. ....... .....
00000220: 6002 0000 0000 0000 6002 0000 0000 0000 `.......`.......
00000230: 0100 0000 0000 0000 2f6c 6962 3634 2f6c ......../lib64/l
...

The first column shows the index. The following eight columns show the raw content in hex, and the last column shows an ASCII representation of the hex content. On the last column, we can see some of the strings that are part of the content of the file.

Ok, if we knew what those hex values represent that would be great. But just by looking at the hex values, it's impossible to decipher the program's intent. So we first need to analyse it, and once we have it in an easy to read format, we can go from there.

Using radare2 to analyse binary

We could try to make sense of all the bits in the hex dump and come up with all the object segments, and then all the tables, etcetera. Or we can make use of a disassembler. Using a disassembler will be faster than anything we could do manually. The disassembler will analyse the binary format and come up with an assembly representation.

We are going to use the disassembler functionality provided by radare2. If you don't have radare2 installed on your computer, you can clone the GitHub repository here, and follow the installation instructions there.

After installing radare2, we'll get access to the r2 command. Let's run it on our binary a.out.

1
2
3
$ r2 a.out
 -- Come here, we are relatively friendly
[0x00000700]>

Alright, r2 gives us a REPL where we can do static and dynamic analysis of binaries. The binary is loaded now. We can begin inspecting it.

Radare has many options and to master all of them will take some time, but the more you use the tool, the more you'll be discovering and taking advantage of all of its capabilities. One way to explore the tool's options is by using the internal help.

1
> ?

A single question mark will show you all the options available to you. Because there are many sections, it'll be impractical to display everything there. So you can ask for help on each of the commands. For example, we would like to get some general information about the object file. We will use the i(nformation) command. To see its help, type:

1
> i?

And you'll get a description of the i option. We see there that we can display the list symbols using is. You can explore other options on your own.

We are now going to move our focus to the main function. We are going to navigate to the address where the main symbol is located. So we are going to use the s(eek) command.

1
> s main

Now we can print the hex located in our current address using the p(rint) command (check p? for more details):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> px
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x00000680 4883 ec18 488d 350d 0200 00bf 0100 0000 H...H.5.........
0x00000690 6448 8b04 2528 0000 0048 8944 2408 31c0 dH..%(...H.D$.1.
0x000006a0 e8ab ffff ff48 8d74 2404 488d 3d2f 0200 .....H.t$.H.=/..
0x000006b0 0031 c0e8 a8ff ffff 817c 2404 7d11 0000 .1.......|$.}...
0x000006c0 7423 488d 3d1a 0200 00e8 62ff ffff 31c0 t#H.=.....b...1.
0x000006d0 488b 5424 0864 4833 1425 2800 0000 7513 H.T$.dH3.%(...u.
0x000006e0 4883 c418 c348 8d3d d401 0000 e83f ffff H....H.=.....?..
0x000006f0 ffeb dbe8 48ff ffff 0f1f 8400 0000 0000 ....H...........
0x00000700 31ed 4989 d15e 4889 e248 83e4 f050 544c 1.I..^H..H...PTL
0x00000710 8d05 6a01 0000 488d 0df3 0000 0048 8d3d ..j...H......H.=
0x00000720 5cff ffff ff15 b608 2000 f40f 1f44 0000 \....... ....D..
0x00000730 488d 3dd9 0820 0055 488d 05d1 0820 0048 H.=.. .UH.... .H
0x00000740 39f8 4889 e574 1948 8b05 8a08 2000 4885 9.H..t.H.... .H.
0x00000750 c074 0d5d ffe0 662e 0f1f 8400 0000 0000 .t.]..f.........
0x00000760 5dc3 0f1f 4000 662e 0f1f 8400 0000 0000 ]...@.f.........
0x00000770 488d 3d99 0820 0048 8d35 9208 2000 5548 H.=.. .H.5.. .UH

Alright, that looks familiar. It looks like the output we get from xxd. Ok, so we can see the content, but that is not that useful, we already were able to do that using xxd. What we want now is disassemble that content and see which instructions the disassembler gets from the binary format.

To do that, we can use the print command again but using the option d for disassemble and f for formate data:

1
2
> pdf
p: Cannot find function at 0x0000076a

Oh, what happened there? Let's remove the formated data and just print the disassembly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
> pd 32
      ;-- main:
      0x0000076a   55       push rbp
      0x0000076b   4889e5     mov rbp, rsp
      0x0000076e   4883ec20    sub rsp, 0x20
      0x00000772   897dec     mov dword [rbp - 0x14], edi
      0x00000775   488975e0    mov qword [rbp - 0x20], rsi
      0x00000779   64488b042528. mov rax, qword fs:[0x28]
      0x00000782   488945f8    mov qword [rbp - 8], rax
      0x00000786   31c0      xor eax, eax
      0x00000788   c745f47d1100. mov dword [rbp - 0xc], 0x117d
      0x0000078f   488d3df20000. lea rdi, str.Hello__please_enter_your_4_pin_code: ; 0x888 ; "Hello, please enter your 4 pin code: "
      0x00000796   b800000000   mov eax, 0
      0x0000079b   e890feffff   call sym.imp.printf
      0x000007a0   488d45f0    lea rax, [rbp - 0x10]
      0x000007a4   4889c6     mov rsi, rax
      0x000007a7   488d3d000100. lea rdi, [0x000008ae]    ; "%d"
      0x000007ae   b800000000   mov eax, 0
      0x000007b3   e888feffff   call sym.imp.__isoc99_scanf
      0x000007b8   8b45f0     mov eax, dword [rbp - 0x10]
      0x000007bb   3945f4     cmp dword [rbp - 0xc], eax
    ┌─< 0x000007be   750e      jne 0x7ce
    │  0x000007c0   488d3df10000. lea rdi, str.Ok_this_are_your_bank_details. ; 0x8b8 ; "Ok this are your bank details."
    │  0x000007c7   e844feffff   call sym.imp.puts
    ┌──< 0x000007cc   eb0c      jmp 0x7da
    │└─> 0x000007ce   488d3d020100. lea rdi, str.Wrong_pin__bye ; 0x8d7 ; "Wrong pin, bye!"
    │  0x000007d5   e836feffff   call sym.imp.puts
    └──> 0x000007da   b800000000   mov eax, 0
      0x000007df   488b55f8    mov rdx, qword [rbp - 8]
      0x000007e3   644833142528. xor rdx, qword fs:[0x28]
    ┌─< 0x000007ec   7405      je 0x7f3
    │  0x000007ee   e82dfeffff   call sym.imp.__stack_chk_fail
    └─> 0x000007f3   c9       leave
      0x000007f4   c3       ret

That is what the radare2 decompiler came up with. It looks like our main! To make the decompilation more precise, the decompiler needs more context. If it knows what other variables are used on the program, or other functions used, it'll be more accurate. To accomplish this, we are going to run the analyser (with all the options aa, check a? for more options).

1
2
3
4
5
6
7
8
9
10
11
12
> aaa
[Cannot analyse at 0x00000650g with sym. and entry0 (aa)
[x] Analyse all flags starting with sym. and entry0 (aa)
[Cannot analyse at 0x00000650ac)
[x] Analyse function calls (aac)
[x] Analyse len bytes of instructions for references (aar)
[x] Check for objc references
[x] Check for vtables
[x] Type matching analysis for all functions (aaft)
[x] Propagate noreturn information
[x] Use -AA or aaaa to perform additional experimental analysis.
>

Alright, now radare2 has more context to draw from, and we can use the print instruction we were using before with data format:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
> pdf
      ; DATA XREF from entry0 @ 0x67d
┌ 139: int main (int argc, char **argv, char **envp);
│      ; var char **var_20h @ rbp-0x20
│      ; var int64_t var_14h @ rbp-0x14
│      ; var int64_t var_10h @ rbp-0x10
│      ; var uint32_t var_ch @ rbp-0xc
│      ; var int64_t canary @ rbp-0x8
│      ; arg int argc @ rdi
│      ; arg char **argv @ rsi
│      0x0000076a   55       push rbp
│      0x0000076b   4889e5     mov rbp, rsp
│      0x0000076e   4883ec20    sub rsp, 0x20
│      0x00000772   897dec     mov dword [var_14h], edi  ; argc
│      0x00000775   488975e0    mov qword [var_20h], rsi  ; argv
│      0x00000779   64488b042528. mov rax, qword fs:[0x28]
│      0x00000782   488945f8    mov qword [canary], rax
│      0x00000786   31c0      xor eax, eax
│      0x00000788   c745f47d1100. mov dword [var_ch], 0x117d
│      0x0000078f   488d3df20000. lea rdi, str.Hello__please_enter_your_4_pin_code: ; 0x888 ; "Hello, please enter your 4 pin code: " ; const char *format
│      0x00000796   b800000000   mov eax, 0
│      0x0000079b   e890feffff   call sym.imp.printf     ; int printf(const char *format)
│      0x000007a0   488d45f0    lea rax, [var_10h]
│      0x000007a4   4889c6     mov rsi, rax
│      0x000007a7   488d3d000100. lea rdi, [0x000008ae]    ; "%d" ; const char *format
│      0x000007ae   b800000000   mov eax, 0
│      0x000007b3   e888feffff   call sym.imp.__isoc99_scanf ; int scanf(const char *format)
│      0x000007b8   8b45f0     mov eax, dword [var_10h]
│      0x000007bb   3945f4     cmp dword [var_ch], eax
│    ┌─< 0x000007be   750e      jne 0x7ce
│    │  0x000007c0   488d3df10000. lea rdi, str.Ok_this_are_your_bank_details. ; 0x8b8 ; "Ok, this is your bank account." ; const char *s
│    │  0x000007c7   e844feffff   call sym.imp.puts      ; int puts(const char *s)
│   ┌──< 0x000007cc   eb0c      jmp 0x7da
│   ││  ; CODE XREF from main @ 0x7be
│   │└─> 0x000007ce   488d3d020100. lea rdi, str.Wrong_pin__bye ; 0x8d7 ; "Wrong pin, bye!" ; const char *s
│   │  0x000007d5   e836feffff   call sym.imp.puts      ; int puts(const char *s)
│   │  ; CODE XREF from main @ 0x7cc
│   └──> 0x000007da   b800000000   mov eax, 0
│      0x000007df   488b55f8    mov rdx, qword [canary]
│      0x000007e3   644833142528. xor rdx, qword fs:[0x28]
│    ┌─< 0x000007ec   7405      je 0x7f3
│    │  0x000007ee   e82dfeffff   call sym.imp.__stack_chk_fail ; void __stack_chk_fail(void)
│    │  ; CODE XREF from main @ 0x7ec
│    └─> 0x000007f3   c9       leave
└      0x000007f4   c3       ret

That's better. Now we even get a comment on top with all the local variables and their types.

Nice!

Ok, time to do some patching. That's what we came here for.

Patching

If you see the instruction in address 0x000007be, we jump to address 0x000007ce if the result of the comparison on 0x000007bb is NOT EQUAL. That means, if we input a number that is not 4477, we jump to the code that displays "Wrong pin, bye!". But if it's equal, it'll continue to the following instructions at 0x000007c0, and that instruction displays "Ok, this is your bank account.".

We can see that the opcode for jne is 0x75 and the opcode for je (Jump if equal) is 0x74. If we modify that opcode, we would be able to input anything but the code 4477, and we'll see the "bank details".

Open a new shell. If we inspect the hex dump we got from xxd we'll see that on address 0x000007be we find 0x750e. We want it to be 0x740e. Let's see how we can accomplish that.

We are going first to patch it using a regular editor and the hex dump from xxd. Later we'll use r2 to do the same. I'll use vim, but you can use anything you want. Let's begin by saving the hex dump to a file, that later we'll modify using vim:

1
2
$ xxd a.out > a.hex
$ vi a.hex

If you see line 124 we get:

1
000007b0: 0000 00e8 88fe ffff 8b45 f039 45f4 750e .........E.9E.u.

You can see 750e :) perfect! Let's change it to 740e, and save. We can now restore from the hex dump format to a binary again using xxd with the -r option.

1
2
3
4
5
6
7
8
$ xxd -r a.hex > a-patched.out
# Let's add execution permissions
$ chmod u+x a-patched.out
# And run it
$ ./a-patched.out
./a-patched.out
Hello, please enter your 4 pin code: 5555
Ok, this is your bank account.

I entered 5555, which in the previous version should have shown "Wrong pin, bye!". But now we see "Ok, this is your bank account.". Nice :).

Ok, let's do the same using r2. Back to the shell where we had r2 running. If you closed it, we ran the following commands on r2:

1
2
3
4
5
6
7
$ r2 a.out
# analyse
> aaa
# seek to main
> s main
# print disasm function
> pdf

We need to move our focus to the address 0x000007be.

1
> s 0x000007be

If we print the hex dump at that address we'll see:

1
2
3
4
> px
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 750e 488d 3df1 0000 00e8 44fe ffff eb0c u.H.=.....D.....
0x000007ce 488d 3d02 0100 00e8 36fe ffff b800 0000 H.=.....6.......

So we need to change it to 74. When we load the binary to r2 it is loaded as read-only, we need to change to read-write using oo+ (read more using o?).

1
2
3
4
# We could have loaded the file using `-w` flag when calling `r2`
# $ r2 -w a.out
> oo+
> wv 0x750e

Good, Let's print it:

1
2
3
> px
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 0e74 0000 3df1 0000 00e8 44fe ffff eb0c .t..=.....D.....

Oh no! We ruined it!

Well, wv writes 64 bits, so we overwrote the other 4 bytes. And also, they are not in the order we expected.

This is the perfect error. It teaches so much. Let's see the lessons learned. First, we need to be very careful. We can ruin a binary if we modify carelessly. Second, always make a backup. Third, always take into account the binary endianness (What is the order of the most significant bits). Let's fix it:

We had:

1
2
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 750e 488d

We now have:

1
2
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 0e74 0000

So we need to write:

1
> wv 0x8d480e74

Let's hex dump:

1
2
3
4
> px
- offset -  0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 740e 488d 3df1 0000 00e8 44fe ffff eb0c t.H.=.....D.....
0x000007ce 488d 3d02 0100 00e8 36fe ffff b800 0000 H.=.....6.......

That's better. Now we can close r2 and run a.out and it should now accept any number (except 4477) as a valid pin to display "Ok, this is your bank account.".

1
2
3
$ ./a.out
Hello, please enter your 4 pin code: 7777
Ok, this is your bank account.

Great! That's it. Now it's your turn to explore and play more on your own with r2.

Final thoughts

As you can probably see r2 is a very handy tool to master. You'll be able to work with binaries like magic. If you are inquisitive and read the help for w you could see that it would be easier to use:

1
2
3
4
> wB 0x740e
# Or we could have used wx, for writing hex
# Or wv4 to only write 4 bytes
# check w? :) you'll see lots of options

But using wv gave way to explore some common pitfalls when working with binaries. Remember, it's always a good idea to read the manuals to learn more.

In all tutorials or blogpost, a specific point is being shown. So the examples you'll see will be biased towards it. Don't take it as the only way to do things, search deeper. As we saw, you could have accomplished the same result using xxd and vi. Also, we could have changed the number "4477" to something else and input that number, or just read the code and see that we needed to input "4477".

It is important to know that there are always many avenues to solve the same problems. A useful skill is knowing which tools are better for which case.

Ok, that's it for this post. Until next time.

Using Radare2 to patch a binary Dec 28 2019

Table of Contents

Control flow on Assembly

A simple example

Using radare2 to analyse binary

Patching

Final thoughts

Related topics/notes of interest