Using Radare2 to patch a binary Dec 28 2019
When reversing a binary, sometimes it's useful to modify how the binary behaves. We can accomplish this by changing the binary itself. If we had the source code, it'd be easy, but for us, it'll require looking at the decompiled code and deciding which bytes to modify to get our desired behaviour. We can, for example, change the control flow by changing the jump condition. Or we could modify a string that is used on a comparison, etcetera. The modification of a binary is known as patching. In this post, we are going to learn how to use radare2
to patch a binary.
Before we start, we'll have a refresher on control flow in assembly language.
Table of Contents
Control flow on Assembly
To keep track of the execution flow, we use the program counter register (EIP for 32bit, RIP 64bits). The RIP stores the address of the next instruction to be executed. In a simple program, we might just have a sequence of instructions one after the other. So the RIP will be incremented each time by one. On non-trivial examples, we would like to change the flow of the program depending on different conditions.
The simplest form of control flow is an unconditional jump. We use the instruction jmp location
. Where location could be an address or a label:
1
jmp label
We can also have conditional jumps. As the name implies, it'll jump to a specific location if the condition is met. For example, using je
jump on equal:
1
2
3
4
5
mov eax, $7
mov ebx, $7
cmp eax, ebx
je 0x0000007c
....
If eax
and ebx
are equal, it'll jump to address 0x0000007c
. Else it'll continue to the next instruction. cmp
subtracts both registers and if the Z (Zero) flag is set, it means that they are equal. The instruction je
checks for the Z flag in the flags register and make the decision there.
That should be enough for our example. But if you would like to read more, or need a refresher on assembly control flow, visit this link.
We can now have a look at a simple example that uses control flow, and we'll later use the generated binary as an example for patching.
A simple example
Let's create a simple program that reads a number and verifies if it's the four-digit code that will "unlock" and print the information. We'll name the file bankdetails.c
and add the following content:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdio.h>
int main(int argc, char* argv[]) {
int code = 4477;
int pcode;
printf("Hello, please enter your 4 pin code: ");
scanf("%d",&pcode);
if(code == pcode) {
printf("Ok, this is your bank account.\n");
} else {
printf("Wrong pin, bye!\n");
}
return 0;
}
We'll compile it, and use that binary as our test subject for binary patching.
1
2
$ gcc bankdetails.c
# this generates the binary: a.out
We can see the content of a.out
using the xxd(1)
command. It'll display a hex dump of the file.
1
$ xxd a.out | less
We pipe the output to less
so we can read the content more comfortably. I'll show the first part of the file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0300 3e00 0100 0000 6006 0000 0000 0000 ..>.....`.......
00000020: 4000 0000 0000 0000 c019 0000 0000 0000 @...............
00000030: 0000 0000 4000 3800 0900 4000 1d00 1c00 ....@.8...@.....
00000040: 0600 0000 0400 0000 4000 0000 0000 0000 ........@.......
00000050: 4000 0000 0000 0000 4000 0000 0000 0000 @.......@.......
00000060: f801 0000 0000 0000 f801 0000 0000 0000 ................
00000070: 0800 0000 0000 0000 0300 0000 0400 0000 ................
00000080: 3802 0000 0000 0000 3802 0000 0000 0000 8.......8.......
00000090: 3802 0000 0000 0000 1c00 0000 0000 0000 8...............
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000 ................
000000b0: 0100 0000 0500 0000 0000 0000 0000 0000 ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000000d0: 300a 0000 0000 0000 300a 0000 0000 0000 0.......0.......
000000e0: 0000 2000 0000 0000 0100 0000 0600 0000 .. .............
000000f0: a00d 0000 0000 0000 a00d 2000 0000 0000 .......... .....
00000100: a00d 2000 0000 0000 7002 0000 0000 0000 .. .....p.......
00000110: 7802 0000 0000 0000 0000 2000 0000 0000 x......... .....
00000120: 0200 0000 0600 0000 b00d 0000 0000 0000 ................
00000130: b00d 2000 0000 0000 b00d 2000 0000 0000 .. ....... .....
00000140: f001 0000 0000 0000 f001 0000 0000 0000 ................
00000150: 0800 0000 0000 0000 0400 0000 0400 0000 ................
00000160: 5402 0000 0000 0000 5402 0000 0000 0000 T.......T.......
00000170: 5402 0000 0000 0000 4400 0000 0000 0000 T.......D.......
00000180: 4400 0000 0000 0000 0400 0000 0000 0000 D...............
00000190: 50e5 7464 0400 0000 e808 0000 0000 0000 P.td............
000001a0: e808 0000 0000 0000 e808 0000 0000 0000 ................
000001b0: 3c00 0000 0000 0000 3c00 0000 0000 0000 <.......<.......
000001c0: 0400 0000 0000 0000 51e5 7464 0600 0000 ........Q.td....
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000001f0: 0000 0000 0000 0000 1000 0000 0000 0000 ................
00000200: 52e5 7464 0400 0000 a00d 0000 0000 0000 R.td............
00000210: a00d 2000 0000 0000 a00d 2000 0000 0000 .. ....... .....
00000220: 6002 0000 0000 0000 6002 0000 0000 0000 `.......`.......
00000230: 0100 0000 0000 0000 2f6c 6962 3634 2f6c ......../lib64/l
...
The first column shows the index. The following eight columns show the raw content in hex, and the last column shows an ASCII representation of the hex content. On the last column, we can see some of the strings that are part of the content of the file.
Ok, if we knew what those hex values represent that would be great. But just by looking at the hex values, it's impossible to decipher the program's intent. So we first need to analyse it, and once we have it in an easy to read format, we can go from there.
Using radare2 to analyse binary
We could try to make sense of all the bits in the hex dump and come up with all the object segments, and then all the tables, etcetera. Or we can make use of a disassembler. Using a disassembler will be faster than anything we could do manually. The disassembler will analyse the binary format and come up with an assembly representation.
We are going to use the disassembler functionality provided by radare2. If you don't have radare2 installed on your computer, you can clone the GitHub repository here, and follow the installation instructions there.
After installing radare2, we'll get access to the r2
command. Let's run it on our binary a.out
.
1
2
3
$ r2 a.out
-- Come here, we are relatively friendly
[0x00000700]>
Alright, r2
gives us a REPL where we can do static and dynamic analysis of binaries. The binary is loaded now. We can begin inspecting it.
Radare has many options and to master all of them will take some time, but the more you use the tool, the more you'll be discovering and taking advantage of all of its capabilities. One way to explore the tool's options is by using the internal help.
1
> ?
A single question mark will show you all the options available to you. Because there are many sections, it'll be impractical to display everything there. So you can ask for help on each of the commands. For example, we would like to get some general information about the object file. We will use the i
(nformation) command. To see its help, type:
1
> i?
And you'll get a description of the i
option. We see there that we can display the list symbols using is
. You can explore other options on your own.
We are now going to move our focus to the main function. We are going to navigate to the address where the main
symbol is located. So we are going to use the s
(eek) command.
1
> s main
Now we can print the hex located in our current address using the p
(rint) command (check p?
for more details):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> px
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x00000680 4883 ec18 488d 350d 0200 00bf 0100 0000 H...H.5.........
0x00000690 6448 8b04 2528 0000 0048 8944 2408 31c0 dH..%(...H.D$.1.
0x000006a0 e8ab ffff ff48 8d74 2404 488d 3d2f 0200 .....H.t$.H.=/..
0x000006b0 0031 c0e8 a8ff ffff 817c 2404 7d11 0000 .1.......|$.}...
0x000006c0 7423 488d 3d1a 0200 00e8 62ff ffff 31c0 t#H.=.....b...1.
0x000006d0 488b 5424 0864 4833 1425 2800 0000 7513 H.T$.dH3.%(...u.
0x000006e0 4883 c418 c348 8d3d d401 0000 e83f ffff H....H.=.....?..
0x000006f0 ffeb dbe8 48ff ffff 0f1f 8400 0000 0000 ....H...........
0x00000700 31ed 4989 d15e 4889 e248 83e4 f050 544c 1.I..^H..H...PTL
0x00000710 8d05 6a01 0000 488d 0df3 0000 0048 8d3d ..j...H......H.=
0x00000720 5cff ffff ff15 b608 2000 f40f 1f44 0000 \....... ....D..
0x00000730 488d 3dd9 0820 0055 488d 05d1 0820 0048 H.=.. .UH.... .H
0x00000740 39f8 4889 e574 1948 8b05 8a08 2000 4885 9.H..t.H.... .H.
0x00000750 c074 0d5d ffe0 662e 0f1f 8400 0000 0000 .t.]..f.........
0x00000760 5dc3 0f1f 4000 662e 0f1f 8400 0000 0000 ]...@.f.........
0x00000770 488d 3d99 0820 0048 8d35 9208 2000 5548 H.=.. .H.5.. .UH
Alright, that looks familiar. It looks like the output we get from xxd
. Ok, so we can see the content, but that is not that useful, we already were able to do that using xxd
. What we want now is disassemble that content and see which instructions the disassembler gets from the binary format.
To do that, we can use the print command again but using the option d
for disassemble and f
for formate data:
1
2
> pdf
p: Cannot find function at 0x0000076a
Oh, what happened there? Let's remove the formated data and just print the disassembly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
> pd 32
;-- main:
0x0000076a 55 push rbp
0x0000076b 4889e5 mov rbp, rsp
0x0000076e 4883ec20 sub rsp, 0x20
0x00000772 897dec mov dword [rbp - 0x14], edi
0x00000775 488975e0 mov qword [rbp - 0x20], rsi
0x00000779 64488b042528. mov rax, qword fs:[0x28]
0x00000782 488945f8 mov qword [rbp - 8], rax
0x00000786 31c0 xor eax, eax
0x00000788 c745f47d1100. mov dword [rbp - 0xc], 0x117d
0x0000078f 488d3df20000. lea rdi, str.Hello__please_enter_your_4_pin_code: ; 0x888 ; "Hello, please enter your 4 pin code: "
0x00000796 b800000000 mov eax, 0
0x0000079b e890feffff call sym.imp.printf
0x000007a0 488d45f0 lea rax, [rbp - 0x10]
0x000007a4 4889c6 mov rsi, rax
0x000007a7 488d3d000100. lea rdi, [0x000008ae] ; "%d"
0x000007ae b800000000 mov eax, 0
0x000007b3 e888feffff call sym.imp.__isoc99_scanf
0x000007b8 8b45f0 mov eax, dword [rbp - 0x10]
0x000007bb 3945f4 cmp dword [rbp - 0xc], eax
┌─< 0x000007be 750e jne 0x7ce
│ 0x000007c0 488d3df10000. lea rdi, str.Ok_this_are_your_bank_details. ; 0x8b8 ; "Ok this are your bank details."
│ 0x000007c7 e844feffff call sym.imp.puts
┌──< 0x000007cc eb0c jmp 0x7da
│└─> 0x000007ce 488d3d020100. lea rdi, str.Wrong_pin__bye ; 0x8d7 ; "Wrong pin, bye!"
│ 0x000007d5 e836feffff call sym.imp.puts
└──> 0x000007da b800000000 mov eax, 0
0x000007df 488b55f8 mov rdx, qword [rbp - 8]
0x000007e3 644833142528. xor rdx, qword fs:[0x28]
┌─< 0x000007ec 7405 je 0x7f3
│ 0x000007ee e82dfeffff call sym.imp.__stack_chk_fail
└─> 0x000007f3 c9 leave
0x000007f4 c3 ret
That is what the radare2 decompiler came up with. It looks like our main! To make the decompilation more precise, the decompiler needs more context. If it knows what other variables are used on the program, or other functions used, it'll be more accurate. To accomplish this, we are going to run the analyser (with all the options aa
, check a?
for more options).
1
2
3
4
5
6
7
8
9
10
11
12
> aaa
[Cannot analyse at 0x00000650g with sym. and entry0 (aa)
[x] Analyse all flags starting with sym. and entry0 (aa)
[Cannot analyse at 0x00000650ac)
[x] Analyse function calls (aac)
[x] Analyse len bytes of instructions for references (aar)
[x] Check for objc references
[x] Check for vtables
[x] Type matching analysis for all functions (aaft)
[x] Propagate noreturn information
[x] Use -AA or aaaa to perform additional experimental analysis.
>
Alright, now radare2 has more context to draw from, and we can use the print instruction we were using before with data format:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
> pdf
; DATA XREF from entry0 @ 0x67d
┌ 139: int main (int argc, char **argv, char **envp);
│ ; var char **var_20h @ rbp-0x20
│ ; var int64_t var_14h @ rbp-0x14
│ ; var int64_t var_10h @ rbp-0x10
│ ; var uint32_t var_ch @ rbp-0xc
│ ; var int64_t canary @ rbp-0x8
│ ; arg int argc @ rdi
│ ; arg char **argv @ rsi
│ 0x0000076a 55 push rbp
│ 0x0000076b 4889e5 mov rbp, rsp
│ 0x0000076e 4883ec20 sub rsp, 0x20
│ 0x00000772 897dec mov dword [var_14h], edi ; argc
│ 0x00000775 488975e0 mov qword [var_20h], rsi ; argv
│ 0x00000779 64488b042528. mov rax, qword fs:[0x28]
│ 0x00000782 488945f8 mov qword [canary], rax
│ 0x00000786 31c0 xor eax, eax
│ 0x00000788 c745f47d1100. mov dword [var_ch], 0x117d
│ 0x0000078f 488d3df20000. lea rdi, str.Hello__please_enter_your_4_pin_code: ; 0x888 ; "Hello, please enter your 4 pin code: " ; const char *format
│ 0x00000796 b800000000 mov eax, 0
│ 0x0000079b e890feffff call sym.imp.printf ; int printf(const char *format)
│ 0x000007a0 488d45f0 lea rax, [var_10h]
│ 0x000007a4 4889c6 mov rsi, rax
│ 0x000007a7 488d3d000100. lea rdi, [0x000008ae] ; "%d" ; const char *format
│ 0x000007ae b800000000 mov eax, 0
│ 0x000007b3 e888feffff call sym.imp.__isoc99_scanf ; int scanf(const char *format)
│ 0x000007b8 8b45f0 mov eax, dword [var_10h]
│ 0x000007bb 3945f4 cmp dword [var_ch], eax
│ ┌─< 0x000007be 750e jne 0x7ce
│ │ 0x000007c0 488d3df10000. lea rdi, str.Ok_this_are_your_bank_details. ; 0x8b8 ; "Ok, this is your bank account." ; const char *s
│ │ 0x000007c7 e844feffff call sym.imp.puts ; int puts(const char *s)
│ ┌──< 0x000007cc eb0c jmp 0x7da
│ ││ ; CODE XREF from main @ 0x7be
│ │└─> 0x000007ce 488d3d020100. lea rdi, str.Wrong_pin__bye ; 0x8d7 ; "Wrong pin, bye!" ; const char *s
│ │ 0x000007d5 e836feffff call sym.imp.puts ; int puts(const char *s)
│ │ ; CODE XREF from main @ 0x7cc
│ └──> 0x000007da b800000000 mov eax, 0
│ 0x000007df 488b55f8 mov rdx, qword [canary]
│ 0x000007e3 644833142528. xor rdx, qword fs:[0x28]
│ ┌─< 0x000007ec 7405 je 0x7f3
│ │ 0x000007ee e82dfeffff call sym.imp.__stack_chk_fail ; void __stack_chk_fail(void)
│ │ ; CODE XREF from main @ 0x7ec
│ └─> 0x000007f3 c9 leave
└ 0x000007f4 c3 ret
That's better. Now we even get a comment on top with all the local variables and their types.
Nice!
Ok, time to do some patching. That's what we came here for.
Patching
If you see the instruction in address 0x000007be
, we jump to address 0x000007ce
if the result of the comparison on 0x000007bb
is NOT EQUAL. That means, if we input a number that is not 4477, we jump to the code that displays "Wrong pin, bye!". But if it's equal, it'll continue to the following instructions at 0x000007c0
, and that instruction displays "Ok, this is your bank account.".
We can see that the opcode for jne
is 0x75 and the opcode for je
(Jump if equal) is 0x74. If we modify that opcode, we would be able to input anything but the code 4477, and we'll see the "bank details".
Open a new shell. If we inspect the hex dump we got from xxd
we'll see that on address 0x000007be
we find 0x750e
. We want it to be 0x740e
. Let's see how we can accomplish that.
We are going first to patch it using a regular editor and the hex dump from xxd
. Later we'll use r2
to do the same. I'll use vim
, but you can use anything you want. Let's begin by saving the hex dump to a file, that later we'll modify using vim
:
1
2
$ xxd a.out > a.hex
$ vi a.hex
If you see line 124 we get:
1
000007b0: 0000 00e8 88fe ffff 8b45 f039 45f4 750e .........E.9E.u.
You can see 750e
:) perfect! Let's change it to 740e
, and save. We can now restore from the hex dump format to a binary again using xxd
with the -r
option.
1
2
3
4
5
6
7
8
$ xxd -r a.hex > a-patched.out
# Let's add execution permissions
$ chmod u+x a-patched.out
# And run it
$ ./a-patched.out
./a-patched.out
Hello, please enter your 4 pin code: 5555
Ok, this is your bank account.
I entered 5555, which in the previous version should have shown "Wrong pin, bye!". But now we see "Ok, this is your bank account.". Nice :).
Ok, let's do the same using r2
. Back to the shell where we had r2
running. If you closed it, we ran the following commands on r2
:
1
2
3
4
5
6
7
$ r2 a.out
# analyse
> aaa
# seek to main
> s main
# print disasm function
> pdf
We need to move our focus to the address 0x000007be
.
1
> s 0x000007be
If we print the hex dump at that address we'll see:
1
2
3
4
> px
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 750e 488d 3df1 0000 00e8 44fe ffff eb0c u.H.=.....D.....
0x000007ce 488d 3d02 0100 00e8 36fe ffff b800 0000 H.=.....6.......
So we need to change it to 74. When we load the binary to r2
it is loaded as read-only, we need to change to read-write using oo+
(read more using o?
).
1
2
3
4
# We could have loaded the file using `-w` flag when calling `r2`
# $ r2 -w a.out
> oo+
> wv 0x750e
Good, Let's print it:
1
2
3
> px
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 0e74 0000 3df1 0000 00e8 44fe ffff eb0c .t..=.....D.....
Oh no! We ruined it!
Well, wv
writes 64 bits, so we overwrote the other 4 bytes. And also, they are not in the order we expected.
This is the perfect error. It teaches so much. Let's see the lessons learned. First, we need to be very careful. We can ruin a binary if we modify carelessly. Second, always make a backup. Third, always take into account the binary endianness (What is the order of the most significant bits). Let's fix it:
We had:
1
2
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 750e 488d
We now have:
1
2
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 0e74 0000
So we need to write:
1
> wv 0x8d480e74
Let's hex dump:
1
2
3
4
> px
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x000007be 740e 488d 3df1 0000 00e8 44fe ffff eb0c t.H.=.....D.....
0x000007ce 488d 3d02 0100 00e8 36fe ffff b800 0000 H.=.....6.......
That's better. Now we can close r2 and run a.out
and it should now accept any number (except 4477) as a valid pin to display "Ok, this is your bank account.".
1
2
3
$ ./a.out
Hello, please enter your 4 pin code: 7777
Ok, this is your bank account.
Great! That's it. Now it's your turn to explore and play more on your own with r2
.
Final thoughts
As you can probably see r2
is a very handy tool to master. You'll be able to work with binaries like magic. If you are inquisitive and read the help for w
you could see that it would be easier to use:
1
2
3
4
> wB 0x740e
# Or we could have used wx, for writing hex
# Or wv4 to only write 4 bytes
# check w? :) you'll see lots of options
But using wv
gave way to explore some common pitfalls when working with binaries. Remember, it's always a good idea to read the manuals to learn more.
In all tutorials or blogpost, a specific point is being shown. So the examples you'll see will be biased towards it. Don't take it as the only way to do things, search deeper. As we saw, you could have accomplished the same result using xxd
and vi
. Also, we could have changed the number "4477" to something else and input that number, or just read the code and see that we needed to input "4477".
It is important to know that there are always many avenues to solve the same problems. A useful skill is knowing which tools are better for which case.
Ok, that's it for this post. Until next time.
Related topics/notes of interest
- Radare2 GitHub repository.
- You could do a simple binary diffing, using
diff
orvimdiff
, and generating a hex dump usingxxd
1
2
3
$ xxd binary1 > binary1.hex
$ xxd binary2 > binary2.hex
$ vimdiff binary1.hex binary2.hex
You could generate a there and use the patch
tool to generate a patch. Not sure when would you be using this method, but it's good to know it's available.
- Another solution to our problem could be to use ddisasm to get the assembly language program, modify it, and generate the binary from the modified assembly code. You can see the proposed code in this thread on reddit.